Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stampy help

    I've recently run Stampy for the first time using this:

    ./stampy.py -t 12 -g xentro3 -h xentro3 -o /ichec/work/nglif015b/stampy-1.0.21/out -M /ichec/work/nglif0/rawdata/batch1
    /1.control_1.txt /ichec/work/nglif0/rawdata/batch1/1.control_2.txt

    From this I get the output file specified 'out' and my run summary which contains:

    stampy: Mapping...
    stampy: # Nucleotides (all/1/2): 9581823944 4790911972 4790911972
    stampy: # Variants: 811646897 406424420 405222477
    stampy: # Fraction: 0.0847 0.0848 0.0846
    stampy: # Paired-end insert size: 141.5 +/- 42.7 (29776104 pairs)
    stampy: Done

    The output file I get doesn't seem to be recognised as SAM (which I assumed was default) and I'm not sure what to do with it - any help would be hugely appreciated! Here is the head and tail of the file:

    @SQ SN:GL172637 LN:7817814 AS:xentro3 SP:frog
    @SQ SN:GL172638 LN:7803671 AS:xentro3 SP:frog
    @SQ SN:GL172639 LN:7514772 AS:xentro3 SP:frog
    @SQ SN:GL172640 LN:6577953 AS:xentro3 SP:frog
    @SQ SN:GL172641 LN:6532643 AS:xentro3 SP:frog
    @SQ SN:GL172642 LN:6423007 AS:xentro3 SP:frog
    @SQ SN:GL172643 LN:5836443 AS:xentro3 SP:frog

    HWI-ST169:272:C0RCGACXX:7:2308:18249:200865 77 * 0 0 * * 0 0 CAGGGCCGTTGGATTTGTGTTGTCGTTTTTGGTTTTCTGTCTCTTTGGGTTCTTCGTTGTGTGGCTGTGCGTAGTCTGGTGGGCTATTAGTGCTGTCTTTC ??1=??BDDDDDBEIIIE2+2::*1?)11??@BDDDD0*09BCDBD/.'-;(.)=(.=??D@@######################################
    HWI-ST169:272:C0RCGACXX:7:2308:18249:200865 141 * 0 0 * * 0 0 GGCCGTGGCCGGCATATTATCGGGTTTTTTTGACGGCCCAGTCCTACGCACCGCCTGCCCCTGCCCCTCTCAACTGGACAACTGGACTGTTTGTGTTGGCG =?+A=0@D)2<CDEEGI*:EF9DG#############################################################################
    HWI-ST169:272:C0RCGACXX:7:2308:18122:200875 77 * 0 0 * * 0 0 CTCTCTTTCAATGTTTGTTACCATGGCGGGTTGGGGCTTTGACTTTGTTCGATGTGTCGGTGCTTTTTTCAGTGGTTTGGTTGGGTTGTGTGGGATCTCAC ??+4=ADDFF4A?G,2CEE,3A###############################################################################
    HWI-ST169:272:C0RCGACXX:7:2308:18122:200875 141 * 0 0 * * 0 0 GTTGTTTTTTCCTGTAGGTTCATTGTGTTCTCCCGACCTACATTACCCCACACTTTTCTTTAGTGCTTGGCGACTACATATGACTTTCATGAGATTGGTTT +1:BA222=C??CBGBHI:+2A22+4A?*:??:??8)8)0?98*?*9BBB'(;.)7=EH>;==)==>37@@##############################
    HWI-ST169:272:C0RCGACXX:7:2308:18307:200879 77 * 0 0 * * 0 0 CCGTTTCAAGTTCCCAGTGGCCCATGATCCTGGTGCTGTTGTGTGCGTTCTTCACAGCGAAGGACGCATCTTTTATTTAAAGCGATTGTTTCTCACCAGTT @@1ADDDDFFFFF,++3<C3<C3+4+22++2:EEG*1CDGGGII:*)08?BF#################################################
    HWI-ST169:272:C0RCGACXX:7:2308:18307:200879 141 * 0 0 * * 0 0 GATTAGTGGAGCGATTTGTACCAAGAGCCTTTCCGAACTATACCAAGCCGCACTTCTTATCAAACTTAGCCAACCGTCTCAGGCGGCCAATTAGAGTTGAG :+4A++2AFHGFHAHGGIH,3A+2++11:?GHIIGGGGHJJAHCDFFGF<5;A;?..=?);77;B@(;;@###############################
    HWI-ST169:272:C0RCGACXX:7:2308:17992:200846 77 * 0 0 * * 0 0 CCCCTCGACACTAAAGGGGACTACTCTGGCAGAGTACCAGTTGTGTGATTAAGACAGTGGAATGGCGATTGAGGGGTCGCAGTCATTGTGTATCGTCATAC @@@DDDDDFFFFF224AF))1:1***1:C*)0)00*00**00BBFFF######################################################
    HWI-ST169:272:C0RCGACXX:7:2308:17992:200846 141 * 0 0 * * 0 0 TAGGCTTGGTCTCTGCCTTAGTGTTTGGGTCCCGTTGCTAAACCTGCCGGGTCTTATTATCCAGCCTGGCTGAGCACCCTACACTTCCATTTCACTGTTTT ++1=+2A)+2<ADGIH+<A+3<CCJHIIJGGEIIGIEGHHHGHGHCEHFGH'[email protected]:?CE:CEDE?BCC#############################
    HWI-ST169:272:C0RCGACXX:7:2308:18578:200773 77 * 0 0 * * 0 0 CGATACTGGTATCTGGACTTTGCGTCTTGAGGTTGATTGAGGGCACTGTGATGGGCTTGTCAGGCGTAATGATGGTGTCGTTGCAGGATTTGGTTGTTTAC ?@<DDDDDHADDH2<+2+3<CF+)4)1?F########################################################################
    HWI-ST169:272:C0RCGACXX:7:2308:18578:200773 141 * 0 0 * * 0 0 ATCCAGTCTGCCTTCAGTAGAGTGTTATCAGTACCTGCGCGTTCCACCCTCACGGTTCAGCTACCTTACGGTGCCGCGCTGCAGGGCCGGAGGAGGTGCTG ++1=42222ACFHG+2+,3<CH<+3CHIDFHDIHGHBEFGH6DBDIGEG=8BG@;5;C;;C?CA?C###################################

    All the very best,

    N

  • #2
    Um, how do yout know it is not recognized as a SAM?

    Comment


    • #3
      Sorry - I should have included that. I tried to convert to .bam using

      samtools view -hb output.sam

      and get this error message

      [bam_header_read] EOF marker is absent. The input is probably truncated.
      [bam_header_read] invalid BAM binary header (this is not a BAM file).
      [main_samview] fail to read the header from "output.sam".

      Thanks

      Comment


      • #4
        Okay. That does not necessarily mean that you did anything wrong with stampy - your command looks fine to me (though I am far from being an expert...).

        It seems; samtools has a problem with the stampy generated header?

        If you try:

        samtools view -bS output.sam >output.bam


        Does that work?

        Comment


        • #5
          Thanks a lot - that worked!

          Cheers!

          Comment


          • #6
            OK - Everything has worked well. Now I want to view the mapping in IGV. When I load the genome, can I use the stash or stidx files? I've tried using an index of the genome built by bowtie, but my reads don't appear in the browser (presumably as they were aligned to a different index?)

            Thanks,

            N

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Essential Discoveries and Tools in Epitranscriptomics
              by seqadmin




              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
              04-22-2024, 07:01 AM
            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Today, 11:49 AM
            0 responses
            13 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, Yesterday, 08:47 AM
            0 responses
            16 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            61 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            60 views
            0 likes
            Last Post seqadmin  
            Working...
            X