Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Difficulty with bowtie's output sam format

    Hello. I was wondering if I could get some help.

    I'm trying to use bowtie to align a fasta file using an index and output it in sam format. Later on, I'll need to process the output sam file using another script. Anyway, I called bowtie with these options

    bowtie -f -t -p 8 -n 3 -l 32 -k 1 -m 100 -S -y --chunkmbs 1024 --max FASTA_FILE.mm.fasta --best

    using input of the form

    >38-1
    TGGAACGGAACGGAATGGAAGGGAATGGAATGGAAT


    and got output of the form

    38-1 0 chrY:28807964-28808132 275 255 36M * 0 0 TGGAACGGAACGGAATGGAAGGGAATGGAATGGAAT IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII XA:i:2 MD:Z:5T14A15 NM:i:2

    Maybe I'm misunderstanding something, but this doesn't exactly appear to be in sam format. What I would like is for the coordinate of the leftmost position of the sequence to be in the fourth field (instead of 275--I'm not sure what that number represents, but given the label in field three, I don't think it's the coordinate) and for only the chromosome to be in the third field. I could manually modify the output, but I'm afraid to throw away that 275 because I have no idea what it is.

    Does anyone know what I'm doing wrong? Any help is appreciated. If you need more information, I'll do my best to provide it.

    Thanks,

    David

  • #2
    Difficulty with bowtie's output sam format

    Your parameters seem OK, I assume you did also include the name of your index.

    It does look like sam format, although as you say, the 275 seems difficult to explain.

    Have you compared your read sequence to the sequence of whatever is called chrY:28807964-28808132 in your reference to see if the alignment makes sense?

    Comment


    • #3
      Check this: http://bowtie-bio.sourceforge.net/ma...-bowtie-output

      Comment


      • #4
        I think that's all proper SAM format

        38-1 is the name of the read, 0 is the flag (the read maps forward) the reference is named " chrY:28807964-28808132", 275 is the position, 255 is the mapping quality (255 seems to mean that the mapping quality is not available), 36M is the CIGAR, etc

        Comment


        • #5
          Thanks for the quick responses, everyone.

          Your parameters seem OK, I assume you did also include the name of your index.

          It does look like sam format, although as you say, the 275 seems difficult to explain.

          Have you compared your read sequence to the sequence of whatever is called chrY:28807964-28808132 in your reference to see if the alignment makes sense?
          I did include the name of the index when I called bowtie. I found the reference sequence in a fasta file off of which the index was based. The sequence corresonding to chrY:28807964-28808132 is

          >chrY:28807964-28808132
          NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNtgaagtggagtggagtgtaacgaaatggggtggaatgtaattgaatggagtggagtgtttggagtctactggagtggaatggaacggaatggaaaggaatggaatggaatggagtgaagtgcagtgcagtgaaatggagtggaaaggaatggaatggaatcaaatggaNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN


          This doesn't match exactly, but that can be acceptable from what I understand, so I believe this alignment makes sense. That being said...

          I think that's all proper SAM format

          38-1 is the name of the read, 0 is the flag (the read maps forward) the reference is named " chrY:28807964-28808132", 275 is the position, 255 is the mapping quality (255 seems to mean that the mapping quality is not available), 36M is the CIGAR, etc
          At first I was going to say that 275 can't be the position since I would expect the position to fall between 28807964 and 28808132. But then I realized that this number actually corresponds to position in the reference sequence, and sure enough when I checked these characters, they matched the ones given in the bowtie output!

          Anyway, I feel comfortable with making some manual adjustments now.

          Thanks again for your help.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Advancing Precision Medicine for Rare Diseases in Children
            by seqadmin




            Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
            12-16-2024, 07:57 AM
          • seqadmin
            Recent Advances in Sequencing Technologies
            by seqadmin



            Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

            Long-Read Sequencing
            Long-read sequencing has seen remarkable advancements,...
            12-02-2024, 01:49 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 12-17-2024, 10:28 AM
          0 responses
          23 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 12-13-2024, 08:24 AM
          0 responses
          42 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 12-12-2024, 07:41 AM
          0 responses
          28 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 12-11-2024, 07:45 AM
          0 responses
          42 views
          0 likes
          Last Post seqadmin  
          Working...
          X