Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • bowtie - invalid CIGAR string - wrong sam format

    Hi,

    I am trying to align short reads from Illumina with different aligners.

    I have used the same .fq files to align with BWA, Bowtie and Maq.

    BWA and Maq align without errors and give me the expected output files in the right format.

    But when I use the same fq and reference files to align with Bowtie, I get the following:

    My Command:

    bowtie --chunkmbs 100 -p 2 -X 500 prefixDB -1 /path/to/file/file_1.fq -2 /path/to/file/file_2.fq result.sam

    Result:

    # reads processed: xxx
    # reads with at least one reported alignment: yyy (57.06%)
    # reads that failed to align: zzz (42.94%)
    Reported aaa paired-end alignments to 1 output stream(s)

    So that looks alright, but when I tried converting the resultant sam file to bam:

    samtools view -bS -o result.bam result.sam

    I get:

    [samopen] no @SQ lines in the header.
    [sam_read1] missing header? Abort!

    So I indexted the reference.fasta file and tried:

    samtools view -bt ref.fasta.fai result.sam > result.bam

    I get:

    [sam_header_read2] 1 sequences loaded.
    Parse error at line 1: invalid CIGAR character
    Aborted

    Now this is the first line of result.sam (the file I am trying to convert to bam)

    SRR034509.174/1 + NC_000913.2 1272036 GCACCACAGGCGTCGCCTATCGACTGCCAGAAGAGACGCTGGAGCAGGAACTAACCCTGTTGTGGAAGCGAGAGATGATTAATGGCTGTGTTTGTTTATCA IIII;III>-8II*II.IIIIII@II@:I.IIIIII567II?EI);>>DI,I?H0&7F8AB=*&.;F5';E.(0)2,?,44%$)%!&%#$&"$#$##%!#" 0 81:C>A,82:C>T,89:T>G,90:A>T,92:C>T,93:T>G,94:G>T,95:C>T,96:C>T,98:A>T,100:C>A

    All lines follow the same format, which I realize doesn't look the a typical .sam file format (it has missing fields) and I am wondering where I have gone wrong, and how I can correct this.

    Things to note:

    - I am running the latest bowtie and samtools version
    - My bowtie-build command on the reference.fasta has given me the right files as I have used them to align other .fq reads.
    - As mentioned previously, BWA and Maq worked fine with the same data so its very likely nothing is wrong with that.

    Thanks

  • #2
    That's not a .sam file. It's Bowtie native format. You need to add -S or --sam to the command line make the output .sam format.

    The "+" is the giveaway. SAM output has the bitwise flag, which is a number, in the second column of the output.

    Comment


    • #3
      Originally posted by swbarnes2 View Post
      That's not a .sam file. It's Bowtie native format. You need to add -S or --sam to the command line make the output .sam format.

      The "+" is the giveaway. SAM output has the bitwise flag, which is a number, in the second column of the output.
      :P thanku!

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Strategies for Sequencing Challenging Samples
        by seqadmin


        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
        03-22-2024, 06:39 AM
      • seqadmin
        Techniques and Challenges in Conservation Genomics
        by seqadmin



        The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

        Avian Conservation
        Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
        03-08-2024, 10:41 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Yesterday, 06:37 PM
      0 responses
      10 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, Yesterday, 06:07 PM
      0 responses
      10 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 03-22-2024, 10:03 AM
      0 responses
      51 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 03-21-2024, 07:32 AM
      0 responses
      67 views
      0 likes
      Last Post seqadmin  
      Working...
      X