Hello, I’m hoping someone can help me out with this issue. I’ve tried looking for the answer here as well as across the web and can’t find a solution. I apologize if I’ve missed something.
I have a trinity based de novo reference transcriptome and I’m attempting to align the transcripts to the reference genome. I’m a novice so I’m following a protocol and am hitting the error below when I attempt to convert my SAM into a BAM file. I’m looking for a fix to the underlying problem as I’ve already tried removing the offending transcript from the SAM file. When I do that, I still get the error, just on a different transcript farther down in the SAM file. Thanks for any guidance.
gmap -n 0 -D /cribi -t 30 -d cribi 588160VVout.fasta –f samse > 588160.sam
samtools view –Sb 588160.sam > 588160.bam
[E::sam_parse1] CIGAR and query sequence are of different length
[W::sam_read1] parse error at line 979
[main_samview] truncated file.
Here is Line 979: The basepairs here are 644. I don’t understand how the CIGAR string could be different from the sequence length if the sequence was used to align in the first place.
c33925_g1_i4 0 chr5 17008787 40 92M31I39M21I184M89N33M97N68M151N24M * 0 0 CGACATTTCGCCCCCCTCTTTCGTTTCGCCGACCTCCGAAACACGAAATTTCGCCGAAATTTCGGCGAAAAAAACAGAAATTTTCATCCTTGCCCATAACACAAGATTTCCATTTAAAACCCCCTTAAAACCCTAGCTACAGAGTCCACCGGCAGCGCCTGCGTCGAAACCTGCCCAACACCTCCAGGTCGGGTTAAAGGCTTGCTGGATCAGAGTCTAAACCATCTGGCAAGTTGGTTTTCTGTGGGACGTAACGGTGGAGAGTCTAGAGCTGGTGAAGCGCTTTGTTCTCTGCCCTGGGTGTCTTTAGGGGAGGAGAAACTCAGTTTCGAACTGGTCGCCGGAAGTAAACCCTCTGCGGGTCGGTTGGCCGAGACCAAGAGAGAACGCTCAGACCGGTTATCTGAATTGGAATATATCTCCAATGGAGATTAATTTGGGAAAGCTTCCATTCGATATTGATTTCCATCCTTCCAACACGCTCGTTGCTGCAGGTCTGATTAATGGAGACCTTCACTTGTATCGTTATGGTGCCAATTCCTTACCACAAAGGCTATTGGAGGTTCACGCCCATGGCGAAGAATCTTGTAGAGCTGTTCGTTTTATCAACGAGGGACATGCAATTGTGACCGGTTCTCCAGACC * MD:Z:108GCTT3T9C20G8C31A1T216A33 NH:i:2 HI:i:1 NM:i:63 SM:i:40 XQ:i:40 X2:i:0 XO:Z:UM XS:A:+ PG:Z:M
I have a trinity based de novo reference transcriptome and I’m attempting to align the transcripts to the reference genome. I’m a novice so I’m following a protocol and am hitting the error below when I attempt to convert my SAM into a BAM file. I’m looking for a fix to the underlying problem as I’ve already tried removing the offending transcript from the SAM file. When I do that, I still get the error, just on a different transcript farther down in the SAM file. Thanks for any guidance.
gmap -n 0 -D /cribi -t 30 -d cribi 588160VVout.fasta –f samse > 588160.sam
samtools view –Sb 588160.sam > 588160.bam
[E::sam_parse1] CIGAR and query sequence are of different length
[W::sam_read1] parse error at line 979
[main_samview] truncated file.
Here is Line 979: The basepairs here are 644. I don’t understand how the CIGAR string could be different from the sequence length if the sequence was used to align in the first place.
c33925_g1_i4 0 chr5 17008787 40 92M31I39M21I184M89N33M97N68M151N24M * 0 0 CGACATTTCGCCCCCCTCTTTCGTTTCGCCGACCTCCGAAACACGAAATTTCGCCGAAATTTCGGCGAAAAAAACAGAAATTTTCATCCTTGCCCATAACACAAGATTTCCATTTAAAACCCCCTTAAAACCCTAGCTACAGAGTCCACCGGCAGCGCCTGCGTCGAAACCTGCCCAACACCTCCAGGTCGGGTTAAAGGCTTGCTGGATCAGAGTCTAAACCATCTGGCAAGTTGGTTTTCTGTGGGACGTAACGGTGGAGAGTCTAGAGCTGGTGAAGCGCTTTGTTCTCTGCCCTGGGTGTCTTTAGGGGAGGAGAAACTCAGTTTCGAACTGGTCGCCGGAAGTAAACCCTCTGCGGGTCGGTTGGCCGAGACCAAGAGAGAACGCTCAGACCGGTTATCTGAATTGGAATATATCTCCAATGGAGATTAATTTGGGAAAGCTTCCATTCGATATTGATTTCCATCCTTCCAACACGCTCGTTGCTGCAGGTCTGATTAATGGAGACCTTCACTTGTATCGTTATGGTGCCAATTCCTTACCACAAAGGCTATTGGAGGTTCACGCCCATGGCGAAGAATCTTGTAGAGCTGTTCGTTTTATCAACGAGGGACATGCAATTGTGACCGGTTCTCCAGACC * MD:Z:108GCTT3T9C20G8C31A1T216A33 NH:i:2 HI:i:1 NM:i:63 SM:i:40 XQ:i:40 X2:i:0 XO:Z:UM XS:A:+ PG:Z:M
Comment