When I mapped the RNA-seq data to reference genome using tophat 1.3.0 ,I met a problem in the sam output. but the version 1.2.0 did not have such situation.
Here is the result:
1. The tophat 1.3.0 result:
HWI-ST_0101:7:8:14864:67306#0 99 scaffold_8 3915632 255 95M = 3915887 420 GGACCGGTAGAAATTTTCCAATGAGAGATCATGTGAAGATTGAAAAGAAGAGTCCATGACAAATTTACATTGGCTGCTGCAATAGCTGAGGAGCG HHHHHHHFHHHHBHHHHHHFHHHHHFHFFHFGGEGFEGEGGGCFECHHHHHHEHHHFHEEBBEFG@AFFFDFFFFF54;.9;4*.:>>8@B#### NM:i:1 NH:i:1
HWI-ST_0101:7:8:14864:67306#0 147 scaffold_8 3915887 255 68M70N27M = 3915632 420 TTTCCAAGTCATCCTCGTTGCCAATCGGTGCTTGACCGTCTTGCTGGGCCTCATGGATGCGACGATGTTGTGCCAGGTTGTCTGATCGAGAAAAG * NM:i:0 XS:A:- NH:i:1
2. The tophat 1.2.0 result:
HWI-ST_0101:7:8:14864:67306#0 99 scaffold_8 3915632 255 95M = 3915887 0 GGACCGGTAGAAATTTTCCAATGAGAGATCATGTGAAGATTGAAAAGAAGAGTCCATGACAAATTTACATTGGCTGCTGCAATAGCTGAGGAGCG HHHHHHHFHHHHBHHHHHHFHHHHHFHFFHFGGEGFEGEGGGCFECHHHHHHEHHHFHEEBBEFG@AFFFDFFFFF54;.9;4*.:>>8@B#### NM:i:1 NH:i:1
HWI-ST_0101:7:8:14864:67306#0 147 scaffold_8 3915887 255 68M70N27M = 3915632 0 TTTCCAAGTCATCCTCGTTGCCAATCGGTGCTTGACCGTCTTGCTGGGCCTCATGGATGCGACGATGTTGTGCCAGGTTGTCTGATCGAGAAAAG *?EFEGGGEGD>GGEGGDFHHHEHEHHHHFHHHHFGGFFHDFHHGGHHHHHHHHHHHHGHHFHHHHHHFHHHHHHHHHHHHHHHHHHHHHHHHHF NM:i:0 XS:A:- NH:i:1
whe I use htseq-count , It gives error report.
python -m HTSeq.scripts.count accepted_hits.unique.sam ../../../pde.release.v3.gff
39609 GFF lines processed.
Error occured in line 876 of file accepted_hits.unique.sam.
Error: ("'seq' and 'qualstr' do not have the same length.", 'line 876 of file accepted_hits.unique.sam')
[Exception type: ValueError, raised in _HTSeq.pyx:765]
is it a bug?
Here is the result:
1. The tophat 1.3.0 result:
HWI-ST_0101:7:8:14864:67306#0 99 scaffold_8 3915632 255 95M = 3915887 420 GGACCGGTAGAAATTTTCCAATGAGAGATCATGTGAAGATTGAAAAGAAGAGTCCATGACAAATTTACATTGGCTGCTGCAATAGCTGAGGAGCG HHHHHHHFHHHHBHHHHHHFHHHHHFHFFHFGGEGFEGEGGGCFECHHHHHHEHHHFHEEBBEFG@AFFFDFFFFF54;.9;4*.:>>8@B#### NM:i:1 NH:i:1
HWI-ST_0101:7:8:14864:67306#0 147 scaffold_8 3915887 255 68M70N27M = 3915632 420 TTTCCAAGTCATCCTCGTTGCCAATCGGTGCTTGACCGTCTTGCTGGGCCTCATGGATGCGACGATGTTGTGCCAGGTTGTCTGATCGAGAAAAG * NM:i:0 XS:A:- NH:i:1
2. The tophat 1.2.0 result:
HWI-ST_0101:7:8:14864:67306#0 99 scaffold_8 3915632 255 95M = 3915887 0 GGACCGGTAGAAATTTTCCAATGAGAGATCATGTGAAGATTGAAAAGAAGAGTCCATGACAAATTTACATTGGCTGCTGCAATAGCTGAGGAGCG HHHHHHHFHHHHBHHHHHHFHHHHHFHFFHFGGEGFEGEGGGCFECHHHHHHEHHHFHEEBBEFG@AFFFDFFFFF54;.9;4*.:>>8@B#### NM:i:1 NH:i:1
HWI-ST_0101:7:8:14864:67306#0 147 scaffold_8 3915887 255 68M70N27M = 3915632 0 TTTCCAAGTCATCCTCGTTGCCAATCGGTGCTTGACCGTCTTGCTGGGCCTCATGGATGCGACGATGTTGTGCCAGGTTGTCTGATCGAGAAAAG *?EFEGGGEGD>GGEGGDFHHHEHEHHHHFHHHHFGGFFHDFHHGGHHHHHHHHHHHHGHHFHHHHHHFHHHHHHHHHHHHHHHHHHHHHHHHHF NM:i:0 XS:A:- NH:i:1
whe I use htseq-count , It gives error report.
python -m HTSeq.scripts.count accepted_hits.unique.sam ../../../pde.release.v3.gff
39609 GFF lines processed.
Error occured in line 876 of file accepted_hits.unique.sam.
Error: ("'seq' and 'qualstr' do not have the same length.", 'line 876 of file accepted_hits.unique.sam')
[Exception type: ValueError, raised in _HTSeq.pyx:765]
is it a bug?
Comment