Dear All,
I am working on a pair-end (75+35bp) SOLiD dataset. We used LifeScope to perform the analysis (mapping to genome and reads counting on genes etc.). However, we found that the mapping quality output by LifeScope is really low.
For example, in one library, we have around 66 million(M) raw reads. 10M of them were unmapped. Among mapped reads, 45M reads have an alignment score < 10 (most of them are actually 0).
We double checked our experiment settings, and sequencing machine also reported good quality on raw reads. We are now confused about which part can be wrong.
I looked at those low quality alignment from the BAM file, reads like following record has mapping quality 0, which is anti-intuitive to us:
698_1442_2018 113 Chr01 200535 0 3S72M = 7706595 7506094 TGAGATGATTAACATATAAATCTGTAGCTACATGGAATTAAGGTAGTGGAGCAGAGGAGGAAGGATGATAGAAGA 6;@JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ RG:Z:1_1 NH:i:2 CM:i:0 NM:i:0 CQ:Z:@@@@@@@@@@@@@@>@@@@@@@@@@@@@@@@@@@@@@@@?@;@6@@@@?@@?=@@?6?@@@@=<@@@@@@@@@<8 CS:Z:T022022332132020202202221322011231020303020131132323112230033331103032132221
Could anyone shed some light on what might go wrong?
Thanks so much!
Best,
Zheng
I am working on a pair-end (75+35bp) SOLiD dataset. We used LifeScope to perform the analysis (mapping to genome and reads counting on genes etc.). However, we found that the mapping quality output by LifeScope is really low.
For example, in one library, we have around 66 million(M) raw reads. 10M of them were unmapped. Among mapped reads, 45M reads have an alignment score < 10 (most of them are actually 0).
We double checked our experiment settings, and sequencing machine also reported good quality on raw reads. We are now confused about which part can be wrong.
I looked at those low quality alignment from the BAM file, reads like following record has mapping quality 0, which is anti-intuitive to us:
698_1442_2018 113 Chr01 200535 0 3S72M = 7706595 7506094 TGAGATGATTAACATATAAATCTGTAGCTACATGGAATTAAGGTAGTGGAGCAGAGGAGGAAGGATGATAGAAGA 6;@JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ RG:Z:1_1 NH:i:2 CM:i:0 NM:i:0 CQ:Z:@@@@@@@@@@@@@@>@@@@@@@@@@@@@@@@@@@@@@@@?@;@6@@@@?@@?=@@?6?@@@@=<@@@@@@@@@<8 CS:Z:T022022332132020202202221322011231020303020131132323112230033331103032132221
Could anyone shed some light on what might go wrong?
Thanks so much!
Best,
Zheng
Comment