Hello everyone,
First of all, I know there has been a lot of discussions related to uniquely mapped reads estimation (sorry for bringing it up again).
My question is that QNAME, the first column of SAM file, i.e.
Can this information be used for estimating the uniquely mapped reads?
For example, for a SAM file from a paired-end dataset, uniquely mapped reads can be differentiated as the ones that occurs "twice" (two same QNAMEs). Similarly, if there are more than two same QNAMEs, they can be "multi-mapped reads". And if there is only one QNAME, it can be classified as "broken read".
I checked it with the output statistics from STAR aligner and it perfectly matches with it (% of uniquely and multi mapped reads), however, i am not sure about other aligners and don't know if this way is acceptable.
Can anyone please let me know if this is an acceptable way or not? And/or suggest a better way PS I am just an ordinary biologist without much bioinformatics background.
Thanks
First of all, I know there has been a lot of discussions related to uniquely mapped reads estimation (sorry for bringing it up again).
My question is that QNAME, the first column of SAM file, i.e.
Reads/segments having identical QNAME are regarded to come from the same template
For example, for a SAM file from a paired-end dataset, uniquely mapped reads can be differentiated as the ones that occurs "twice" (two same QNAMEs). Similarly, if there are more than two same QNAMEs, they can be "multi-mapped reads". And if there is only one QNAME, it can be classified as "broken read".
I checked it with the output statistics from STAR aligner and it perfectly matches with it (% of uniquely and multi mapped reads), however, i am not sure about other aligners and don't know if this way is acceptable.
Can anyone please let me know if this is an acceptable way or not? And/or suggest a better way PS I am just an ordinary biologist without much bioinformatics background.
Thanks