Dear all,
I believe this questions has been partially brought up several times. But I cannot find a definite answer from all the previous discussion.
I am now working on bam files of paired-end RNA-Seq samples. And I want to extract uniquely mapped paired-end reads (no multiple alignments). Basically, from the summary of tophat below, I want to get the aligned pairs and remove the reads with multiple alignments and disconcordant alignments, which result in 43226526*(1-21.2%-12.6%)=28615960 pe reads.
Left reads:
Input: 160074322
Mapped: 63468641 (39.6% of input)
of these: 15015591 (23.7%) have multiple alignments (13075 have >20)
Right reads:
Input: 160074322
Mapped: 54678109 (34.2% of input)
of these: 12050250 (22.0%) have multiple alignments (7459 have >20)
36.9% overall read alignment rate.
Aligned pairs: 43226526
of these: 9158530 (21.2%) have multiple alignments
and: 5451746 (12.6%) are discordant alignments
23.6% concordant pair alignment rate.
I have tried several command in samtools, including
samtools view -c -f 1 -F 12 accepted_hits.bam
samtools view -c -hf 0x2 accepted_hits.bam.
But all failed to do the job. Could anyone give any suggestions?
Thank you!
I believe this questions has been partially brought up several times. But I cannot find a definite answer from all the previous discussion.
I am now working on bam files of paired-end RNA-Seq samples. And I want to extract uniquely mapped paired-end reads (no multiple alignments). Basically, from the summary of tophat below, I want to get the aligned pairs and remove the reads with multiple alignments and disconcordant alignments, which result in 43226526*(1-21.2%-12.6%)=28615960 pe reads.
Left reads:
Input: 160074322
Mapped: 63468641 (39.6% of input)
of these: 15015591 (23.7%) have multiple alignments (13075 have >20)
Right reads:
Input: 160074322
Mapped: 54678109 (34.2% of input)
of these: 12050250 (22.0%) have multiple alignments (7459 have >20)
36.9% overall read alignment rate.
Aligned pairs: 43226526
of these: 9158530 (21.2%) have multiple alignments
and: 5451746 (12.6%) are discordant alignments
23.6% concordant pair alignment rate.
I have tried several command in samtools, including
samtools view -c -f 1 -F 12 accepted_hits.bam
samtools view -c -hf 0x2 accepted_hits.bam.
But all failed to do the job. Could anyone give any suggestions?
Thank you!