I just got a data set from my collaborator. 2 X 150 bps, Illumina RNAseq data for human samples. We did QC on the data and trimmed it to remove adaptors and low quality bases. Then I use tophat2 to map them to hg19 in three ways:
1. paired end mapping with default parameters;
2. single end mapping for each end separately;
3. paired end mapping with inner distance between the two ends to have a mean of 0 and sd=50;
Mapping rate for 1 and 3 are both around 45%. But mapping rate for 2 is around 75%. So most of the reads are mappable but cannot be paired even when the inner distance between the two ends can be 0 or negative (if the distance can go negative). I took a look at the BAM files for the accepted hits. It looks like read pairs with alignment
------------>>>----------------
------------<<<<----------------
can be paired. But read pairs with alignment
-------------->>>>--------------
-----------<<<<---------------
can not be paired. Does any one know why and how I can gain those unpaired alignment back? Thanks in advance!
1. paired end mapping with default parameters;
2. single end mapping for each end separately;
3. paired end mapping with inner distance between the two ends to have a mean of 0 and sd=50;
Mapping rate for 1 and 3 are both around 45%. But mapping rate for 2 is around 75%. So most of the reads are mappable but cannot be paired even when the inner distance between the two ends can be 0 or negative (if the distance can go negative). I took a look at the BAM files for the accepted hits. It looks like read pairs with alignment
------------>>>----------------
------------<<<<----------------
can be paired. But read pairs with alignment
-------------->>>>--------------
-----------<<<<---------------
can not be paired. Does any one know why and how I can gain those unpaired alignment back? Thanks in advance!