Old 09-18-2014, 03:46 AM





Question aligning nucleoplasm fraction RNA

Hi, this is my first post, so I hope I don't miss any details.

I'm trying to align RNA sequenced from the subcellular nucleoplasm fraction using TopHat2. The command I'm using is something like this:

tophat2 --num-threads 5 -g 1 --library-type fr-secondstrand -a 5 -r 3000 -o ./Nucleop_untreated_rep1_topHat /GenoStorage/Genomas/hg19/Genome_indexFiles/Bowtie2/hg19 NucleopRNA_1_1.fastq,NucleopRNA_2_1.fastq NucleopRNA_1_2.fastq,NucleopRNA_2_2.fastq
I used -r 3000 because previously I had to align chromatin fraction RNA and that was the only way to get the reads properly paired.

After aligning, my align_summary.txt file looks like this:
Left reads:
Input: 23788444
Mapped: 22989305 (96.6% of input)
of these: 148191 ( 0.6%) have multiple alignments (148191 have >1)
Right reads:
Input: 23788444
Mapped: 22863972 (96.1% of input)
of these: 151161 ( 0.7%) have multiple alignments (151161 have >1)
96.4% overall read alignment rate.

Aligned pairs: 22400527
of these: 4825 ( 0.0%) have multiple alignments
and: 8473407 (37.8%) are discordant alignments
58.5% concordant pair alignment rate.

Because of the high % of discordant alignments, I used samtools flagstat to check for any issues:
46399134 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 duplicates
46399134 + 0 mapped (100.00%:-nan%)
46399134 + 0 paired in sequencing
23308173 + 0 read1
23090961 + 0 read2
19935278 + 0 properly paired (42.96%:-nan%)
45487286 + 0 with itself and mate mapped
911848 + 0 singletons (1.97%:-nan%)
11390956 + 0 with mate mapped to a different chr
11390956 + 0 with mate mapped to a different chr (mapQ>=5)

So it seems I get many read pairs with reads in different chromosomes.
I have tried varying some paremeters (library type, -r) but had no success.
I should also mention that this happens for 2 different experiments (with 2 different cell types), in a total of 4 datasets (one fore one, three for the other).
I also looked at other (rare) cases where subcellular fraction RNA was aligned, but they didn't report anything like this.

Thanks in advance to anyone who can help
tomgom

