![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
reason for low mapping rate?? | miaom | RNA Sequencing | 3 | 05-10-2014 08:25 AM |
Very low map rate while mapping to denovo assebly | flyingoyster | RNA Sequencing | 6 | 11-19-2013 06:12 PM |
The low mapping rate | vivienne_lovely | Bioinformatics | 7 | 06-05-2013 06:45 PM |
ChIP-Seq mapping rate | aquleaf | Bioinformatics | 1 | 05-08-2012 08:45 PM |
Mapping rate decreases using Tophat1.2.0 from 1.1.4 | zun | Bioinformatics | 1 | 04-14-2011 06:32 PM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: North Carolina Join Date: Sep 2011
Posts: 38
|
![]()
I have a set of RNA-seq dataset of single end 100bp reads (30 million per sample), and first using tophat2, mapping rate is only 5% to the ref genome. Then I tried to trim raw data to 40-100bp, and mapping rate increase to 18%. I'm doing the mapping with no trimmed data right now...
I wonder what other ways I can try to increase the mapping rate? trim read range to 50-100? increase the phred score based on fastqc? Any comments will be appreciated! |
![]() |
![]() |
![]() |
#2 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,080
|
![]()
Can you post the FastQC plots of what the data looks like? No point in doing random trimming of data.
Take a few reads and do an old fashioned blast to make sure the data is from your sample/correct genome. Mistakes sometimes happen at sequencing cores. |
![]() |
![]() |
![]() |
#3 |
Member
Location: North Carolina Join Date: Sep 2011
Posts: 38
|
![]()
I just had no trimming data alignment, and it is 15%.
15.66% overall alignment rate I will post the fastqc plots soon. Thank you! |
![]() |
![]() |
![]() |
#4 |
Member
Location: North Carolina Join Date: Sep 2011
Posts: 38
|
![]()
Attached here is the fastqc before I trimmed
|
![]() |
![]() |
![]() |
#5 |
Member
Location: North Carolina Join Date: Sep 2011
Posts: 38
|
![]()
This is the fastqc after I trimmed using trimmomatic, 40-100bp
java -jar /usr/local/apps/trimmomatic/Trimmomatic-0.32/trimmomatic-0.32.jar SE 1.fastq 1.trimmed.fastq ILLUMINACLIP:/usr/local/apps/trimmomatic/Trimmomatic-0.32/adapters/TruSeq3-SE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:40 |
![]() |
![]() |
![]() |
#6 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,080
|
![]()
Q-score wise there is no issue, so the problem must lie elsewhere. It is possible to get great data that may not align at all so this is only part of the QC. Report back on the blast result. Do the GC plots look strange?
|
![]() |
![]() |
![]() |
#7 |
Super Moderator
Location: Walnut Creek, CA Join Date: Jan 2014
Posts: 2,707
|
![]()
I don't really understand what you mean by "trimming to 40-100bp". But, it would not surprise me if your problem was adapter contamination; do you know what kind of adapters were used? They might not be TruSeq.
|
![]() |
![]() |
![]() |
#8 |
Senior Member
Location: Germany Join Date: Apr 2012
Posts: 215
|
![]()
What organism are you working with and what is your reference?
I have seen such fastqc results just recently. The reason was a severe rRNA contamination. Maybe mRNA enrichment / ribo-depletion didn't work (or wasn't done)? If the respective sequences are not (or are only partially) represented in your reference, you can of course not map to them. Look at the sequence duplication levels: if there is an increase at 10k, this is an indication for that. If you are working with human samples, the relatively high GC content is another one. To verify this, simply use the rRNA sequences as reference and map to them. |
![]() |
![]() |
![]() |
#9 |
Member
Location: North Carolina Join Date: Sep 2011
Posts: 38
|
![]()
here is the overall fastqc
|
![]() |
![]() |
![]() |
#10 | |
Member
Location: North Carolina Join Date: Sep 2011
Posts: 38
|
![]() Quote:
|
|
![]() |
![]() |
![]() |
#11 |
Member
Location: North Carolina Join Date: Sep 2011
Posts: 38
|
![]()
The lib was done by the NEBNext® RNA Library Prep Kit for Illumina, so it should be TruSeq adaptors.
|
![]() |
![]() |
![]() |
#12 |
Senior Member
Location: Germany Join Date: Apr 2012
Posts: 215
|
![]()
From looking at the fastqc output (btw: there is a new, slightly better fastqc version available), I can only say again that it looks very similar to our rRNA "contaminated" samples. More interestingly, we also used the NEB kit...
|
![]() |
![]() |
![]() |
#13 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,080
|
![]()
@bbm: Were you mapping to the entire genome or just the transcriptome?
|
![]() |
![]() |
![]() |
Thread Tools | |
|
|