SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
how much of the MHC is represented in the reference genome? splaisan General 4 01-12-2014 04:35 AM
over-represented mitochondrial sequences in the RNA-seq data BhariD Bioinformatics 2 07-16-2013 07:27 AM
sequence set motifs over-represented analysis xfh Bioinformatics 0 03-24-2013 11:54 PM
Targeted Genome Assembly for region poorly represented in reference genome? gumbos Bioinformatics 1 01-09-2012 04:01 PM

Reply
 
Thread Tools
Old 02-06-2014, 02:19 AM   #1
starbug16
Junior Member
 
Location: London

Join Date: Feb 2014
Posts: 8
Default Over-represented sequences

Hi,

I have one lane (~360 million reads) of Illumina transcriptome data which I am trying to assemble. Having only a tiny server, I am attempting to reduce this dataset with digital normalization. However, before doing this, I have run fastqc which flags up some over-represented sequences. When I BLAST these, I get matches to my organism's 16S, 18S and 28S rRNA genes.

As a bit of a beginner, I was wondering whether this was normal and whether I need to worry or do anything about this?

Any help or advice very much appreciated!!
starbug16 is offline   Reply With Quote
Old 02-06-2014, 03:07 AM   #2
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

That's completely normal, don't give it a second thought.
dpryan is offline   Reply With Quote
Old 02-06-2014, 03:09 AM   #3
mastal
Senior Member
 
Location: uk

Join Date: Mar 2009
Posts: 667
Default

It's quite normal, especially if the library prep was done using total RNA, and not using any methods to reduce or remove ribosomal RNA.
mastal is offline   Reply With Quote
Old 02-06-2014, 03:32 AM   #4
starbug16
Junior Member
 
Location: London

Join Date: Feb 2014
Posts: 8
Default

Phew, thank you for the responses. Mind much at rest and now onwards to diginorm-ing!
starbug16 is offline   Reply With Quote
Old 02-06-2014, 04:34 PM   #5
yueluo
Member
 
Location: Guangzhou China

Join Date: Aug 2013
Posts: 82
Default

If you're not interested in ribosomal genes, you might as well remove all those reads.
yueluo is offline   Reply With Quote
Old 06-23-2014, 08:59 AM   #6
Marianna85
Member
 
Location: Italy

Join Date: Mar 2012
Posts: 32
Default

Hi everybody,

actually I had the same problem (that is not really a problem but something that is quite normal) and I want to ask you an opinion that it might be helpful also for starbug16.

I did several libraries by using different kits and I had different percentages of reads mapping to rRNAs, ranging from 1 to 30%.
I think that the problem arises when you have to compare samples having very different percentages of rRNAs. I mean: if I have to find DE between 2 samples having respectively 15% and 30% of rRNA reads, I have the impression that final results are biased by the very different percentages of rRNAs (which compete for the sequencing and affect the number of mRNA sequences). Did you understand my point?
In this case, do you think that a normalization procedure (like TMM or DESeq) will minimize this issue?

Thank you to anybody who will tell his opinion!

Marianna
Marianna85 is offline   Reply With Quote
Old 06-23-2014, 06:04 PM   #7
yueluo
Member
 
Location: Guangzhou China

Join Date: Aug 2013
Posts: 82
Default

I usually remove reads that map to rRNA(and/or other sources of contamination) , then proceed with mapping/DE-analysis .
yueluo is offline   Reply With Quote
Old 06-23-2014, 11:15 PM   #8
Marianna85
Member
 
Location: Italy

Join Date: Mar 2012
Posts: 32
Default

Hi Yueluo,
so you think it is not a problem if your samples have really different percentages of rRNAs??

Marianna
Marianna85 is offline   Reply With Quote
Reply

Tags
assembly, fastqc, illumina, overrespresented, transcriptome

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:36 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO