Go Back   SEQanswers > Core Facilities

Similar Threads
Thread Thread Starter Forum Replies Last Post
CSHL Long RNA-seq dataset Giles Bioinformatics 1 01-13-2013 10:20 PM
How can I remove DNA from RNA dataset by bioinformatics? hellsingwyk Bioinformatics 8 10-03-2012 05:50 AM
Adaptor sequences RNA seq -remove? sebastion RNA Sequencing 1 07-31-2012 10:00 AM
How much rRNA sequence is reasonable in an RNA-seq dataset? Nigel Saunders General 6 02-07-2012 04:53 AM
RNA-seq remove any reads that map more than 2 times. fabrice Bioinformatics 3 08-17-2011 02:06 AM

Thread Tools
Old 09-26-2012, 07:30 AM   #1
Location: HK

Join Date: Sep 2012
Posts: 18
Default How can I remove DNA from RNA dataset by bioinformatics?

Dear all,

I am a new starter as I just start to do research related to bioinformatics.
And I am now facing a big problem.

My demonstrator told me that my RNA data contain DNA by blasting the whole dataset. And now he would like me to find out a tool to remove DNA data.

my research is related to metagenomics research, using RNA but not DNA.
As I have no reference to filter DNA that may come from difference species, could anyone give me a hand?

Is there any tools to remove DNA from RNA data that with higher accuracy than removal of blast result which indicate DNA features ?
hellsingwyk is offline   Reply With Quote
Old 09-26-2012, 08:43 AM   #2
Devon Ryan
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480

Perhaps someone else will chime in with a bright idea, but I seriously doubt you can distinguish reads coming from DNA and RNA. Just randomly blasting stuff and saying, "Gee, this read falls in the middle of no where in all genomes to which it matches...must just be DNA contamination", seems like a really bad idea. I certainly hope that your instructor/demonstrator/whatever did something vastly more clever than that, but I suspect not.

Next time, just tell whomever is preparing the samples to DNase treat things.
dpryan is offline   Reply With Quote
Old 09-26-2012, 09:30 AM   #3
Senior Member
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,978

Perhaps hellingwyk is looking to retain reads that match rRNA and discard the rest.

If that is so look into getting the appropriate sequences databases ( or for the search.

Last edited by GenoMax; 09-26-2012 at 09:33 AM.
GenoMax is offline   Reply With Quote
Old 09-26-2012, 05:48 PM   #4
Location: HK

Join Date: Sep 2012
Posts: 18

Thank you very much for answer my question~

However, I am now focusing on all kinds of RNA from microbe and virus in order to search novel information...............

that's why I cannot easily find a reference to filter out DNA....................
hellsingwyk is offline   Reply With Quote
Old 09-26-2012, 05:58 PM   #5
Location: Sydney, Australia

Join Date: Jan 2012
Posts: 61

Overall, if you are looking for all kinds of RNA than you need to trust your library preps and assume that all of the sequences you are seeing are RNAs and not DNAs (and if stuff is mapping to things that have not previously known to be transcribed that doesn't mean it's DNA, it means that perhaps there is more transcription in your system than previously characterized). BLASTing to the genome and assuming that something is DNA because it hasn't been shown to be transcribed before is not evidence of contamination (although it could be, I suppose; hard to say without knowing more of your experimental setup).

In eukaryotes, a "rough" way of doing this would be to look for evidence of unspliced transcripts, and if their numbers are extremely(!) high - to go back and redo the library preps.
dvanic is offline   Reply With Quote
Old 09-28-2012, 04:41 AM   #6
Senior Member
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,315

I basically agree with dvanic. There is not going to be any reliable way of distinguishing DNA reads from RNA reads when they are mixed together in an RNA library. If this distinction is important to your analysis then you need to verify that efforts were made to remove the DNA that will contaminate any RNA prep from that prep during library construction. In most cases this should include a DNAse treatment of the purified RNA.

If the RNA library is strand-specific you could re-assure yourself that this was the case by making sure that strand-specificity is reflected in the data. Not sure what a reasonable ratio of +strand to -strand is in a pure RNA library, but it should be pretty high, overall.

pmiguel is offline   Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 03:45 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO