SEQanswers

Go Back   SEQanswers > Core Facilities



Similar Threads
Thread Thread Starter Forum Replies Last Post
CSHL Long RNA-seq dataset Giles Bioinformatics 1 01-13-2013 10:20 PM
How can I remove DNA from RNA dataset by bioinformatics? hellsingwyk Bioinformatics 8 10-03-2012 05:50 AM
Adaptor sequences RNA seq -remove? sebastion RNA Sequencing 1 07-31-2012 10:00 AM
How much rRNA sequence is reasonable in an RNA-seq dataset? Nigel Saunders General 6 02-07-2012 04:53 AM
RNA-seq remove any reads that map more than 2 times. fabrice Bioinformatics 3 08-17-2011 02:06 AM

Reply
 
Thread Tools
Old 09-26-2012, 07:30 AM   #1
hellsingwyk
Member
 
Location: HK

Join Date: Sep 2012
Posts: 18
Default How can I remove DNA from RNA dataset by bioinformatics?

Dear all,

I am a new starter as I just start to do research related to bioinformatics.
And I am now facing a big problem.

My demonstrator told me that my RNA data contain DNA by blasting the whole dataset. And now he would like me to find out a tool to remove DNA data.

my research is related to metagenomics research, using RNA but not DNA.
As I have no reference to filter DNA that may come from difference species, could anyone give me a hand?

Is there any tools to remove DNA from RNA data that with higher accuracy than removal of blast result which indicate DNA features ?
hellsingwyk is offline   Reply With Quote
Old 09-26-2012, 08:43 AM   #2
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

Perhaps someone else will chime in with a bright idea, but I seriously doubt you can distinguish reads coming from DNA and RNA. Just randomly blasting stuff and saying, "Gee, this read falls in the middle of no where in all genomes to which it matches...must just be DNA contamination", seems like a really bad idea. I certainly hope that your instructor/demonstrator/whatever did something vastly more clever than that, but I suspect not.

Next time, just tell whomever is preparing the samples to DNase treat things.
dpryan is offline   Reply With Quote
Old 09-26-2012, 09:30 AM   #3
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,978
Default

Perhaps hellingwyk is looking to retain reads that match rRNA and discard the rest.

If that is so look into getting the appropriate sequences databases (http://www.arb-silva.de/ or http://rdp.cme.msu.edu/) for the search.

Last edited by GenoMax; 09-26-2012 at 09:33 AM.
GenoMax is offline   Reply With Quote
Old 09-26-2012, 05:48 PM   #4
hellsingwyk
Member
 
Location: HK

Join Date: Sep 2012
Posts: 18
Default

Thank you very much for answer my question~

However, I am now focusing on all kinds of RNA from microbe and virus in order to search novel information...............

that's why I cannot easily find a reference to filter out DNA....................
hellsingwyk is offline   Reply With Quote
Old 09-26-2012, 05:58 PM   #5
dvanic
Member
 
Location: Sydney, Australia

Join Date: Jan 2012
Posts: 61
Default

Overall, if you are looking for all kinds of RNA than you need to trust your library preps and assume that all of the sequences you are seeing are RNAs and not DNAs (and if stuff is mapping to things that have not previously known to be transcribed that doesn't mean it's DNA, it means that perhaps there is more transcription in your system than previously characterized). BLASTing to the genome and assuming that something is DNA because it hasn't been shown to be transcribed before is not evidence of contamination (although it could be, I suppose; hard to say without knowing more of your experimental setup).

In eukaryotes, a "rough" way of doing this would be to look for evidence of unspliced transcripts, and if their numbers are extremely(!) high - to go back and redo the library preps.
dvanic is offline   Reply With Quote
Old 09-28-2012, 04:41 AM   #6
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,315
Default

I basically agree with dvanic. There is not going to be any reliable way of distinguishing DNA reads from RNA reads when they are mixed together in an RNA library. If this distinction is important to your analysis then you need to verify that efforts were made to remove the DNA that will contaminate any RNA prep from that prep during library construction. In most cases this should include a DNAse treatment of the purified RNA.

If the RNA library is strand-specific you could re-assure yourself that this was the case by making sure that strand-specificity is reflected in the data. Not sure what a reasonable ratio of +strand to -strand is in a pure RNA library, but it should be pretty high, overall.

--
Phillip
pmiguel is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:45 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO