SEQanswers

Go Back   SEQanswers > Applications Forums > Metagenomics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Time & Cost of using 1 MiSeq Machine to do 16s rDNA (V2/V4) Seq on 300 Samples/Month vs92 Illumina/Solexa 28 10-09-2015 11:07 AM
300 Stool Samples - 16s rDNA amplification & sequencing for identifying bacteria vs92 454 Pyrosequencing 5 11-20-2013 08:28 AM
A rookie's question: how to deal with this RPKM data. ips RNA Sequencing 0 03-07-2012 01:48 AM
How to deal with multi-sample NGS data? ssnowfox Bioinformatics 7 03-22-2011 01:49 PM
PubMed: Evaluation of the bacterial diversity in the feces of cattle using 16S rDNA b Newsbot! Literature Watch 0 07-26-2008 07:33 AM

Reply
 
Thread Tools
Old 10-01-2012, 09:04 AM   #1
newBioinfo
Member
 
Location: US

Join Date: Mar 2012
Posts: 36
Default how to deal with 16S rDNA data form Illumina

Hi Everyone,
I am very new in the field of Bioinformatics, I have a sequencing data from Ilumina of 16S r DNA from water. I got the quality score distribution of the data and it lies within 30 -35 range, which means it is a good data to start with. I want to know what will be the next step to deal with this data.
I am thinkng of doing this:
1) Blast the data with ribosomal data base

Can anyone provide me some idea how to start with this data.


Thanks for any help!!!!!
newBioinfo is offline   Reply With Quote
Old 10-02-2012, 03:50 AM   #2
RickBioinf
Member
 
Location: Leiden, The Netherlands

Join Date: Sep 2012
Posts: 28
Default

How many reads do you have? The more reads you have the longer it takes (sometimes even 3 months) so it's probably not a good idea. You should check if you can make your dataset smaller.
Maybe you should also try to use another program than BLAST, it is slower than alot of different tools, like: https://github.com/csmiller/EMIRGE
If it still takes a long time let me know, I can help you search for some other things too.

Good luck.
RickBioinf is offline   Reply With Quote
Old 10-03-2012, 01:50 PM   #3
wkrhc4mia
Junior Member
 
Location: MD

Join Date: Oct 2011
Posts: 1
Default

Quote:
Originally Posted by newBioinfo View Post
Hi Everyone,
I am very new in the field of Bioinformatics, I have a sequencing data from Ilumina of 16S r DNA from water. I got the quality score distribution of the data and it lies within 30 -35 range, which means it is a good data to start with. I want to know what will be the next step to deal with this data.
I am thinkng of doing this:
1) Blast the data with ribosomal data base

Can anyone provide me some idea how to start with this data.


Thanks for any help!!!!!
Take a look at QIIME (www.qiime.org, and the overview tutorial there) or mothur (www.mothur.org). Those provide standard pipelines for dealing with 16S sequences. Blasting some of the sequences against a database such as RDP or greengenes is usually part of the pipeline.
wkrhc4mia is offline   Reply With Quote
Old 10-04-2012, 07:37 AM   #4
newBioinfo
Member
 
Location: US

Join Date: Mar 2012
Posts: 36
Default

Quote:
Originally Posted by RickBioinf View Post
How many reads do you have? The more reads you have the longer it takes (sometimes even 3 months) so it's probably not a good idea. You should check if you can make your dataset smaller.
Maybe you should also try to use another program than BLAST, it is slower than alot of different tools, like: https://github.com/csmiller/EMIRGE
If it still takes a long time let me know, I can help you search for some other things too.

Good luck.
Thanks RickBioinf for the help.
I have around 78 million reads and I have filtered these reads to 77 million. Now my data has those reads which have no 'N'. I tried blasting the data to non redundant database but it was taking too long. I will try what you have suggested. So, this db has only ribosomal DNA.
Thanks for the help!!!
newBioinfo is offline   Reply With Quote
Old 10-04-2012, 07:39 AM   #5
newBioinfo
Member
 
Location: US

Join Date: Mar 2012
Posts: 36
Default

Quote:
Originally Posted by wkrhc4mia View Post
Take a look at QIIME (www.qiime.org, and the overview tutorial there) or mothur (www.mothur.org). Those provide standard pipelines for dealing with 16S sequences. Blasting some of the sequences against a database such as RDP or greengenes is usually part of the pipeline.
Thanks wkrhc4mia,
I am thinking of using mothur, so you mean I do not have to blast the data to any db separately, it will be a part of mothur pipeline.

Thanks for the help!!!
newBioinfo is offline   Reply With Quote
Old 10-04-2012, 09:32 AM   #6
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,060
Default

If you run a single blast search, it is going to take a long time. This is where you could break up your initial search file into multiple smaller fragments and then run the searches in parallel (would work best if you have access to a compute cluster).

There are parallel implementations of blast http://www.mpiblast.org/ that can be useful. Installing and using mpiBLAST is not trivial though .. just a fair warning.

Quote:
Originally Posted by newBioinfo View Post
Thanks RickBioinf for the help.
I tried blasting the data to non redundant database but it was taking too long. I will try what you have suggested. So, this db has only ribosomal DNA.
Thanks for the help!!!
GenoMax is offline   Reply With Quote
Old 10-27-2012, 06:19 PM   #7
fanyucai1
Member
 
Location: China

Join Date: Jan 2011
Posts: 11
Default

There is a software named MEGAN (http://ab.inf.uni-tuebingen.de/software/megan/)
,you could use. The reference database you could choose SILVA\Greengene\RDP
fanyucai1 is offline   Reply With Quote
Old 10-28-2012, 06:43 PM   #8
Polecat
Member
 
Location: Australia

Join Date: Aug 2012
Posts: 11
Default

Becareful if using MEGAN that you don't waste time by doing your blasts against the wrong database.

MEGAN likes NCBI taxonomy for BLASTN, BLASTX or BLASTP to compare against NCBI-NT, NCBI-NR or genome specifi c databases. MEGAN can also parse fi les generated by the RDP website or the Silva. MEGAN can also parse files in SAM format.
Polecat is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 04:40 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO