SEQanswers

Go Back   SEQanswers > General



Similar Threads
Thread Thread Starter Forum Replies Last Post
Removing duplicates from 16S RNA data mhadidi2002 Bioinformatics 6 10-28-2014 05:59 AM
Who uses a MiSeq for 16S data? capsicum Metagenomics 0 11-21-2012 01:10 PM
Guidelines - DNA input lilletine SOLiD 6 02-21-2012 11:58 PM

Reply
 
Thread Tools
Old 05-01-2015, 04:43 AM   #1
bio_informatics
Senior Member
 
Location: USA

Join Date: Nov 2013
Posts: 182
Question guidelines with 16s RNA data

Hello Members, and Seniors,

I don't have any experience in/on 16s RNA data.
I know there are tools like QIIME, and mothur which do the taxonomic assignments, and are too versatile in themselves bundled with many more utilities. I've both these tools installed and working fine.

I'm looking for few guidelines, or steps in order to move ahead.
I've Illumina data.

Is there something similar to assembly?, such as :-
Quote:
- get fastq (either paired end (either mate paired, or normal), or single-end),
- trim your reads (with tools like timmomatic, etc)
- assemble your reads (based on your type of organism, if prokaryote-SPAdes, if not find the suitable one)
- Get QUAST report, and check/verify results.
In amplicon data, they are barcoded, and then demultiplexed, uh; this is making the water more muddy for me

Can somebody please enlighten here with initial steps/workflow?
The first few steps are trimming, denoising, and chimera removal, if I'm not wrong.

And how do these tools (QIIME, Mothur) come into picture and where?

Last edited by bio_informatics; 05-01-2015 at 04:55 AM.
bio_informatics is offline   Reply With Quote
Old 05-01-2015, 05:03 AM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,053
Default

What kind of analysis have you been asked to do?
GenoMax is offline   Reply With Quote
Old 05-01-2015, 05:12 AM   #3
bio_informatics
Senior Member
 
Location: USA

Join Date: Nov 2013
Posts: 182
Default

Genomax: Thanks for your prompt reply.

Lets begin with taxonomic assignment, first.
PS: I may be reinventing wheel here many times, but again, how else would I be learning. :P
bio_informatics is offline   Reply With Quote
Old 05-01-2015, 05:25 AM   #4
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,053
Default

Qiime: http://nbviewer.ipython.org/github/b...tutorial.ipynb
GenoMax is offline   Reply With Quote
Old 05-01-2015, 05:32 AM   #5
bio_informatics
Senior Member
 
Location: USA

Join Date: Nov 2013
Posts: 182
Default

Thank you for URL, I went through this yesterday, only glanced it though.
Shall go through it more wisely, now.

Quote:
Originally Posted by GenoMax View Post
Another question, (please bear with me, even if questions are too naive).

Why there's a step of demultiplexing (or have to be demultiplexed, assuming data isn't) while dealing with amplicon data? When we get WGS data, they do come demultiplexed from sequencer.

Can't sequencer de-multiplex Amplicon data?
bio_informatics is offline   Reply With Quote
Old 05-01-2015, 05:53 AM   #6
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,053
Default

Quote:
Originally Posted by bio_informatics View Post
Why there's a step of demultiplexing (or have to be demultiplexed, assuming data isn't) while dealing with amplicon data? When we get WGS data, they do come demultiplexed from sequencer.

Can't sequencer de-multiplex Amplicon data?
I don't do Qiime regularly so my explanation will be a bit rough.

Qiime started in the 454 world and expects the reads to have a certain ID header format which incorporates the sample name in each read ID header at the beginning (this is not like the illumina read header). The sample information also needs to match the "mapping file" (more info here: http://qiime.org/documentation/file_...ing-your-files).

Qiime has a tool to generate data in this format from fastq files (http://qiime.org/scripts/split_libraries_fastq.html) but it expects to have the barcodes in a separate fastq file (and not part of read header as Illumina does it). This is not the default way MiSeq produces data. There are workarounds (that involve MiSeq config file edits) that will produce data in two separate files (sequence and barcodes).

Locally we do not demultiplex data so all reads go to the "undetermined" file from a MiSeq run. These files are then processed via custom script that generates the data in the format qiime expects. This avoids having to edit MiSeq config files or using the Qiime supplied demultiplexing tool. You do need to make sure that the data is trimmed and adapters removed.

Last edited by GenoMax; 05-01-2015 at 05:59 AM.
GenoMax is offline   Reply With Quote
Old 05-01-2015, 06:05 AM   #7
bio_informatics
Senior Member
 
Location: USA

Join Date: Nov 2013
Posts: 182
Thumbs up

Many thanks for your detailed reply.

Quote:
Originally Posted by GenoMax View Post
I don't do Qiime regularly so my explanation will be a bit rough.
Do you use Mothur, instead?

Thanks much for redirecting to right URLs. I'd have gone through them number of times, yet not have identified their sole purpose, how and when to use them.

Quote:
Originally Posted by GenoMax View Post
Locally we leave the data non-multiplexed so the reads all go to the "undetermined" file.
Why is that? This is what I desperately looking answer for.
Why data is left non-multiplexed? Does this have something to cost effectiveness?

Thanks again for your time, and replies.

Last edited by bio_informatics; 05-01-2015 at 06:10 AM.
bio_informatics is offline   Reply With Quote
Old 05-01-2015, 06:30 AM   #8
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,053
Default

Quote:
Originally Posted by bio_informatics View Post
Why is that? This is what I desperately looking answer for.
Why data is left non-multiplexed? Does this have something to cost effectiveness?

Thanks again for your time, and replies.
Manipulating data files is much easier than to have to edit MiSeq config files everytime you want to run 16S data for Qiime (most people won't have access to MiSeq to do this anyway). You also deal with a single "undetermined" pool file instead of multiple sample files.

That said, if you get demultiplexed files then adjust your processing accordingly to get them into Qiime format.
GenoMax is offline   Reply With Quote
Old 05-01-2015, 06:40 AM   #9
bio_informatics
Senior Member
 
Location: USA

Join Date: Nov 2013
Posts: 182
Default

Quote:
Originally Posted by GenoMax View Post
Manipulating data files is much easier than to have to edit MiSeq config files everytime you want to run 16S data for Qiime (most people won't have access to MiSeq to do this anyway). You also deal with a single "undetermined" pool file instead of multiple sample files.
Ahan.
That makes much sense, and explains the whole fog behind demultiplex, barcode, blah, blah.

I shall now, play around with data, and tools.

Thanks much for your patience and extensive help.
Merci!

Last edited by bio_informatics; 05-01-2015 at 06:42 AM.
bio_informatics is offline   Reply With Quote
Reply

Tags
16srna, mothur, qiime

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:58 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO