Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • zhaopeihua
    Member
    • Aug 2013
    • 18

    How to use QIIME process MiSeq 16s data?

    Hi:

    I’m totally new to this area, I've got some MiSeq 16S data. Is there tutorial like 454 Overview Tutorial("http://qiime.org/tutorials/tutorial.html") on QIIME website?

    thanks in advance
  • GenoMax
    Senior Member
    • Feb 2008
    • 7142

    #2


    DNA sequencing continues to decrease in cost with the Illumina HiSeq2000 generating up to 600 Gb of paired-end 100 base reads in a ten-day run. Here we present a protocol for community amplicon sequencing on the HiSeq2000 and MiSeq Illumina ...

    Comment

    • zhaopeihua
      Member
      • Aug 2013
      • 18

      #3
      thanks for your reply

      I hava checked this tutorial, it used for single-end,but I want to process pair-end data.
      Do you know other tutorial or approach to do this? thank again.

      Comment

      • fanyucai1
        Member
        • Jan 2011
        • 11

        #4
        qiime could not cope with pair-end datas,but if you use miseq platform ,maybe there is a overlap between pair-end data. so you can assembly the data ,then use the qiime.


        Originally posted by zhaopeihua View Post
        Hi:

        I’m totally new to this area, I've got some MiSeq 16S data. Is there tutorial like 454 Overview Tutorial("http://qiime.org/tutorials/tutorial.html") on QIIME website?

        thanks in advance

        Comment

        • csquared
          Member
          • May 2008
          • 67

          #5
          Very easy to use PANDASeq to assemble the paired reads and then use the assembled file in QIIME.

          If you can use MacQIIME, I have a simple python script and some BASH scripts that make processing hundreds of samples very simple using default settings. Perfect for getting to know the software tools and then tweaking as you get more familiar with the tools and settings.
          HudsonAlpha Institute for Biotechnology
          http://www.hudsonalpha.org/gsl

          Comment

          • Vesperholly
            Junior Member
            • Jul 2013
            • 1

            #6
            Hi there,
            We are looking to combine several runs of 16S Miseq data into one large data set for analysis - 400 to 500 samples all told. If you have any ideas about streamlining Qiime analysis for a set this size, or if there are any specific trouble spots we should keep an eye out for, I'd love to hear about them!

            Comment

            • rhinoceros
              Senior Member
              • Apr 2013
              • 372

              #7
              Originally posted by Vesperholly View Post
              Hi there,
              We are looking to combine several runs of 16S Miseq data into one large data set for analysis - 400 to 500 samples all told. If you have any ideas about streamlining Qiime analysis for a set this size, or if there are any specific trouble spots we should keep an eye out for, I'd love to hear about them!
              I've made a pipeline for just that, but it's work related and I can't post it. Anyway, it's a rather simple bash script that anyone could write. Everything starts from a map file from where relevant information is parsed and passed on to mothur for denoising. Then some header editing so every sequence surely has a unique ID (which is nice if you want to combine samples later on). Then back to QIIME for open reference otu picking. Nothing complicated. Almost everything QIIME related works just fine with default settings, however, in my experience more memory should be allocated to RDP-classifier, or you'll risk it hanging.

              edit. My pipeline is for 454-data. Anyway, I'm going to adapt it for MiSeq data someday soon. Don't think much needs to be changed, just the preprocessing steps..
              Last edited by rhinoceros; 01-20-2014, 01:15 PM.
              savetherhino.org

              Comment

              • bstamps
                Member
                • Oct 2012
                • 40

                #8
                You know there is a pre-existing way to have multiple runs of 454, Illumina...whatever... within QIIME, right? Just add -n some integer to the split_libraries.py or split_libraries_fastq.py that exceeds the number of reads in the first library. For example, for three Illumina runs with overlapping barcodes, and 10 million reads per library

                split_libraries_fastq.py -i run1_reads.fastq -b run1_barcodes.fastq -m run1mapping.txt -o Split1Out/

                split_libraries_fastq.py -i run2_reads.fastq -b run2_barcodes.fastq -m run2mapping.txt -n 10000001 -o Split2Out/

                split_libraries_fastq.py -i run3_reads.fastq -b run3_barcodes.fastq -m run3mapping.txt -n 20000002 -o Split3Out/

                Note that "-n" increments as you add libraries to some arbitrary number that is larger than the total possible number of reads in the previous library. This is done to number each sequence in the output seqs.fna so that there are no overlapping read names in each file.

                Afterwards you would take the output from each SplitOut folder and concatenate them into a single seqs.fna

                cat Split1Out/seqs.fna Split2Out/seqs.fna Split3Out/seqs.fna > seqs.fna

                After this, make a new mapping file with all your samples in it.

                And then run the remainder of your QIIME workflow as you want. You'll get errors about duplicate barcodes, but post split_libraries this is irrelevant.

                See http://qiime.org/scripts/split_libraries_fastq.html for more guidance for this command, or here http://qiime.org/tutorials/denoising_454_data.html and here http://qiime.org/scripts/split_libraries.html if you have conventional or 454 libraries.


                Originally posted by rhinoceros View Post
                I've made a pipeline for just that, but it's work related and I can't post it. Anyway, it's a rather simple bash script that anyone could write. Everything starts from a map file from where relevant information is parsed and passed on to mothur for denoising. Then some header editing so every sequence surely has a unique ID (which is nice if you want to combine samples later on). Then back to QIIME for open reference otu picking. Nothing complicated. Almost everything QIIME related works just fine with default settings, however, in my experience more memory should be allocated to RDP-classifier, or you'll risk it hanging.

                edit. My pipeline is for 454-data. Anyway, I'm going to adapt it for MiSeq data someday soon. Don't think much needs to be changed, just the preprocessing steps..

                Comment

                • Brajbio
                  Member
                  • Jun 2010
                  • 20

                  #9
                  Check out this tutorial for Pre-processing Paired end Illumina data for QIIME

                  Comment

                  Latest Articles

                  Collapse

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by SEQadmin2, 06-09-2026, 11:58 AM
                  0 responses
                  25 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-05-2026, 10:09 AM
                  0 responses
                  30 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-04-2026, 08:59 AM
                  0 responses
                  39 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-02-2026, 12:03 PM
                  0 responses
                  62 views
                  0 reactions
                  Last Post SEQadmin2  
                  Working...