Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • allo
    Member
    • Jul 2009
    • 15

    How to Demultiplex a Nextera paired-end MiSeq run

    Has anybody been able to successfully demultiplexed a Nextera paired-end MiSeq run?
    The current MiSeq Reporter cannot demultiplex and produce individual fastq.gz files for each dual-indexed sample.
    So I thought, I’ll will give CASAVA a try but I keep getting error after error. I faked the sample sheet to look like the examples on the CASAVA UG but now I get a “DemultiplexedBustardConfig.xml” error.
    Anybody there with some advice for a frustrated Biologist?
    Thank You,
    Alfredo Lopez
  • kentk
    Member
    • Dec 2011
    • 17

    #2
    Yes I've done it by Python. Basically each of the I1, I2, R1 and R2 fastq.gz files are related to each other positionally line-by-line. That is, the first line of I1 corresponds to the same cluster as the first line in I2 and in R1 and in R2

    What I did was parse and use regexp to read the fastq file read by read and then write each read into its own demultiplexed fastq. You can read the last part of the read header that looks something like this:

    1:N:0:1
    first number means read/index 1 or 2
    last number is the classification according to the order you provided in your run sample sheet.
    In this case this mean that this is read 1 and comes from sample 1.

    Sometimes you get something like this:
    1:N:0:0
    This means that CASAVA was not able to classify it because the raw read of one of the indexes is too vague, degenerate, full of useless N's to be able to bin it.

    So if you read each header of the raw multiplexed fastq, you can classify each read and write it into separate files.
    Hope this helps.

    Comment

    • zherbert
      Junior Member
      • Dec 2009
      • 4

      #3
      Another low-tech way to demultiplex is to point each indexed sample to a different Genome Folder on the Miseq sample sheet and run MiSeq Reporter. This will trick MSR into demultiplexing for you.

      Comment

      • allo
        Member
        • Jul 2009
        • 15

        #4
        Solved!

        Hi KentK and Zherbet,
        Thank you very much for your replies. After a few emails with Illumina's customer support I got CASAVA to run. It turns out the CASAVA is very picky about the project and sample names on the sample sheet and you cannot have any of these characters: ? ( ) [ ] / \ = +. < > : ; " ' , * ^ | &
        So, once I got a bona fide CASAVA style sample sheet the program produced the expected demultiplexed individual fastq files.
        >FCID,Lane,Sample_ID,SampleRef,Index,Description,Control,Recipe,Operator,SampleProject
        The flow cell ID can be obtained from the SAV. It is the number on top of every SAV graph and has the following format: 000000000-A0???
        The advantage of using CASAVA is that I can control the mismatches policy. So I can get perfect indexes or with a single one on either or both indexes.
        Thanks!

        Comment

        • kentk
          Member
          • Dec 2011
          • 17

          #5
          Originally posted by zherbert View Post
          Another low-tech way to demultiplex is to point each indexed sample to a different Genome Folder on the Miseq sample sheet and run MiSeq Reporter. This will trick MSR into demultiplexing for you.
          What do you mean by "point each indexed sample to a different folder"? Do you do this in IEM when creating a sample sheet before the run?

          Comment

          • zherbert
            Junior Member
            • Dec 2009
            • 4

            #6
            Originally posted by kentk View Post
            What do you mean by "point each indexed sample to a different folder"? Do you do this in IEM when creating a sample sheet before the run?
            Yes, you can do this in IEM, but you can also edit the sample sheet later and rerun MSR. One way to to set this is up is by creating a few subdirectories in the the Genomes location. I tested this by putting 12 copies of phiX subdirectories named 1-12 within a Demultiplex directory:

            Path/To/Genomes/Demultiplex/1
            Path/To/Genomes/Demultiplex/2
            Path/To/Genomes/Demultiplex/3

            I would only recommend doing this for low-plexity runs. The output get a bit messy (i.e. 8 very similarly named fastq files for each sample output to the same location in a dual index run), but it works well enough for small numbers of pooled samples.

            Hope this helps.

            Comment

            • kentk
              Member
              • Dec 2011
              • 17

              #7
              Thanks zherbert. Ill try this out. Still a hack though. I wish Miseq would just store those sequences instead.

              Comment

              Latest Articles

              Collapse

              • GATTACAT
                Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                by GATTACAT
                Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
                Yesterday, 11:43 AM
              • SEQadmin2
                Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                by SEQadmin2


                I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                Here are nine questions we think about, in roughly the order they matter, before...
                06-18-2026, 07:11 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by SEQadmin2, Today, 11:08 AM
              0 responses
              6 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-30-2026, 05:37 AM
              0 responses
              11 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-26-2026, 11:10 AM
              0 responses
              19 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-17-2026, 06:09 AM
              0 responses
              53 views
              0 reactions
              Last Post SEQadmin2  
              Working...