Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to Demultiplex a Nextera paired-end MiSeq run

    Has anybody been able to successfully demultiplexed a Nextera paired-end MiSeq run?
    The current MiSeq Reporter cannot demultiplex and produce individual fastq.gz files for each dual-indexed sample.
    So I thought, I’ll will give CASAVA a try but I keep getting error after error. I faked the sample sheet to look like the examples on the CASAVA UG but now I get a “DemultiplexedBustardConfig.xml” error.
    Anybody there with some advice for a frustrated Biologist?
    Thank You,
    Alfredo Lopez

  • #2
    Yes I've done it by Python. Basically each of the I1, I2, R1 and R2 fastq.gz files are related to each other positionally line-by-line. That is, the first line of I1 corresponds to the same cluster as the first line in I2 and in R1 and in R2

    What I did was parse and use regexp to read the fastq file read by read and then write each read into its own demultiplexed fastq. You can read the last part of the read header that looks something like this:

    1:N:0:1
    first number means read/index 1 or 2
    last number is the classification according to the order you provided in your run sample sheet.
    In this case this mean that this is read 1 and comes from sample 1.

    Sometimes you get something like this:
    1:N:0:0
    This means that CASAVA was not able to classify it because the raw read of one of the indexes is too vague, degenerate, full of useless N's to be able to bin it.

    So if you read each header of the raw multiplexed fastq, you can classify each read and write it into separate files.
    Hope this helps.

    Comment


    • #3
      Another low-tech way to demultiplex is to point each indexed sample to a different Genome Folder on the Miseq sample sheet and run MiSeq Reporter. This will trick MSR into demultiplexing for you.

      Comment


      • #4
        Solved!

        Hi KentK and Zherbet,
        Thank you very much for your replies. After a few emails with Illumina's customer support I got CASAVA to run. It turns out the CASAVA is very picky about the project and sample names on the sample sheet and you cannot have any of these characters: ? ( ) [ ] / \ = +. < > : ; " ' , * ^ | &
        So, once I got a bona fide CASAVA style sample sheet the program produced the expected demultiplexed individual fastq files.
        >FCID,Lane,Sample_ID,SampleRef,Index,Description,Control,Recipe,Operator,SampleProject
        The flow cell ID can be obtained from the SAV. It is the number on top of every SAV graph and has the following format: 000000000-A0???
        The advantage of using CASAVA is that I can control the mismatches policy. So I can get perfect indexes or with a single one on either or both indexes.
        Thanks!

        Comment


        • #5
          Originally posted by zherbert View Post
          Another low-tech way to demultiplex is to point each indexed sample to a different Genome Folder on the Miseq sample sheet and run MiSeq Reporter. This will trick MSR into demultiplexing for you.
          What do you mean by "point each indexed sample to a different folder"? Do you do this in IEM when creating a sample sheet before the run?

          Comment


          • #6
            Originally posted by kentk View Post
            What do you mean by "point each indexed sample to a different folder"? Do you do this in IEM when creating a sample sheet before the run?
            Yes, you can do this in IEM, but you can also edit the sample sheet later and rerun MSR. One way to to set this is up is by creating a few subdirectories in the the Genomes location. I tested this by putting 12 copies of phiX subdirectories named 1-12 within a Demultiplex directory:

            Path/To/Genomes/Demultiplex/1
            Path/To/Genomes/Demultiplex/2
            Path/To/Genomes/Demultiplex/3

            I would only recommend doing this for low-plexity runs. The output get a bit messy (i.e. 8 very similarly named fastq files for each sample output to the same location in a dual index run), but it works well enough for small numbers of pooled samples.

            Hope this helps.

            Comment


            • #7
              Thanks zherbert. Ill try this out. Still a hack though. I wish Miseq would just store those sequences instead.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM
              • seqadmin
                Techniques and Challenges in Conservation Genomics
                by seqadmin



                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                Avian Conservation
                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                03-08-2024, 10:41 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Yesterday, 06:37 PM
              0 responses
              12 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 06:07 PM
              0 responses
              10 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-22-2024, 10:03 AM
              0 responses
              51 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-21-2024, 07:32 AM
              0 responses
              68 views
              0 likes
              Last Post seqadmin  
              Working...
              X