Seqanswers Leaderboard Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Etherella
    Member
    • Aug 2012
    • 20

    CLC genomics wokbench and illumina demultiplexing

    Hi there!
    After a miseq Nextera XT run we got a lot of undetermined data (undeterminedbarcode sequences with one mismatch or more). We wouldn't like to throw away so much data , and looking for a possibility to demultiplex sequences with one or more mismatch in the barcode.
    Does CLC genomics workbench have this function? there is an option to process tagged sequences, but can the mismatched barcodes be processed?

    Thank you for any answers!
  • GenoMax
    Senior Member
    • Feb 2008
    • 7142

    #2
    I am not sure CLC can help since MiSeq reporter apparently will not add the tags to the "undetermined" reads file it produces. I am going by the info provided by dsobral in a recent thread that is in the list below.

    In cases such as this you will need to de-multiplex the MiSeq data using the "Bcl2fastq" software that is available here: http://support.illumina.com/download...tware_184.ilmn. If you are not comfortable using command line tools then you will need to find someone who is reasonably proficient with linux and has access to a linux server.

    You will need:

    1. Full data folder from your MiSeq run
    2. Working install of bcl2fastq (in addition to the illumina link above look at this thread http://seqanswers.com/forums/showthread.php?t=34844) You can allow up to 2 mismatches per tag read.
    3. Example of the SampleSheet.csv you will need to create to run Bcl2fastq is in post #14 in this thread.
    Bridged amplification & clustering followed by sequencing by synthesis. (Genome Analyzer / HiSeq / MiSeq)


    NOTE: If this run was over-clustered (density > 1300-1400 clusters/mm^2 for v.3 reagents) then chances of recovering useful data are slim.

    Comment

    • JackieBadger
      Senior Member
      • Mar 2009
      • 385

      #3
      Are we talking index or barcode?
      For indecies use CASAVA by Illumina
      For inline barcodes use jMHC

      Comment

      • sklages
        Senior Member
        • May 2008
        • 628

        #4
        Originally posted by GenoMax View Post
        2. Working install of bcl2fastq (in addition to the illumina link above look at this thread http://seqanswers.com/forums/showthread.php?t=34844) You can allow up to 2 mismatches per tag read.
        bcl2fastq allows, just as CASAVA before, exactly one or zero mismatches in index recognition.

        Comment

        • sklages
          Senior Member
          • May 2008
          • 628

          #5
          Originally posted by Etherella View Post
          Hi there!
          After a miseq Nextera XT run we got a lot of undetermined data (undeterminedbarcode sequences with one mismatch or more). We wouldn't like to throw away so much data , and looking for a possibility to demultiplex sequences with one or more mismatch in the barcode.
          Does CLC genomics workbench have this function? there is an option to process tagged sequences, but can the mismatched barcodes be processed?

          Thank you for any answers!
          As GenoMax has already pointed out, it is possible to get the "undetermined indices" when demultiplexing with CASAVA/bcl2fastq (no idea why Illumina does not write the index sequences in the header for the miseq undet files).

          But maybe it is enough if you just ask your sequence provider to run demultiplexing with one mismatch?

          Comment

          • luc
            Senior Member
            • Dec 2010
            • 469

            #6
            To my knowledge,
            CLC does demultiplexing only for in-line barcodes, not for barcodes in separate barcode reads. CLC assumes that such de-multiplexing is being done by the Illumina system software. It is relatively easy to do demultiplexing with some scripts tolerating one (examples are already mentioned) or more mismatches (there certainly are better options, but we have some quick and dirty script if desired).
            Last edited by luc; 02-25-2014, 01:55 PM.

            Comment

            • Bioinform
              Member
              • May 2013
              • 17

              #7
              Hi,

              Does anyone have perl or pythogn script that can pull out Reads (Forward) from R1 file and corresponding pair (Reverse) from R2 file. CLC workbench does give paired sequence, but as mentioned by luc it looks for inline barcodes.
              I want some script that works alike and tolerate some mismatch. i would also expect it looks for barcode in seperate barcode reads.

              Many thanks

              Comment

              • luc
                Senior Member
                • Dec 2010
                • 469

                #8
                Hi Bioinform,

                The allPrep-8.py script out of barcode-tools set, will do what you want and more.
                When using the "-D" it will only demultiplex ( and not do adapter or quality trimming).

                Comment

                • oligoelemento
                  Junior Member
                  • Jan 2014
                  • 1

                  #9
                  More options:
                  jmhc
                  fastx_barcode_splitter

                  But I have a question, is there any software that detects also insertions/deletions in the barcodes? I want to use something to repair Ion Torrent barcodes but the software above only detects mismatches

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Pathogen Surveillance with Advanced Genomic Tools
                    by seqadmin




                    The COVID-19 pandemic highlighted the need for proactive pathogen surveillance systems. As ongoing threats like avian influenza and newly emerging infections continue to pose risks, researchers are working to improve how quickly and accurately pathogens can be identified and tracked. In a recent SEQanswers webinar, two experts discussed how next-generation sequencing (NGS) and machine learning are shaping efforts to monitor viral variation and trace the origins of infectious...
                    03-24-2025, 11:48 AM
                  • seqadmin
                    New Genomics Tools and Methods Shared at AGBT 2025
                    by seqadmin


                    This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

                    The Headliner
                    The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
                    03-03-2025, 01:39 PM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, Today, 10:17 AM
                  0 responses
                  7 views
                  0 reactions
                  Last Post seqadmin  
                  Started by seqadmin, 03-20-2025, 05:03 AM
                  0 responses
                  49 views
                  0 reactions
                  Last Post seqadmin  
                  Started by seqadmin, 03-19-2025, 07:27 AM
                  0 responses
                  59 views
                  0 reactions
                  Last Post seqadmin  
                  Started by seqadmin, 03-18-2025, 12:50 PM
                  0 responses
                  50 views
                  0 reactions
                  Last Post seqadmin  
                  Working...