Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Demultiplexing and CASAVA 1.7

    Hello,
    I am looking for young bioinformatics. I need tools to perform demultiplexing
    tool (script) to sequences from the GAIIx before CASAVA (1.7). We do not use the index provided by Illumina. Thank you in advance for your help

  • #2
    Originally posted by tonio100680 View Post
    Hello,
    I am looking for young bioinformatics. I need tools to perform demultiplexing
    tool (script) to sequences from the GAIIx before CASAVA (1.7). We do not use the index provided by Illumina. Thank you in advance for your help
    If you don't use an index read, CASAVA is of no use for demultiplexing.
    Have a look at 'sabre' (https://github.com/ucdavis-bioinformatics/sabre) or FastX-Toolkit (http://hannonlab.cshl.edu/fastx_tool...splitter_usage) amongst others.

    hth, Sven

    Comment


    • #3
      You could try Novobarcode. It's included in download of Novoalign at www.novocraft.com. Free to use, no license is required.

      Colin

      Comment


      • #4
        Thank you for the help !

        After converting my files are formatted QSEQ. To use Novobarcode must convert them FASTQ. If so have you a simple solution?

        Comment


        • #5
          You might want to read the following thread as a starting point,http://seqanswers.com/forums/showthread.php?t=1801

          hth, Sven

          Comment


          • #6
            I really am a bioinformatics novice! I want a simple converter... It's panic. I'm harassed by biologists

            Comment


            • #7
              'bfast', as mentioned in the thread, has its own converter,
              BFAST facilitates the fast and accurate mapping of short reads to reference sequences, where mapping billions of short reads with variants is of…


              Download the archive, untar it, and look in the 'scripts' directory, there you'll find a perl script called 'ill2fastq.pl'. I never used it, but it should do the job.
              There are probably a lot more tools... maybe you could have a look at GALAXY, but I am not sure if they provide qseq-to-fastq conversion.

              When I read your original post again, it seems you want to use CASAVA 1.7 for mapping? Be aware that 1.7 needs qseq format for input files, however the fresh 1.8 takes fastq as input ...

              hth, Sven

              Comment


              • #8
                Originally posted by tonio100680 View Post
                Thank you for the help !

                After converting my files are formatted QSEQ. To use Novobarcode must convert them FASTQ. If so have you a simple solution?
                Recent versions of Novoalign will process qseq files. The latest version will accept 3 read files with index tag in it's own read file. Output is then qseq.

                Earlier version could only accept 2 qseq files and would write demuxed files in fastq format. In this case you run novobarcode twice, once for read1 and index read then again for read2 and the index read. You can still do this with latest version if you wnat qseq to fastq conversion.

                Colin

                Comment


                • #9
                  That's right, I want to use CASAVA 1.7 because the 1.8 is not available in France...
                  I can not use novobarcode because the output format is not compatible with the input format CASAVA so I'm looking for an alternative demultiplexer "simply" ...

                  Comment


                  • #10
                    CASAVA 1.8 is already available at iCom; if you have your tags directly attached to your (first) read, you'll probably need to write your own demultiplexer to write qseq again (shouldn't be too hard if know someone familiar with e.g. perl).

                    Or a simple one-liner, assuming the barcode sequence 'ACGTACGT' (not removing it),
                    Code:
                    perl -lane 'print if($F[8]=~/^ACGTACGT/)' SampleABC_qseq.txt > SampleA_NewQseq.txt
                    You could then use the 'qseq-mask' (USE_BASES) option of GERALD to skip these bases.

                    Or just a starting point (not tested thoroughly), a simple script looking for all seqs starting with $barcode and removing it from seq and quals:
                    Code:
                    #!/usr/local/bin/perl
                    
                    use warnings;
                    use strict;
                    
                    my $barcode = shift;
                    my $length  = length($barcode);
                    my @line;
                    my ($s,$q);
                    
                    while (<>) {
                    
                        chomp;
                        @line=split;
                        next unless ($line[8]=~/^$barcode/o);
                        
                        $s=substr($line[8],0+$length);
                        $q=substr($line[9],0+$length);
                        
                        print join("\t", @line[0..7]), "\t$s\t$q\t$line[10]\n";     
                    }
                    start with the barcode sequence as the first argument and a bunch of qseq files as the following arguments. Redirect output to a new file if happy; the above script just dumps to the terminal.

                    E.g.
                    Code:
                    scriptName.pl ACGTACG *qseq.txt > newQseqFile_ACGTACG.txt
                    hth, Sven
                    Last edited by sklages; 06-08-2011, 01:09 AM.

                    Comment


                    • #11
                      btw, as of CASAVA version 1.8 there is a script called "configureQseqToFastq.pl" to convert a whole folder of qseqs to fastq.

                      Comment


                      • #12
                        Originally posted by tonio100680 View Post
                        That's right, I want to use CASAVA 1.7 because the 1.8 is not available in France...
                        I can not use novobarcode because the output format is not compatible with the input format CASAVA so I'm looking for an alternative demultiplexer "simply" ...
                        Let me look to see if I can get novobarcode to do qseq in & out when index tag is embedded in the read.

                        Novobarcode does allow some mismatches in the index tag so it may classify more reads than a perl script.
                        Last edited by sparks; 06-09-2011, 12:33 AM.

                        Comment


                        • #13
                          I've modded novobarcode so that it can write out QSEQ when input is in QSEQ. If you'd like to try it send an email to support (at) novocraft ....
                          Last edited by sparks; 06-09-2011, 12:32 AM.

                          Comment


                          • #14
                            Originally posted by sklages View Post
                            If you don't use an index read, CASAVA is of no use for demultiplexing.
                            Have a look at 'sabre' (https://github.com/ucdavis-bioinformatics/sabre) or FastX-Toolkit (http://hannonlab.cshl.edu/fastx_tool...splitter_usage) amongst others.

                            hth, Sven

                            We actually use Casava v.17 to demultiplex our in-house designed barcodes routinely. We don't use the Illumina index read kit, we just set out detection cycles out past the barcode and use Casava to demultiplex the barcode. It's just a matter of properly formatting the sample sheet. Casava will make you a folder for each barcode and demultiplex the _qseq.txt files to each folder, then you can run Gerald to create the .fastq

                            Take a look a the demultiplex.pl script in Casava 1.7 and there is an example of the sample sheet format to use on pg 14 in the manual.
                            Christine Brennan
                            UM DNA Sequencing Core
                            Ann Arbor, MI 48109

                            [email protected]

                            Comment


                            • #15
                              Originally posted by cbrennan View Post
                              We actually use Casava v.17 to demultiplex our in-house designed barcodes routinely. We don't use the Illumina index read kit, we just set out detection cycles out past the barcode and use Casava to demultiplex the barcode. It's just a matter of properly formatting the sample sheet. Casava will make you a folder for each barcode and demultiplex the _qseq.txt files to each folder, then you can run Gerald to create the .fastq

                              Take a look a the demultiplex.pl script in Casava 1.7 and there is an example of the sample sheet format to use on pg 14 in the manual.
                              ah, ok. But actually you are using an index .. I interpreted that the OP had indices as part of the construct to be sequenced, just like nimblegen adaptors or so.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Advancing Precision Medicine for Rare Diseases in Children
                                by seqadmin




                                Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
                                12-16-2024, 07:57 AM
                              • seqadmin
                                Recent Advances in Sequencing Technologies
                                by seqadmin



                                Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

                                Long-Read Sequencing
                                Long-read sequencing has seen remarkable advancements,...
                                12-02-2024, 01:49 PM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 12-17-2024, 10:28 AM
                              0 responses
                              26 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 12-13-2024, 08:24 AM
                              0 responses
                              42 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 12-12-2024, 07:41 AM
                              0 responses
                              28 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 12-11-2024, 07:45 AM
                              0 responses
                              42 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X