SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Haplotype Softwares AsoBioInfo Bioinformatics 43 08-12-2015 04:41 PM
RNA editing softwares nagendra Bioinformatics 2 05-18-2013 01:43 AM
softwares - splicing cedance Bioinformatics 12 08-04-2011 08:28 AM
Softwares for SNP ranking shum1 Bioinformatics 3 07-20-2010 10:23 PM
Graphical softwares! MoBi Bioinformatics 3 12-15-2009 11:38 PM

Reply
 
Thread Tools
Old 12-24-2014, 02:08 AM   #1
ClemBuntu
Member
 
Location: Lyon

Join Date: Dec 2014
Posts: 37
Question Demultiplexing softwares

Hello everyone,

I used to demultiplexe using bcl2fastq.pl provied by CASAVA.
I want to try new tools because I got some trouble at the demultiplexing step.
So I checked on google and I found several softs like TagGD, Flexbar, Sabre, deML etc. but all of them take a FASTQ file in input... And the only way I can have a FASTQ file is demultiplexing using bcl2fastq.pl plus this script removes the index sequence of all reads...
So how can I demultiplexe something which is already demultiplexed and where the index sequence are removed ?

I think the only thing I can do is to give an empty SampleSheet to CASAVA, then I suppose it will put all reads together as undetermined reads, but maybe there is a cleaner way to do ?
ClemBuntu is offline   Reply With Quote
Old 12-24-2014, 06:50 AM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,978
Default

Can you specify what kind of "trouble" you ran into with bcl2fastq?

Please keep in mind that if there was a problem with the index reads (multiple N's in the sequence) then no demultiplexing tool is going to help you. This run may have to be repeated.
GenoMax is offline   Reply With Quote
Old 12-24-2014, 06:55 AM   #3
ClemBuntu
Member
 
Location: Lyon

Join Date: Dec 2014
Posts: 37
Default

Well I got more undetermined reads that I excepted.

I'm sure the main problem comes from the run but anyway how can I do if I want to compare these toolts ?
ClemBuntu is offline   Reply With Quote
Old 12-24-2014, 07:18 AM   #4
westerman
Rick Westerman
 
Location: Purdue University, Indiana, USA

Join Date: Jun 2008
Posts: 1,104
Default

Quote:
Originally Posted by ClemBuntu View Post
Well I got more undetermined reads that I excepted.

I'm sure the main problem comes from the run but anyway how can I do if I want to compare these toolts ?
I suspect (not having used the tools you mentioned) that they are focused on demultiplexing user-generated barcodes instead of using the Illumina indexes and thus you will be disappointed in them. As you said you could force everything into undetermined reads but that won't give you the Illumina indexes.

If you have more undetermined reads than expected then, like GenoMax, I think that you have problems with your indexing reads. You can alleviate some of the problems via the 'use-bases-mask' and 'num_of_mismatches' parameters to Casava.
westerman is offline   Reply With Quote
Old 12-24-2014, 07:27 AM   #5
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,978
Default

Quote:
Originally Posted by ClemBuntu View Post
Well I got more undetermined reads that I excepted.

I'm sure the main problem comes from the run but anyway how can I do if I want to compare these toolts ?
What fraction of the reads are in the undetermined file?

Are you recovering all the samples you expect from the run? A common problem is specifying an incorrect barcode for a sample which sends the reads for that sample(s) to the undetermined file.
GenoMax is offline   Reply With Quote
Old 12-24-2014, 07:33 AM   #6
westerman
Rick Westerman
 
Location: Purdue University, Indiana, USA

Join Date: Jun 2008
Posts: 1,104
Default

GenoMax has, as usual, great suggestions.

His "common problem" of specifying an incorrect barcode (which is indeed something we do often enough) should be easy to spot because one of your samples will have close to zero reads.

The harder problem is if many of the index reads overall have poor quality bases in them. If the problem is base-specific (e.g., occurs only at the 2nd index base) then Casava's 'use-bases-mask' parameter can be used to ignore the specific base. Otherwise 'num_of_mismatches' needs to be used. Sometimes I use both if the reads are really poor.
westerman is offline   Reply With Quote
Old 12-24-2014, 07:37 AM   #7
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,978
Default

Use the following command to find out what tags are represented in your "undetermined" file. You will generally see a great variation present. Thing to look for is if there are some tags that are way over-represented. Sometime people make mistakes in making libraries/with pooling and you don't quite have the result that you expect from a pool.

Replace the ? with specific lane number.

Code:
$ zcat lane?_Undetermined_L00?_R1_001.fastq.gz | grep @HWI |cut -d: -f10 | sort | uniq -c | sort -r -n -k1
GenoMax is offline   Reply With Quote
Old 01-01-2015, 03:38 AM   #8
ClemBuntu
Member
 
Location: Lyon

Join Date: Dec 2014
Posts: 37
Default

The fraction of undetermined reads is not too high, so don't worry about that guys but thanks anyway....


My question was about demultiplexing tools that take FastQ files as input... I'm just wondering why they exist since Illumina software like bcl2fastq demultiplexe and creats FastQ file.... It seem nobody here use these tools that's very strange !

As westerman said maybe theses tools are only useful when you have a custom barecode...
ClemBuntu is offline   Reply With Quote
Old 01-01-2015, 06:12 AM   #9
Michael.Ante
Senior Member
 
Location: Vienna

Join Date: Oct 2011
Posts: 121
Default

Long before the index reads read-out, barcodes were inserted into the reads' sequence (inline barcoding). You had to demultiplex the Fastq file and split it into the different samples according to the first n bases. Therefore, various tools were implemented.
Now in times of the index reads, these tools are more or less outdated.
I hope that answers your question.
Cheers,
Michael
Michael.Ante is offline   Reply With Quote
Old 01-02-2015, 01:11 AM   #10
ClemBuntu
Member
 
Location: Lyon

Join Date: Dec 2014
Posts: 37
Default

Oh I see, now that makes sense thanks !
ClemBuntu is offline   Reply With Quote
Reply

Tags
demultiplexing

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:00 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO