SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Illumina/Solexa (http://seqanswers.com/forums/forumdisplay.php?f=6)
-   -   bcl2fastq fail to demultiplexing, Barcode collision (http://seqanswers.com/forums/showthread.php?t=89531)

lingbl 05-26-2019 10:04 PM

bcl2fastq fail to demultiplexing, Barcode collision
 
bcl2fastq --barcode-mismatches 1 -o ./test --tiles s_6 --sample-sheet SampleSheet_L006_8index.csv



2019-05-27 13:33:58 [28b0880] ERROR: bcl2fastq::common::Exception: 2019-May-27 13:33:58: Success (0): /TeamCityBuildAgent/work/556afd631a5b66d8/src/cxx/lib/layout/BarcodeCollisionDetector.cpp(187): Throw in function void bcl2fastq::layout::BarcodeCollisionDetector::handleCollision(const value_type&, const value_type&)
Dynamic exception type: boost::exception_detail::clone_impl<bcl2fastq::layout::BarcodeCollisionError>
std::exception::what: Barcode collision for barcodes: GACCTGAT, CAGCTGAT
By default, bcl2fastq allows 1 mismatch in each barcode. Barcodes with too few mismatches are ambiguous ( less than 2 times the number of mismatches plus 1). To reduce the number of allowed mismatches, use the command line option: '--barcode-mismatches'. Note that particularly for barcodes with only 1 mismatch, there is the danger that some reads will be written to the wrong sample due to errors in the barcode sequence.



What's wrong about bcl2fastq ? index GACCTGAT, CAGCTGAT have two base different, I cannot see collision between GACCTGAT, CAGCTGAT.:confused:

r.rosati 05-27-2019 02:24 AM

The confusion comes from the use of "mismatch" both for "sequencing error" and "difference between barcodes".
If you allow one sequencing error, then the number of differences between barcodes must be equal to (2*sequencing errors + 1) = 3. Otherwise, for example, if the sequencer reads CACCTGAT it won't be able to attribute this to the first barcode with one sequencing error, or the second barcode with one sequencing error.

In your case, you should allow zero mismatches (sequencing errors) due to having barcodes with two mismatches (differences).

lingbl 05-27-2019 02:52 AM

CAGCTGAT
CACCTGAT

GACCTGAT
CACCTGAT

Great help , thank you

r.rosati 05-27-2019 03:51 AM

Glad to be of help!
As an afterthought - one can't blame the software for calling both "mismatches". Not the sequencer, nor the software know the "truth" and they don't know if a base is a sequencing error or not. So for the software, a mismatch is a mismatch; if it was known that a called base was an error, it wouldn't have been called. Perhaps I should have phrased the two as (1) "mismatch between the read sequences vs the barcode sequences" and (2) "mismatch between the two expected barcodes".


All times are GMT -8. The time now is 01:18 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.