View Single Post
Old 10-25-2011, 01:05 PM   #1
Junior Member
Location: OR, United States

Join Date: Aug 2011
Posts: 5
Default Illumina1.8 Paired-End Barcode Splitting?

Hi everyone,

Hopefully this is a pretty simple question with a pretty simple answer. I've been using a pipeline for pre-processing Illumina1.8, fresh off the sequencer, data. The pipeline puts sequences with the correct first 5bp (we have a list of 5bp barcodes denoting different samples), including however many mismatches we want to allow, into separate files. We run it first on the /1 half of the data and then on the /2 half and end up with AACCC_1.fq and AACCC_2.fq files for each different barcode.

My question comes in when a given sequence from the /1 side matches a barcode in my list exactly, but the /2 side is not the same as the /1 and doesn't match any barcode. Am I able to safely assume that because the /1 and /2 have the same sequenceID from Illumina and are supposedly read from different ends of the same molecule, that I can put the /2 into the same barcode file as the /1?

My biology background isn't very strong, so sorry ahead of time is this is a simple question. Right now if only one side matches a given barcode, the other side is just thrown out. That would be a terrible waste of PE data if my assumption above is safe. Let me know what you think.
pbatzel is offline   Reply With Quote