I have 125xPE Hiseq data with dual Nextera indexes (each 8 bp). My data is from a single lane of the flowcell, but I have my lane's .bcl and .cpos files. Using these files I then perform my de-multiplexing using Picard.
However, I would like to remove any human reads from the whole data set before I ever de-multiplex the samples. This way human reads would never be identified on a per-sample basis, only on a group wide basis and filtered out as early as possible from my downstream analysis pipeline.
Can anyone suggest a way to do this? Is it feasible?
It seems that if I could get Step 3 to work, this would do it.
Step 1: Completely skip all barcode bases in the reads and generate my PE fastq files
Step 2: Tag human reads from these de-identified fastq files using some tool
Step 3: Somehow use Step 2's tags to remove human data from the raw .bcl and .cpos files
Step 4: De-multiplex again using the barcode
Also, any favorite tool for human read filtering from Hiseq data (and justification)?
Any comparison between Picard for demux and Illumina's bcl2fastq or other demultiplexers? It seems to me that a demultiplexer that takes into account single indels would be nice.
However, I would like to remove any human reads from the whole data set before I ever de-multiplex the samples. This way human reads would never be identified on a per-sample basis, only on a group wide basis and filtered out as early as possible from my downstream analysis pipeline.
Can anyone suggest a way to do this? Is it feasible?
It seems that if I could get Step 3 to work, this would do it.
Step 1: Completely skip all barcode bases in the reads and generate my PE fastq files
Step 2: Tag human reads from these de-identified fastq files using some tool
Step 3: Somehow use Step 2's tags to remove human data from the raw .bcl and .cpos files
Step 4: De-multiplex again using the barcode
Also, any favorite tool for human read filtering from Hiseq data (and justification)?
Any comparison between Picard for demux and Illumina's bcl2fastq or other demultiplexers? It seems to me that a demultiplexer that takes into account single indels would be nice.
Comment