Dear SEQanswers experts - I have a set of 24 samples for RNAseq. The analyses I need to do is pairwise (i.e. 12 pairs of samples) for differential gene expression. I need to run them on a single HiSeq 2500 flowcell (this is all we have money for).
So I could run three samples per lane across the flowcell, which is fine, but it means that some pairs will be compared across lanes, and this feels like it would cause problems in the data later on, with any lane effects causing noise in some of the pairs.
A long time ago an Illumina tech guy suggested pooling all libraries across all lanes as a flexible way of reducing any lane bias. It would also mean I can cross compare any sample to any other without lane bias.
So my questions are:
I plan to use a tophat2/HTseq-count/DESeq2 pipeline for analysis.
Thanks for any input.
Matt
So I could run three samples per lane across the flowcell, which is fine, but it means that some pairs will be compared across lanes, and this feels like it would cause problems in the data later on, with any lane effects causing noise in some of the pairs.
A long time ago an Illumina tech guy suggested pooling all libraries across all lanes as a flexible way of reducing any lane bias. It would also mean I can cross compare any sample to any other without lane bias.
So my questions are:
- if i pooled my samples together into a single pool and run this pool across all 8 lanes, does that seem reasonable to you?
- do many people do this cross-lane pooling regularly?
- And if so, at which stage during data processing do you re-combine the data from multiple fastq files into single sample data? can this be done during initial deconvolution?
I plan to use a tophat2/HTseq-count/DESeq2 pipeline for analysis.
Thanks for any input.
Matt
Comment