Hi All,
I'm relatively new to whole genome sequencing, and have a background in 16S based typing of bacteria using NGS. I am collecting data on a set of closely related organism, using illumina, identifying SNPs, and clustering them into a phylogenetic tree based on the differences between strains. I am wondering whether I need to worry about a 'batch effect' in these samples (either the extraction batch, or sequencing run or library prep or other). If so, it's not really feasible for us to sequence each sample multiple times, what might be the best way to go about first of all, identifying that batch effects may be present, and second, accounting for these in the analysis?
Thanks!
I'm relatively new to whole genome sequencing, and have a background in 16S based typing of bacteria using NGS. I am collecting data on a set of closely related organism, using illumina, identifying SNPs, and clustering them into a phylogenetic tree based on the differences between strains. I am wondering whether I need to worry about a 'batch effect' in these samples (either the extraction batch, or sequencing run or library prep or other). If so, it's not really feasible for us to sequence each sample multiple times, what might be the best way to go about first of all, identifying that batch effects may be present, and second, accounting for these in the analysis?
Thanks!
Comment