Does anyone know if Illumina sequencing is still expected to be biased against GC rich sequences?
In doing some bacterial genome resequencing it is clear that our data does not give Poisson coverage of the genome, but that the variance is much higher than the mean (~35 for some runs). We found that about half of this unexplained dispersion is due to a bias against GC rich sequences, and that local GC content (within 10-20 bp) is the strongest determinant of the differences in sequencing coverage, and this influence decreases to about 100 bp away from the center base, where GC content matters little if at all.
The problem is that I have seen data from other sequencing centers that do not show any GC effect, and have much lower dispersion (variance/mean around 3.5). I would love to get data like this, but can't figure out what is different about our two attempts. Anyone have any insight? Did they change their machines or protocols to avoid this?
In doing some bacterial genome resequencing it is clear that our data does not give Poisson coverage of the genome, but that the variance is much higher than the mean (~35 for some runs). We found that about half of this unexplained dispersion is due to a bias against GC rich sequences, and that local GC content (within 10-20 bp) is the strongest determinant of the differences in sequencing coverage, and this influence decreases to about 100 bp away from the center base, where GC content matters little if at all.
The problem is that I have seen data from other sequencing centers that do not show any GC effect, and have much lower dispersion (variance/mean around 3.5). I would love to get data like this, but can't figure out what is different about our two attempts. Anyone have any insight? Did they change their machines or protocols to avoid this?
Comment