byou678 10-30-2012 08:02 AM

How to check HCV genomic sequences between Two samples is different or not?
I am looking to determine if the Hepatitis C Virus genomic sequences (variant population structure) between two samples is different to a statistically significant degree.

If the nucleotide substitution or error rate is 1 in 1000 nucleotides and my target is 1665 nucleotides in length, could i determine a difference in the two populations by sampling 100 clones from each population using traditional sanger sequencing ?
or would we need to use Nex Gen Sequencing ?
If we need to NGS, the what depth of coverage would we need to have sufficient power to determine a difference ?

I really appreciate any response!!

swbarnes2 10-30-2012 01:53 PM

A 1665 bp sequence needs 2, maybe 3 sanger sequencing primers to cover completely.

200 samples, 3 sequencing reactions each, a pretty modest project. You use a multi-channel pipet, you set up two plates of PCR reactions, and send all that out with your three sanger primers.

200 samples on an Illumina, you only want 1500 reads per sample to get enough coverage, maybe less. (if the reads are distributed equally, that can be tricky with a small PCR product). Every one of those 200 libraries needs to be prepped separately, so it can have its own barcodes.

You could try pooling the 100 from one population all together, but I'm not sure I would trust that you would be able to tell the difference between a SNV found in a small % of your population and a PCR artifact.

A typical Illumina run produces a billion reads per flow cell at least. So you need only a tiny bit of real estate on a flow cell.

Personally, I think Illumina might be overkill. Sanger sequencing is pretty feasible for a few hundred samples across such a small region.

