Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Optimal coverage for sequencing microsatellites with Illumina

    My lab is planning a phylogeography study on several different groups of lizards using microsatellites. We are interested in pooling the microsatellite amplicons for all of our individuals and then sequencing that library on an Illumina HiSeq 2500 machine (perhaps not the optimal machine for this project, but the one we have access to).

    Right now, we are trying to figure out the logistics of our protocol, and one thing that we are stuck on is how much coverage we want per microsatellite locus per individual (which determines how many we could pool, etc.). I imagine that one would want more coverage than for a RAD protocol, since there are more potential variants one could be detecting, but I am really not sure. We haven't developed our microsatellites yet, so we don't know how much allelic variation we will be dealing with.

    Has anyone else done a similar protocol with microsatellites? Does anyone have any advice? The few papers I found had wildly different amount of coverage (one had ~ 13x, which they determined was not enough, and the other 2000x, which seems excessive)

    Just starting out, any thoughts would be appreciated!

  • #2
    Amplicons will often have very different read depths given differences in amplicon lengths and GC content. Different samples will also have different total read counts. So you will want to oversequence to get sufficient depth of your worse-performing samples and worse-performing amplicons. If you can't fit it all in, then you'll have to decide to do fewer samples or be OK with not all amplicons returning data.

    At low read depths, sampling probability rules. Let's say two alleles are present at a locus and they have the same amplifying performance. At 10X read depth there is a (1/2)^10 or 0.1% chance of not sampling that allele (not too bad). But let's say the allele is a little longer amplicon and the read balance is 7 to 3. Now there is a 3% chance of not getting a read in the worse performing allele. Now imagine you want 3 reads to call the allele... the chance is actually quite high you won't achieve that.

    I'd pick some number, like 20X depth, then add more for different reasons... let's say 50% of the library is off-target amplifications, so double the reads needed. Now predict you have a 4-fold variation in read count between samples and you want good coverage of the low ones... multiply by 4. There is a 10-fold variation in locus coverage, thats 10X more. Now it seems super high, but you can decide to drop the very worst loci and multiply by 5 instead of 10. Anyway, that's the process!
    Providing nextRAD genotyping and PacBio sequencing services. http://snpsaurus.com

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Techniques and Challenges in Conservation Genomics
      by seqadmin



      The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

      Avian Conservation
      Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
      03-08-2024, 10:41 AM
    • seqadmin
      The Impact of AI in Genomic Medicine
      by seqadmin



      Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
      02-26-2024, 02:07 PM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 03-14-2024, 06:13 AM
    0 responses
    32 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 03-08-2024, 08:03 AM
    0 responses
    71 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 03-07-2024, 08:13 AM
    0 responses
    80 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 03-06-2024, 09:51 AM
    0 responses
    68 views
    0 likes
    Last Post seqadmin  
    Working...
    X