Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • 50bp SE vs 125bp PE RADseq study design

    Hi everyone,
    I am preparing to do a ddRADseq study using a HiSeq 2000 to 1) look for local adaptation to an environmental gradient and 2) look for population structuring at neutral loci. My species (a frog) doesn't have a reference genome and is not very closely related to annotated Xenopus. Despite this, being able to align to the Xenopus genome is a goal.

    Given my budget and the setup of the sequencing facility that I use, I have to choose between sequencing 480 individuals using 50bp SE reads or 384 individuals using 125bp PE reads. I will be running 96 individuals as a pilot study. If 50bp SE reads are sufficient to address my question, I'd rather use that approach so I can include more individuals/populations in the study.

    My question is if I do the 125bp PE reads in the pilot study, is it possible to downsample the 125bp PE reads to simulate what results would have looked like if I had used the 50bp SE reads.

    I appreciate any advice. Including any generally concerning read length and PE vs. SE for this type of study. Also, please let me know if more information would be helpful. Thanks!

  • #2
    Yes it'd be straightforward to downsample to 50bp reads (for instance, using skewer with the flag "-L 50" to downsample the R1s and just ignore the R2s).

    Do you have any expectations regarding the level of polymorphism in your populations? For many of your pop gen analyses you may be sampling a single SNP from each RAD locus to try to account for linkage, so if there is high polymorphism rate it's possible you could be better served by having more individuals. If polymorphism is low, then you might benefit from having longer RAD loci which will give you a greater chance of recovering a polymorphic site within a locus.

    The last time I looked most of the off-the-shelf RADseq data analysis pipelines didn't perfectly account for the linkage between R1s and R2s from paired end reads and, for instance, treated them as separate loci. It's possible that has changed now. But another approach is to size select your fragments so that they will overlap (meaning inserts shorter than about 240bp in your case) and to merge them into single longer reads before analysis. If you can map to Xenopus then this won't be an issue because you likely won't be using the RAD pipelines. But I think that you'll likely get a very poor mapping rate to Xenopus--that's been our experience with Rana.
    Last edited by atcghelix; 09-14-2016, 09:46 AM.

    Comment


    • #3
      Thanks for the reply, atcghelix. I appreciate the tip on using skewer to perform the downsampling.

      Unfortunately, I don't have any expectations for polymorphism because I can't find where anyone has done any sequencing work on my study species. Your explanation of why it would be helpful makes a lot of sense though. I'm not sure how quickly polymorphism rates generally differ between genera, but with my species in Lithobates, I'd be grateful for any insights your work on Rana might provide.

      I'll have to look into how various pipelines deal with linkage in PE data. Our lab isn't set up for precise size selection so I don't think overlapping the reads would work, which makes me lean more toward the 50bp SE reads for the pilot study.

      Comment


      • #4
        Your expectations for polymorphism will likely depend on the spatial scale of your study. We found that roughly 70% of our RAD loci contained at least one variable site (and 30% were invariant) across ~600 miles of habitat for about 90 individuals for one species. If we had instead genotyped 90 individuals from a small area we'd expect a much higher percentage of invariant loci. I think we saw around 0.2% of sites being polymorphic within individuals.

        Of course there's a lot of things that can influence that (if you're working in a region that is all post-glacial recolonization, for instance).

        We were using 100bp SE reads. If we had used 50bp then we'd have more invariant loci. I think we'd expect something like halving the number of variable loci for 50bp SE.

        Comment


        • #5
          Of course, something like 100bp SE reads could be an intermediate solution.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM
          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          18 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          22 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          17 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          49 views
          0 likes
          Last Post seqadmin  
          Working...
          X