Seqanswers Leaderboard Ad

**JackieBadger** · 07-24-2013, 02:07 PM

"Does it make sense to genotype all (1000+) of our samples using RAD (or ddRAD?)"
Not only does it not make sense but will be VERY expensive!

You can get a good estimate of genome wide diversity in a population with 20-30 individuals

"or should we use RADseq on a subset of individuals (96 individuals total from “pure” populations?), ID informative SNPs, and then screen the rest of our samples using some genotyping assay (e.g. Sequenom)?"

This is counter to the rationale behind RADseq. Radseq negates the need for costly, timely, expensive SNP assays. RADseq is genotyping by sequencing. You ID the SNPs and genotype at the same time. Assays not needed

"Many projects seem to go this direction, but the problem I see is that it requires that there be enough flanking sequence (to the SNP) to develop oligos for Sequenom. "

Unless you want to develop an assay that will be used a lot, then it's not worth it. You need to think carefully about the costs of each. For what you want to do (detect introgression between species) you could do this probably with a minimum of 15 samples from each species. We are !

I would advise do RAD on a fraction of your samples, and then look at neutral markers (micro-sats, mtDNA) in the remainder to get the whole picture.

**TKC** · 07-25-2013, 08:07 AM

Hi,

Thank you for the response, I'll take all the help I can get!

"You can get a good estimate of genome wide diversity in a population with 20-30 individuals"
-I guess I should clarify, this is a species complex for which we have samples representing multiple populations per species, and the 1000+ samples is 20-30 individuals per population per species. So we were looking for the most affordable way of looking at every population (as work with msats and mtDNA has already shown that they should be managed as distinct units)... We also know from previous work than hybridization is likely much more/less extensive in some populations than others.

As to cost, what would be a good estimate (rough, ballpark, etc) of cost per individual for RADseq? Would you recommend attempting the library prep in house, or farming that out with the sequencing? We have talked with Floragenex to some extent, and still don't have any concrete quote for total cost per individual...

"This is counter to the rationale behind RADseq. Radseq negates the need for costly, timely, expensive SNP assays. RADseq is genotyping by sequencing. You ID the SNPs and genotype at the same time. Assays not needed"
-I'm not 100% sure how much RADseq will cost us per individual, but Sequenom should cost us (after some start up costs) $4-5 per ~30 SNPs per individual (quoted from a commercial outfit- does anyone else know better??). So assuming we need 150 SNPs to have good diagnostic power that should cost us $20-25 per individual to assay. Since RADseq will give us more information than we really need, and if Sequenom is cheaper, it made sense to me that we could identify the SNPs (via RAD) that are useful for diagnosing hybrids, and then run the rest of our samples through the assay for genotyping.

Again, thanks for taking the time to respond!

**SNPsaurus** · 07-25-2013, 09:18 AM

Hi TKC,

In the "old" days of a few years ago, lots of people did what you suggest here: use RAD-Seq to identify SNPs in a subset of a population, then convert a subset of those SNPs to a high-throughput genotyping platform. People used RAD PE contigs to get 300-500 bp, or overlap PE. And as you mention, a MiSeq run would now also have the same purpose.

But as sequencing costs drop, the population size where that strategy is appropriate keeps getting higher. It is an investment to set up the genotyping, and you are then dealing with only previously known SNPs so lose the ability to discern new alleles that may be of interest.

It sounds like the first question is if you need to get information on all 1000 individuals. When someone contacts SNPsaurus or my lab about a genotyping project (disclosure: my academic lab developed RAD-Seq, I have equity in Floragenex which offers RAD-Seq, I founded SNPsaurus which offers nextRAD), we ask how many markers are needed, do you need perfect information at each locus (this is a little tricky, some applications such as a genetic map prefer to have high quality genotypes that reliably call each allele of a heterozygote, others want good quality calls but missing alleles are OK), and what is the SNP rate in the population if known (i.e. how many sequenced loci will have a SNP in the population).

From that, you can design the experiment. In your case, since cost is a factor for a large population, if you can get away with it you'd hope to get by with fewer markers and lower coverage. So a project assaying 30,000 tags per genome (10,000 markers if 1/3 of tags have SNPs) at 5x coverage (you will have good quality calls but miss a portion of heterozygous alleles) can fit >600 samples per lane. Usually the project is constrained by index availability at that kind of multiplexing (for nextRAD we use dual indexing and typically mutliplex at 192 samples per lane), in which case you need 6 lanes of sequencing, and will get higher coverage than planned because you aren't multiplexing as much as possible.

For this kind of low coverage sequencing, the library cost will dominate, then. Most people peg the cost of materials at $15-20 per sample. I think the biggest unplanned for cost is labor (again, I'm an outsourcing service provider so I'll make that argument, but in my academic lab we help people with RAD projects and sometimes it drags on for months with run failure after run failure, so we see the ugly side as well!).

Oops, I went back and saw your 20X coverage... it actually still fits in 6 lanes, so the project would be the same.

**JackieBadger** · 07-25-2013, 09:44 AM

5x coverage is way too low in my opinion. 20-30x minimum.

If cost is no issue, go with one of the Oregon providers (they ain't cheap!..but will deliver SNPs hassle free).... if you do not have 10's/100's of thousands of dollars to spend, I would suggest collaborating with a lab which has expertise.

**SNPsaurus** · 07-25-2013, 12:17 PM

JackieBadger, 5X may be low, but it really depends on the application. GBS (the Elshire method) is designed to sample many loci at sub-1X coverage, for example. If TKC just wants to see if introgression is happening, and there are hundreds of strain-specific SNPs, then having some missing data won't be a problem. But, you are right that it is better to be conservative about it!

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 39 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 41 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 35 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 55 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Novice help: RADseq strategy for population level study

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News