SEQanswers

Go Back   SEQanswers > Applications Forums > Sample Prep / Library Generation



Similar Threads
Thread Thread Starter Forum Replies Last Post
SSPACE: a new stand-alone scaffolding tool for small and large genomes boetsie Bioinformatics 251 06-01-2016 08:46 AM
Indexing very large genomes Brett_CCG Bioinformatics 10 08-15-2013 10:52 PM
Bambus2 ... setup for large(ish) genomes plattsa Bioinformatics 0 05-09-2012 08:32 AM
CLC genomics server for assembling large genomes ngs_agd Bioinformatics 4 03-23-2012 01:47 PM
Assembly of Large Genomes using Cloud Computing by Contrail Gangcai De novo discovery 9 11-23-2011 08:42 AM

Reply
 
Thread Tools
Old 02-27-2014, 03:26 PM   #1
atcghelix
Member
 
Location: CA

Join Date: Jul 2013
Posts: 74
Default ddRAD with frogs/large genomes

Hi Everyone,

We're trying to do some RAD sequencing on frogs that have large genomes (9-10gb). We've tried some samples with single digest RAD using ApeKI (another lab was prepping some samples with ApeKI and threw some of our frogs in). We found almost no SNPs between samples, I believe because we had too many fragments and really low coverage. So we're now considering ddRAD with a rare-cutting enzyme.

Does anyone have any suggestions of restriction enzymes to use for amphibians with large genomes? Does anyone have advice for RAD with large-genome amphibians? The species is Rana boylii (and a few other closely related species), and we plan on doing size selection using either gel excision or SPRI beads. We want to put about 120 individuals on a HiSeq lane, so we could handle a relatively large number of fragments.

Thanks very much!
atcghelix is offline   Reply With Quote
Old 02-27-2014, 03:37 PM   #2
JackieBadger
Senior Member
 
Location: Halifax, Nova Scotia

Join Date: Mar 2009
Posts: 381
Default

Without answering the your question, I can tell you that 120 individuals (especially with such large genomes) in one lane is out of the question. With such a large genome, 5 - 10 individuals..maybe 20... Plenty of knowledgeable RAD people on here, so I'm sure you will get help.
JackieBadger is offline   Reply With Quote
Old 02-27-2014, 08:55 PM   #3
SNPsaurus
Registered Vendor
 
Location: Eugene, OR

Join Date: May 2013
Posts: 422
Default

ApeKI is going to give fragment sizes of around 500bp, and with reads off of both sides (I'm guessing you were following the Cornell GBS protocol rather than the "single digest and shear" RAD protocol), you'd expect 40M different sites to be sampled in a 10 Gb genome, so you are right about low coverage and too many fragments. This isn't really a fault of GBS as it was designed to do exactly that... lightly skim sequence the possible sites and impute the missing genotypes from the many reference genomes available, which probably aren't available for your frog!

I would be very cautious about using ddRAD without a Pippin to do the size selection. Let's say you use a combination of restriction enzymes to produce 50,000 fragments in a 50 bp size range. Even a 5 bp shift in the size selection will remove the 5,000 fragments at the bottom of the range and add 5,000 fragments at the top of the range. So it would be very difficult to get consistency between libraries. You could do all the samples in one library, if you never plan to compare the data to a future library. If the polymorphism rate is high between the species or samples, then the use of a frequently-cutting enzyme will cause a high rate of locus drop out (there will be dozens of sites one mutation away from becoming the frequent cutter in every fragment).

I'm not sure what I would recommend. If you had access to a Pippin you could try to get the number of fragments down to 20,000 or so by taking a very tight size slice, in which case you probably could multiplex 100+ samples and get sufficient depth to call heterozygous loci. Even with the Pippin, at that tiny size range even a few bp shift would cause problems between libraries, so you would have to pool all the samples in one library and not expect to compare to future libraries.

Using traditional RAD-Seq (digest and shear), you could use SbfI, isolate long RAD tags (sheared and ligated) and destroy many of the fragments with a combo of 4-cutters to reduce below the 300,000 read sites you would expect to get. My lab has done that and it worked well. But the use of 4-cutters would also cause many fragments to gain or lose 4-cutter sites in polymorphic samples as in the ddRAD above, causing locus drop-outs.

If you just want to find SNPs, you could do any of these approaches on pools of individuals and identify high-frequency alleles after sequencing the pool to 100-300X. If you want individual data, you could sequence to lower read depth and downsample to a single allele (see Genomic Evidence for Island Population Conversion Resolves Conflicting Theories of Polar Bear Evolution http://www.plosgenetics.org/article/...l.pgen.1003345)
and still calculate Tajima's D and identify admixture, phylogeography etc). I'd suggest low coverage RAD (of some type) rather than low coverage whole genome shotgun, to simplify the informatics and force more overlap of sequence between samples.

But spending more would ease some of the parameter pressure your population size and budget create!
__________________
Providing nextRAD genotyping and PacBio sequencing services. http://snpsaurus.com
SNPsaurus is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:06 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO