SEQanswers

Go Back   SEQanswers > Applications Forums > Genomic Resequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
what is the file size for a 30X human genome sequencing file, raw and BAM? RNA-seq Illumina/Solexa 2 04-15-2011 11:27 AM
origin of 30X coverage; i.e. why 30X andrewj General 1 12-09-2010 08:48 AM

Reply
 
Thread Tools
Old 02-10-2011, 01:27 AM   #1
henry.wood
Member
 
Location: Leeds, UK

Join Date: Apr 2010
Posts: 63
Default Why 30X

Hello
I'm in the process of organising sequencing of a number of tumours and matched blood samples. I have to related questions regarding coverage. I often hear of 30X coverage as if it is some kind of magic number but I'm curious to know where it has come from. Is it realistic to ask for 20X for my blood samples, since I am only interested in them as a control for my tumours, or is it some kind of logarithmic scale whereby 20X will only get me 10% of the information but spending lots of money on 40X will only add another 2%. My second question is about the tumours, which are likely to be a mixed population of cells. It seems to me that the heterogenous nature of the samples will mean 30X will not be enough, but what will be enough, is 50X OK, 60X if it's very mixed? I realise that these questions are a bit open ended and the answers will depend on what I want to do with the data, but I would appreciate any feedback or pointers to papers where this has been calculated. We're doing it on a HiSeq if error rates come into the calculations.
Cheers
Henry
henry.wood is offline   Reply With Quote
Old 02-10-2011, 02:10 AM   #2
Geneus
Member
 
Location: New Jersey

Join Date: Dec 2010
Posts: 61
Default

I too would love to see answers to these same questions and several others. Is there a point of diminishing return on depth of coverage for cancer samples in detecting mutations? Can there also be a point at which your coverage is so high that the mutations one is detecting could actually be artifacts? Does the type of cancer (and whether the sample is from a cell line or actual tumor sample) have any bearing on the coverage one would use when sequencing for mutation detection?
Geneus is offline   Reply With Quote
Old 02-10-2011, 04:51 AM   #3
krobison
Senior Member
 
Location: Boston area

Join Date: Nov 2007
Posts: 747
Default

I've wondered about these too. One approach to answer it would be to take a genome at much higher coverage & then randomly sample the reads from the BAM file at different coverages & re-call variants. Obviously a bit compute intensive, but perhaps well worth it.

I would guess sample purity is the main determinant for tumors. Once you have some tumor-specific mutations you can start estimating purity. Of course, the tumor itself is heterogeneous. This complicates things further. So one probably needs to state a goal clearly ("have 95% chance of detecting a mutation present in 80% of the tumor in a sample which is 55% tumor") and then compute out target coverage from there.

Some tumors will be more heterogeneous than others; a first guess is that those with higher environmental mutation load OR those which have more DNA repair defects will be more heterogeneous.
krobison is offline   Reply With Quote
Old 02-10-2011, 06:16 AM   #4
Geneus
Member
 
Location: New Jersey

Join Date: Dec 2010
Posts: 61
Default

Quote:
Originally Posted by krobison View Post
One approach to answer it would be to take a genome at much higher coverage & then randomly sample the reads from the BAM file at different coverages & re-call variants. Obviously a bit compute intensive, but perhaps well worth it.
That is the approach someone else had suggested to me. So an N=2 makes me feel better about doing just that.
Geneus is offline   Reply With Quote
Old 02-10-2011, 06:33 AM   #5
henry.wood
Member
 
Location: Leeds, UK

Join Date: Apr 2010
Posts: 63
Default

That seems a pretty sensible idea to me too. Whether I can get it to work by Monday when I send my samples off is another thing. I think my suppliers are quite flexible in letting me do 20-30X now and then spend more money later to bump it up if need be.
PS, what time do you get up Geneus. I'm in Europe so I'm supposed to be up and about at that time, but I make your first reply about 6.15am in your part of the world.
henry.wood is offline   Reply With Quote
Old 02-10-2011, 07:23 AM   #6
Geneus
Member
 
Location: New Jersey

Join Date: Dec 2010
Posts: 61
Default

Quote:
Originally Posted by henry.wood View Post
PS, what time do you get up Geneus.
Henry...let's just say early...and leave it at that.
Geneus is offline   Reply With Quote
Old 02-10-2011, 10:09 AM   #7
kopi-o
Senior Member
 
Location: Stockholm, Sweden

Join Date: Feb 2008
Posts: 319
Default

Quote:
I've wondered about these too. One approach to answer it would be to take a genome at much higher coverage & then randomly sample the reads from the BAM file at different coverages & re-call variants. Obviously a bit compute intensive, but perhaps well worth it.
What a coincidence (or maybe not) - I just started to consider doing this today (before reading this thread). I'll report back if I ever get around to it.
kopi-o is offline   Reply With Quote
Old 02-10-2011, 10:19 AM   #8
anoopmandaher
Junior Member
 
Location: California

Join Date: Nov 2010
Posts: 5
Default

for 'why 30x', check out the following url and corresponding reference. There's certainly more to say on the topic, but I'd thought I'd shoot this quick reply first:

http://www.nature.com/nature/journal...e07517_F5.html
Bentley et al (2008) Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456, 53-59
doi:10.1038/nature07517
anoopmandaher is offline   Reply With Quote
Old 02-10-2011, 12:43 PM   #9
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

Quote:
Originally Posted by krobison View Post
I've wondered about these too. One approach to answer it would be to take a genome at much higher coverage & then randomly sample the reads from the BAM file at different coverages & re-call variants. Obviously a bit compute intensive, but perhaps well worth it.
One of the first WGS papers out of BGI did exactly this in the supplemental materials.
nilshomer is offline   Reply With Quote
Old 02-10-2011, 05:16 PM   #10
Geneus
Member
 
Location: New Jersey

Join Date: Dec 2010
Posts: 61
Default

Quote:
Originally Posted by nilshomer View Post
One of the first WGS papers out of BGI did exactly this in the supplemental materials.
Do you have the details of which journal this appeared in?

Thanks.
Geneus is offline   Reply With Quote
Old 02-10-2011, 08:49 PM   #11
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

Quote:
Originally Posted by Geneus View Post
Do you have the details of which journal this appeared in?

Thanks.
Hopefully you tried searching yourself, but see the section "Depth effect..." in http://www.nature.com/nature/journal/v456/n7218/full/nature07484.html".
nilshomer is offline   Reply With Quote
Old 02-11-2011, 03:42 AM   #12
Geneus
Member
 
Location: New Jersey

Join Date: Dec 2010
Posts: 61
Default

Quote:
Originally Posted by nilshomer View Post
Hopefully you tried searching yourself
Thank you very much.

By the way, hope is not a strategy.
Geneus is offline   Reply With Quote
Old 02-11-2011, 05:55 AM   #13
henry.wood
Member
 
Location: Leeds, UK

Join Date: Apr 2010
Posts: 63
Default

Thanks all for the input. It seems that I can save a bit of money on sequencing my control samples, since an extra 10X coverage going from 20-30X only catches an extra 1-2% of the SNPs. This will let me spend more on my tumours, which is what I'm really interested in. If it ever sees the light of day I'll try and remember to make a mention of the forum in the acknowledgments, even if it's just to get on the list of the "Greatest papers in the world".
henry.wood is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 04:11 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO