SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
PubMed: What can you do with 0.1x genome coverage? A case study based on a genome sur Newsbot! Literature Watch 1 04-11-2012 02:18 PM
Finding *new* regions of DNA in genome assemblies green tree De novo discovery 5 02-20-2012 03:19 PM
Finding Exon-Intron Junctions without a reference genome brachysclereid Bioinformatics 3 05-22-2011 07:21 AM
finding SV pipeline with PE reads boetsie Bioinformatics 0 11-04-2010 09:38 AM
Finding unmapped reads using samtools Ash Bioinformatics 2 10-28-2010 08:20 AM

Reply
 
Thread Tools
Old 12-07-2011, 05:45 PM   #1
smandape1
Junior Member
 
Location: Indianapolis

Join Date: Oct 2011
Posts: 1
Default Finding genome coverage using random reads

Thank you for looking at my question. I am trying to solve this homework question.

Consider the problem of sequencing genome by random reads. If G is the length of the entire sequence, L is the length of the read and n is the number of reads, then coverage is defined as nL/G . Now, if we want 50% of the original long sequence to be covered by at least one fragment, how much coverage do we need?

I read Lander-Waterman http://www.genetics.wustl.edu/bio548...005/Lander.htm model to understand the concept. But didn't quite get how to solve this problem. I thought to consider the given 50% as probability and y as 1 (the one from Poisson distribution) and calculate lambda (that is the coverage). But I don't think I am on right track. I thought of considering y as 1 because the question says 50% of the original long sequence to be covered by atleast one fragment, which means that those bases are sequenced atleast once.

I may be wrong.

Experts can you guide me please.

Thank you.
smandape1 is offline   Reply With Quote
Old 12-07-2011, 10:32 PM   #2
Heisman
Senior Member
 
Location: St. Louis

Join Date: Dec 2010
Posts: 535
Default

I think you are on the right track, actually. Search for the following sentence in the link you provided and you'll see it is doing what you describe here:

"Given 10x coverage, if we want to calculate the probability that a base is sequenced three times, we just need to substitute 3 for y and 10 for C into the formula:"


Also, you won't get much homework help here as this forum is not designed for that purpose. Your question is actually a very basic statistics question and would be more properly asked at some sort of math forum.

Also, in real life the coverage may not be close to being randomly distributed due to varying GC content affecting coverage to different degrees (depending on the sequencing method and preparation.
Heisman is offline   Reply With Quote
Reply

Tags
algorithms, bioinformatics, coverage calculation

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:19 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO