SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Genome size confirmation through genome assembly bioman1 Bioinformatics 3 04-23-2014 02:05 PM
Genome size estimation moinul De novo discovery 9 04-04-2014 04:22 AM
How to estimating the genome size yanij Bioinformatics 18 09-10-2013 07:19 AM
Genome size quantification tboothby General 1 11-18-2011 08:28 AM
gsAssembler - predicted genome size? Jordy224 Bioinformatics 2 11-22-2010 10:27 PM

Reply
 
Thread Tools
Old 03-09-2015, 01:24 AM   #1
priya
Member
 
Location: sweden

Join Date: Apr 2013
Posts: 57
Default Genome size and corrected genome size

Hi,
I came across this term "corrected genome size" while reading one paper. Is there any difference between genome size and corrected genome size. If yes, then what needs to be corrected for genome size.
priya is offline   Reply With Quote
Old 03-09-2015, 03:51 AM   #2
amitm
Member
 
Location: Manchester, UK

Join Date: Feb 2011
Posts: 52
Default

hi priya,
You haven't given any specific detail. Was the term used in context to NGS analysis? Was it referring to a reference genome or a de novo assembled one?
amitm is offline   Reply With Quote
Old 03-09-2015, 04:00 AM   #3
priya
Member
 
Location: sweden

Join Date: Apr 2013
Posts: 57
Default

Quote:
Originally Posted by amitm View Post
hi priya,
You haven't given any specific detail. Was the term used in context to NGS analysis? Was it referring to a reference genome or a de novo assembled one?
I came across this term in paper describing normalization of chip-seq reads .
For your better understanding , I attached screenshoot of lines from the paper .
Attached Images
File Type: png article.png (140.9 KB, 13 views)

Last edited by priya; 03-09-2015 at 04:06 AM.
priya is offline   Reply With Quote
Old 03-09-2015, 04:06 AM   #4
amitm
Member
 
Location: Manchester, UK

Join Date: Feb 2011
Posts: 52
Default

hi,
it seems that corrected genome size is the area of the genome covered by all the ChIP-seq reads in that sample.
So, if sample A has 20M reads and they cover 2Gb of hg19, then corrected genome size is 2Gb.
amitm is offline   Reply With Quote
Old 03-09-2015, 05:01 AM   #5
priya
Member
 
Location: sweden

Join Date: Apr 2013
Posts: 57
Default

Quote:
Originally Posted by amitm View Post
hi,
it seems that corrected genome size is the area of the genome covered by all the ChIP-seq reads in that sample.
So, if sample A has 20M reads and they cover 2Gb of hg19, then corrected genome size is 2Gb.
Hi amitm,
Thank you for your reply!

Can you please clarify me how to calculate the genome coverage from sequencing experiment.
For sample read coverage, i can easily check the alignment logs (for eg: Bowtie log files), which gives me clearly stat of number of reads mapped per sample.
priya is offline   Reply With Quote
Old 03-09-2015, 07:42 AM   #6
amitm
Member
 
Location: Manchester, UK

Join Date: Feb 2011
Posts: 52
Default

hi,
Once you have done the mapping of reads, use the BAM file obtained to create a BED file.
Use bedtools -
http://bedtools.readthedocs.org/en/l.../bamtobed.html

Then, the coordinates returned would be overlapping. You need to merge them to create "unique" regions/ coordinate intervals.
Use -
http://bedtools.readthedocs.org/en/l...ols/merge.html

Once there, add up the lengths of all intervals and thats the portion of the genome covered, i.e. corrected genome size
amitm is offline   Reply With Quote
Old 03-09-2015, 08:14 AM   #7
priya
Member
 
Location: sweden

Join Date: Apr 2013
Posts: 57
Default

Quote:
Originally Posted by amitm View Post
hi,
Once you have done the mapping of reads, use the BAM file obtained to create a BED file.
Use bedtools -
http://bedtools.readthedocs.org/en/l.../bamtobed.html

Then, the coordinates returned would be overlapping. You need to merge them to create "unique" regions/ coordinate intervals.
Use -
http://bedtools.readthedocs.org/en/l...ols/merge.html

Once there, add up the lengths of all intervals and thats the portion of the genome covered, i.e. corrected genome size
Hi amitm,
Thanks alot for your clear explaination. I will try it out
priya is offline   Reply With Quote
Old 03-25-2015, 02:13 PM   #8
AlexReynolds
Member
 
Location: Seattle, WA

Join Date: Feb 2013
Posts: 45
Default

You can use BEDOPS bam2bed to convert from BAM to BED, pipe to bedops to merge overlapping elements, and pipe to bedmap to generate a list of lengths per merged element to sum with awk:

$ bam2bed < foo.bam | bedops --merge - | bedmap --echo-overlap-size - | awk '{s += $1;} END {print s;}' > answer.txt

In this case, bedmap is mapping merged elements against themselves. Merged elements coming out of bedops are guaranteed to be disjoint, so --echo-overlap-size is guaranteed to report the unique length of each merged element.
AlexReynolds is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:59 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO