Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Genome size and corrected genome size

    Hi,
    I came across this term "corrected genome size" while reading one paper. Is there any difference between genome size and corrected genome size. If yes, then what needs to be corrected for genome size.

  • #2
    hi priya,
    You haven't given any specific detail. Was the term used in context to NGS analysis? Was it referring to a reference genome or a de novo assembled one?

    Comment


    • #3
      Originally posted by amitm View Post
      hi priya,
      You haven't given any specific detail. Was the term used in context to NGS analysis? Was it referring to a reference genome or a de novo assembled one?
      I came across this term in paper describing normalization of chip-seq reads .
      For your better understanding , I attached screenshoot of lines from the paper .
      Attached Files
      Last edited by priya; 03-09-2015, 03:06 AM.

      Comment


      • #4
        hi,
        it seems that corrected genome size is the area of the genome covered by all the ChIP-seq reads in that sample.
        So, if sample A has 20M reads and they cover 2Gb of hg19, then corrected genome size is 2Gb.

        Comment


        • #5
          Originally posted by amitm View Post
          hi,
          it seems that corrected genome size is the area of the genome covered by all the ChIP-seq reads in that sample.
          So, if sample A has 20M reads and they cover 2Gb of hg19, then corrected genome size is 2Gb.
          Hi amitm,
          Thank you for your reply!

          Can you please clarify me how to calculate the genome coverage from sequencing experiment.
          For sample read coverage, i can easily check the alignment logs (for eg: Bowtie log files), which gives me clearly stat of number of reads mapped per sample.

          Comment


          • #6
            hi,
            Once you have done the mapping of reads, use the BAM file obtained to create a BED file.
            Use bedtools -


            Then, the coordinates returned would be overlapping. You need to merge them to create "unique" regions/ coordinate intervals.
            Use -


            Once there, add up the lengths of all intervals and thats the portion of the genome covered, i.e. corrected genome size

            Comment


            • #7
              Originally posted by amitm View Post
              hi,
              Once you have done the mapping of reads, use the BAM file obtained to create a BED file.
              Use bedtools -


              Then, the coordinates returned would be overlapping. You need to merge them to create "unique" regions/ coordinate intervals.
              Use -


              Once there, add up the lengths of all intervals and thats the portion of the genome covered, i.e. corrected genome size
              Hi amitm,
              Thanks alot for your clear explaination. I will try it out

              Comment


              • #8
                You can use BEDOPS bam2bed to convert from BAM to BED, pipe to bedops to merge overlapping elements, and pipe to bedmap to generate a list of lengths per merged element to sum with awk:

                $ bam2bed < foo.bam | bedops --merge - | bedmap --echo-overlap-size - | awk '{s += $1;} END {print s;}' > answer.txt

                In this case, bedmap is mapping merged elements against themselves. Merged elements coming out of bedops are guaranteed to be disjoint, so --echo-overlap-size is guaranteed to report the unique length of each merged element.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Current Approaches to Protein Sequencing
                  by seqadmin


                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                  04-04-2024, 04:25 PM
                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 04-11-2024, 12:08 PM
                0 responses
                30 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 10:19 PM
                0 responses
                32 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 09:21 AM
                0 responses
                28 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-04-2024, 09:00 AM
                0 responses
                53 views
                0 likes
                Last Post seqadmin  
                Working...
                X