Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Counting read depth using bedtools

    I have sequenced a cell line exposed to UV and would like to know if any genes have been deleted compared to the ancestor. I extracted the CDS regions from the annotation file to annotation.bed and ran coverageBed in order to find the read depth at any given exon

    coverageBed -abam DXB11.bam -b annotation.bed > depth.txt

    the output for a particular domain was
    NW_003614442.1 464809 465646 158 837 837 1.0000000
    So 100% of the region 464809-465646 had a depth of 158 and that entire region of 837 bp was that depth, correct?

    That that is very high as the theoretical depth should be 35. So i looked into the depth at every position of the genome
    /BEDTools-Version-2.16.2/genomeCoverageBed -d -ibam DXB11.bam > DXB11.coverage

    and looked into the same region (464809-465646) and done this way it had a median depth of 18 = much more realistic.
    Are you able to see what i did wrong or maybe advice me another way of more easily getting to a median depth of each exon in the genome from a bam file?

  • #2
    Hi Kaas,
    I think 158 is the number of features that overlapped the interval, not necessarily fold coverage. I would recommend that you read The bedtools manual.

    Default Output:
    After each entry in B, reports:
    1) The number of features in A that overlapped the B interval.
    2) The number of bases in B that had non-zero coverage.
    3) The length of the entry in B.
    4) The fraction of bases in B that had non-zero coverage.
    You may want to try -d or -hist options.

    Hope this helps.
    Last edited by rnaeye; 02-21-2014, 08:00 AM. Reason: additional information

    Comment


    • #3
      Hi rnaeye

      Thank you for you answer. I tried going through the description for genomeCoverageBed (http://bedtools.readthedocs.org/en/l.../coverage.html) and for genomecov (http://bedtools.readthedocs.org/en/l...genomecov.html) but had a hard time translating their bioinformatic terms into what conclusion i can make from my own data based on the results.

      The number of features in A that overlapped the B interval = number of reads that are identified in the exon region i specify. But then you would expect at least some kind of correlation between the length of a given region and the depth coverageBed gives, right? because i do not see any correlation. That is the reason why I find this a bit fishy.

      ok, i will use -d and extract the median from there

      Comment


      • #4
        Hi,
        Try to google following search them "The BEDTools manual PDF"
        You can download a PDF version of user manual. I think it explains better. I guess you should calculate coverage per base and conclude it from there. have fun, best.
        Last edited by rnaeye; 02-21-2014, 10:30 AM.

        Comment


        • #5
          read the help of bedtools coverage
          i think you can use -hist or -d option

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Essential Discoveries and Tools in Epitranscriptomics
            by seqadmin




            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
            Yesterday, 07:01 AM
          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          59 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          57 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          48 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          55 views
          0 likes
          Last Post seqadmin  
          Working...
          X