Seqanswers Leaderboard Ad

**fengqi** · 06-25-2013, 09:49 AM

The output one (w/o -hist) only gives you the fraction of exons bases has non-zero coverage by your reads. There is no information of the depth of the coverage.

The output two gives you the fraction covered of each depth of ceverage.

At the end of the output two, you can see a histogram summarizing the coverage among all exons.

**meher** · 06-25-2013, 09:52 AM

Originally posted by fengqi View Post

The output one (w/o -hist) only gives you the fraction of exons bases has non-zero coverage by your reads. There is no information of the depth of the coverage.

The output two gives you the fraction covered of each depth of ceverage.

At the end of the output two, you can see a histogram summarizing the coverage among all exons.

Can we get a single value like x% of exons are covered from the output?

**fengqi** · 06-25-2013, 09:58 AM

Originally posted by meher View Post

Can we get a single value like x% of exons are covered from the output?

default output will report 4 columns for each interaval in B (that is your exon info file)

The first column is the nubmer of reads in your bam file that overlappled the exon interval.

You can easily pick up those exons with zero value. Then you will know how many exons covered, then you can get x%.

Hope I understood correctly what you want.

**meher** · 06-25-2013, 10:01 AM

Originally posted by fengqi View Post

default output will report 4 columns for each interaval in B (that is your exon info file)

The first column is the nubmer of reads in your bam file that overlappled the exon interval.

You can easily pick up those exons with zero value. Then you will know how many exons covered, then you can get x%.

Hope I understood correctly what you want.

Sorry that i couldn't understand, what do you mean by default output? Is it from

samtools view -b <BAM> | coverageBed -abam stdin -b exons.bed

or

samtools view -b | coverageBed -abam stdin -b exons.bed -hist

Either of these outputs have more than 4 columns. So, which one are you referring?

**fengqi** · 06-25-2013, 10:44 AM

Originally posted by meher View Post

Sorry that i couldn't understand, what do you mean by default output? Is it from

samtools view -b <BAM> | coverageBed -abam stdin -b exons.bed

or

samtools view -b | coverageBed -abam stdin -b exons.bed -hist

Either of these outputs have more than 4 columns. So, which one are you referring?

The first one, w/o -hist

**meher** · 06-25-2013, 10:48 AM

Originally posted by fengqi View Post

The first one, w/o -hist

Okay. But there are seven columns in the output. First three gives the chromosome number, start and end position. The next two columns are zero's and sixth is the length of the exon and the last is again zero.

So, how can we get the x% out of this?

**sdriscoll** · 06-25-2013, 02:55 PM

it sounds like you first want to make the coverage of each exon into a binary call (ie either covered or not covered). Then you can just count the number that are covered verses the total.

The last column of the coverageBed output is the percent coverage of the exon feature in that row. Let's say you want to call 95% coverage "covered" and anything below that as "not covered". Also consider that if you're working with a typical alternatively spliced species your list of exons may have redundant entries for exons that are shared between isoforms. So to put it together this might work:

first count all of the unique lines in the exons file:

Code:

sort -k1,1 -k2,2m -k3,3n exons.bed | uniq > exons_unique.bed
wc -l exons_unique.bed

Now you have a total count of exons. Now run the coverageBed with the unique list and then count the number of rows that have a value in the last column greater than your "covered" threshold:

Code:

coverageBed -abam <bam_file> -b exons_unique.bed > exons_coverage.bed
awk '{if($7 > 0.95) print $0}' exons_coverage.bed | wc -l

that should do it. realize, of course, that if you are working with an alternatively spliced transcriptome this result is a little weird because instead of knowing which isoforms the coverage actually belongs to we're assigning to any that the coverage intersects with.

**meher** · 06-26-2013, 03:31 AM

Originally posted by sdriscoll View Post

it sounds like you first want to make the coverage of each exon into a binary call (ie either covered or not covered). Then you can just count the number that are covered verses the total.

The last column of the coverageBed output is the percent coverage of the exon feature in that row. Let's say you want to call 95% coverage "covered" and anything below that as "not covered". Also consider that if you're working with a typical alternatively spliced species your list of exons may have redundant entries for exons that are shared between isoforms. So to put it together this might work:

first count all of the unique lines in the exons file:

Code:

sort -k1,1 -k2,2m -k3,3n exons.bed | uniq > exons_unique.bed
wc -l exons_unique.bed

Now you have a total count of exons. Now run the coverageBed with the unique list and then count the number of rows that have a value in the last column greater than your "covered" threshold:

Code:

coverageBed -abam <bam_file> -b exons_unique.bed > exons_coverage.bed
awk '{if($7 > 0.95) print $0}' exons_coverage.bed | wc -l

that should do it. realize, of course, that if you are working with an alternatively spliced transcriptome this result is a little weird because instead of knowing which isoforms the coverage actually belongs to we're assigning to any that the coverage intersects with.

Thanks it worked. But could it be possible to find, how much percentage of the exons are covered by atleast 1X/5X/10X reads out of the coverageBed output?

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 29 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 32 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 28 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Percentage of exons covered

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News