Seqanswers Leaderboard Ad

**kmcarr** · 01-31-2012, 06:01 AM

Originally posted by nilmot13 View Post

Hi,

Currently I'm trying to solve a simple problem for bioinformatician, but being a experimental biologist it's proving to be very tricky for me.

At the moment I've obtained mapped reads in SAM format. I want to determine the sequencing coverage in term of number of times sequenced per base. I know I can use pileup function in samtool however it does not give strand specific information.

Are there any way of counting the bases per strand?

Or perhaps handling the pileup dataset in Galaxy to obtain strand specific count?

Many help would be great thank you.

The better tool for this job is the BEDTools Suite, a collection of programs for manipulating genomic range/alignment data. For your particular problem the tool to use is genomeCoverageBed. You provide as input your sorted BAM file and a genome file. The genome file is simply a text file listing the name of each chromosome and its size in bp (see the documentation). The program has options to calculate either global coverage parameters or per base depth. It also has an option to calculate strand specific coverage.

**nilmot13** · 01-31-2012, 06:04 AM

Dear kmcarr,

That's sound exactly like what I need. Thank you every so much, I'll try it out!!

**nilmot13** · 02-01-2012, 03:03 AM

Originally posted by kmcarr View Post

The better tool for this job is the BEDTools Suite, a collection of programs for manipulating genomic range/alignment data. For your particular problem the tool to use is genomeCoverageBed. You provide as input your sorted BAM file and a genome file. The genome file is simply a text file listing the name of each chromosome and its size in bp (see the documentation). The program has options to calculate either global coverage parameters or per base depth. It also has an option to calculate strand specific coverage.

Hi kmcarr,

I've managed to get it to work.

If you have the time do you mind helping me with a couple of more question with regards to BEDTools?

I've setup test file to run the genomeCoverageBed script. Is it possible to output the report as a file? At the moment I'm using this command.

$ genomeCoverageBed -ibam <test file.bam> -bga -strand -g <genome file.genome>

Many thanks!

**nilmot13** · 02-01-2012, 03:46 AM

Other confusions

I think i've worked it out:

by doing "command as before" > filename.file

However, the coverage information is slighly confusing?
I was expecting count per base with strand specificity, but I think it's giving me coverage percentage with respect to entire genome?

**kmcarr** · 02-01-2012, 06:21 AM

When using the -strand option you must follow that with either a '+' or '-' to indicate which strand to report. The output data will be limited to coverage on the strand you indicated so you will need to run the command twice, once for each strand.

Using the -bga option would not report fractions. It outputs a standard bedGraph format file, reporting intervals of coverage at a particular depth with a new interval started when the depth changes. For example:

Code:

The columns are:
Chrom  startPos endPos depth

Chr1	3658	3684	1
Chr1	3684	3693	2
Chr1	3693	3733	3
Chr1	3733	3759	2
Chr1	3759	3763	1
Chr1	3763	3768	3
Chr1	3768	3773	2
Chr1	3773	3838	3
Chr1	3838	3848	1
Chr1	4008	4055	1
Chr1	4055	4077	2
Chr1	4077	4083	3
Chr1	4083	4119	2
Chr1	4119	4125	3
Chr1	4125	4130	4
Chr1	4130	4139	3
Chr1	4139	4144	4
Chr1	4144	4152	5
Chr1	4152	4154	4

If you want to report the depth at each and every position of the genome you should use the -d option instead of -bg(a).

**nilmot13** · 02-01-2012, 07:02 AM

Thank, I've just setup a test file. It works fine now with the +/- sign after the -strand command.

Thanks every so much.

**nilmot13** · 02-03-2012, 03:54 AM

Hi,

A very quick question I don't know if you have encountered this before

during the run of the genomeCoverageBed scipt the output resulted in this error message

BGZF ERROR: read block failed - could not read data from block

I have sorted my bam file beforehand and have check the total chromosome length is correct.
Can anyone suggest potential cause of error?

Thank you very much

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 23 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 24 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 21 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Pileup from BAM or SAM

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News