Unconfigured Ad

**Wallysb01** · 07-22-2014, 07:31 AM

I suppose what you might be looking for is IGV, it does a very nice job of displaying coverage coming from an aligned bam file. It can show you things like SNPs/indels in your reads too.

**westerman** · 07-22-2014, 07:46 AM

Since Genomics101 can already look at the data visually via Artemis (ACT) I am not sure if IGV would add much to his knowledge.

I suggest exploring the use of the '-D' (and required '-u/-g') option in 'samtools mpileup'. Piped to bcftools that should give you the depth of coverage for each sample at each base. A bit of parsing into spreadsheet form and you could do further manipulations in Excel, et.al.

**kokonech** · 07-22-2014, 08:32 AM

You should check out QualiMap, which is useful to get an overview of BAM file:

Qualimap: Evaluating next generation sequencing alignment data

http://qualimap.bioinfo.cipf.es/

Qualimap Evaluating next generation sequencing alignment data

It computes plenty of useful quality metrics such as mean and std coverage, coverage per chromosome and others. Additionally it produces a number of plots including coverage histogram and coverage across reference. By request it can output per-base coverage.

Here is an example report:

Qualimap: Evaluating next generation sequencing alignment data

http://qualimap.bioinfo.cipf.es/samples/ERR089819_result/qualimapReport.html

Qualimap Evaluating next generation sequencing alignment data

**rhinoceros** · 07-22-2014, 09:02 AM

You could:

1. Map your reads to the reference with bwa
2. Turn the sam file into a bam file with samtools
3. Visualize with tablet

**GenoMax** · 07-22-2014, 09:28 AM

It appears that Genomics101 is interested in getting quantitative/qualitative differential coverage data for samples (not just a visual inspection). Trying to think if there is anything available that can do this without custom code.

**westerman** · 07-22-2014, 09:43 AM

I suspect some custom code will have to be written but the code, aside from samtools and bcf, should be all unix tools. I suspect that 'grep,' 'cut' and perhaps 'paste' will be the tools required. Nothing that a even a rookie bioinformatics person should find difficult. While a bit awkward I think that the following will tease out the depths into a TSV file.

Code:

# Get chromosome positions and per-sample coverage (depth)

samtools mpileup -D -u file1.sorted.bam file2.sorted.bam file3.sorted.bam | bcftools view - | grep -v '#' | cut -f 1,2,10-12 > bam.tmp

# Just extract chromosome positions
cut -f 1,2 bam.tmp > position.tmp

# Get the depth part of the samples
for i in `seq 3 5`
  do 
  cut -f $i bam.tmp | cut -f 2 -d ':' > depth_$i.tmp
done

# Put it together into a spreadsheet
paste position.tmp depth_*.tmp > results.tsv

The results I got on my test file (and I expect high coverage from this experiment) look like:

Code:

S1      1       1802    1264    2665
S1      2       1811    1267    2666
S1      3       1812    1267    2667
S1      4       1812    1267    2668
S1      5       1812    1268    2668
S1      6       1811    1266    2661
S1      7       1812    1267    2667
S1      8       1811    1265    2665
S1      9       1811    1266    2669
S1      10      1810    1263    2662

**Matt Kearse** · 07-22-2014, 03:31 PM

Geneious (which is commercial software) has nice coverage visualization. You can also run the high/low coverage finder on each sample to annotate regions of high/low coverage. Then run the compare annotations tool to compare the results between samples which will annotate regions in one sample that have high/low coverage not present in other samples. The video at https://www.youtube.com/watch?v=IOGmxjK3f_4 demonstrates some of this starting from around 30 seconds into it.

**wokai001** · 07-24-2014, 02:21 AM

Altough I did not intent do cover whole chromosomes, you may have a look at my 'rbamtools' package.
There is a alignDepth function with which you can calculate coverage values as numerical values and may plot the values (There's a new plot function in the next release...)

**jwfoley** · 07-24-2014, 07:20 AM

What you're describing is basically the same problem as ChIP-seq peak calling from multiple samples. I wrote a program called UniPeak that does this very straightforwardly (doi:10.1186/1471-2164-14-720). It uses Epanechnikov kernel smoothing rather than base coverage, but if you think about it, coverage is mathematically equivalent to kernel smoothing with a rectangular kernel. So you could try it with Epanechnikov smoothing (if anything this might actually give you better results), or change the kernel function to a rectangle and recompile (PM me if you need help with that).

Topics	Statistics	Last Post
A New Method Makes Hantavirus Genome Analysis Faster and More Accessible by SEQadmin2 Started by SEQadmin2, 06-05-2026, 10:09 AM	0 responses 16 views 0 reactions	Last Post by SEQadmin2 06-05-2026, 10:09 AM
A New Single-Cell Method Maps DNA-Protein Interactions by SEQadmin2 Started by SEQadmin2, 06-04-2026, 08:59 AM	0 responses 34 views 0 reactions	Last Post by SEQadmin2 06-04-2026, 08:59 AM
Long-Read RNA Sequencing Uncovers a Hidden Layer of Immune Cell Regulation by SEQadmin2 Started by SEQadmin2, 06-02-2026, 12:03 PM	0 responses 37 views 0 reactions	Last Post by SEQadmin2 06-02-2026, 12:03 PM
DNA Methylation Study Reveals How Epigenetic Changes Pass Between Generations by SEQadmin2 Started by SEQadmin2, 06-02-2026, 11:40 AM	0 responses 24 views 0 reactions	Last Post by SEQadmin2 06-02-2026, 11:40 AM

Unconfigured Ad

Best programs/methods for looking at mapped coverage? (Mapping whole genome data)

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News