![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Merging reads mapped on genome and CDS (SOLID data) | jmandel | Bioinformatics | 1 | 01-04-2012 07:34 AM |
Programs for 454 data sequence alignment and mapping | Abishai3911 | 454 Pyrosequencing | 0 | 06-30-2011 03:12 PM |
PubMed: Optical mapping of DNA: Single-molecule-based methods for mapping genomes. | Newsbot! | Literature Watch | 0 | 01-06-2011 03:00 AM |
Why do we use mapping programs instead of blast for mapping to a reference? | thsuk1 | Bioinformatics | 6 | 08-27-2010 09:54 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: Maryland, USA Join Date: May 2012
Posts: 60
|
![]()
Greetings.
I'm interested in looking, in detail, at the mapped coverage for NGS short read data onto a reference, using whole genome data. I'd like a good way to examine and quantify the coverage along a chromosome and compare this coverage among samples to look for areas with usually high- and low-coverage, and to see if these high- and low-coverage areas are common among samples. I know how to get basic mapped read data using SAMtools, and I can look at the data visually using Atermis, but I don't know how to do anything more sophisticated. Any help is much appreciated, cheers! |
![]() |
![]() |
![]() |
#2 |
Senior Member
Location: San Francisco, CA Join Date: Feb 2011
Posts: 286
|
![]()
I suppose what you might be looking for is IGV, it does a very nice job of displaying coverage coming from an aligned bam file. It can show you things like SNPs/indels in your reads too.
|
![]() |
![]() |
![]() |
#3 |
Rick Westerman
Location: Purdue University, Indiana, USA Join Date: Jun 2008
Posts: 1,104
|
![]()
Since Genomics101 can already look at the data visually via Artemis (ACT) I am not sure if IGV would add much to his knowledge.
I suggest exploring the use of the '-D' (and required '-u/-g') option in 'samtools mpileup'. Piped to bcftools that should give you the depth of coverage for each sample at each base. A bit of parsing into spreadsheet form and you could do further manipulations in Excel, et.al. |
![]() |
![]() |
![]() |
#4 |
Curious Character
Location: Berlin, Germany Join Date: Sep 2010
Posts: 13
|
![]()
You should check out QualiMap, which is useful to get an overview of BAM file:
http://qualimap.bioinfo.cipf.es/ It computes plenty of useful quality metrics such as mean and std coverage, coverage per chromosome and others. Additionally it produces a number of plots including coverage histogram and coverage across reference. By request it can output per-base coverage. Here is an example report: http://qualimap.bioinfo.cipf.es/samp...mapReport.html |
![]() |
![]() |
![]() |
#5 |
Senior Member
Location: sub-surface moon base Join Date: Apr 2013
Posts: 372
|
![]()
You could:
1. Map your reads to the reference with bwa 2. Turn the sam file into a bam file with samtools 3. Visualize with tablet
__________________
savetherhino.org |
![]() |
![]() |
![]() |
#6 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,088
|
![]()
It appears that Genomics101 is interested in getting quantitative/qualitative differential coverage data for samples (not just a visual inspection). Trying to think if there is anything available that can do this without custom code.
|
![]() |
![]() |
![]() |
#7 |
Rick Westerman
Location: Purdue University, Indiana, USA Join Date: Jun 2008
Posts: 1,104
|
![]()
I suspect some custom code will have to be written but the code, aside from samtools and bcf, should be all unix tools. I suspect that 'grep,' 'cut' and perhaps 'paste' will be the tools required. Nothing that a even a rookie bioinformatics person should find difficult. While a bit awkward I think that the following will tease out the depths into a TSV file.
Code:
# Get chromosome positions and per-sample coverage (depth) samtools mpileup -D -u file1.sorted.bam file2.sorted.bam file3.sorted.bam | bcftools view - | grep -v '#' | cut -f 1,2,10-12 > bam.tmp # Just extract chromosome positions cut -f 1,2 bam.tmp > position.tmp # Get the depth part of the samples for i in `seq 3 5` do cut -f $i bam.tmp | cut -f 2 -d ':' > depth_$i.tmp done # Put it together into a spreadsheet paste position.tmp depth_*.tmp > results.tsv Code:
S1 1 1802 1264 2665 S1 2 1811 1267 2666 S1 3 1812 1267 2667 S1 4 1812 1267 2668 S1 5 1812 1268 2668 S1 6 1811 1266 2661 S1 7 1812 1267 2667 S1 8 1811 1265 2665 S1 9 1811 1266 2669 S1 10 1810 1263 2662 Last edited by westerman; 07-22-2014 at 10:48 AM. Reason: Ah, first try did not work. Need a 'for' (bash). Remember you are getting what you pay for. :-) |
![]() |
![]() |
![]() |
#8 |
Member
Location: New Zealand Join Date: Mar 2014
Posts: 20
|
![]()
Geneious (which is commercial software) has nice coverage visualization. You can also run the high/low coverage finder on each sample to annotate regions of high/low coverage. Then run the compare annotations tool to compare the results between samples which will annotate regions in one sample that have high/low coverage not present in other samples. The video at https://www.youtube.com/watch?v=IOGmxjK3f_4 demonstrates some of this starting from around 30 seconds into it.
|
![]() |
![]() |
![]() |
#9 |
Member
Location: Düsseldorf Join Date: Nov 2010
Posts: 20
|
![]()
Altough I did not intent do cover whole chromosomes, you may have a look at my 'rbamtools' package.
There is a alignDepth function with which you can calculate coverage values as numerical values and may plot the values (There's a new plot function in the next release...) |
![]() |
![]() |
![]() |
#10 |
Senior Member
Location: Stanford Join Date: Jun 2009
Posts: 181
|
![]()
What you're describing is basically the same problem as ChIP-seq peak calling from multiple samples. I wrote a program called UniPeak that does this very straightforwardly (doi:10.1186/1471-2164-14-720). It uses Epanechnikov kernel smoothing rather than base coverage, but if you think about it, coverage is mathematically equivalent to kernel smoothing with a rectangular kernel. So you could try it with Epanechnikov smoothing (if anything this might actually give you better results), or change the kernel function to a rectangle and recompile (PM me if you need help with that).
|
![]() |
![]() |
![]() |
Tags |
mapping coverage |
Thread Tools | |
|
|