SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Bioinformatics (http://seqanswers.com/forums/forumdisplay.php?f=18)
-   -   Extremely high Coverage in some regions (http://seqanswers.com/forums/showthread.php?t=26369)

JMFA 01-10-2013 03:56 AM

Extremely high Coverage in some regions
 
Dear all,

I am currently working with NGS data for one individual sequenced using Illumina HiSeq 2000 (13-16x coverage).
I follow some standard pipelines to align and prepare the data for variant calling (BWA, Picard and GATK). (I am not interested in exome analysis but I've followed this pipeline to prepare the BAM files http://seqanswers.com/wiki/How-to/exome_analysis)

When I was ready for the calling step I decided to check the coverage of this individual using BEDTools. The average coverage looks good (14/15X) but in some regions I get extremely high values like 9200x or higher !

I don't think that this is normal and I believe that these values are not OK even for repetitive regions...

Do you have any idea of what might be wrong?
Thank you all in advance,
J

cwhelan 01-10-2013 10:30 PM

This is normal although it should only happen in a few regions. This paper suggests that it is due to misassembled regions of the reference where there should be many copies of a repeat but there's only one in the assembly:

http://bioinformatics.oxfordjournals...rmatics.btr354

You could check the BED files linked to from that paper to see if they match up with your high peaks.

JMFA 01-11-2013 03:42 AM

Dear cwhelan,

Thanks for the reply!
Yes! Maybe its normal because I only get a few strange peaks.

Thanks again,
J


All times are GMT -8. The time now is 01:36 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.