SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
GATK DepthOfCoverage at high depth lletourn Bioinformatics 2 03-28-2012 08:51 AM
GATK / DepthofCoverage nguyendofx Bioinformatics 0 11-07-2011 11:21 AM
GATK - DepthOfCoverage giverny Bioinformatics 2 09-14-2011 02:48 PM
GATK error nvteja Bioinformatics 1 12-12-2010 09:47 AM
GATK depthofcoverage foxyg Bioinformatics 1 08-21-2010 10:22 AM

Reply
 
Thread Tools
Old 03-30-2012, 09:43 AM   #1
bwubb
Member
 
Location: Philadelphia

Join Date: Jan 2012
Posts: 58
Default GATK DepthofCoverage Error

Im trying to use GATK DepthofCoverage, but Im getting an error which I do not understand how to resolve.

My command is:
Code:
java -Xmx4g -jar ~/GenomeAnalysisTK-1.3/GenomeAnalysisTK.jar \
-T DepthOfCoverage \
-R ~/b37_genomes/human_g1k_v37.fasta \
-I PE.merged.sorted.recal.bam \
-L ~/BED/exons.b37.bed \
-o test.GATK.out
But then I get:

##### ERROR MESSAGE: File associated with name ~/BED/exons.b37.bed is malformed: Interval file could not be parsed in any supported format. caused by BED files must be parsed through Tribble; parsing them as intervals through the GATK engine is no longer supported
##### ERROR ------------------------------------------------------------------------------------------

The GATK wiki does not have many examples of what can be used. This is Whole Exome sequencing data, so naturally I want to determine the coverage over all exon intervals.

I have no clue what this Tribble thing is, and how can they not support BED format....

Thanks.
bwubb is offline   Reply With Quote
Old 03-30-2012, 08:06 PM   #2
Heisman
Senior Member
 
Location: St. Louis

Join Date: Dec 2010
Posts: 535
Default

Could you post the header lines of your file and then the first few intervals as well?
Heisman is offline   Reply With Quote
Old 04-09-2012, 10:16 AM   #3
bwubb
Member
 
Location: Philadelphia

Join Date: Jan 2012
Posts: 58
Default

Wow. Sorry I missed the reply to this thread. Thank for responding.

Here are the first few headers:
Code:
@HD	VN:1.0	GO:none	SO:coordinate
@SQ	SN:1	LN:249250621
@SQ	SN:2	LN:243199373
@SQ	SN:3	LN:198022430
@SQ	SN:4	LN:191154276
@SQ	SN:5	LN:180915260
@SQ	SN:6	LN:171115067
@SQ	SN:7	LN:159138663
@SQ	SN:8	LN:146364022
@SQ	SN:9	LN:141213431
Is this sufficient, or did you want the alignments as well? Thanks again for the help, this is still an issue for me.
bwubb is offline   Reply With Quote
Old 04-09-2012, 10:36 AM   #4
Heisman
Senior Member
 
Location: St. Louis

Join Date: Dec 2010
Posts: 535
Default

It indicates the bed file is the problem: could you post maybe the first 50 lines of that?
Heisman is offline   Reply With Quote
Old 04-10-2012, 11:20 AM   #5
bwubb
Member
 
Location: Philadelphia

Join Date: Jan 2012
Posts: 58
Default

Code:
1	14467	14587
1	14639	14883
1	14943	15064
1	15671	15990
1	16591	16719
1	16750	17074
1	17178	17420
1	17443	18108
1	18203	18448
1	19049	19170
1	20603	20723
1	24448	24915
1	29267	29389
1	30275	30431
1	35095	35215
1	35245	35366
1	35668	35788
1	62983	63703
1	69069	70029
1	112710	112830
1	120754	120966
1	129018	129259
1	133356	133597
1	135235	135355
1	135688	136176
1	137366	137726
1	173754	173874
1	228233	228711
1	259027	259147
1	267075	267287
1	326408	326768
1	327177	327665
1	327998	328118
1	329752	329993
1	334092	334333
1	342357	342569
1	350498	350618
1	367647	368608
1	470971	471330
1	621084	622045
1	639075	639195
1	647124	647336
1	655375	655616
1	659720	659961
1	661601	661721
1	662054	662542
1	662854	663214
1	709552	709672
1	717360	717480
1	721338	721640
It is b37 format, but I have been consistent in its use. This is not my design file bed, but rather "all" exons. I wouldnt think this is the issue though. If there is no overlap and shouldnt it just report zero coverage?
bwubb is offline   Reply With Quote
Old 04-10-2012, 11:29 AM   #6
Heisman
Senior Member
 
Location: St. Louis

Join Date: Dec 2010
Posts: 535
Default

So I use the "1.0.2885" version of GATK for DepthOfCoverage and I use interval files that look like this:

Code:
@HD     VN:1.0  SO:unsorted
@SQ     SN:chr1 LN:249250621    UR:file:hg19.fa       M5:1b22b98cdeb4a9304cb5d48026a85128
@SQ     SN:chr2 LN:243199373    UR:file:hg19.fa       M5:a0d9851da00400dec1098a9255ac712e
@SQ     SN:chr3 LN:198022430    UR:file:hg19.fa       M5:641e4338fa8d52a5b781bd2a2c08d3c3
@SQ     SN:chr4 LN:191154276    UR:file:hg19.fa       M5:23dccd106897542ad87d2765d28a19a1
@SQ     SN:chr5 LN:180915260    UR:file:hg19.fa       M5:0740173db9ffd264d728f32784845cd7
@SQ     SN:chr6 LN:171115067    UR:file:hg19.fa       M5:1d3a93a248d92a729ee764823acbbc6b
@SQ     SN:chr7 LN:159138663    UR:file:hg19.fa       M5:618366e953d6aaad97dbe4777c29375e
@SQ     SN:chr8 LN:146364022    UR:file:hg19.fa       M5:96f514a9929e410c6651697bded59aec
@SQ     SN:chr9 LN:141213431    UR:file:hg19.fa       M5:3e273117f15e0a400f01055d9f393768
@SQ     SN:chr10        LN:135534747    UR:file:hg19.fa       M5:988c28e000e84c26d552359af1ea2e1d
@SQ     SN:chr11        LN:135006516    UR:file:hg19.fa       M5:98c59049a2df285c76ffb1c6db8f8b96
@SQ     SN:chr12        LN:133851895    UR:file:hg19.fa       M5:51851ac0e1a115847ad36449b0015864
@SQ     SN:chr13        LN:115169878    UR:file:hg19.fa       M5:283f8d7892baa81b510a015719ca7b0b
@SQ     SN:chr14        LN:107349540    UR:file:hg19.fa       M5:98f3cae32b2a2e9524bc19813927542e
@SQ     SN:chr15        LN:102531392    UR:file:hg19.fa       M5:e5645a794a8238215b2cd77acb95a078
@SQ     SN:chr16        LN:90354753     UR:file:hg19.fa       M5:fc9b1a7b42b97a864f56b348b06095e6
@SQ     SN:chr17        LN:81195210     UR:file:hg19.fa       M5:351f64d4f4f9ddd45b35336ad97aa6de
@SQ     SN:chr18        LN:78077248     UR:file:hg19.fa       M5:b15d4b2d29dde9d3e4f93d1d0f2cbc9c
@SQ     SN:chr19        LN:59128983     UR:file:hg19.fa       M5:1aacd71f30db8e561810913e0b72636d
@SQ     SN:chr20        LN:63025520     UR:file:hg19.fa       M5:0dec9660ec1efaaf33281c0d5ea2560f
@SQ     SN:chr21        LN:48129895     UR:file:hg19.fa       M5:2979a6085bfe28e3ad6f552f361ed74d
@SQ     SN:chr22        LN:51304566     UR:file:hg19.fa       M5:a718acaa6135fdca8357d5bfe94211dd
@SQ     SN:chrM LN:16571        UR:file:hg19.fa       M5:d2ed829b8a1628d16cbeee88e88e39eb
@SQ     SN:chrX LN:155270560    UR:file:hg19.fa       M5:7e0e2e580297b7764e31dbc80c2540dd
@SQ     SN:chrY LN:59373566     UR:file:hg19.fa       M5:1e86411d73e6f00a10590f976be01623
chr1    2985721 2985854 +       target_1
chr1    3102668 3103058 +       target_2
chr1    3160630 3160721 +       target_3
with the header created as described here: http://www.broadinstitute.org/gsa/wi...ference_genome

Maybe try something like that?
Heisman is offline   Reply With Quote
Reply

Tags
depth of coverage, depthofcoverage, gatk, tribble

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:21 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO