SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
GATK: sorting vcf file given a reference file jorge Bioinformatics 4 01-14-2015 12:16 PM
GATK DepthofCoverage Error bwubb Bioinformatics 5 04-10-2012 10:29 AM
GATK / DepthofCoverage nguyendofx Bioinformatics 0 11-07-2011 10:21 AM
GATK - DepthOfCoverage giverny Bioinformatics 2 09-14-2011 01:48 PM
GATK depthofcoverage foxyg Bioinformatics 1 08-21-2010 09:22 AM

Reply
 
Thread Tools
Old 04-19-2013, 06:06 AM   #1
Lilach
Member
 
Location: Israel

Join Date: Sep 2011
Posts: 20
Default Error with the reference file for GATK DepthOfCoverage

Hi,
I am trying to use GATK DepthOfCoverage for chrX exome.
I got the BAM files from the lab that ran the NGS and I don't have their human reference genome fasta file. So I tried to give my human genome fasta files as a reference, but I got the following error.
Is there a way to overcome it without getting their fasta file? (which I'm not sure I can get in a short time and this is urgent).
I need only to check only chrX reads, although it was aligned to the whole genome.

Thanks!

##### ERROR MESSAGE: Input files reads and reference have incompatible contigs: Found contigs with the same name but different lengths:
##### ERROR contig reads = chrM / 16569
##### ERROR contig reference = chrM / 16571.
##### ERROR reads contigs = [chr1, chr2, chr3, chr4, chr5, chr6, chr7, chrX, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr20, chrY, chr19, chr22, chr21, chrM]
##### ERROR reference contigs = [chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr22, chrX, chrY, chrM, chr1_gl000191_random, chr1_gl000192_random, chr4_ctg9_hap1, chr4_gl000193_random, chr4_gl000194_random, chr7_gl000195_random, chr8_gl000196_random, chr8_gl000197_random, chr9_gl000198_random, chr9_gl000199_random, chr9_gl000200_random, chr9_gl000201_random, chr11_gl000202_random, chr17_ctg5_hap1, chr17_gl000203_random, chr17_gl000204_random, chr17_gl000205_random, chr17_gl000206_random, chr18_gl000207_random, chr19_gl000208_random, chr19_gl000209_random, chr21_gl000210_random, chrUn_gl000211, chrUn_gl000212, chrUn_gl000213, chrUn_gl000214, chrUn_gl000215, chrUn_gl000216, chrUn_gl000217, chrUn_gl000218, chrUn_gl000219, chrUn_gl000220, chrUn_gl000221, chrUn_gl000222, chrUn_gl000223, chrUn_gl000224, chrUn_gl000225, chrUn_gl000226, chrUn_gl000227, chrUn_gl000228, chrUn_gl000229, chrUn_gl000230, chrUn_gl000231, chrUn_gl000232, chrUn_gl000233, chrUn_gl000234, chrUn_gl000235, chrUn_gl000236, chrUn_gl000237, chrUn_gl000238, chrUn_gl000239, chrUn_gl000240, chrUn_gl000241, chrUn_gl000242, chrUn_gl000243, chrUn_gl000244, chrUn_gl000245, chrUn_gl000246, chrUn_gl000247, chrUn_gl000248, chrUn_gl000249]
##### ERROR ------------------------------------------------------------------------------------------
Lilach is offline   Reply With Quote
Old 04-19-2013, 06:15 AM   #2
Zaag
Senior Member
 
Location: Amsterdam

Join Date: Nov 2009
Posts: 112
Default

https://code.google.com/p/vcfsorter/

should do the trick
Zaag is offline   Reply With Quote
Old 04-19-2013, 10:43 AM   #3
Lilach
Member
 
Location: Israel

Join Date: Sep 2011
Posts: 20
Default

Hi Zaan,
Thank you for the answer, but as well as I know I should give the bam file as input to GATK DepthOfCoverage and not the vcf, since I want to know which targets were not covered (and targets can be highly covered but without variants).
my Bam file is already sorted. I guess the problem is because its header is different than the fasta file header, becuase the supplier used a different fasta file as a reference.
Is there any other idea?
Lilach is offline   Reply With Quote
Old 04-20-2013, 02:28 PM   #4
Lilach
Member
 
Location: Israel

Join Date: Sep 2011
Posts: 20
Default

I can create a new reference human genome without chrM and the other contigs, only chr1-22. Is there a way to tell the GATK DepthOfCoverage to ignore chrM?
Lilach is offline   Reply With Quote
Old 04-21-2013, 11:55 PM   #5
Zaag
Senior Member
 
Location: Amsterdam

Join Date: Nov 2009
Posts: 112
Default

No you have chrM in your reads, I would recommend you to download the hg19 reference (chromFa.tar.gz) from here:
http://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/
Zaag is offline   Reply With Quote
Old 04-22-2013, 08:59 AM   #6
Lilach
Member
 
Location: Israel

Join Date: Sep 2011
Posts: 20
Default

My reference genome is indeed from hg19 chromFa.tar.gz, and the chrM.fa file there is the 16571 version.

I found a solution, and I'm writing it here for the community.
The 16569 chrM fasta version can be downloaded from NCBI pubmed nucleotide: NC_012920

I solved all other problems by cat a new reference file with only the chromosomes.

Thank you, Zaag, for the help!
Lilach is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:59 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO