SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Viewing vcf and bam files in artemis lg36 Bioinformatics 3 04-10-2017 03:44 AM
Artemis acceptable input file format problem edge Bioinformatics 13 07-19-2012 03:58 PM
I did a samtools pileup and ended in a vcf format which is confuses at INDEL's ssn4ssn Illumina/Solexa 1 04-24-2012 04:19 AM
Viewing multiple VCF files in Artemis coldturkey Bioinformatics 0 02-01-2012 01:45 AM
Failed to View .BAM file in Artemis rururara Bioinformatics 3 04-12-2011 08:38 PM

Reply
 
Thread Tools
Old 08-22-2012, 07:47 AM   #1
palc
Senior Scientist
 
Location: Auckland, New Zealand

Join Date: Jun 2012
Posts: 6
Question Artemis can't read my indel vcf file

I have got raw SNPs and INDELs in separate vcf files using GATK. I tried to view that on Artemis after uploading my embl and fasta reference file. Artemis managed to read the raw SNPs file with nice graphics on the top section of the window. But when I upload INDEL vcf file, it can't read though the Artemis log file says vcf file is visible. Even when I filter my raw SNPs vcf file and get a new vcf and then try to read that on Artemis, it can't read anymore.
palc is offline   Reply With Quote
Old 08-25-2012, 11:32 PM   #2
Torst
Senior Member
 
Location: The University of Melbourne, AUSTRALIA

Join Date: Apr 2008
Posts: 275
Default

I recall someone saying that GATK produces illegal VCF files for indels. That is, it doesn't include a context base (eg. "AG"=>"A"), and instead just does ("G"=>"-"). I suspect Artemis is expecting a valid VCF file.

Does it follow this format?
http://www.1000genomes.org/wiki/Anal...mat-version-41

You should send a message to the author Tim Carver on the Artemis Users mailing list:

http://www.mail-archive.com/artemis-...c.uk/info.html
Torst is offline   Reply With Quote
Old 08-25-2012, 11:41 PM   #3
palc
Senior Scientist
 
Location: Auckland, New Zealand

Join Date: Jun 2012
Posts: 6
Default

I think it does follow. here it is:

#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SOD
NC_0077121_SODALIS_GLOSSINIDIUS_STR_MORSITANS_CHROMOSOME 26515 . AACCAGC A . . AC=69,69;DP=88;MM=0.13043478,0.36842105;MQ=32.565216,55.105263;NQSBQ=37.71994,36.090908;NQSMM=0.0014662757,0.0;SC=39,30,9,10 GT 0/1
NC_0077121_SODALIS_GLOSSINIDIUS_STR_MORSITANS_CHROMOSOME 87337 . C CAAT . . AC=10,10;DP=10;MM=4.0,0.0;MQ=29.0,0.0;NQSBQ=37.96,0.0;NQSMM=0.1,0.0;SC=3,7,0,0 GT 0/1
NC_0077121_SODALIS_GLOSSINIDIUS_STR_MORSITANS_CHROMOSOME 89973 . CAGGCCGAAATAGG C . . AC=34,34;DP=54;MM=0.029411765,0.2;MQ=32.617645,46.45;NQSBQ=37.64072,37.62434;NQSMM=0.0,0.0;SC=14,20,6,14 GT 0/1
NC_0077121_SODALIS_GLOSSINIDIUS_STR_MORSITANS_CHROMOSOME 95838 . G GA . . AC=62,62;DP=64;MM=0.37096775,0.5;MQ=58.48387,60.0;NQSBQ=36.989933,39.916668;NQSMM=0.0016778524,0.083333336;SC=48,14,1,1 GT 0/1
NC_0077121_SODALIS_GLOSSINIDIUS_STR_MORSITANS_CHROMOSOME 102366 . T TCATCAG . . AC=45,49;DP=80;MM=0.06666667,0.41935483;MQ=35.77778,43.612904;NQSBQ=37.70068,35.731617;NQSMM=0.0,0.014705882;SC=16,29,19,12 GT 0/1
NC_0077121_SODALIS_GLOSSINIDIUS_STR_MORSITANS_CHROMOSOME 104553 . TAAAAGGGCA T . . AC=41,41;DP=62;MM=0.09756097,0.23809524;MQ=37.0,47.666668;NQSBQ=37.430695,35.70051;NQSMM=0.0024752475,0.0;SC=22,19,11,10 GT 0/1
NC_0077121_SODALIS_GLOSSINIDIUS_STR_MORSITANS_CHROMOSOME 107398 . GA G . . AC=75,75;DP=80;MM=0.29333332,1.4;MQ=18.466667,17.0;NQSBQ=38.039085,38.822224;NQSMM=0.0013477089,0.022222223;SC=14,61,5,0 GT 0/1

I will send an e-mail to Tim then.
palc is offline   Reply With Quote
Old 08-26-2012, 03:07 PM   #4
Torst
Senior Member
 
Location: The University of Melbourne, AUSTRALIA

Join Date: Apr 2008
Posts: 275
Default

Yes, it does look like it is representing indels correctly. Hmmm.
Torst is offline   Reply With Quote
Old 08-28-2012, 06:13 AM   #5
palc
Senior Scientist
 
Location: Auckland, New Zealand

Join Date: Jun 2012
Posts: 6
Smile It's solved

The SNP and INDEL files I produced using GATK was error-free but during compressing and indexing steps I did the mistake. The VCF files need to be compressed and indexed using bgzip and tabix as below:

e.g. bgzip file.vcf (will create file.vcf.gz)
e.g. tabix -p vcf file.vcf.gz (will create file.vcf.gz.tbi)

I didn't use the (-p vcf) flag so it didn't work on artemis as it didn't recognize the vcf file. I had a chat with the author of Artemis Tim Carver who guided me all the way. Thanks Tim.
palc is offline   Reply With Quote
Reply

Tags
artemis, snps indels, vcftools.

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:41 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO