Hi,
I am trying to create a vcf from a bam file using samtools and bcftools
like that:
I would like to have an uncompressed vcf with only the variants in it.
The command runs without errors. When I try to open the vcf file in an editor (e.g. TextWrangler) it opens without a problem
but when I try to open it in the terminal I get this strange ascii characters:
But when I am using the bcftools command I can see the file
I have found this as I was trying to upload my vcf files into IGV anf got the following error
As you can see the CROM line is there, but somehow IGV donesn't recognize the file.
Am I doing something wrong in the conversion?
What is the right way to convert a bam file to vcf.
thanks,
Assa
I am trying to create a vcf from a bam file using samtools and bcftools
like that:
Code:
samtools mpileup -ugf ~/genomes/Mus_musculus/GRCm38.p2/mm_ref_GRCm38.p2.fa bamfile | bcftools call -vmO z -o file.vcf
The command runs without errors. When I try to open the vcf file in an editor (e.g. TextWrangler) it opens without a problem
Code:
##fileformat=VCFv4.2 ##FILTER=<ID=PASS,Description="All filters passed"> ##samtoolsVersion=1.2+htslib-1.2.1 ##samtoolsCommand=samtools mpileup -ugf /home/yeroslaviz/genomes/Mus_musculus/Ensembl/NCBIM37/Sequence/WholeGenomeFasta/genome.fa ../bamFiles_trimmedPlusgtf/Q_trimmedPlusgtf.bam ../bamFiles_trimmedPlusgtf/R_trimmedPlusgtf.bam ../bamFiles_trimmedPlusgtf/S_trimmedPlusgtf.bam ../bamFiles_trimmedPlusgtf/T_trimmedPlusgtf.bam ##reference=file:///home/yeroslaviz/genomes/Mus_musculus/Ensembl/NCBIM37/Sequence/WholeGenomeFasta/genome.fa ##contig=<ID=1,length=197195432> ##contig=<ID=10,length=129993255> ##contig=<ID=11,length=121843856> ##contig=<ID=12,length=121257530> ##contig=<ID=13,length=120284312> ##contig=<ID=14,length=125194864> ##contig=<ID=15,length=103494974> ##contig=<ID=16,length=98319150> ##contig=<ID=17,length=95272651> ##contig=<ID=18,length=90772031> ##contig=<ID=19,length=61342430> ... ##INFO=<ID=DP4,Number=4,Type=Integer,Description="Number of high-quality ref-forward , ref-reverse, alt-forward and alt-reverse bases"> ##INFO=<ID=MQ,Number=1,Type=Integer,Description="Average mapping quality"> ##bcftools_callVersion=1.2+htslib-1.2.1 ##bcftools_callCommand=call -vmO z -o FR.Ileum.vcf #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT ../bamFiles_trimmedPlusgtf/Q_trimmedPlusgtf.bam ../bamFiles_trimmedPlusgtf/R_trimmedPlusgtf.bam ../bamFiles_trimmedPlusgtf/S_trimmedPlusgtf.bam ../bamFiles_trimmedPlusgtf/T_trimmedPlusgtf.bam 1 3044386 . G A 59 . DP=2;VDB=0.1;SGB=-0.314421;MQ0F=0;AC=4;AN=4;DP4=0,0,2,0;MQ=50 GT:PL ./.:0,0,0 ./.:0,0,0 1/1:41,3,0 1/1:41,3,0 1 3044390 . T A 58 . DP=2;VDB=0.1;SGB=-0.314421;MQ0F=0;AC=4;AN=4;DP4=0,0,2,0;MQ=50 GT:PL ./.:0,0,0 ./.:0,0,0 1/1:40,3,0 1/1:41,3,0
Code:
>head C.vcf ??ɮ?"?=?rɑ?W f^?X?+밖??@#Z?c$x???$Hb,????f?? ????D?Ǘ?????)?w?[??n??????㫯?ŷ???o????|??Z>?}y{??\|??????k?d????????>|???z???jy???ɫ?O/^J????ߖ??O????ry??|?Z/??'????????s?????>???ӓ???1?]~Z??c?h????I1?؇????g????ry?y??;?()??u???w4??*iAo?!Z0 ????? ??C5ww?j?}?;?TN9?s?Nw8+?㰃??7?0B߾??1?ɝ?As???i3??5[743?3Vn??L?? ??8?????!O??Hk??<?f?3????|9?wh?X[W????50ᶷ??f????r??<~{???]~?]?????xqy??\?q???????v????_-o?/?N<yy???k???????ɫoWg?d?????????͇???x?a?^??Yk|[??/n/???»??l???o??<<???z?yy??엋????v5????o>?o???????ׯ7?????????r??/?o=????b???O????????1?]???p?q?u?e? ?=??0??ʼn|v? ??????j?L$?Bp??????tu?^????'? q?}5=]-?9??X/?|??%y?\?W???I???M?9?????ӿ?X_/?m?뿦?{K?}s??Єzt;@o?{?????\|?J?????????????-V˻A{?????}5~??E?~?w?G?~??|???? ???q|?D?a?=vęz?:K??x????"³????o_N?z?)?yω??"?˛/?ˏ???KD?8??2??????ŗ?????ǟϷp?iĉ?|?y|?-?1?/???.?i?f7?O?.>\ ??UB??bq??r????????&B??ſ?~?J??? g:K`?=s=??b?????<́-????D?r?PBIs9??????K\?(?6?_C;?7k??f??????mҬ??U??_?'???????z??????????v???t??\?Kt??xџ???y9f?S????r|z???????G?߸/I??????????OoƳ???????o_??y?n???????G???_?6?e??? ?ftW!??$S?X;????p???ߞ?9?O?????4?ZxJ??=?Å?p㟱 <????3?WEv?O?zlt>??_F? ??T'?%p?!???PSÃN%???? ?i9???V?I0????6? Xa~䅪?(F?#Q????-??4??????}??DJg:Pg?M??F?c؝?f?Y??#?N?_?(? ??qh???L??3????Ì??H?7??x??"~d?}???s??c1????i1A???,??3܃??H?s??Fj???<?u?2o?YT?#8=???????k?7T????? 2???>s?hs??j?;???)?[k??L????#4d1'?4??~???icIO???!8??Ƶ???.x ??P??R?4>?!???ө????Z?Y ?;??k/?Dp?8x?:?? V?5??Q:pW????A)a? ??c?Ӂ ?p?@3Jɸ??M\bd?4L$? "c9ܖ?
Code:
bcftools view C.vcf | head ##fileformat=VCFv4.2 ##FILTER=<ID=PASS,Description="All filters passed"> ##samtoolsVersion=1.2+htslib-1.2.1 ##samtoolsCommand=samtools mpileup -ugf /home/yeroslaviz/genomes/Mus_musculus/Ensembl/NCBIM37/Sequence/WholeGenomeFasta/genome.fa ../bamFiles_trimmedPlusgtf/C_trimmedPlusgtf.bam ##reference=file:///home/yeroslaviz/genomes/Mus_musculus/Ensembl/NCBIM37/Sequence/WholeGenomeFasta/genome.fa ##contig=<ID=1,length=197195432> ##contig=<ID=10,length=129993255> ##contig=<ID=11,length=121843856> ...
I have found this as I was trying to upload my vcf files into IGV anf got the following error
Code:
Error loading /home/VariantCalling/C.vcf: Unable to parse header with error: Your input file has a malformed header: We never saw the required CHROM header line (starting with one #) for the input VCF file, for input source: /home/VariantCalling/C.vcf
Am I doing something wrong in the conversion?
What is the right way to convert a bam file to vcf.
thanks,
Assa
Comment