Hi all,
I am new to SAMtools and bcftools . I used Hisat2 for alignment, samtools for sorting and indexing and SNP calling. Here are the script I used for SNP call and generate vcf file. However, I could not get GT (genotype) field in vcf files (I also copy and paste the vcf content here, in italic) . Just wondering whether you have any suggestions about getting GT field.
module load SAMtools
samtools mpileup -g -f genomic.fna B12.sorted.bam > B12.sorted.bam.raw.bcf
module load BCFtools/1.9-foss-2016b
bcftools view B12.sorted.bam.raw.bcf | vcfutils.pl varFilter - > B12.sorted.bam.raw.vcf
_##contig=<ID=ONZH01031391.1,length=44893>
##ALT=<ID=*,Description="Represents allele(s) other than observed.">
##INFO=<ID=INDEL,Number=0,Type=Flag,Description="Indicates that the variant is an INDEL.">
##INFO=<ID=IDV,Number=1,Type=Integer,Description="Maximum number of reads supporting an indel">
##INFO=<ID=IMF,Number=1,Type=Float,Description="Maximum fraction of reads supporting an indel">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Raw read depth">
##INFO=<ID=VDB,Number=1,Type=Float,Description="Variant Distance Bias for filtering splice-site artefacts in RNA-seq data (bigger is better)",Version="3">
##INFO=<ID=RPB,Number=1,Type=Float,Description="Mann-Whitney U test of Read Position Bias (bigger is better)">
##INFO=<ID=MQB,Number=1,Type=Float,Description="Mann-Whitney U test of Mapping Quality Bias (bigger is better)">
##INFO=<ID=BQB,Number=1,Type=Float,Description="Mann-Whitney U test of Base Quality Bias (bigger is better)">
##INFO=<ID=MQSB,Number=1,Type=Float,Description="Mann-Whitney U test of Mapping Quality vs Strand Bias (bigger is better)">
##INFO=<ID=SGB,Number=1,Type=Float,Description="Segregation based metric.">
##INFO=<ID=MQ0F,Number=1,Type=Float,Description="Fraction of MQ0 reads (smaller is better)">
##INFO=<ID=I16,Number=16,Type=Float,Description="Auxiliary tag used for calling, see description of bcf_callret1_t in bam2bcf.h">
##INFO=<ID=QS,Number=R,Type=Float,Description="Auxiliary tag used for calling">
##FORMAT=<ID=PL,Number=G,Type=Integer,Description="List of Phred-scaled genotype likelihoods">
##bcftools_viewVersion=1.9+htslib-1.9
##bcftools_viewCommand=view W9.raw.bcf; Date=Fri Apr 17 14:08:37 2020
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT /home/fastdir/a1674603/msGBS_analysis/03_Aligning/combined/sorted/W9.sorted.bam
ONZH01000001.1 14661 . A <*> 0 . DP=2;I16=2,0,0,0,64,2048,0,0,120,7200,0,0,0,0,0,0;QS=1,0;MQ0F=0 PL 0,6,58
ONZH01000001.1 14672 . T <*> 0 . DP=3;I16=3,0,0,0,119,4731,0,0,180,10800,0,0,33,363,0,0;QS=1,0;MQ0F=0 PL 0,9,100
ONZH01000001.1 14683 . A <*> 0 ._
Thanks,
Jia
I am new to SAMtools and bcftools . I used Hisat2 for alignment, samtools for sorting and indexing and SNP calling. Here are the script I used for SNP call and generate vcf file. However, I could not get GT (genotype) field in vcf files (I also copy and paste the vcf content here, in italic) . Just wondering whether you have any suggestions about getting GT field.
module load SAMtools
samtools mpileup -g -f genomic.fna B12.sorted.bam > B12.sorted.bam.raw.bcf
module load BCFtools/1.9-foss-2016b
bcftools view B12.sorted.bam.raw.bcf | vcfutils.pl varFilter - > B12.sorted.bam.raw.vcf
_##contig=<ID=ONZH01031391.1,length=44893>
##ALT=<ID=*,Description="Represents allele(s) other than observed.">
##INFO=<ID=INDEL,Number=0,Type=Flag,Description="Indicates that the variant is an INDEL.">
##INFO=<ID=IDV,Number=1,Type=Integer,Description="Maximum number of reads supporting an indel">
##INFO=<ID=IMF,Number=1,Type=Float,Description="Maximum fraction of reads supporting an indel">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Raw read depth">
##INFO=<ID=VDB,Number=1,Type=Float,Description="Variant Distance Bias for filtering splice-site artefacts in RNA-seq data (bigger is better)",Version="3">
##INFO=<ID=RPB,Number=1,Type=Float,Description="Mann-Whitney U test of Read Position Bias (bigger is better)">
##INFO=<ID=MQB,Number=1,Type=Float,Description="Mann-Whitney U test of Mapping Quality Bias (bigger is better)">
##INFO=<ID=BQB,Number=1,Type=Float,Description="Mann-Whitney U test of Base Quality Bias (bigger is better)">
##INFO=<ID=MQSB,Number=1,Type=Float,Description="Mann-Whitney U test of Mapping Quality vs Strand Bias (bigger is better)">
##INFO=<ID=SGB,Number=1,Type=Float,Description="Segregation based metric.">
##INFO=<ID=MQ0F,Number=1,Type=Float,Description="Fraction of MQ0 reads (smaller is better)">
##INFO=<ID=I16,Number=16,Type=Float,Description="Auxiliary tag used for calling, see description of bcf_callret1_t in bam2bcf.h">
##INFO=<ID=QS,Number=R,Type=Float,Description="Auxiliary tag used for calling">
##FORMAT=<ID=PL,Number=G,Type=Integer,Description="List of Phred-scaled genotype likelihoods">
##bcftools_viewVersion=1.9+htslib-1.9
##bcftools_viewCommand=view W9.raw.bcf; Date=Fri Apr 17 14:08:37 2020
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT /home/fastdir/a1674603/msGBS_analysis/03_Aligning/combined/sorted/W9.sorted.bam
ONZH01000001.1 14661 . A <*> 0 . DP=2;I16=2,0,0,0,64,2048,0,0,120,7200,0,0,0,0,0,0;QS=1,0;MQ0F=0 PL 0,6,58
ONZH01000001.1 14672 . T <*> 0 . DP=3;I16=3,0,0,0,119,4731,0,0,180,10800,0,0,33,363,0,0;QS=1,0;MQ0F=0 PL 0,9,100
ONZH01000001.1 14683 . A <*> 0 ._
Thanks,
Jia