Hi everyone,
I'm studying how to use bcftools and playing a little bit with its parameters to better understand what they really do. I did two tests on the same paired-end bam file, one with base quality 20 and the other with 2 minimum reads for a site and used bcftools stats to compare them (output bellow).
My question is; does anyone know where can I find some documentation about this output and how it is done? I thought the "2" id was refering to the intersection between the two files, but when I look at the number of SNPs found (22393 for Q20, 275806 for m2 and 1764922 for both) it doesn't make any sense: the third number is bigger than the other two and it doesn't correspond to the union either.
Thanks in advance!
I'm studying how to use bcftools and playing a little bit with its parameters to better understand what they really do. I did two tests on the same paired-end bam file, one with base quality 20 and the other with 2 minimum reads for a site and used bcftools stats to compare them (output bellow).
My question is; does anyone know where can I find some documentation about this output and how it is done? I thought the "2" id was refering to the intersection between the two files, but when I look at the number of SNPs found (22393 for Q20, 275806 for m2 and 1764922 for both) it doesn't make any sense: the third number is bigger than the other two and it doesn't correspond to the union either.
Thanks in advance!
# This file was produced by bcftools stats (1.2+htslib-1.2.1) and can be plotted using plot-vcfstats.
# The command line was: bcftools stats S37.uQ20.vcf.bgzip S37.um2.vcf.bgzip
#
# Definition of sets:
# ID [2]id [3]tab-separated file names
ID 0 S37.uQ20.vcf.bgzip
ID 1 S37.um2.vcf.bgzip
ID 2 S37.uQ20.vcf.bgzip S37.um2.vcf.bgzip
# SN, Summary numbers:
# SN [2]id [3]key [4]value
SN 0 number of samples: 1
SN 1 number of samples: 1
SN 0 number of records: 388196
SN 0 number of SNPs: 22393
SN 0 number of MNPs: 0
SN 0 number of indels: 112390
SN 0 number of others: 0
SN 0 number of multiallelic sites: 22439
SN 0 number of multiallelic SNP sites: 22393
SN 1 number of records: 277113
SN 1 number of SNPs: 275806
SN 1 number of MNPs: 0
SN 1 number of indels: 1307
SN 1 number of others: 0
SN 1 number of multiallelic sites: 276058
SN 1 number of multiallelic SNP sites: 275806
SN 2 number of records: 257769020
SN 2 number of SNPs: 1764922
SN 2 number of MNPs: 0
SN 2 number of indels: 38776
SN 2 number of others: 0
SN 2 number of multiallelic sites: 1773536
SN 2 number of multiallelic SNP sites: 1764922
# The command line was: bcftools stats S37.uQ20.vcf.bgzip S37.um2.vcf.bgzip
#
# Definition of sets:
# ID [2]id [3]tab-separated file names
ID 0 S37.uQ20.vcf.bgzip
ID 1 S37.um2.vcf.bgzip
ID 2 S37.uQ20.vcf.bgzip S37.um2.vcf.bgzip
# SN, Summary numbers:
# SN [2]id [3]key [4]value
SN 0 number of samples: 1
SN 1 number of samples: 1
SN 0 number of records: 388196
SN 0 number of SNPs: 22393
SN 0 number of MNPs: 0
SN 0 number of indels: 112390
SN 0 number of others: 0
SN 0 number of multiallelic sites: 22439
SN 0 number of multiallelic SNP sites: 22393
SN 1 number of records: 277113
SN 1 number of SNPs: 275806
SN 1 number of MNPs: 0
SN 1 number of indels: 1307
SN 1 number of others: 0
SN 1 number of multiallelic sites: 276058
SN 1 number of multiallelic SNP sites: 275806
SN 2 number of records: 257769020
SN 2 number of SNPs: 1764922
SN 2 number of MNPs: 0
SN 2 number of indels: 38776
SN 2 number of others: 0
SN 2 number of multiallelic sites: 1773536
SN 2 number of multiallelic SNP sites: 1764922