I'm trying to estimate the # of reads representing each allele across multiple samples in RNA-seq data.
I've identified SNPs with samtools mpileup and vcftools call, but allelic depth is not provided in the vcf.
I gather that GATK Variant Annotator (-A DepthPerAlleleBySample) might be able to extract this info from the vcf, but GATK seems unhappy with my vcf format
Does anyone have other suggests for getting per sample/per allele counts? Or a suggestion to get around this formatting issue?
Thanks!
I've identified SNPs with samtools mpileup and vcftools call, but allelic depth is not provided in the vcf.
I gather that GATK Variant Annotator (-A DepthPerAlleleBySample) might be able to extract this info from the vcf, but GATK seems unhappy with my vcf format
Code:
Your input file has a malformed header: unexpected tag count 5 in line <ID=VDB,Number=1,Type=Float,Description="Variant Distance Bias for filtering splice-site artefacts in RNA-seq data (bigger is better)",Version="3">
Thanks!
Comment