Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Great BAMs, good .fai so why empty VCFs?

    Greetings. I am using BWA to map reads from the Illumina 1.5 pipeline to contigs made with velvet and using SAMtools to call variants.

    Code:
    samtools mpileup -uf contigs_ref.fasta Sample_to_contigs_rmdup.bam | bcftools view -bvcg -> Sample_to_contigs_rmdup.bam.bcf
    
    bcftools view Sample_to_contigs_rmdup.bam.bcf | vcfutils.pl varFilter >Sample_to_contigs_rmdup.bam.vcf
    Looking in a genome browser, the .bam is perfect, and I can see nice SNPs, and just right amount as I would expect. But the VCFs come out empty.

    When I look at the raw BCF/VCF with all the positions (not just variants) there are only 34 positions from the middle of one contig (somewhere in the middle of the reference file) and no variants, rather than the 951462 positions that should be there with plenty of variants.

    It looks like this:

    Code:
    ##CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO	FORMAT	Sample_to_contigs_rmdup.bam
    Contig_14090_length_986	852	.	A	.	35.9	.	DP=2;AF1=0;AC1=0;DP4=2,0,0,0;MQ=14;FQ=-33	PL	0
    Contig_14090_length_986	853	.	T	.	32.9	.	DP=3;AF1=0;AC1=0;DP4=2,0,1,0;MQ=12;FQ=-30.1;PV4=1,1,0.33,1	PL	0
    Contig_14090_length_986	854	.	A	.	32.9	.	DP=4;VDB=0.0768;AF1=0;AC1=0;DP4=2,0,1,0;MQ=12;FQ=-30.1;PV4=1,1,0.33,0	PL	0
    Contig_14090_length_986	855	.	T	.	17.3	.	DP=8;VDB=0.0004;AF1=0.6089;AC1=1;DP4=0,2,3,0;MQ=11;FQ=-23.6;PV4=0.1,1,0.47,1	PL	12
    Contig_14090_length_986	856	.	T	.	48	.	DP=11;AF1=0;AC1=0;DP4=4,2,0,0;MQ=13;FQ=-45	PL	0
    Contig_14090_length_986	857	.	A	.	48	.	DP=14;AF1=0;AC1=0;DP4=4,2,0,0;MQ=13;FQ=-45	PL	0
    Contig_14090_length_986	858	.	T	.	48	.	DP=15;AF1=0;AC1=0;DP4=4,2,0,0;MQ=13;FQ=-45	PL	0
    Contig_14090_length_986	859	.	A	.	48	.	DP=17;AF1=0;AC1=0;DP4=4,2,0,0;MQ=13;FQ=-45	PL	0
    Contig_14090_length_986	860	.	T	.	48	.	DP=18;AF1=0;AC1=0;DP4=4,2,0,0;MQ=13;FQ=-45	PL	0
    Contig_14090_length_986	861	.	A	.	45	.	DP=19;AF1=0;AC1=0;DP4=4,1,0,0;MQ=14;FQ=-42	PL	0
    Contig_14090_length_986	862	.	T	.	42	.	DP=19;AF1=0;AC1=0;DP4=4,0,0,0;MQ=14;FQ=-39	PL	0
    Contig_14090_length_986	863	.	A	.	42	.	DP=19;AF1=0;AC1=0;DP4=4,0,0,0;MQ=14;FQ=-39	PL	0
    Contig_14090_length_986	864	.	T	.	42	.	DP=19;AF1=0;AC1=0;DP4=4,0,0,0;MQ=14;FQ=-39	PL	0
    Contig_14090_length_986	865	.	A	.	42	.	DP=19;AF1=0;AC1=0;DP4=4,0,0,0;MQ=14;FQ=-39	PL	0
    Contig_14090_length_986	866	.	T	.	42	.	DP=19;AF1=0;AC1=0;DP4=4,0,0,0;MQ=14;FQ=-39	PL	0
    Contig_14090_length_986	867	.	A	.	42	.	DP=19;AF1=0;AC1=0;DP4=4,0,0,0;MQ=14;FQ=-39	PL	0
    Contig_14090_length_986	868	.	T	.	42	.	DP=19;AF1=0;AC1=0;DP4=4,0,0,0;MQ=14;FQ=-39	PL	0
    Contig_14090_length_986	869	.	A	.	42	.	DP=19;AF1=0;AC1=0;DP4=4,0,0,0;MQ=14;FQ=-39	PL	0
    Contig_14090_length_986	870	.	T	.	42	.	DP=19;AF1=0;AC1=0;DP4=4,0,0,0;MQ=14;FQ=-39	PL	0
    Contig_14090_length_986	871	.	A	.	42	.	DP=19;AF1=0;AC1=0;DP4=4,0,0,0;MQ=14;FQ=-39	PL	0
    Contig_14090_length_986	872	.	T	.	42	.	DP=19;AF1=0;AC1=0;DP4=4,0,0,0;MQ=14;FQ=-39	PL	0
    Contig_14090_length_986	873	.	A	.	42	.	DP=19;AF1=0;AC1=0;DP4=4,0,0,0;MQ=14;FQ=-39	PL	0
    Contig_14090_length_986	874	.	T	.	42	.	DP=19;AF1=0;AC1=0;DP4=4,0,0,0;MQ=14;FQ=-39	PL	0
    Contig_14090_length_986	875	.	A	.	42	.	DP=19;AF1=0;AC1=0;DP4=4,0,0,0;MQ=14;FQ=-39	PL	0
    Contig_14090_length_986	876	.	T	.	42	.	DP=19;AF1=0;AC1=0;DP4=4,0,0,0;MQ=14;FQ=-39	PL	0
    Contig_14090_length_986	877	.	A	.	35.9	.	DP=16;AF1=0;AC1=0;DP4=2,0,0,0;MQ=14;FQ=-33	PL	0
    Contig_14090_length_986	878	.	A	.	10.4	.	DP=15;VDB=0.0015;AF1=1;AC1=2;DP4=0,0,1,0;MQ=20;FQ=-30	PL	20
    Contig_14090_length_986	879	.	A	.	33	.	DP=15;AF1=0;AC1=0;DP4=1,0,0,0;MQ=20;FQ=-30	PL	0
    Contig_14090_length_986	880	.	T	.	33	.	DP=14;AF1=0;AC1=0;DP4=1,0,0,0;MQ=20;FQ=-30	PL	0
    Contig_14090_length_986	881	.	A	.	33	.	DP=13;AF1=0;AC1=0;DP4=1,0,0,0;MQ=20;FQ=-30	PL	0
    Contig_14090_length_986	882	.	T	.	33	.	DP=9;AF1=0;AC1=0;DP4=1,0,0,0;MQ=20;FQ=-30	PL	0
    Contig_14090_length_986	883	.	T	.	28.2	.	DP=6;VDB=0.0300;;AC1=2;FQ=-30	PL	0
    Contig_14090_length_986	884	.	A	.	28.2	.	DP=4;;AC1=2;FQ=-30	PL	0
    Contig_14090_length_986	885	.	A	.	28.2	.	DP=1;;AC1=2;FQ=-30	PL	0
    The coverage here is also ridiculously low, most positions are closer to 50-100X.

    I have had the problem before when the fasta index (.fai) file was not made properly, but this time it's perfect.

    What could be going on here?
    Last edited by Genomics101; 06-12-2013, 04:28 AM.

  • #2
    When you looked at the bam file, did you check the mapping qualities? I had this happen, when for some reason (settings in Bowtie) mapping quality was 0 for the whole sample. Then all variants were counted as insignificant.
    Try getting a "raw vcf" (without applying vcfutils varfilter), and check if that is empty, too.

    Comment


    • #3
      @sBeier

      I should add the the vcf I have posted up there is the raw VCF, I'll edit the post to reflect this.

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Recent Advances in Sequencing Analysis Tools
        by seqadmin


        The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
        05-06-2024, 07:48 AM
      • seqadmin
        Essential Discoveries and Tools in Epitranscriptomics
        by seqadmin




        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
        04-22-2024, 07:01 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Yesterday, 06:57 AM
      0 responses
      12 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 05-06-2024, 07:17 AM
      0 responses
      16 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 05-02-2024, 08:06 AM
      0 responses
      19 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-30-2024, 12:17 PM
      0 responses
      24 views
      0 likes
      Last Post seqadmin  
      Working...
      X