Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Missing chromosomes in vcf output

    **UPDATE**

    Problem solved. The local version of the .fai index file I was using only contained chromosome 1. Version on the server that I accidentally checked multiple times was fine. Whoops.

    Hello everyone!

    I have bam files for several histone ChIP-seq and RNA-seq experiments and I am attempting to call SNPs. I used the following command successfully with the ChIP-seq data:

    Code:
    samtools mpileup -uD -f hg19.fa INPUT.bam | bcftools view -bcvg - > INPUT_raw.bcf; 
    bcftools view DL237_INPUT_raw.bcf | vcfutils.pl varFilter -D100 > INPUT.vcf;
    For the RNA-seq data, the sequencing center changed their pipeline and I needed a different ENSEMBL build for mpileup:

    Code:
    samtools mpileup -uD -f Homo_sapiens.GRCh37.72.dna_rm.toplevel.fa.gz RNA.bam | bcftools view -bcvg - > RNA_raw.bcf; 
    bcftools view RNA_raw.bcf | vcfutils.pl varFilter -D100 > RNA.vcf;
    Everything seemed to run smoothly until the final step. I get no errors and no indication of a problem.
    The output VCF starts out just like all the others but doesn't have entries for any chromosomes except for chromosome 1.
    If I take out the vcfutils.pl varFilter step and run:

    Code:
    samtools mpileup -uD -f Homo_sapiens.GRCh37.72.dna_rm.toplevel.fa.gz RNA.bam | bcftools view -bcvg - > RNA_raw.bcf; 
    bcftools view RNA_raw.bcf > RNA.vcf;
    I end up with a MASSIVE vcf file with SNPs called for all chromosomes and no errors. The problem seems to be in the vcfutils.pl step but I have used this script successfully with other files. I'm not sure what else to try/where the problem might be. Any advice? And please let me know if any additional information would be helpful! *Note* I am limitedly familiar with this pipeline and am doing this as a favor for a lab mate who is out of town.
    Last edited by spyf89; 08-13-2014, 07:01 AM.

  • #2
    First thing to check...I wonder if one file is sorted chr1, chr2, chr3...and the other chr1, chr10, chr11...

    That discrepancy might explain why the file stops at Chr1.

    Comment


    • #3
      I'll definitely check that out. I sorted the bam to see if that fixed the problem (no luck) but perhaps they are sorted differently.

      The only reason I think this might not fix things is because, without the vcftools.pl varFilter step, there are SNPs called across all chromosomes. The information seems to be in order in the .bcf file...it's just not getting through the filtering step.

      Thank you for the response!

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Strategies for Sequencing Challenging Samples
        by seqadmin


        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
        03-22-2024, 06:39 AM
      • seqadmin
        Techniques and Challenges in Conservation Genomics
        by seqadmin



        The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

        Avian Conservation
        Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
        03-08-2024, 10:41 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Yesterday, 06:37 PM
      0 responses
      7 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, Yesterday, 06:07 PM
      0 responses
      7 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 03-22-2024, 10:03 AM
      0 responses
      49 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 03-21-2024, 07:32 AM
      0 responses
      66 views
      0 likes
      Last Post seqadmin  
      Working...
      X