Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Allelic Depth from VCF

    I'm trying to estimate the # of reads representing each allele across multiple samples in RNA-seq data.

    I've identified SNPs with samtools mpileup and vcftools call, but allelic depth is not provided in the vcf.

    I gather that GATK Variant Annotator (-A DepthPerAlleleBySample) might be able to extract this info from the vcf, but GATK seems unhappy with my vcf format

    Code:
    Your input file has a malformed header: unexpected tag count 5 in line <ID=VDB,Number=1,Type=Float,Description="Variant Distance Bias for filtering splice-site artefacts in RNA-seq data (bigger is better)",Version="3">
    Does anyone have other suggests for getting per sample/per allele counts? Or a suggestion to get around this formatting issue?

    Thanks!

  • #2
    I seem to remember that with bcftools, you can have this information in the bcf file...

    Comment


    • #3
      OOps, I meant "bcftools call" not vcftools

      Comment


      • #4
        If you pass the DPR option to bcftools call it gives the allele depth (I also do DP for total read depth):
        samtools mpileup -gu -t DP,DPR

        The output is 0/1:153,0,125:61:20,41,0,0
        0/1 allele call
        153,0,125 prob of allele call
        61 read depth
        20,41,0,0 depth of different alleles
        Providing nextRAD genotyping and PacBio sequencing services. http://snpsaurus.com

        Comment


        • #5
          Allelic Depth

          Excellent! I'll give it a shot!

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Essential Discoveries and Tools in Epitranscriptomics
            by seqadmin


            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
            Yesterday, 07:01 AM
          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          39 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          41 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          35 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          55 views
          0 likes
          Last Post seqadmin  
          Working...
          X