Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Missing genotype data in .vcf file

    Hi all,
    I'm trying to gather genotype data from .vcf files generated using GATK (as part of Illumina's MiSeq Reporter software), but I'm getting odd results.
    I'm looking at one variant site, and for individual sample vcf files, I get data when there is a variant (i.e. not homozygous reference) so that's fine.
    I'd like to be able to output the data for the hom ref samples - i.e. depth at the site so I can quickly confirm whether the genotype is real (is it really homozygous reference, or a no call?).
    I'm looking at the genome vcf files to get this data and for some sites I get a 0/0 call, but for others there is just a "." and when I look at the samples in IGV, the coverage is almost the same at both sites. So, I'm wondering how I can get the info that I want, and also why I would get a 0/0 genotype with all the associated info (GQ, MQ, DP, DPF etc) for some positions and not for others?
    I've had a look through previous threads, but can't find any info on this.
    Apologies also for my ignorance, I am a complete bioinformatics novice.

  • #2
    Hello favwiz,
    can you please post an example where the genotype is "0/0" or "."

    fin swimmer

    Comment


    • #3
      Hi fin swimmer,
      Thanks for taking a look at this. I've uploaded the combined gvcf that I'm looking at but have had to remove a few lines to reduce the file size. Let me know if you need more info.
      Thanks.
      Attached Files

      Comment


      • #4
        Hi,
        i didn't work with gVCF until now. If I see it correct you're trying to merge the gvcf files of your samples with vcf-merge. In my understanding vcf-merge cannot handle gvcf files as it doesn't take care of the blocksize.

        So as your gvcf files were produced by gatk, CombineGVCFs should be the right tool for merging.

        fin swimmer

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM
        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        30 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        32 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        28 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-04-2024, 09:00 AM
        0 responses
        52 views
        0 likes
        Last Post seqadmin  
        Working...
        X