Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • No mutations in BAM (IGV) but a mutation in final VCF?

    I use GATK 4.0 for the variant calling pipeline. my steps involve MarkDuplicates, BaseRecalibration, ApplyBaseRecalibration and HaplotypeCaller. When I check in a loci there is no mutation in the original BAM file in IGV, but there is a mutation in final VCF and when I check the bamout of the HaplotypeCaller there seems to be a mutation. Then I tried Sanger sequencing and see that there is actually no mutation. So the original Bam file is the right one and bamout is the wrong mutation.

    So how could I overcome this problem? This is a serious issue and occurs several times. Thanks in advance.

  • #2
    Any repeats in the loci?

    Since the GATK HaplotypeCaller function involves reassembly of the reads, it may pull in the reads from the different areas for the repetitive loci.

    There may be multiple causes for the problem: repeats, SV (Structural Variants), allelic variation and mapping/assembly artifacts.

    If doing a QC with Sanger make sure your primers are NOT allele specific - otherwise they would just amplify the reference allele...

    If your dataset is PCR-free and has low GC bias - try looking for any signs of CNV (Copy Number Variation).

    PS: Ideally (if funding is not limited) I would also try to sequence the affected samples using the 3-rd generation sequencing technology (ONT/PacBio) and assemble them de novo .

    or at least get some 10X allele phasing data for the samples...

    Given the existing dataset and no new funding/experiments:

    1. determine the areas affected by the repeats and mask them for time being.
    2. Try to trace the origin of the reads in the affected area of the bam file. If they are unmapped in the original bam file, and the coverage is half of what it is in the final bam file - probably your other allele is too divergent from the reference to be mapped successfully by your mapper.

    You may also try making a custom reference (include the divergent region version as a separate sequence and retry mapping).

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Current Approaches to Protein Sequencing
      by seqadmin


      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
      04-04-2024, 04:25 PM
    • seqadmin
      Strategies for Sequencing Challenging Samples
      by seqadmin


      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
      03-22-2024, 06:39 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 04-11-2024, 12:08 PM
    0 responses
    18 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 10:19 PM
    0 responses
    22 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 09:21 AM
    0 responses
    17 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-04-2024, 09:00 AM
    0 responses
    49 views
    0 likes
    Last Post seqadmin  
    Working...
    X