Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • freebayes calls SNP instead of INDEL

    Hello,
    I would like to push your attention to a problem which I allready reported at the freebayes git repository.

    In my reference sequence there is a pattern 11 times TG followed by 7 times T. Now I have a sample which I already sequenced by sanger resulting in one allel with 11xTG/5xT and one with 12xTG/7xT.

    If I have a look at the alignment file for my NGS analysis I can guess the same result. But freebayes says there are 11/12xTG (which is ok) and a SNP T>G which is not the same as a 2bp deletion. In contrast gatk's HaplotypeCaller have the correct result.

    I don't understand what freebayes is doing here. Are there any parameters with which I can influence the result.

    On freebayes git site you can see the relevant position in igv. What more information can I provide to you?

    Thanks a lot.

    fin swimmer

  • #2
    It's possible that this is a result of the aligner, rather than the variant-caller, particularly in light of the fact that GATK can realign reads near possible indels. What aligner are you using? And, have you looked at the locations in IGV to see if the alignments clearly indicate an indel?

    Comment


    • #3
      There's a known incompatibility between FreeBayes and the CIGAR format of SAM/BAM v1.4, which results in no indel calls. If that's true for your FreeBayes VCF, reformat your alignment to v1.3 and rerun FreeBayes.

      Comment


      • #4
        Hello there,

        Originally posted by Brian Bushnell View Post
        It's possible that this is a result of the aligner, rather than the variant-caller, particularly in light of the fact that GATK can realign reads near possible indels.
        as I understood gatk's HaplotypeCaller and freebayes discard aligment information in indel-regions and do a denovo assembling. So a realignment isn't necessary.

        Just for testing, I do a IndelRealignment with gatk and a VariantCalling with freebayes afterwards. It ended up with calling the insertion of TG, but neither the real deletion of TT nor the previously fount T>G are now reported anymore.

        Originally posted by Brian Bushnell View Post
        What aligner are you using?
        I'm using bwa mem.

        Originally posted by Brian Bushnell View Post
        And, have you looked at the locations in IGV to see if the alignments clearly indicate an indel?
        Yes I did. And the deletion is good to see (103 reads with T and 42 reads with del). But there are several reads not spanning the whole repeating region. So there are 27 reads that are aligned as a mismatch instead of a deletion. Even this are less reads freebayes maybe favor these over a deletion? Can I influence this in any way?

        Originally posted by HESmith
        There's a known incompatibility between FreeBayes and the CIGAR format of SAM/BAM v1.4, which results in no indel calls. If that's true for your FreeBayes VCF, reformat your alignment to v1.3 and rerun FreeBayes.
        How can I find out which version is used by bwa?

        fin swimmer

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Essential Discoveries and Tools in Epitranscriptomics
          by seqadmin




          The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
          04-22-2024, 07:01 AM
        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Yesterday, 08:47 AM
        0 responses
        14 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        60 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        60 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        54 views
        0 likes
        Last Post seqadmin  
        Working...
        X