Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Does a reliable consensus mean more reliable SNPs?

    Dear All,

    I'm relatively new to WGS analysis so please excuse any naivety on my part.

    Before getting the WGS sequences I have confirmed the presence or absence of certain oligonucleotides in various bacterial DNA samples. So I know I should see these sequences in the final consensus sequence.

    Is it true to think that if I can produce a more reliable consensus sequence then the SNP calls are also likely to be more reliable. I appreciate that there are many SNP quality filters etc that will also be applied that can lead to difference between a consensus and a SNP call, but I just wanted to get an idea of the overall correlation between the consensus and SNPs.

    If there is a high correlation between the two then surely if I make sure that my consensus sequences are as reliable as possible, when I come to calling the SNPs from the same mapped reads they will be more reliable???

    Apologies if I'm totally wrong about this.

    Best wishes lg36

  • #2
    Hi lg36,

    you're not wrong about this at all -- this is in fact a pretty important factor in SNP discovery.

    Your SNPs can only ever be as good as your reference and your mapping. If your reference contains errors, this will propagate right through into your SNP calls, and similarly if you mismap lots of reads you will also increase your false positive SNP rate.

    I routinely map the reads from the individual used to make the reference back to the reference before I do any mapping of other individuals onto that reference for SNP discovery. I then call SNPs on that mapping first, and I always get SNPs here.

    In a homozygous or haploid organism this will give you a list of positions where there reference most likely contains errors -- in an ideal case there should be zero SNPs when I map the reads back onto the reference that was made from the same reads. I don't know what you work with but I am fortunate in that I do a lot of work with cultivated barley which is essentially homozygous and that simplifies matters obviously.

    I then subtract the list of SNPs called there from any list of SNPs generated with reads from a different individual -- it's essentially a way of removing background noise. I guess if you have a heterozygous organism and it's well curated you could probably use a public, curated list of SNPs instead.

    This gives you much cleaner SNP sets and reduces the false positive rate but the caveat is that potentially you may be increasing your false negative rate (I don't have any data on this yet). It all depends on what your SNPs are for - if reliability is key, then this works well. You may also want to remove duplicates from the mapping -- that also reduces your FP rate.

    cheers

    Micha

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Recent Innovations in Spatial Biology
      by seqadmin


      Spatial biology is an exciting field that encompasses a wide range of techniques and technologies aimed at mapping the organization and interactions of various biomolecules in their native environments. As this area of research progresses, new tools and methodologies are being introduced, accompanied by efforts to establish benchmarking standards and drive technological innovation.

      3D Genomics
      While spatial biology often involves studying proteins and RNAs in their...
      01-01-2025, 07:30 PM
    • seqadmin
      Advancing Precision Medicine for Rare Diseases in Children
      by seqadmin




      Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
      12-16-2024, 07:57 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 01-09-2025, 04:04 PM
    0 responses
    431 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 01-09-2025, 09:42 AM
    0 responses
    441 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 01-08-2025, 03:17 PM
    0 responses
    452 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 01-03-2025, 11:18 AM
    1 response
    50 views
    1 like
    Last Post Tonia
    by Tonia
     
    Working...
    X