Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • lg36
    Member
    • Mar 2012
    • 12

    Does a reliable consensus mean more reliable SNPs?

    Dear All,

    I'm relatively new to WGS analysis so please excuse any naivety on my part.

    Before getting the WGS sequences I have confirmed the presence or absence of certain oligonucleotides in various bacterial DNA samples. So I know I should see these sequences in the final consensus sequence.

    Is it true to think that if I can produce a more reliable consensus sequence then the SNP calls are also likely to be more reliable. I appreciate that there are many SNP quality filters etc that will also be applied that can lead to difference between a consensus and a SNP call, but I just wanted to get an idea of the overall correlation between the consensus and SNPs.

    If there is a high correlation between the two then surely if I make sure that my consensus sequences are as reliable as possible, when I come to calling the SNPs from the same mapped reads they will be more reliable???

    Apologies if I'm totally wrong about this.

    Best wishes lg36
  • mbayer
    Member
    • Mar 2009
    • 31

    #2
    Hi lg36,

    you're not wrong about this at all -- this is in fact a pretty important factor in SNP discovery.

    Your SNPs can only ever be as good as your reference and your mapping. If your reference contains errors, this will propagate right through into your SNP calls, and similarly if you mismap lots of reads you will also increase your false positive SNP rate.

    I routinely map the reads from the individual used to make the reference back to the reference before I do any mapping of other individuals onto that reference for SNP discovery. I then call SNPs on that mapping first, and I always get SNPs here.

    In a homozygous or haploid organism this will give you a list of positions where there reference most likely contains errors -- in an ideal case there should be zero SNPs when I map the reads back onto the reference that was made from the same reads. I don't know what you work with but I am fortunate in that I do a lot of work with cultivated barley which is essentially homozygous and that simplifies matters obviously.

    I then subtract the list of SNPs called there from any list of SNPs generated with reads from a different individual -- it's essentially a way of removing background noise. I guess if you have a heterozygous organism and it's well curated you could probably use a public, curated list of SNPs instead.

    This gives you much cleaner SNP sets and reduces the false positive rate but the caveat is that potentially you may be increasing your false negative rate (I don't have any data on this yet). It all depends on what your SNPs are for - if reliability is key, then this works well. You may also want to remove duplicates from the mapping -- that also reduces your FP rate.

    cheers

    Micha

    Comment

    Latest Articles

    Collapse

    • SEQadmin2
      Nine Things a Sample Prep Scientist Thinks About Before Sequencing
      by SEQadmin2


      I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.


      Here are nine questions we think about, in roughly the order they matter, before...
      06-18-2026, 07:11 AM
    • SEQadmin2
      From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
      by SEQadmin2


      Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


      The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
      ...
      06-02-2026, 10:05 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by SEQadmin2, 06-17-2026, 06:09 AM
    0 responses
    30 views
    0 reactions
    Last Post SEQadmin2  
    Started by SEQadmin2, 06-09-2026, 11:58 AM
    0 responses
    44 views
    0 reactions
    Last Post SEQadmin2  
    Started by SEQadmin2, 06-05-2026, 10:09 AM
    0 responses
    49 views
    0 reactions
    Last Post SEQadmin2  
    Started by SEQadmin2, 06-04-2026, 08:59 AM
    0 responses
    50 views
    0 reactions
    Last Post SEQadmin2  
    Working...