SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
SAMTOOLS mpileup only INDELs, no SNPs? jkozubek Bioinformatics 3 09-10-2012 10:16 AM
find all snps/indels prbndr Bioinformatics 2 09-20-2011 11:43 AM
How to confirm SNPs results from a proteomics approach didipao De novo discovery 15 07-28-2011 06:09 AM
PubMed: A probabilistic method for the detection and genotyping of small indels from Newsbot! Literature Watch 0 06-10-2011 11:20 AM
how to validate SNPs and Indels after assembly? sulicon Bioinformatics 9 02-25-2011 05:27 AM

Reply
 
Thread Tools
Old 12-16-2012, 09:32 AM   #1
Jane M
Senior Member
 
Location: Paris

Join Date: Aug 2011
Posts: 239
Default Which method to confirm SNPs and indels?

Dear all,

I have analyzed my DNA-seq data, in the common way: alignment, de-duplication, realignement, recalibration and variant calling, to get SNPs and indels. Now, we want to confirm these variants with biology. I am wondering what is the best way to do it.
I haven't found a topic about it on the forum that is why I open this discussion.

There is Sanger sequencing, but it could give false negative depending on the material used (if I understood well).
It is also possible to use Ion Torrent.
I am not a biologist and I am not familiar with these 2 methods. In my institute, we start working with DNA-seq data.

Could you please let me know what do you do, which difficulties did you meet with these methods? Which traps to avoid?

Thanks in advance,
Jane
Jane M is offline   Reply With Quote
Old 12-16-2012, 04:44 PM   #2
swbarnes2
Senior Member
 
Location: San Diego

Join Date: May 2008
Posts: 912
Default

Sanger sequencing is going to be the gold standard for accuracy in DNA sequence. It's kind of overkill when you only care about a single base, but you'll have the added context of a few hundred bases around your SNP.
swbarnes2 is offline   Reply With Quote
Old 12-18-2012, 08:23 AM   #3
Jane M
Senior Member
 
Location: Paris

Join Date: Aug 2011
Posts: 239
Default

Thank you swbarnes2 for your answer. I heard that with Sanger sequencing, we can get false negatives because of the tak (DNA polymerase). I would like to ask people used to do Sanger sequencing how frequent are these cases.

Finally, in my study, the biologists intend to use Ion Torrent to save time. What do you think about this method? Do you have any feedback?
Jane M is offline   Reply With Quote
Old 12-18-2012, 11:36 AM   #4
brofallon
Member
 
Location: United States

Join Date: May 2011
Posts: 26
Default Sanger

We Sanger suspected variants pretty frequently here and we don't get too many false negatives. Ion Torrent is another next-gen method and is both expensive and probably not much more reliable than other NGS methods.
brofallon is offline   Reply With Quote
Old 12-18-2012, 01:13 PM   #5
swbarnes2
Senior Member
 
Location: San Diego

Join Date: May 2008
Posts: 912
Default

Do you understand how sanger sequencing works? As long as your primers are really sitting down where you believe they should be, and nowhere else, you will get the right sequence. It might not be pretty, but there's no bias towards the letter in your reference. How would the enzyme know what is the reference letter, and what is the variant you expect to see?
swbarnes2 is offline   Reply With Quote
Old 12-20-2012, 02:19 AM   #6
Jane M
Senior Member
 
Location: Paris

Join Date: Aug 2011
Posts: 239
Default

Thank you both for your answers!

brofallon, when you get or suspect false positives, what do you do? Do you resequence with an other enzyme or by changing something else? Or you don't further study this SNP?

swbarnes2, I know a bit how Sanger sequencing works. I understand that there is no bias towards the letter in the reference. I just need an idea about how frequent are the false positives with Sanger sequencing.
My question is: are we always able to detect when Sanger sequencing doesn't work, so that if we do not validate a SNP, we cannot conclude that there is no SNPs. We can only say that Sanger sequencing didn't validate or invalidate this variant?

If I understand correctly the point of view of brofallon, he doesn't recommend to use Ion Torrent to validate variants. I would like to know if this technology has already been used in this case (instead of Sanger sequencing). Did you read some papers about it?
Jane M is offline   Reply With Quote
Old 12-20-2012, 05:42 AM   #7
gsgs
Senior Member
 
Location: germany

Join Date: Oct 2009
Posts: 140
Default

as I understand, you get multiple partial sequences of length ~500 and they overlap
and each SNP position is covered -say- 10 times. Now you can have one nucleotide
there -say- 8 times and another one 2 times.
Assume one read gives us 90% probability then 8:2 should give 99.99996%
if I calculated correctly.
Provided that the reads are independent and there is no systematic error,
some constellation that makes the error more likely ...
...............
I don't know. There are probably lots of statistics available already
and papers written about it
gsgs is offline   Reply With Quote
Old 12-20-2012, 09:25 AM   #8
Jane M
Senior Member
 
Location: Paris

Join Date: Aug 2011
Posts: 239
Default

I guess this answer was not intended for this topic ...?
Jane M is offline   Reply With Quote
Old 12-20-2012, 09:35 AM   #9
swbarnes2
Senior Member
 
Location: San Diego

Join Date: May 2008
Posts: 912
Default

Quote:
Originally Posted by Jane M View Post
Thank you both for your answers!

brofallon, when you get or suspect false positives, what do you do? Do you resequence with an other enzyme or by changing something else? Or you don't further study this SNP?

swbarnes2, I know a bit how Sanger sequencing works. I understand that there is no bias towards the letter in the reference. I just need an idea about how frequent are the false positives with Sanger sequencing.
Okay, but you did ask repeatedly about the false negative rate. But no, if your primers are sitting down where they are supposed to be, the only way for you to get the wrong sequence would be a freak polymerase error. The odds of that happening exactly on the base that you care about are tiny. It's not going to happen.

Now, you might get poor quality data, but traces come with quality scores. If your sequences blasts to where it ought to, and the quality around your base is good, then that's what your letter is. If you are trying to make mixed calls, that's trickier, but people have been dealing with the quality of sanger trace files for more than 10 years. It's ground that is well trod.
swbarnes2 is offline   Reply With Quote
Old 12-20-2012, 10:50 AM   #10
brofallon
Member
 
Location: United States

Join Date: May 2011
Posts: 26
Default

For our part, if we find a questionable variant in NGS data, and then Sanger sequence the area and find that the variant is not there, we typically assume the variant is a false positive and go no further. As swbarnes mentioned, if we have reason to believe the Sanger wasn't of good quality, we might redesign the primers and try it again, but as long as things don't seem too suspicious we trust the Sanger results.
brofallon is offline   Reply With Quote
Old 12-20-2012, 03:31 PM   #11
gsgs
Senior Member
 
Location: germany

Join Date: Oct 2009
Posts: 140
Default

what's your statistical error-rate ?
you or others must have measured that ...

where to find a list of labs with their error-rates
gsgs is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:11 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2022, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO