SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Ion Torrent



Similar Threads
Thread Thread Starter Forum Replies Last Post
What are the "Flow Evaluator" measures in the Ion Torrent Suite kgulukota Bioinformatics 1 01-28-2014 05:28 AM
building Torrent Suite v3.6 razzmataz Ion Torrent 17 12-10-2013 05:32 AM
Ion Torrent Suite v3.2 (base call improvements) kkmahale Ion Torrent 0 12-28-2012 01:49 AM
PubMed: An integrative variant analysis suite for whole exome next-generation sequenc Newsbot! Literature Watch 0 06-13-2012 02:00 AM

Reply
 
Thread Tools
Old 03-25-2014, 04:07 AM   #1
gmarco
Member
 
Location: Spain

Join Date: Oct 2012
Posts: 36
Question Variant Calling outside Torrent Suite and TVC

Hello,

After trying to build Torrent Suite 4 for weeks on a RedHat Linux without success (I got compiled programs but not working with my Linux Architecture), and exploring all the tools they "offer" inside their analysis suite.

I'm scared.

They use their own variant caller (tvc), a tool that supports hotspots.. freebayes and normalized measurements. In my opinion an in-house variant caller to deal with the homopolymer problem that we all know.

They also use a modified version of GATK, that includes the IndelAssembly toolkit. I think it's their own modified version of GATK. I can't find any information about IndelAssembly (toolkit intended to find large indels) outside Life realms.

I would like to analyze some Ion Torrent data, but I don't like this way. I would like to know what tools are you using for variant calling outside the straight way from Life and they Torrent Suite one click analyzer. Tools supported by all the scientific community. For me a golden standard version of GATK would be ok.

My purpose at this moment is to analyze AmpliSeq exomes.

Thanks !

Last edited by gmarco; 03-25-2014 at 04:17 AM.
gmarco is offline   Reply With Quote
Old 03-25-2014, 06:00 AM   #2
TiborNagy
Senior Member
 
Location: Budapest

Join Date: Mar 2010
Posts: 329
Default

If you analyse human data, GATK is a very good tool.
TiborNagy is offline   Reply With Quote
Old 03-25-2014, 06:01 AM   #3
gmarco
Member
 
Location: Spain

Join Date: Oct 2012
Posts: 36
Default

I'm using GATK as I do with Illumina data. I'm experiencing very slow variant calling process with UnifiedGenotyper.
gmarco is offline   Reply With Quote
Old 03-29-2014, 10:02 PM   #4
snetmcom
Senior Member
 
Location: USA

Join Date: Oct 2008
Posts: 158
Default

Could you ask your service provider to analyze the data on their Torrent Suite? It extremely simple on the Server.
snetmcom is offline   Reply With Quote
Old 04-04-2014, 06:47 AM   #5
c_ro87
Member
 
Location: Buenos Aires

Join Date: Feb 2012
Posts: 12
Default

Hello, i'm analyzing Ampliseq runs on custom gene panels, using a 316 Chip

i also want to try something different, i also try the GATK, but after the MarkDuplicates step, i found that due to my amplicons start and end in the same locations in the alignment thay are marked as PCR duplicates and removed from the next steps..

so i can't do this step, if i follow the best practices guidelines, and aplly hard filtering, i get a lot more variants that with the ION variantCaller pipeline

i mean aprox 30 variants in the ION pipeline for each barcado, and aprox 150 with GATK after the hard filtering step

i don't know how good is this disagrement

what experience do you have?
c_ro87 is offline   Reply With Quote
Old 04-04-2014, 12:44 PM   #6
snetmcom
Senior Member
 
Location: USA

Join Date: Oct 2008
Posts: 158
Default

Ampliseq library will be mostly duplicates by design. You should not be filtering these reads out.

GATK is likely calling false positives because it does not have any ion specific rules.
snetmcom is offline   Reply With Quote
Old 04-10-2014, 01:10 AM   #7
arnaud83
Junior Member
 
Location: France Marseille

Join Date: Apr 2014
Posts: 6
Default

I'm glad to see people who think like me about the Torrent Server Suite and its Tool Box.
I'm used to deal with PGM data, and here is my pipeline :

-Alignment is done with bwasw. I tested several alignment programs (tmap, novoalign) and even if the percentage of mapped reads is 4-5 % smaller than tmap, the indel mismatch is better.

-For an exome i use MarkDuplicates.jar from PICARD. For targeted sequencing, it's not recommended because 80-90% of reads will be marked.

-Then, i use FreeBayes to call SNP and UnifiedGenotyper to call INDEL. I prefer to use FreeBayes to call SNP because i can set the min variant frequency and be more sensitive than UnifiedGenotyper. But It requires additional filtering (Strand Bias, Quality, etc...).
For more specificity without to much work, i suggest you UnifiedGenotyper for both SNP and INDEL.
But to be honest you have better chance to call a true INDEL with flipping a coin. There are too many False Positive, and for an exome it's a misery.

Last edited by arnaud83; 04-10-2014 at 11:52 PM.
arnaud83 is offline   Reply With Quote
Old 04-11-2014, 07:00 AM   #8
c_ro87
Member
 
Location: Buenos Aires

Join Date: Feb 2012
Posts: 12
Default

@arnaud83: how you do the strand bias filtering?
c_ro87 is offline   Reply With Quote
Old 04-11-2014, 10:26 AM   #9
arnaud83
Junior Member
 
Location: France Marseille

Join Date: Apr 2014
Posts: 6
Default

Quote:
Originally Posted by c_ro87 View Post
@arnaud83: how you do the strand bias filtering?
For strand bias based on Fisher's exact test (Unifiedgenotyper), i use a threshold of 60 ( p=0.000001). Variants below this threshold will be keep.
FreeBayes doesn't include strand bias in the vcf output, but you can easily compute this with some programming skills
arnaud83 is offline   Reply With Quote
Old 04-13-2014, 03:29 AM   #10
IonTom
Member
 
Location: Germany

Join Date: Apr 2014
Posts: 32
Default

I also kind of gave up on the IonTorrent Suite.

Currently i am using NextGenMap for Alignment.

For Variant Calling i use Platypus. The principle is kind of similar to FreeBayes,
but the QC statistics and filters are much more complete. As in everything you can think of.

One additional fillter I use is implemented in the BioConductor VariantTools package.
It tells you at how many different in read positions a variant was found.
This is kind of important as removing PCR duplicates is not really possible for
Amplicon data.

Last edited by IonTom; 04-24-2014 at 11:54 AM.
IonTom is offline   Reply With Quote
Old 04-13-2014, 10:57 PM   #11
arnaud83
Junior Member
 
Location: France Marseille

Join Date: Apr 2014
Posts: 6
Default

I did not know these tools. Thank you.
I will test them.
arnaud83 is offline   Reply With Quote
Old 04-24-2014, 12:00 PM   #12
IonTom
Member
 
Location: Germany

Join Date: Apr 2014
Posts: 32
Default

@arnaud83: How did they work for you ?


There is a nice paper discussing the topic of using aligners on ion torrent data:
http://www.biomedcentral.com/1471-2164/15/264/

Last edited by IonTom; 04-24-2014 at 01:09 PM.
IonTom is offline   Reply With Quote
Old 04-26-2014, 07:45 AM   #13
wolfpack14
Member
 
Location: Raleigh, NC

Join Date: Jan 2014
Posts: 12
Default

The homopolymer issue in IonTorrent can be semi-mitigated through setting frequency thresholds based on mixture fractions. The solution can be applied through post-processing or integrated into one of these variant caller applications. We're working on a paper right now that demonstrates the methodology in a productional lab environment (vs. academic environment you see in most papers).
wolfpack14 is offline   Reply With Quote
Old 04-29-2014, 04:54 AM   #14
arnaud83
Junior Member
 
Location: France Marseille

Join Date: Apr 2014
Posts: 6
Default

Quote:
Originally Posted by IonTom View Post
@arnaud83: How did they work for you ?


There is a nice paper discussing the topic of using aligners on ion torrent data:
http://www.biomedcentral.com/1471-2164/15/264/

Well, to be honest, i'm a little bit disappointed by mosaik. The mentioned paper shows promising results but i obtained worse results than bwa or tmap.
arnaud83 is offline   Reply With Quote
Old 06-09-2014, 12:32 AM   #15
gmarco
Member
 
Location: Spain

Join Date: Oct 2012
Posts: 36
Default

I'm very happy seeing this topic has received many answers. I'm wiling to try all these tools.

Quote:
Originally Posted by wolfpack14 View Post
The homopolymer issue in IonTorrent can be semi-mitigated through setting frequency thresholds based on mixture fractions. The solution can be applied through post-processing or integrated into one of these variant caller applications. We're working on a paper right now that demonstrates the methodology in a productional lab environment (vs. academic environment you see in most papers).
Hello wolfpack do you have any ETA?

I expected very very very slow GATK UnifiedGenotyper variant calling with Ion Torrent exome variant calling. Anyone had this issue?

Ion Torrent data has 2 major issues:
1 - Dealing with homopolymer problem (how the hell we're supposed to filter those reads, or deal with them)
2 - Setup correct variant calling settings.

Last edited by gmarco; 06-09-2014 at 12:39 AM.
gmarco is offline   Reply With Quote
Reply

Tags
calling, ion, suite, torrent, variant

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:56 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO