SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Hot to calculate coverage on a 454 targeted region sequenced ? Giorgio C Bioinformatics 9 10-23-2012 02:06 PM
Sequencing microsatellite enriched DNA amplicons on MiSeq Polar314 Illumina/Solexa 1 03-20-2012 02:12 AM
Kits for sequencing long amplicons on MiSeq odile Illumina/Solexa 0 03-07-2012 01:50 PM
PubMed: Minor variant detection in amplicons using 454 massive parallel pyrosequencin Newsbot! Literature Watch 0 09-13-2011 03:00 AM
MAQ and indel detection fadista Bioinformatics 1 09-03-2008 01:07 AM

Reply
 
Thread Tools
Old 11-22-2012, 11:59 PM   #1
lpalacios
Junior Member
 
Location: Bilbao (Spain)

Join Date: Apr 2011
Posts: 5
Wink Indel detection in high coverage amplicons sequenced by MiSeq

Hello everybody,
I have started working with MiSeq and I am trying to sequence custom amplicons with 500 cycles kit (250 nucleotides sequenced by R1 and R2). I have not managed to detect all the indels, specially long indels (above 18-40 nucleotides). My library contains 48 amplicons of 300 nucleotides on average and they have high coverage (around 10000 reads each). As the MiSeq reporter results are not accurate, I am using bwa to generate .bam, then samtools to sort and index the .bam files and finally GATK to detect variants. I have tried with -glm BOTH, and also witn -glm INDEL but both failed.
When I load the .bam in the IGV I can see that there is an insertion or a deletion in the corresponding nucleotide and all the SNPs are correctly shown but then when I obtained the .vcf by GATK I obtain a lot of false positives and false negatives and I have no long indels.
I have also tried PINDEL but I have not detected long indels either, perhaps because the R1 and R2 sequences are overlapped.
Could anybody tell me how can I improve my variants detection, specially the indels?

Thanks in advanced!

Lourdes

Last edited by lpalacios; 11-23-2012 at 12:25 AM.
lpalacios is offline   Reply With Quote
Old 11-23-2012, 01:15 AM   #2
rlong
Junior Member
 
Location: St.Louis, MO USA

Join Date: Oct 2012
Posts: 2
Default

For pindel, you might improve results by running samtools fixmate on your bam. Pindel will recover some unmapped read mates, but only if they are flagged as such, and located alongside the mapped mate. So if you had stringent mapping quality restrictions during alignment, you might find more support for larger indels this way.

You could also use samtools to call variants and see what sort of results you get there.
rlong is offline   Reply With Quote
Old 02-24-2015, 07:04 AM   #3
alexholman
Junior Member
 
Location: Boston, MA

Join Date: Feb 2015
Posts: 6
Default

This post is good and stale, but I'll update my solution for posterity.

Try the package freebayes "Bayesian haplotype-based polymorphism discovery and genotyping"
https://github.com/ekg/freebayes

The caller seems to work well on amplicon data and ended up being the cleanest and most complete VCF file (with ref and alt allele frequencies).
alexholman is offline   Reply With Quote
Reply

Tags
amplicon sequencing, indel analysis

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:35 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO