SEQanswers

Go Back   SEQanswers > Applications Forums > Genomic Resequencing

Similar Threads
Thread Thread Starter Forum Replies Last Post
Question on calling SNPs using samtools/bcftools nkwuji Bioinformatics 6 02-19-2013 08:52 AM
calling SNPs for prokaryotes Kasycas Bioinformatics 3 02-28-2012 01:33 AM
calling Heterozygous SNPs with samtools mpileup egatti Bioinformatics 1 07-21-2011 08:16 AM
SNPs calling accuracy: MAQ vs. SliderII nmalhis Bioinformatics 0 04-02-2009 09:51 AM
calling SNPs asankaf General 2 02-04-2009 06:45 PM

Reply
 
Thread Tools
Old 04-08-2012, 11:02 AM   #21
Medo
Junior Member
 
Location: germany

Join Date: Mar 2012
Posts: 3
Default mpileup and Gtak command for haploid genomes

Hi,
I wanna ask about the samtools mpileup and Gatk commands for haploid genome in bacteria.
I tried them many times but it always hangs with me.
knowing that I did my allignment using Bowtie 2 which allows allignments with gaps.
for instance , this is my mpileup command :

samtools mpileup -uf NC_008596.1.fasta mt1sortfilter.bam ->snp/pileup/mt1.pileup

I don't know what's wrong, but it freeze and give nothing for hours

thanks
Medo is offline   Reply With Quote
Old 04-16-2012, 11:32 AM   #22
vv85
Member
 
Location: Mexico

Join Date: Feb 2011
Posts: 17
Default

Quote:
Originally Posted by Medo View Post
Hi,
I wanna ask about the samtools mpileup and Gatk commands for haploid genome in bacteria.
I tried them many times but it always hangs with me.
knowing that I did my allignment using Bowtie 2 which allows allignments with gaps.
for instance , this is my mpileup command :

samtools mpileup -uf NC_008596.1.fasta mt1sortfilter.bam ->snp/pileup/mt1.pileup

I don't know what's wrong, but it freeze and give nothing for hours

thanks
the - before the > might be the problem
vv85 is offline   Reply With Quote
Old 04-17-2012, 03:21 AM   #23
Medo
Junior Member
 
Location: germany

Join Date: Mar 2012
Posts: 3
Default

HI vv85,
Thanks a lot , that was the reason .
But do you know really if samtools pileup and GATK are really applicable in haploid genomes or i will get false positive variants?

Thanks alot
Medo is offline   Reply With Quote
Old 04-18-2012, 07:47 AM   #24
vv85
Member
 
Location: Mexico

Join Date: Feb 2011
Posts: 17
Default

Like another poster has mentioned I prefer using samtools on haploid genomes. False positive variants are always possible depending on the initial sequencing data you're using and specific features of your genome.
vv85 is offline   Reply With Quote
Old 05-25-2012, 11:05 AM   #25
zhiwei
Junior Member
 
Location: usa

Join Date: Jul 2011
Posts: 3
Default

You may try this recent program SNVer.
http://snver.sourceforge.net/
It models the number of haploids in its model so it is applicable to haplid genomes too.


Quote:
Originally Posted by d17 View Post
Does anyone have any thoughts on calling SNPs from short read data (e.g. Illumina) in haploid genomes? It seems that many SNP calling programs are set up to deal only with diploid genomes (e.g. GATK's UnifiedGenotyper).

I found the program FreeBayes from the Marth Lab which allows you to specify the ploidy. This looks like a good candidate and I will definitely try it. It appears to be unpublished.

Does anyone have any experience with calling SNPs in haploid genomes using FreeBayes or another program?

Thanks!
zhiwei is offline   Reply With Quote
Old 07-11-2012, 01:54 PM   #26
ragowthaman
Member
 
Location: Seattle, USA

Join Date: Nov 2009
Posts: 12
Default

@Kasycas and @jgibbons1.
Its highly possible you wrote/found a script to map your SNPs on to genes (or find out synonymous and non-syn mutations.
I use snpEFF program for that. All you need is your VCF file and gene annotations in GFF format.

http://snpeff.sourceforge.net/

Shamefully agree, i wrote a (inferior)script to do it myself before finding this one.
Gowthaman
ragowthaman is offline   Reply With Quote
Old 10-01-2012, 06:00 AM   #27
ekg
Member
 
Location: Boston, MA

Join Date: Apr 2010
Posts: 36
Default

Quote:
Originally Posted by garwuf View Post
I gave quite an extensive try to Freebayes recently, and wouldn't recommend it in its current state. I have tried it on several bacterial datasets (of 4 - 6 Mb size), which were previously evaluated with Gigabayes, Samtools and GATK, and found that Freebayes reports nonexisting snps while missing well-defined ones. In fact, not a single snp was correctly predicted, no matter which parameters have been used.

Then, after reading the above post of d17, I decided to try Freebayes on smaller reference. I have generated two artificial sets of reads to a 128 kb template with 10 variant sites of different complexity. One set provided 50x , another one 400x coverage, and the alignment was performed with bwa. On this alignments, Freebayes has generated sane vcf output: no false positives, several snps were detected correctly. Still, the efficiency was quite low: for 50x dataset, it never reported more than 3 variants out of 10, and for 400x dataset it was 4-5 depending on settings. For comparison, Samtools 1.18 detected all 10 variants even on 50x dataset.

To my mind, Freebayes may have some problem with handling cashed sequence data, that's why it works with kb-sized but fails on Mb-sized references. On the other hand, it's still being developed. Maybe eventually these bugs will be fixed.
I'm the author of freebayes.

Did you submit bug reports about these issues? We have been using freebayes for haploid detection without issue.

When you say that freebayes was reporting many false SNPs, was this before or after you filtered the output on the QUAL field? It is our expectation that users filter the output data, and the output will include many SNPs with very low reported quality so as to allow filtering at any desired level.

The test setup you are describing is very similar to one we use during development, but your results are dramatically different.

Also, I am not aware of any existing issues with larger genomes, as we typically work with human samples, but again, I will be able to resolve anything with a bug report.

It's likely that if other users reported the same issues they have been resolved in the time since you tested.
ekg is offline   Reply With Quote
Old 10-01-2012, 06:06 AM   #28
ekg
Member
 
Location: Boston, MA

Join Date: Apr 2010
Posts: 36
Default

To answer the original post, simply running

% freebayes -p 1 -f reference.fasta alignments.bam

is sufficient to generate haploid SNP, indel, and complex allele calls using freebayes. The method is described in arXiv:1207.3907, "Haplotype-based variant detection from short-read sequencing."

If anyone has issues with this method, please report them to me (via email) or to the freebayes mailing list.

Happy variant detecting.
ekg is offline   Reply With Quote
Reply

Tags
haploid, snps

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:11 AM.


Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.