Seqanswers Leaderboard Ad

**swbarnes2** · 06-25-2012, 08:18 AM

I use samtools on bacteria all the time, and it's not really a problem. Remember that just because your organism is haploid doesn't mean that you have been giving a clonal culture to analyze.

**dictseq** · 12-17-2012, 03:32 PM

We found that the following workflow works well for variant calling in haploids using samtools. To generate a consensus sequence from the realigned BAM file in FASTQ format, we used:

samtools mpileup -uf ref.fasta realigned.bam | bcftools view -cg -s sample.txt - | perl vcfutils.pl vcf2fq > consensus.fq

where sample.txt is a plain text file with the sample name in the first column and the ploidy level in the second column. For example,

sample_name 1

The columns should be tab-delimited.

**swbarnes2** · 12-18-2012, 09:23 AM

Keep in mind that vcfutils vcf2fq will not handle indels. It just creates a small window of lowercase letters around the putative indel.

**ekg** · 12-19-2012, 05:11 AM

Why not use freebayes? It's designed to support haploids and polyploids (use the --ploidy option).

**ekg** · 12-19-2012, 05:15 AM

Also, you can use vcfgeno2haplo (https://github.com/ekg/vcflib/blob/m...geno2haplo.cpp) with an arbitrarily large window size to generate a consensus fasta sequence from the output.

If you input a file with one sample, the recorded consensus sequence will be the ALT in the single VCF record in the output.

It works for indels as well as SNPs. It does not generate quality estimates.

**Zam** · 12-19-2012, 05:04 PM

This paper (apologies for self-publicity) might be of interest to you:

High-throughput microbial population genomics using the Cortex variation assembler. Z Iqbal, I Turner, G McVean, Bioinformatics 2012

http://bioinformatics.oxfordjournals.org/content/early/2012/11/19/bioinformatics.bts673.full.pdf+html

Fast highly specific and sensitive variant discovery and genotyping for microbial genomes by multi-sample de novo assembly, allowing direct genome comparison without using a reference. The paper contains examples with short and long reads, reproducing results from published papers (drug-resistant/susceptible S.aureus, in-host evolution) with a fraction of the time and effort.

**ekg** · 12-20-2012, 12:44 AM

@Zam, this looks great! I'm looking forward to reading it.

**kjm** · 02-27-2014, 02:09 PM

Hi,

I've been reading through a number of threads and manuals about this and still don't understand how to identify if SNPs are homozygous or heterozygous when viewing the vcf file. Has anyone seen any issues with using the -s parameter in the bcftools view for setting the ploidy as mentioned above? Not sure if it matters or not, but my samples aren't bacteria, but are haploid males. Thanks!

Oh and yes, we are looking into Freebayes too.

**kjm** · 02-28-2014, 01:32 PM

Hi again. So I decided to do a test a merged sample (merged from three runs of the same sample) and got an error I'm not sure about.

Here is the command I used: samtools mpileup –Dsuf ref.fa Sample.bam | bcftools view –bcgv –s Sample.txt - > Sample.raw.bcf

Then got this:
[mpileup] 1 samples in 1 input file
<mpileup> set max per-file depth to 8000
<bcf_hdr_subsam> 1 samples in the list but not in BCF.

So it's something wrong with the text file I used for the -s parameter in bcftools for ploidy(I think?). The text file seems like it would be pretty straight forward, but I don't know what could cause the issue for the "samples in list but not in BCF" error. Any ideas? Thanks.

**ZLounsberry** · 05-11-2014, 01:26 PM

Hey kjm,
Assuming you haven't answered this on your own yet (or assuming someone else is having a similar error and needs the thread to not die at your question), I think I have an answer.

If you are piping it the way you are in your post, making the "Sample.txt" file simply a dash, tab, and 1 should cover it. That way it reads your file name as the piped filename and continues as usual. Got rid of my "<bcf_hdr_subsam> 1 samples in the list but not in BCF." error anyway... Hope that helps someone.

Sample.txt should read:
- 1

**polpol** · 01-13-2015, 01:28 AM

variant calling in SAMtools

Dear ZLounsberry,

I saw your answer , but I didn't understand how the text file should be .

I am running the following command:
samtools mpileup –uf ucsc.hg19.fasta child.bam father.bam mother.bam | bcftools view -bvcgT trioauto -s family.txt - > femvar.vcf &

My txt file is:

child.bam
father.bam
mother.bam

I get an error:
[mpileup] 3 samples in 3 input files
<mpileup> Set max per-file depth to 2666
<bcf_hdr_subsam> 3 samples in the list but not in BCF.

Do you have any suggestion how to fix the problem?
Thank you for your help,

**ZLounsberry** · 01-13-2015, 08:14 AM

Hello polpol,
Try changing your "family.txt" file to read:

- 1

instead of:

child.bam
father.bam
mother.bam

If that does not work, try:

- 3

(so a dash, a tab character, and a 1 or, if that doesn't work, maybe a 3). It's a bit off topic for this thread, so feel free to email me (my username AT gmail DOT com) and let me know if you need a hand.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 25 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 28 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 24 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

haploid variant calling in SAMtools

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News