Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
tophat/Cufflinks workflow question hmortens Bioinformatics 2 01-09-2012 11:26 AM
Post-Doctoral Position in Bioinformatics / Genome Analysis in Uppsala, Sweden hronne Academic/Non-Profit Jobs 0 07-21-2011 12:49 PM
Post-doctoral fellow; genome assembly and analysis, Gif-sur-Yvette (France) Academic/Non-Profit Jobs 0 01-11-2011 05:18 AM
Tophat/cufflinks workflow question staylor Bioinformatics 7 12-08-2009 03:08 PM
Genome alignment/analysis question jvntc General 2 08-13-2009 02:00 PM

Thread Tools
Old 02-07-2012, 10:00 AM   #1
Junior Member
Location: Mobile, AL

Join Date: Feb 2012
Posts: 2
Default post-assembly genome analysis workflow question

Hi All,

We want to extract sequence information (for various genes) from a number of genome assemblies and generate consensus sequences for comparison between genomes representing different experiments.

What we have been doing is using samtools to extract regions from the genomic bam file, then trying to convert those into fasta format using bam2fastq. Everything we've extracted has been groups of overlapping short reads, we have not been successful at obtaining consensus sequences.

Is there an alternative workflow that would be more efficient/better? Are there suggestions for tools we should be using instead of/in addition to samtools and bam2fastq?

(Note: We have tried using the samtools programs (mpileup, bcf view, and to generate a consensus sequence. Unfortunately, the (pipelined and non-pipelined) use of the program ‘bcftools view’ generates the following error: [bcf_sync] incorrect number of fields (0 != 5) at 0:0)).
tom_mlvs is offline   Reply With Quote
Old 02-07-2012, 10:29 AM   #2
Senior Member
Location: San Diego

Join Date: May 2008
Posts: 912

You are trying to make a vcf with bcftools? That should work, there's probably something off about your input file.

Do you mean genome assemblies, or alignments? I assume alignments, since you have bam files?

I suppose you could try using samtools view to pull sections of your .bam, and then you could put those .bams through velvet, to assemble a consensus for that sample at that region.

But I think getting the vcf files for those regions is the way to go. You just aren't doing it right.
swbarnes2 is offline   Reply With Quote
Old 02-07-2012, 11:52 AM   #3
Junior Member
Location: Mobile, AL

Join Date: Feb 2012
Posts: 2

We've been trying to pull out a chromosome, and go at it that way. Should be be doing it gene by gene instead, or is there something obviously wrong here:

samtools view -b exome_input.bam 12 > chr12.bam
samtools mpileup -f chr12.fa chr12.bam > chr12_pileup
bcftools view -cg chr12_pileup > chr12_vcf

It seems we're not the only ones to get the following error
[bcf_sync] incorrect number of fields (0 != 5) at 0:0))

Maybe the suggestions here (nohup) will fix it:
Will update when we've tried.


Last edited by tom_mlvs; 02-07-2012 at 12:17 PM.
tom_mlvs is offline   Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 10:48 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO