Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
Adapter Dimer Structure bionop Illumina/Solexa 0 02-15-2015 12:03 PM
How to label the proteins in 3D structure? manishbudathoki General 3 10-07-2013 09:29 AM
Arachne folder structure smg283 Bioinformatics 0 12-17-2010 12:53 PM

Thread Tools
Old 08-11-2015, 02:35 PM   #1
Senior Member
Location: Worcester, MA

Join Date: Oct 2009
Posts: 133
Default mpileup to STRUCTURE

Hi all,
Thought I would see if anyone has a different (better!) pipeline for collecting a set of SNPs from many bam files for use with STRUCTURE or other population genetic software. The end goal is to have a matrix with SNPs as rows and samples in column.

x y z
Site 1 T T G
Site 2 G A C
Site 3 A G G

1. Create mpileup file
2. Call SNPs (with your favorite software) for each sample
3. Create merged SNP list, with the end of goal of having each position where there is a SNP in at least one sample
4. Create consensus fasta file from mpileup for each sample
5. Extract consensus nucleotide from each position (#3) from consensus fasta (#4)
6. Merge files

One problem is that some SNP sites found in particular samples may not pass quality filters in other samples. I would love to hear of other pipelines.

jgibbons1 is offline   Reply With Quote
Old 08-11-2015, 08:27 PM   #2
Registered Vendor
Location: Eugene, OR

Join Date: May 2013
Posts: 507

Have you tried just sending the mpileup to bamtools call?
samtools mpileup -gu -t DP -f ref.fasta -b bam_file_list.txt| bcftools call -cv - > genotypes.vcf
You'll need to sort the bam files, with samtools sort for instance

or the equivalent from freebayes
freebayes -f $ref_fasta_file bam_string > vcf_name
You'll need to "read group" the bams with bamaddrg -b bamfile and then index with bamtools index.

Either of these should be filtered with vcftools or vcflib to keep the SNPs with a certain presence in the population, using calls of a certain depth for each sample, etc.
Providing nextRAD genotyping and PacBio sequencing services.

Last edited by SNPsaurus; 08-11-2015 at 08:30 PM.
SNPsaurus is offline   Reply With Quote
Old 08-12-2015, 10:42 AM   #3
Senior Member
Location: Worcester, MA

Join Date: Oct 2009
Posts: 133

Thanks for these suggestions. I actually haven't explored bcftools or freebayes so should have a busy few days

I will post an update once I come to a satisfactory solution.
jgibbons1 is offline   Reply With Quote

mpileup, snp, structure

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 02:50 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO