SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
PubMed: Development and application of a whole-genome simple sequence repeat panel fo Newsbot! Literature Watch 0 10-18-2011 03:00 AM
Extracting genome specific SNPs from 1000 genomes maricu Bioinformatics 12 01-21-2011 03:46 AM
Extracting Reference Sequence from a bam File andy11 Bioinformatics 6 12-13-2010 03:06 PM

Reply
 
Thread Tools
Old 09-03-2019, 06:08 AM   #1
vl250
Junior Member
 
Location: London

Join Date: Sep 2019
Posts: 1
Default Extracting a panel of SNPs from a whole genome sequence

I'm looking to use whole genomes we already have available as part of a GWAS with genotyping data from a SNP array.

How best can I extract the relevant SNPs from my whole genomes? I have a .bed file set up with all my SNP locations, and have been Googling like mad trying to find ways to do this. Am I best to extract these from my aligned .bam file using something like:

Code:
$ bedmap --echo --fraction-map 1 <(bam2bed < reads.bam) intervals.bed > answer.bed
or

Code:
$ samtools mpileup -ugf ref.fa -l intervals.bed sample1.bam | bcftools call -vmO z -o answer.vcf.gz
Or to just restrict the final combined vcf for the whole genomes with something like this:

Code:
$ java -jar GenomeAnalysisTK.jar -T SelectVariants -R ref.fa -V all_samples.vcf -L intervals.bed -sn Sample1 -sn Sample2 -sn Sample3
I'm not sure what the advantages/disadvantages of doing it each stage would be, and how best I can get as close to a SNP array output as possible (ie with the variant present at each location).

Googling usually answers all my questions but not today!
vl250 is offline   Reply With Quote
Old 09-04-2019, 12:18 AM   #2
Gopo
Member
 
Location: Louisiana

Join Date: Nov 2013
Posts: 38
Default

I would call SNPs with your desired variant caller (I prefer CallVariants from BBTools/BBMap) then use bedtools intersect (-a study.vcf.gz -b bed-file-of-snps) and the output is a VCF file of intersecting SNPs. Note that you might have some indels after bedtools intersect that you need to filter out with bcftools or vcftools.

If you want to use mpileup, samtools mpileup is deprecated so use bcftools mpileup.

Code:
bcftools mpileup -Ou -f <ref.fa> <sample1.bam> <sample2.bam> <sample3.bam> | bcftools call -vmO z -o <study.vcf.gz>
above example command from http://www.htslib.org/workflow/
Gopo is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:26 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO