SEQanswers

Go Back   SEQanswers > Applications Forums > Genomic Resequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
Retrieving regions from VCF to run statistics Rubal7 Bioinformatics 3 06-19-2012 12:01 AM
Tophat~Error retrieving prep_reads info ruc9 Bioinformatics 6 02-28-2012 07:56 AM
Retrieving mismatch details from tophat traeki Bioinformatics 2 05-24-2011 11:03 AM
Retrieving single bases of the reference genome in R mixter Bioinformatics 1 10-18-2010 02:30 AM
retrieving reads from SRA - lack of documentation NGSfan General 1 06-22-2010 06:09 AM

Reply
 
Thread Tools
Old 07-10-2012, 11:13 AM   #1
Gavin_Sherlock
Junior Member
 
Location: Stanford

Join Date: Oct 2011
Posts: 3
Default Retrieving reads with SNPs

Hi,

Does anyone know of a way to straightforwardly retrieve all reads (from resequencing) that have a SNP present in them? We're generating vcf files, using bwa+gatk, and would then like to know all reads which contain a SNP listed in the vcf file. I know that I can write a wrapper that uses samtools view for a region containing a SNP, and then look at each read individually (the SNPs will be heterozygous) that maps to that location, but I wondered whether there is something that is likely faster than this, as I'd likely implement the wrapper in Perl, which will be slow for thousands of SNPs x many strains.

Cheers,
Gavin
Gavin_Sherlock is offline   Reply With Quote
Old 08-14-2012, 07:25 PM   #2
RockChalkJayhawk
Senior Member
 
Location: Rochester, MN

Join Date: Mar 2009
Posts: 191
Default

Quote:
Originally Posted by Gavin_Sherlock View Post
Hi,

Does anyone know of a way to straightforwardly retrieve all reads (from resequencing) that have a SNP present in them? We're generating vcf files, using bwa+gatk, and would then like to know all reads which contain a SNP listed in the vcf file. I know that I can write a wrapper that uses samtools view for a region containing a SNP, and then look at each read individually (the SNPs will be heterozygous) that maps to that location, but I wondered whether there is something that is likely faster than this, as I'd likely implement the wrapper in Perl, which will be slow for thousands of SNPs x many strains.

Cheers,
Gavin
check the fields in the gatk vcf. It already tells you how many reference/alternate reads there are.
RockChalkJayhawk is offline   Reply With Quote
Old 10-10-2012, 08:54 AM   #3
cpendlebury
Junior Member
 
Location: Melbourne, Australia

Join Date: May 2011
Posts: 3
Default

I believe he wants the actual sequence of the reads with SNPs, rather than just the numbers of SNPs.

If so I'm wanting to figure out how to do the same thing.
cpendlebury is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:44 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO