SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Is it possible to convert a SNP.txt to a bed file or get a SNP.bed from samtools? Ling Bioinformatics 7 04-02-2015 07:17 AM
How to call SNP using known SNP information nkwuji Bioinformatics 1 07-23-2013 10:16 PM
Filtering Indels isomer Illumina/Solexa 0 04-23-2010 05:53 AM
SNP filtering related fault in the BWA workflow rmadcf Bioinformatics 0 07-09-2009 07:04 AM
Maq SNP filtering script bug? qiudao Bioinformatics 9 03-03-2009 01:48 PM

Reply
 
Thread Tools
Old 12-15-2010, 03:39 AM   #1
gfmgfm
Member
 
Location: il

Join Date: Jun 2010
Posts: 64
Default SNP filtering

I ran samtools pileup and varfilter on human capture Ilumina data.
Does anyone has suggestions on how to filter the output in order to get reliable homozygous and heterozygous SNPs?
gfmgfm is offline   Reply With Quote
Old 12-15-2010, 03:51 AM   #2
nickloman
Senior Member
 
Location: Birmingham, UK

Join Date: Jul 2009
Posts: 356
Default

I'd suggest looking at VarScan and playing with the settings.

http://varscan.sourceforge.net/
nickloman is offline   Reply With Quote
Old 12-15-2010, 07:08 AM   #3
drio
Senior Member
 
Location: 41°17'49"N / 2°4'42"E

Join Date: Oct 2008
Posts: 323
Default

pileup is deprecated. Use mpileup instead. It computes BAQ (check same link) and uses it for the SNP calling. It seems BAQ helps reduce false positives (check alignment example at the bottom in the link above).
__________________
-drd
drio is offline   Reply With Quote
Old 12-15-2010, 09:36 PM   #4
gfmgfm
Member
 
Location: il

Join Date: Jun 2010
Posts: 64
Default

Thanks a lot for the suggestion. I already tried VarScan. The problem is that it does not give columns with total coverage for the position, so it is difficult to decide whether it is a reliable heterozygous/homozygous SNP. For example here are two lines from an output of VarScan:
1.chrM 16236 C A 0 35 100% 0 1 0 93 0.98
2. chrM 16236 C T 0 275 100% 0 2 0 93 0.98

I thought that in order to distinguish b/w reliable homozygous/heterozygous SNPs, I need to know the ratio of the "A" coverage (line 1) relative to the total coverage for that position. I would also like to know what is the ratio between the most frequent nucleotide to the second frequent nucleotide.
without this info how can I tell, for example in line 1, if it is a reliable SNP and what kind of SNP?
gfmgfm is offline   Reply With Quote
Old 12-15-2010, 09:44 PM   #5
gfmgfm
Member
 
Location: il

Join Date: Jun 2010
Posts: 64
Default

drio, thanks a lot for the reply. I looked at the mpileup. If I have already results and conclusions from pileup results - is it OK to use them?
gfmgfm is offline   Reply With Quote
Old 12-16-2010, 06:20 AM   #6
drio
Senior Member
 
Location: 41°17'49"N / 2°4'42"E

Join Date: Oct 2008
Posts: 323
Default

As soon as you used reasonable filters (check protocol in FAQ for a starting point) yes. pileup has been used with great level of success in various papers.
__________________
-drd
drio is offline   Reply With Quote
Old 12-16-2010, 06:51 AM   #7
gfmgfm
Member
 
Location: il

Join Date: Jun 2010
Posts: 64
Default

ok, thanks.
gfmgfm is offline   Reply With Quote
Old 05-17-2012, 10:33 AM   #8
bioman1
Member
 
Location: US

Join Date: May 2012
Posts: 80
Default Varscan SNV

Dear all,
I am new to NGS analysis. I have used bowtie (ver:bowtie-0.12.7) for aligning reference sequence (fastq format) with two paired end files of illumina reads (fastq format). Then I used SAM tools (ver:samtools-0.1.18) and made a 'mpileup' file. Then I have used Varscan (ver:.v2.2.11) for variant calling. I used "pileup2snp' command (with default parameters) to determine SNV and for heterozygosity & homozygosity.

1. The output gives in colums and I have below as rows for easy reading
Output:
Chrom:gi|53564564|gb|JH556356.3
Position:1781287
Ref:T
Cons:Y
Reads1:7
Reads2:2
VarFreq:22.22%
Strands1:2
Strands2:2
Qual1:27
Qual2:26
Pvalue:0.98
MapQual1:1
MapQual2:1
Reads1Plus:5
Reads1Minus:2
Reads2Plus:1
Reads2Minus:1
VarAllele:c


2. Any any one can tell me how identify SNV (how many of them are heterzygous & homozygous) with the above output ?. I have searched this forum, I could not find any help.
bioman1 is offline   Reply With Quote
Old 02-03-2013, 10:01 AM   #9
soban
Junior Member
 
Location: sweden

Join Date: Nov 2011
Posts: 5
Default SNP Filtering

Dear Fellows,
I am new to NGS technologies, i ran BWA for mapping my reads then i used GATK-Tool for SNP-Calling, now i want to filter the SNPs, i dont know how to proceed further, please name some tools to filter the SNPs, also how to use that one?.

Last edited by soban; 02-07-2013 at 02:49 AM.
soban is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:15 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO