Hi everyone,
I found in the literature that when you do SNP detection in human exome data, you expect approximately 20,000 SNPs
We did exome sequencing with SeqCap Nimblegen v2 30x coverage on Illumina HiSeq and got 100 bp paired-end reads
I analysed our exome data with CLC Bio (mapping + variant calling) and I found 190000 SNPs (through Genomics Gateway - SNP Detection tool).
These were the parameters I used for SNP detection
• Quality
o Window length 11
o Maximum number of gaps and mismatches 2
o Minimum average quality of surrounding bases 15
o Minimum quality of central base 20
• Significance
o Non-specific and low-quality matches are ignored during SNP detection
Minimum coverage 4
Minimum variant frequency 35%
Minimum paired coverage 0
Maximum coverage 25
Minimum variant count required 1 and sufficient 5
• Ploidy
o Maximum expected variation 2
I'm kind of a newbie to exome data analysis. I was therefore wondering if someone could give me an explanation for this huge number of SNPs? Or examples in literature where they also had this problem?
I found in the literature that when you do SNP detection in human exome data, you expect approximately 20,000 SNPs
We did exome sequencing with SeqCap Nimblegen v2 30x coverage on Illumina HiSeq and got 100 bp paired-end reads
I analysed our exome data with CLC Bio (mapping + variant calling) and I found 190000 SNPs (through Genomics Gateway - SNP Detection tool).
These were the parameters I used for SNP detection
• Quality
o Window length 11
o Maximum number of gaps and mismatches 2
o Minimum average quality of surrounding bases 15
o Minimum quality of central base 20
• Significance
o Non-specific and low-quality matches are ignored during SNP detection
Minimum coverage 4
Minimum variant frequency 35%
Minimum paired coverage 0
Maximum coverage 25
Minimum variant count required 1 and sufficient 5
• Ploidy
o Maximum expected variation 2
I'm kind of a newbie to exome data analysis. I was therefore wondering if someone could give me an explanation for this huge number of SNPs? Or examples in literature where they also had this problem?
Comment