Seqanswers Leaderboard Ad

**Kashliks** · 03-12-2012, 12:44 AM

Hey austic

I almost wanted to make new thread about SNP filtering parameters, but I might as well join this.
I am also using Lasergene package for evaluation.

According to publications there must be around 20-25k SNP in human exome. Not sure about variability in these numbers between individuals, but I suppose thats target.

So I set parameters to:
show All SNPs,
show Coding SNPs only (as I am interested in exome),
Q call 40,
P not ref - 90%,
and SNP percent filter 50-100.

Last one is most arguable because it says to show SNPs that have been seen in at least half (50%) of reads - I think its too stringent, but it is only way that allows me to arrive at 23k SNPs (which supposedly are expected) and gives me transition/transversion (Ti/Tv) ratio of 3.05 (exome is supposed to be 3-3.5, random is 0.5 and genome on average is 2.0-2.1) if I relax any of parameters above, then this "check" fails.

Any thoughts and comments on these parameters?

Now there is toughest part - to narrow down in order to find rare syndrom cause.
I saw in this forum scheme about narrowing pipeline numbers, but there were no explanation how it was done (like 20k->10k->700->120 and then 4-6).
Would be nice if someone would share info how to get from 23k to at least 1000 or less so its possible to look at them manually

Some more points about my data: I am surprised that there is about half of SNPs that are not annotated (isn't that too many NOVEL?), also half are supposed to be amino acid change snps (also too many for what i know about genetics), 169 STOP codon SNPs - OK supposedly possible.

I am thinking of looking at all trio at once in order to catch new SNPs in child - most likely fakes.

Anyone knows how to check SNP conservation in evolution (I mean to check 23k at once) and same for amino acid change SNP (11k at once in polyphen and/or similar soft).

Would be nice to share filtering parameters and numbers what comes out - so its possible to compare.

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, Yesterday, 11:49 AM	0 responses 15 views 0 likes	Last Post by seqadmin Yesterday, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 16 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 61 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

Comment

Latest Articles

ad_right_rmr

News