Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Hey austic

    I almost wanted to make new thread about SNP filtering parameters, but I might as well join this.
    I am also using Lasergene package for evaluation.

    According to publications there must be around 20-25k SNP in human exome. Not sure about variability in these numbers between individuals, but I suppose thats target.

    So I set parameters to:
    show All SNPs,
    show Coding SNPs only (as I am interested in exome),
    Q call 40,
    P not ref - 90%,
    and SNP percent filter 50-100.

    Last one is most arguable because it says to show SNPs that have been seen in at least half (50%) of reads - I think its too stringent, but it is only way that allows me to arrive at 23k SNPs (which supposedly are expected) and gives me transition/transversion (Ti/Tv) ratio of 3.05 (exome is supposed to be 3-3.5, random is 0.5 and genome on average is 2.0-2.1) if I relax any of parameters above, then this "check" fails.

    Any thoughts and comments on these parameters?

    Now there is toughest part - to narrow down in order to find rare syndrom cause.
    I saw in this forum scheme about narrowing pipeline numbers, but there were no explanation how it was done (like 20k->10k->700->120 and then 4-6).
    Would be nice if someone would share info how to get from 23k to at least 1000 or less so its possible to look at them manually


    Some more points about my data: I am surprised that there is about half of SNPs that are not annotated (isn't that too many NOVEL?), also half are supposed to be amino acid change snps (also too many for what i know about genetics), 169 STOP codon SNPs - OK supposedly possible.

    I am thinking of looking at all trio at once in order to catch new SNPs in child - most likely fakes.

    Anyone knows how to check SNP conservation in evolution (I mean to check 23k at once) and same for amino acid change SNP (11k at once in polyphen and/or similar soft).

    Would be nice to share filtering parameters and numbers what comes out - so its possible to compare.

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Essential Discoveries and Tools in Epitranscriptomics
      by seqadmin




      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
      04-22-2024, 07:01 AM
    • seqadmin
      Current Approaches to Protein Sequencing
      by seqadmin


      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
      04-04-2024, 04:25 PM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, Yesterday, 11:49 AM
    0 responses
    15 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-24-2024, 08:47 AM
    0 responses
    16 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-11-2024, 12:08 PM
    0 responses
    61 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 10:19 PM
    0 responses
    60 views
    0 likes
    Last Post seqadmin  
    Working...
    X