Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to estimate tumor purity?

    Typically, tumor samples contain normal cells to a certain extent. Some programs that deal with somatic mutations take tumor purity percentage as an input, e.g. ExomeCNV, VarScan, etc.

    I noticed that ExomeCNV's LOH calling algorithm supposedly can be used to estimate the % contaminated by normal cells. But it is not well documented in their User Guide. Can anyone show me how to do this?

    Or are there other programs that can estimate tumor purity?

    Thanks in advance.

  • #2
    Succinctly, you expect heterozygous somatic SNVs to be present in a sample at a frequency around 50% (and homozygous changes to be present at 100%). If your tumor is impure, these fractions will drop. If your sample is 10% normal cells, your numbers would be 45% and 90%. Regions with LOH or single copy number loss provide larger numbers of somatic mutations that should occur at nearly 100%, making this estimation easier.

    Comment


    • #3
      Is this what you are looking for?

      Comment


      • #4
        Originally posted by davidblaney View Post
        Is this what you are looking for?

        http://genomebiology.com/2013/14/7/R80/abstract
        looks good. Thanks a lot!

        Comment


        • #5
          This is the function in ExomeCNV to estimate fraction of normal cells in tumor sample. It takes two arguments. The first one is the logR (aka log intensity) value vector for different regions indexed from 1 to length(logR). The second argument is a vector of indexes of the logR vector that is determined to be LOH.

          ========================================
          guesstimate.contamination <-
          function(logR, region.idx=NULL) {
          if (is.null(region.idx)) region.idx = 1:length(logR)
          med.logR = median(logR[region.idx], na.rm=TRUE)
          if (med.logR < 0) { rho = 0.5 } else { rho = 1.5 }
          return((2**med.logR - rho)/(1 - rho))
          }
          =====================================

          From the look of it, it takes the median of the logR values in LOH regions and then apply this formula

          let logR_m be the logR median among the LOH regions

          if logR_m >=0, contamination = 3 - 2^(logR_m + 1)
          if logR_m < 0, contamination = 2^(logR_m + 1) - 1

          It seems to me it is theoretically possible to have contamination to be outside the bound of 0 and 1. Do you think this function is correct?

          Comment


          • #6
            According to formula described in their paper, the estimation formula is

            c = 1 - 2*average(|BAF_LOH - 0.5|)



            This formula seems to make more sense. It also won't make c goes out of bound.

            So which formula is correct???

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Essential Discoveries and Tools in Epitranscriptomics
              by seqadmin




              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
              04-22-2024, 07:01 AM
            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Yesterday, 11:49 AM
            0 responses
            15 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-24-2024, 08:47 AM
            0 responses
            16 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            62 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            60 views
            0 likes
            Last Post seqadmin  
            Working...
            X