Seqanswers Leaderboard Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • ymc
    Senior Member
    • Mar 2010
    • 496

    tumor purity and edgeR

    I am currently using tophat2-htseq-edgeR pipeline for my tumor-normal pair analysis. I basically followed edgeR manual's example to do my analysis.

    I didn't see anywhere in the example that required me to input a guess on tumor purity/contamination. I believe theoretically tumor purity should affect the true gene expression level in tumor cells. So is it wrong that edgeR didn't take that into account? Or this effect is fixed somehow through the estimation of BCV??

    Thanks.
  • dpryan
    Devon Ryan
    • Jul 2011
    • 3478

    #2
    edgeR isn't only for looking at tumor-normal pairs, so it doesn't make any assumptions about the type of experiment you're doing. If you want to account for tumor purity, you'll need to put that in your experimental model.

    Comment

    • ymc
      Senior Member
      • Mar 2010
      • 496

      #3
      Thank you for your reply. What do you mean by "put that in your experimental model"? How to do that?

      Comment

      • dpryan
        Devon Ryan
        • Jul 2011
        • 3478

        #4
        edgeR and the other standard tools take a dataframe that describes the various experimental manipulations or confounders that you want included in a model fit. In the typical examples, the components of this dataframe are factors (genotype, treatment, etc.), but they don't have to be, you can also use continuous explanatory variables here. I believe there have been a few discussions on the bioconductor email list between people needing to account for age or other continuous data in their models, so you might have a read through those. This assumes that you have some numeric estimate of purity, of course. If you just have a categorical estimate (low, medium, high, etc.), then you could also just use that as a factor.

        FYI, here's one email thread about the subject, which is convenient since it makes explicit what logFC would mean in such situations.
        Last edited by dpryan; 08-20-2013, 06:22 AM. Reason: Fix some wording to better fit the "real" terms for things

        Comment

        • jparsons
          Member
          • Feb 2012
          • 62

          #5
          There are also a few papers/programs out there that attempt to make numerical estimates of the purity of a 'mixture' sample. I can't speak to the efficacy of them, as the only sample I'm interested in is a ternary system and the programs are quite limited in scope. However, the general method is sound.

          PMID: 23737925
          Last edited by jparsons; 08-20-2013, 01:55 PM. Reason: wrong paper in second ref

          Comment

          • ymc
            Senior Member
            • Mar 2010
            • 496

            #6
            Can you adjust the tumor hit counts by this method?

            Let c be the fraction of tumor sample contaminated by normal cells (probably determined experimentally or by other means)

            true tumor count = (tumor_count - c*normal_count) / (1-c)

            Will this work?

            Comment

            • dpryan
              Devon Ryan
              • Jul 2011
              • 3478

              #7
              No, because the relative contribution from the two sources to the measured count will vary by gene. Some genes may be more tumor-specific while others more normal-specific. Because of how RNAseq works, separating these two sources is actually quite difficult.

              Comment

              • Dario1984
                Senior Member
                • Jun 2011
                • 166

                #8
                Originally posted by dpryan View Post
                No, because the relative contribution from the two sources to the measured count will vary by gene.
                Yes, many genes would not be differentially expressed between the healthy sample and the cancer sample. If you applied a scaling factor, you would be altering their fold changes away from 1 and introducing new problems. I have not seen any journal articles account for this problem for differential expression analysis and I haven't even seen anyone do a spike-in study with healthy cells and a varying proportion of a cancer cell line to convincingly demonstrate that methods such as ESTIMATE work well. I would also be interested to know what kind of percentages purity estimation methods give on a single cell RNA-seq dataset.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Pathogen Surveillance with Advanced Genomic Tools
                  by seqadmin




                  The COVID-19 pandemic highlighted the need for proactive pathogen surveillance systems. As ongoing threats like avian influenza and newly emerging infections continue to pose risks, researchers are working to improve how quickly and accurately pathogens can be identified and tracked. In a recent SEQanswers webinar, two experts discussed how next-generation sequencing (NGS) and machine learning are shaping efforts to monitor viral variation and trace the origins of infectious...
                  03-24-2025, 11:48 AM
                • seqadmin
                  New Genomics Tools and Methods Shared at AGBT 2025
                  by seqadmin


                  This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

                  The Headliner
                  The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
                  03-03-2025, 01:39 PM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 03-20-2025, 05:03 AM
                0 responses
                49 views
                0 reactions
                Last Post seqadmin  
                Started by seqadmin, 03-19-2025, 07:27 AM
                0 responses
                57 views
                0 reactions
                Last Post seqadmin  
                Started by seqadmin, 03-18-2025, 12:50 PM
                0 responses
                50 views
                0 reactions
                Last Post seqadmin  
                Started by seqadmin, 03-03-2025, 01:15 PM
                0 responses
                201 views
                0 reactions
                Last Post seqadmin  
                Working...