Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • tumor purity and edgeR

    I am currently using tophat2-htseq-edgeR pipeline for my tumor-normal pair analysis. I basically followed edgeR manual's example to do my analysis.

    I didn't see anywhere in the example that required me to input a guess on tumor purity/contamination. I believe theoretically tumor purity should affect the true gene expression level in tumor cells. So is it wrong that edgeR didn't take that into account? Or this effect is fixed somehow through the estimation of BCV??

    Thanks.

  • #2
    edgeR isn't only for looking at tumor-normal pairs, so it doesn't make any assumptions about the type of experiment you're doing. If you want to account for tumor purity, you'll need to put that in your experimental model.

    Comment


    • #3
      Thank you for your reply. What do you mean by "put that in your experimental model"? How to do that?

      Comment


      • #4
        edgeR and the other standard tools take a dataframe that describes the various experimental manipulations or confounders that you want included in a model fit. In the typical examples, the components of this dataframe are factors (genotype, treatment, etc.), but they don't have to be, you can also use continuous explanatory variables here. I believe there have been a few discussions on the bioconductor email list between people needing to account for age or other continuous data in their models, so you might have a read through those. This assumes that you have some numeric estimate of purity, of course. If you just have a categorical estimate (low, medium, high, etc.), then you could also just use that as a factor.

        FYI, here's one email thread about the subject, which is convenient since it makes explicit what logFC would mean in such situations.
        Last edited by dpryan; 08-20-2013, 06:22 AM. Reason: Fix some wording to better fit the "real" terms for things

        Comment


        • #5
          There are also a few papers/programs out there that attempt to make numerical estimates of the purity of a 'mixture' sample. I can't speak to the efficacy of them, as the only sample I'm interested in is a ternary system and the programs are quite limited in scope. However, the general method is sound.

          PMID: 23737925
          Last edited by jparsons; 08-20-2013, 01:55 PM. Reason: wrong paper in second ref

          Comment


          • #6
            Can you adjust the tumor hit counts by this method?

            Let c be the fraction of tumor sample contaminated by normal cells (probably determined experimentally or by other means)

            true tumor count = (tumor_count - c*normal_count) / (1-c)

            Will this work?

            Comment


            • #7
              No, because the relative contribution from the two sources to the measured count will vary by gene. Some genes may be more tumor-specific while others more normal-specific. Because of how RNAseq works, separating these two sources is actually quite difficult.

              Comment


              • #8
                Originally posted by dpryan View Post
                No, because the relative contribution from the two sources to the measured count will vary by gene.
                Yes, many genes would not be differentially expressed between the healthy sample and the cancer sample. If you applied a scaling factor, you would be altering their fold changes away from 1 and introducing new problems. I have not seen any journal articles account for this problem for differential expression analysis and I haven't even seen anyone do a spike-in study with healthy cells and a varying proportion of a cancer cell line to convincingly demonstrate that methods such as ESTIMATE work well. I would also be interested to know what kind of percentages purity estimation methods give on a single cell RNA-seq dataset.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM
                • seqadmin
                  Techniques and Challenges in Conservation Genomics
                  by seqadmin



                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                  Avian Conservation
                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                  03-08-2024, 10:41 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 03-27-2024, 06:37 PM
                0 responses
                12 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-27-2024, 06:07 PM
                0 responses
                11 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-22-2024, 10:03 AM
                0 responses
                53 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-21-2024, 07:32 AM
                0 responses
                69 views
                0 likes
                Last Post seqadmin  
                Working...
                X