Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • tumor purity and edgeR

    I am currently using tophat2-htseq-edgeR pipeline for my tumor-normal pair analysis. I basically followed edgeR manual's example to do my analysis.

    I didn't see anywhere in the example that required me to input a guess on tumor purity/contamination. I believe theoretically tumor purity should affect the true gene expression level in tumor cells. So is it wrong that edgeR didn't take that into account? Or this effect is fixed somehow through the estimation of BCV??

    Thanks.

  • #2
    edgeR isn't only for looking at tumor-normal pairs, so it doesn't make any assumptions about the type of experiment you're doing. If you want to account for tumor purity, you'll need to put that in your experimental model.

    Comment


    • #3
      Thank you for your reply. What do you mean by "put that in your experimental model"? How to do that?

      Comment


      • #4
        edgeR and the other standard tools take a dataframe that describes the various experimental manipulations or confounders that you want included in a model fit. In the typical examples, the components of this dataframe are factors (genotype, treatment, etc.), but they don't have to be, you can also use continuous explanatory variables here. I believe there have been a few discussions on the bioconductor email list between people needing to account for age or other continuous data in their models, so you might have a read through those. This assumes that you have some numeric estimate of purity, of course. If you just have a categorical estimate (low, medium, high, etc.), then you could also just use that as a factor.

        FYI, here's one email thread about the subject, which is convenient since it makes explicit what logFC would mean in such situations.
        Last edited by dpryan; 08-20-2013, 06:22 AM. Reason: Fix some wording to better fit the "real" terms for things

        Comment


        • #5
          There are also a few papers/programs out there that attempt to make numerical estimates of the purity of a 'mixture' sample. I can't speak to the efficacy of them, as the only sample I'm interested in is a ternary system and the programs are quite limited in scope. However, the general method is sound.

          PMID: 23737925
          Last edited by jparsons; 08-20-2013, 01:55 PM. Reason: wrong paper in second ref

          Comment


          • #6
            Can you adjust the tumor hit counts by this method?

            Let c be the fraction of tumor sample contaminated by normal cells (probably determined experimentally or by other means)

            true tumor count = (tumor_count - c*normal_count) / (1-c)

            Will this work?

            Comment


            • #7
              No, because the relative contribution from the two sources to the measured count will vary by gene. Some genes may be more tumor-specific while others more normal-specific. Because of how RNAseq works, separating these two sources is actually quite difficult.

              Comment


              • #8
                Originally posted by dpryan View Post
                No, because the relative contribution from the two sources to the measured count will vary by gene.
                Yes, many genes would not be differentially expressed between the healthy sample and the cancer sample. If you applied a scaling factor, you would be altering their fold changes away from 1 and introducing new problems. I have not seen any journal articles account for this problem for differential expression analysis and I haven't even seen anyone do a spike-in study with healthy cells and a varying proportion of a cancer cell line to convincingly demonstrate that methods such as ESTIMATE work well. I would also be interested to know what kind of percentages purity estimation methods give on a single cell RNA-seq dataset.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Advancing Precision Medicine for Rare Diseases in Children
                  by seqadmin




                  Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
                  12-16-2024, 07:57 AM
                • seqadmin
                  Recent Advances in Sequencing Technologies
                  by seqadmin



                  Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

                  Long-Read Sequencing
                  Long-read sequencing has seen remarkable advancements,...
                  12-02-2024, 01:49 PM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 12-17-2024, 10:28 AM
                0 responses
                33 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 12-13-2024, 08:24 AM
                0 responses
                49 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 12-12-2024, 07:41 AM
                0 responses
                34 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 12-11-2024, 07:45 AM
                0 responses
                46 views
                0 likes
                Last Post seqadmin  
                Working...
                X