Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help understanding my DESeq data

    Hello everybody!
    I hope somone can help me... I am really bad in this bioinformatics stuff.
    I've made an RNAseq experiment, among other goals, to get target genes regulated by a chromatin protein complex:
    - 1 control, 3 different mutants, 3 biological replicates.
    - Illumina sequencing > reads mapped with TopHat (by a statistician)
    - Differential gene expression analysis with DESeq and edgeR (by a bioinformatician). I got 2 lists (DESeq + edgeR) of differentially expressed genes for each mutant compared to the control.
    - Because of the small number of significantly deregulated genes (according to adjusted pValue or FDR), I've set a rather loose threshold for the pValue (10%) and the log2 fold change (-1.5 > log2 FC > 1.5) in the two lists. Then I merged them, ending up with lists of about 100-400 deregulated genes per list.
    - I am now validating some interesting Targets by qPCR. Further biological experiments are planned.

    My questions:
    1. DESeq data: Is it normal that the "base mean" and the "base mean A" (corresponding to the control sample, which was the same for the 3 treatments) are not the same in the 3 lists (for the 3 mutants) of differentially expressed genes?
    2. qPCR validation: What is the relationship between the normalized expression ratio of the "2^-deltadeltaCt" method and the fold change of the DESeq/edgeR analysis. Should there be a quantitative correlation?

    Thank you for any help!
    Last edited by Axolotl; 11-15-2013, 01:40 AM.

  • #2
    Are you doing the qPCR validation on the same samples? If so, this is pointless -- you will just find what you already know. If not, you will soon learn why using raw instead of adjusted p values is not permitted,

    Comment


    • #3
      No, I'm validating on new samples, treated the same way though... So you are saying I could have picked the putative target genes blindly, by chance? Oh God...!
      What about the base mean values in my control sample? I can't figure out why these should vary...

      Comment


      • #4
        You do know what a p value is, don't you? You wouldn't decide that a 10% threshold on a p value is a good idea without knowing what a p value is, or would you?

        Ok, seriously: Imagine you perform an RNA-Seq experiment, comparing samples treated with some drug with control samples, but some evil colleague has swapped your drug with water. How many of your organism's 20,000 genes will nevertheless have a p value below 10%?

        Answer: 10%, i.e., 2000 genes.

        This is the definition of "p value": Even if the treatment had no effect, x% of the genes will have a p value below x%.

        And this is why you must never use raw p values in high-throughput experiments.

        (And sorry that I sound annoyed but I explain this basic fact about once every week, and I cannot fathom why so many people don't know this, because -- it's somewhat important to know this, isn't it?)
        Last edited by Simon Anders; 11-15-2013, 06:18 AM.

        Comment


        • #5
          I am very sorry to annoy you. Yes, I new what the p Value is, although I don't know how the adjusted one is estimated. But shouldn't the adjusted pValue be somehow dependent on the raw value? And what about the fact, that these genes appear in DESeq and edgeR? Is this likely? And why is "my" bioinformatician not screeming?
          Again, I'm really sorry about my "stupid" questions, I already realized that my feeling of probabilities goes systematically against the laws.
          Thank you for being patient.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Essential Discoveries and Tools in Epitranscriptomics
            by seqadmin




            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
            Yesterday, 07:01 AM
          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          59 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          57 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          47 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          55 views
          0 likes
          Last Post seqadmin  
          Working...
          X