Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to deal with normalized RNA seq data?

    Hi everyone !
    I have got the TCGA data about prostatic cancer,the data has been normalized.Now I want to analysis it and find some different expressed genes.
    my question is what method can I use except limma and t.test?
    Thank you !

  • #2
    Starting with normalized data, you can use LIMMA, T-Tests or ANOVA. Of course, without biological replicates, you cannot do any statistical analysis of differential gene expression at all, so I'm assuming you have at least 2 biological replicates for each condition (and really, 3 should be the bare minimum acceptable for any reliable stats).
    Michael Black, Ph.D.
    ScitoVation LLC. RTP, N.C.

    Comment


    • #3
      Thank you for your reply!
      The data which I got is 180 samples ,contain 141 tumor samples and 39 normal samples .I have analysis the data with LIMMA and T-Test,when I use LIMMA I choose 3 tumor and 3 normal to analysis ,I find about 80 different expressed genes(diffgenes),then I choos 10 tumor and 10 normal,I get about 2000 diffgenes ,finally I use all the tumor and normal samples ,unintelligibly,I get 10,000 diffgenes.
      I don't know why this phenomenon happen ?
      Should I do some cluster or PCA before analysis (Actually,I have done these method ,but cluster and PCA didn't work well,I can't cluster all the 180 sample perfectly),is there any other method to do this work?
      I'm a junior ,thank you for your help !

      Comment


      • #4
        Any time that you increase the number of replicates, you will likely detect more significant results. That is the whole point of replication. The more replicates, the more precise is your estimate of the population mean and variance, and hence the smaller the change you can now detect as significant (i.e. unlikely to be observed by chance). Statistical significance is all about your ability to estimate population mean and variance, and the more replicates you have, the better your estimates of those parameters.

        This is the very reason why biological replicates are so important in detecting differential gene expression. The more replicates, the greater your ability to detect ever more subtle changes in gene expression.

        Unless you have some rational reason to reject a sample, you should always use all of your biological replicates when testing for differential gene expression - you have the most power to discriminate differences then.

        When selecting differentially expressed genes though, you should NOT rely purely on statistical significance. Many published studies have clearly shown that you will get your most reliable results if you simultaneously use both a statistical threshold (e.g. FDR<0.05 is common), AND a magnitude threshold (e.g. absolute value of estimated fold change >1.5, or >2.0 are common cutoffs).

        Genes selected by simultaneously applying a statistical, AND a magnitude threshold are the most likely to validate via an independent method such as RT-qPCR.
        Michael Black, Ph.D.
        ScitoVation LLC. RTP, N.C.

        Comment


        • #5
          The more samples you use, the more power you have. That's a completely expected result. With enough samples you'd likely find almost everything to be at least slightly different between the two samples.

          "PCA didn't work well" is meaningless. Perhaps the samples clustered by group or perhaps not, but in neither case can one say that the PCA itself didn't work well.

          Comment


          • #6
            Thanks for all!I think I have understand the important of biological replicates ,and I will use some magnitude threshold,or some biological methods to find the genes which I interested in.
            On the other hand,I want to get the raw counts about Prostate cancer from TCGA,then use DEseq and edgeR to find differential expression genes,I don't know whether I can get the raw data from TCGA.
            I get a new question ,my 180 sample from TCGA is from different batch ,is there a impact to our analysis ?

            Comment


            • #7
              Yes, batch effects can be quite large on occasion (or quite small, you never know). Have a look at the SVA package on bioconductor.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Essential Discoveries and Tools in Epitranscriptomics
                by seqadmin




                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                04-22-2024, 07:01 AM
              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Today, 11:49 AM
              0 responses
              12 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 08:47 AM
              0 responses
              16 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              61 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              60 views
              0 likes
              Last Post seqadmin  
              Working...
              X