Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • GO enrichment RNASEQ: nothing!

    Hello,

    I used goseq package to do GO-enrichment of RNASEQ data. The up- and down-regulated genes are not enriched in any category if I use the BH (FDR) correction. Is that a possible situation or is it more likely I made some mistake? If the latter, what should I look for?

    Thanks!

  • #2
    GO enrichment

    How did you choose the genes?
    Are you looking at the Differentially Expressed genes that show a significant increase and significant decrease? Is this output from Cuffdiff? DEseq?

    How many genes are in your sublists that you are testing?

    Comment


    • #3
      I used edgeR to determine the list of up-regulated and down-regulated genes. I have 3 samples, so in total I have 6 lists. The number of genes for each list is:

      388 edger_1_down.genelist
      503 edger_1_up.genelist
      316 edger_2_down.genelist
      274 edger_2_up.genelist
      249 edger_3_down.genelist
      187 edger_3_up.genelist

      In none of the above cases, the pvalue after FDR correction (0.05) becomes significant.

      Comment


      • #4
        edgeR

        Two thoughts.

        How stringent are your parameters for edgeR?

        Why are you separating the up and down into separate lists?
        Here is an example of why you wouldn't want to separate them.
        Consider a metabolic pathway that has multiple branch points that lead to different end products. To go farther along the pathway some of the proteins that are at those branch points must be (down) regulated while the other protein at the branch point must be up regulated.

        This is a simplified example but you should combine those lists for the GO enrichment analysis.

        The individual lists might be interesting to do some cis-regulatory identification analysis.

        Best,

        Andrew

        Comment


        • #5
          Thank you Andrew for the suggestion. I will try to combine the list.

          Comment


          • #6
            Hello, my problem with GOSeq enrichment analysis is quite different.
            I have run go.seq analysis and obtained the following result:

            > head(GO_Seq_SM_0.001)
            category over_represented_pvalue under_represented_pvalue
            631 GO:0005509 5.304893e-08 1.0000000
            1456 GO:0030001 1.439800e-07 1.0000000
            20 GO:0000151 4.262968e-04 0.9999608
            1318 GO:0016762 5.216655e-04 0.9999842
            70 GO:0003677 7.635085e-04 0.9997061
            775 GO:0006571 7.907911e-04 0.9999957


            Then I run the enrichment analysis to find the significatively enriched GO classes, but the software lists more times only the first category:

            >enriched.prova=GO_Seq_SM_0.001$category[p.adjust(GO_Seq_SM_0.001$over_represented_pvalue,method="BH")]
            > head(enriched.prova)
            [1] "GO:0005509" "GO:0005509" "GO:0005509" "GO:0005509" "GO:0005509" "GO:0005509"

            Is there someone that can help me finding the error?
            I tried on other datasets, but the result is the same.

            Thanks!

            Comment


            • #7
              Hi Raffaella,
              try adding some value to your command ,like '< 0.05' after the parenthesis.
              In fact, the syntax looks wrong, you need a vector o TRUE or FALSE values to subset correctly the GO vector

              i.e., try:

              enriched.prova=GO_Seq_SM_0.001$category[p.adjust(GO_Seq_SM_0.001$over_represented_pvalue,method="BH") < 0.05]

              pbseq

              Comment


              • #8
                Thanks, pbseq. It was a simple error in my printed copy of the manual
                The last part of the command was missing.
                Now it works well.

                Just another question. I have a non-native genome and I tried to limit results to a specific class of GO terms (such as "GO:CC", "GO:BP"and "GO:MF"), but I have always the full set of results, not just the GO terms belonging to the selected class.
                Is this function available with non-native data?

                Thanks for sharing your expertise.

                Comment


                • #9
                  I have a follow-up question regarding the comment "Why are you separating the up and down into separate lists?".
                  Is there a way to see what GO terms are enriched in the up-regulated, and in the down-regulated genes without separating the lists? I have RNA-seq data from two very different environments, and am specifically looking at genes that are used to grow in one environment (and not the other). I am therefore wanting to get a list of GO terms that are enriched in the genes used to grow in the one environment, but have only been able to get that by splitting the "up" and "down" regulated genes. Is there a problem with splitting the lists like this, other than what was described by Andrew? And if so, is there a way to not separate the list in GOseq, but have the output show which are enriched for the increased/decreased gene groups?

                  Thanks in advance for your response/feedback.

                  Comment


                  • #10
                    I'm wondering the same thing as vpp605.
                    I am separating the significant genes into up- and down-regulated as well, because I have two very different groups.

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Essential Discoveries and Tools in Epitranscriptomics
                      by seqadmin


                      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
                      Yesterday, 07:01 AM
                    • seqadmin
                      Current Approaches to Protein Sequencing
                      by seqadmin


                      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                      04-04-2024, 04:25 PM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, 04-11-2024, 12:08 PM
                    0 responses
                    39 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-10-2024, 10:19 PM
                    0 responses
                    41 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-10-2024, 09:21 AM
                    0 responses
                    35 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-04-2024, 09:00 AM
                    0 responses
                    55 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X