Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Which R package can do this

    Hi, I have dataset of list name of genes and the gene hits. I would like to calculate over representation genes in R.

    I was wondering which R package can do this.

    Can anyone recommend some good R packages for analyzing and plotting metagenomics data (best for microbes).

    Thanks.

  • #2
    Bioconductor should do the trick http://www.bioconductor.org

    Comment


    • #3
      Originally posted by SDPA_Pet View Post
      Hi, I have dataset of list name of genes and the gene hits. I would like to calculate over representation genes in R.

      I was wondering which R package can do this.
      I think what you want is to apply the hypergeometric test which in R is implememted in the function phyper

      Code:
      phyper(q, m, n, k, ...)
      
      x, q 	vector of quantiles representing the number of white balls drawn without replacement from an urn which contains both black and white balls.
      m 	the number of white balls in the urn.
      n 	the number of black balls in the urn.
      k 	the number of balls drawn from the urn.
      ...

      Comment


      • #4
        Originally posted by JackieBadger View Post
        Bioconductor should do the trick http://www.bioconductor.org
        Hi, I checked bioconductor which includes lots of package. Can you tell me which one can help me find over representation genes.

        Also, which package can help me draw a heat map.

        Thank you.

        Comment


        • #5
          Originally posted by SDPA_Pet View Post
          Hi, I checked bioconductor which includes lots of package. Can you tell me which one can help me find over representation genes.

          Also, which package can help me draw a heat map.

          Thank you.
          For all those people who find it more convenient to bother you with their question rather than to Google it for themselves.


          However, you're probably better off with MEV

          Comment


          • #6
            Maybe, if you explained more about what kind of data you have, you might get more helpful responses. "Metagnomics data" could be anything from a buch of FASTQ files to a list of species.

            Comment


            • #7
              Hi Simon,

              Sorry about the confused. I have a table generated from metagenomic data. For each sample, I have two columns. One column the name of gene and the other the number of hits. Total, I have 10 samples. I would like to find out which genes are over representative.

              That is it.

              Comment


              • #8
                In that case, dariober's suggestion of the hypergeometric test is appropriate.

                Comment


                • #9
                  Hi Blahah, I am newbie. If I want to do hypergeometric test, which package I should use. Can you give me the R package name. That's all I want to know.

                  Someone just tells me use bio-conduct, but it includes hundreds of packages.

                  Comment


                  • #10
                    @SDPA_Pet read @dariober's post above... he tells you the R function is phyper. You don't need a package - it's in the R base installation.

                    Comment


                    • #11
                      OK, Thanks.

                      Comment


                      • #12
                        BTW, what is the functional level I should do the analysis. I can do COG function level (the lowest) or I can do COG categories (the highest).

                        If I do lowest, there will be thousands of functional genes.

                        Comment


                        • #13
                          Hi, I still don't know how to use the find over representative genes via phyper (someone recommends this command) or other R package. I attached a csv file as an example. Can anyone write a R scripts for me with my dataset.

                          In my dataset, the first row are my sample name. The first column is COG category ID. The numbers are gene counts.

                          Thank you.
                          Attached Files

                          Comment


                          • #14
                            Most people here seemed to have jumped to the conclusion that you want to do an enrichment test, and there, in fact, the hypergeometric test (also known as Fisher's exact test) is the customary thing to do, usually with the R function 'fisher.test', which internally calls 'phyper'.

                            I really don't see how this applies here. Please explain your setting again: By number of "hits" in your table, you mean the number of sequencing reads that mapped to this gene, right?

                            Now, what do you mean by "overrepresented"? Are you looking for genes which appear more often in one kind of samples than in the other? (E.g.: You have 5 samples from shallow water, 5 from deep water: Which genes differ in their abundance between these two types?)

                            What kind of samples are we talking about?

                            Comment


                            • #15
                              Hi Simon,

                              I am sorry I didn't explain it clearly.

                              The number of "hits" in your table, I mean the number of sequencing reads that mapped to this gene. ( you are right)
                              In the file that I attached, I am interested in the 2nd column (OSP_8 100 Spring Plain). I want to compare the 2nd column to the 3rd and 4th column.

                              "Over-representative": I want to find that which genes in the sample OSP_8 100 Spring Plain are more abundant (or different) than other 2 samples.

                              Do you know how to write the code?

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Recent Innovations in Spatial Biology
                                by seqadmin


                                Spatial biology is an exciting field that encompasses a wide range of techniques and technologies aimed at mapping the organization and interactions of various biomolecules in their native environments. As this area of research progresses, new tools and methodologies are being introduced, accompanied by efforts to establish benchmarking standards and drive technological innovation.

                                3D Genomics
                                While spatial biology often involves studying proteins and RNAs in their...
                                01-01-2025, 07:30 PM
                              • seqadmin
                                Advancing Precision Medicine for Rare Diseases in Children
                                by seqadmin




                                Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
                                12-16-2024, 07:57 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 01-09-2025, 04:04 PM
                              0 responses
                              433 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 01-09-2025, 09:42 AM
                              0 responses
                              441 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 01-08-2025, 03:17 PM
                              0 responses
                              456 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 01-03-2025, 11:18 AM
                              1 response
                              50 views
                              1 like
                              Last Post Tonia
                              by Tonia
                               
                              Working...
                              X