Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • correlation value in scatterplots using cummerbund??

    Hi,
    I'm running cummerbund and I made a scatterplot of my data, but I'd like to see the correlation value of that plot. Is it possible? Is there any command to see the R value of that correlation?? Thanks!

  • #2
    You might have to use cor(), a base R function. However, I don't know exactly how to subset a CuffData object so that you get the proper input for cor().

    Comment


    • #3
      That's my big problem right now. I'd like to know how can I use cor() with CuffData.

      Comment


      • #4
        Originally posted by Marcos Lancia View Post
        That's my big problem right now. I'd like to know how can I use cor() with CuffData.
        It's been a while since I've used cummeRbund, but I remember that you can use the fpkmMatrix() function (can't remember the exact usage) to get a matrix that plays nicely with the base R functions.

        Comment


        • #5
          Thanks for the tip cmbetts. It looks like you want to execute
          Code:
          m <- fpkmMatrix(genes(cuffdiff_output))
          cor(m[, 1], m[, 2]) # or whatever columns you need
          Keep in mind that csScatter() might discard some of the data points. You can verify this by running
          Code:
          csScatter(genes(cuffdiff_output))
          # compare to
          plot(m[, 1], m[, 2])

          Comment


          • #6
            I'm progressing, now a new problem

            Thanks for writing everyone! I finally could see the correlation value, but I crashed with another new problem. One of the r value is near to 0, but the plot looks very good, near to 1. I'm pretty sure that data plotted is the same analized. Anybody saw something similar? What did you do? Thanks!

            Comment


            • #7
              Marcos,

              Did you ever sort this out? I would double check that you're supplying the right vectors to cor(). It might also have to do with how csScatter doesn't plot all of the data points. It's hard for me to think of anything else since I don't have the data to play with myself.

              Comment


              • #8
                Hi, again

                For example: I want to analyze TRAP_Sm_rep1 vs TRAP_Sm_rep2. So, I write:

                >samples (cuff)

                sample_index sample_name sample_name parameter value
                1 1 SN16K_mock_rep2 <NA> <NA> <NA>
                2 2 SN16K_mock_rep1 <NA> <NA> <NA>
                3 3 SN16K_Sm_rep2 <NA> <NA> <NA>
                4 4 SN16K_Sm_rep1 <NA> <NA> <NA>
                5 5 TRAP_mock_rep1 <NA> <NA> <NA>
                6 6 TRAP_mock_rep2 <NA> <NA> <NA>
                7 7 TRAP_Sm_rep1 <NA> <NA> <NA>
                8 8 TRAP_Sm_rep2 <NA> <NA> <NA>

                So, I write:

                >cor(m[, 7],m[, 8])

                Is it right?

                Comment


                • #9
                  I have a couple questions for you. Are you following some sort of published analysis? Also, does each row of samples(cuff) have a column in m? In other words, what does
                  Code:
                  all(colnames(m) %in% samples(cuff)[, 2])
                  say when you enter it into your R session?

                  Edit: I realize that the code above isn't exactly what I meant to ask for. Can you paste what R prints for
                  Code:
                  colnames(m)
                  Last edited by blakeoft; 05-04-2015, 10:36 AM.

                  Comment


                  • #10
                    all(colnames(m) %in% samples(cuff)[, 2])
                    [1] TRUE

                    I'm working by myself. I don't following any published analysis. Do you know any? The R help isn't good either.

                    Comment


                    • #11
                      colnames(m)

                      [1] "SN16K_mock_rep2" "SN16K_mock_rep1" "SN16K_Sm_rep2" "SN16K_Sm_rep1"
                      [5] "TRAP_mock_rep1" "TRAP_mock_rep2" "TRAP_Sm_rep1" "TRAP_Sm_rep2"

                      Comment


                      • #12
                        I saw the prefix "TRAP", and it made me think of Trapnell, as in Cole Trapnell. This is why I was curious if you were working with some kind of sample data set or something.

                        Your code should be correct though,
                        Code:
                        cor(m[, 7], m[, 8])
                        should give you what you want. Have you tried checking the correlation between both of these columns with all of the others?

                        You might also compare the following values:

                        Code:
                        sum(m[, 7] == 0)
                        sum(m[, 8] == 0)
                        sum(m[, 7] == 0 & m[, 8] == 0)
                        to see if you have many more zeros in one the columns or if they don't share many of the same zeros.

                        My suggestions are shots in the dark, so I apologize if nothing enlightening happens.

                        Comment


                        • #13
                          No, I´m not working with Cole Trapnell datasets, these data are mine.
                          Question: How can I be sure that data plotted are the same analyzed by correlation? Make some kind of matrix, maybe?
                          Thanks so much for writing, you've been very helpful.

                          Comment


                          • #14
                            After reading in some of my old cufflinks data, I've realized that my second post in this discussion has some incorrect code. Please plot these two figures for comparison:

                            Code:
                            csScatter(genes(cuff), "TRAP_Sm_rep1", "TRAP_Sm_rep2")
                            # compare to
                            plot(log(m[, 7] + 1), log(m[, 8] + 1))
                            These plots should look pretty similar. You should be able to tell that you're passing the right vectors into cor().

                            Comment


                            • #15
                              Hi,
                              Your suggestion of plotting log(m[, ]+1) gaves me an idea. I made the cor(log(m[, ]+1)) and it worked! The cor() values are up to 0.95 all of them. Thanks for your help, mission accomplished, up to now.
                              Do you know how can I put labels in genes with differential expression? I tried with labels=T in scatterplots, but it didn´t work.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Techniques and Challenges in Conservation Genomics
                                by seqadmin



                                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                Avian Conservation
                                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                03-08-2024, 10:41 AM
                              • seqadmin
                                The Impact of AI in Genomic Medicine
                                by seqadmin



                                Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
                                02-26-2024, 02:07 PM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 03-14-2024, 06:13 AM
                              0 responses
                              34 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-08-2024, 08:03 AM
                              0 responses
                              72 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-07-2024, 08:13 AM
                              0 responses
                              82 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-06-2024, 09:51 AM
                              0 responses
                              68 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X