Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Hi all,

    I'm very new to R and also RNA-seq in general so I hope my question is not too daft and that you might be able to help me out.

    I have an experiment set up where I have a group of 8 samples. 2 x infected at 4hrs, 2 x mock infected at 4hrs, 2 x infected at 7hrs and 2 x mock infected at 7hrs.

    I've been trying to use Deseq to analyse the read counts and by following the instructions from the PDF (which btw is very well written and can even be followed by a novice like me!) and I get to the point where I can look for differentially expressed genes. However, the instuctions only go into comparisons between two conditions. This is fine for looking at the infected and uninfected states but I would also like to examine whether the time course of infection has any effect on gene transcription. Is there any way of running a 3-way analysis in Deseq?
    Also, when I come to display the results of the most significantly differentially expressed genes the table generated only produces the top 6 genes. Is there a way of exporting the whole table created so that i can introduce a cut-off myself based on p-values?

    Thanks in advance for any help!

    Emma

    Comment


    • #17
      Dear Emma

      Originally posted by emma_n View Post
      I have an experiment set up where I have a group of 8 samples. 2 x infected at 4hrs, 2 x mock infected at 4hrs, 2 x infected at 7hrs and 2 x mock infected at 7hrs.

      I've been trying to use Deseq to analyse the read counts and by following the instructions from the PDF (which btw is very well written and can even be followed by a novice like me!) and I get to the point where I can look for differentially expressed genes. However, the instuctions only go into comparisons between two conditions. This is fine for looking at the infected and uninfected states but I would also like to examine whether the time course of infection has any effect on gene transcription. Is there any way of running a 3-way analysis in Deseq?
      I take it that, by "3-way analysis", you mean that you want to test an interaction contrast, i.e., to see whether the effect of the combination of later time and infection is significantly different from the sum of the effect of the later time only and the effect of the infection only.

      This is not yet possible with DESeq. The stress is on "not yet": I've found a way to do this (you are not the first to ask for it), and it works on my computer; I only need to add it to DESeq, and I hope that I'll be able to do so within the next wto weeks. You are welcome to be one of the first to try this new feature.

      If you want to try something right now, you can also use the 'getVarianceStabilizedData' to transform your data to a homoscedastic scale and the perform the analysis with limma, an R package for microarray analysis that is designed for such tasks and might be able to work with sequencing data as well, provided you first do a variance-stabilizing transform.

      Also, when I come to display the results of the most significantly differentially expressed genes the table generated only produces the top 6 genes. Is there a way of exporting the whole table created so that i can introduce a cut-off myself based on p-values?
      The 'head' function used in the vignette instructs R to only display the first 10 or so lines of a data frame. Write 'head( ..., 100 )' to get 100 lines, or omit the 'head' altogether to get all lines.

      If you want to look at the data with your favorite spreadsheet program (Excel or the like), just save the result data frame with the 'write.csv' function: write.csv( res, file="myresults.txt" )

      Simon

      Comment


      • #18
        Thanks for your reply Simon, i'm now generating the tables I'm after!

        As for the 3-way analysis, yes you're right that I want to be able to find out if the combination of infection plus time has and affect on the differentially transcribed genes. If you are developing this as an add in for Deseq then I'm more than happy to hang on for that. I would much prefer to do the analysis altogether within one package and if Deseq will be able to offer this in the near future that would be great for me.

        Thanks again for your help

        Emma

        Comment


        • #19
          Hi Everyone

          In my Analysis, i am working without any relicates......I trying to find differentially expressed genes using DESeq.
          After running > res <- nbinomTest(cds, "T", "N") I got following results
          > head(res)
          id baseMean baseMeanA baseMeanB foldChange log2FoldChange pval
          1 1 62.5678206 81.639584 43.4960569 0.5327815 -0.9083842 0.3533266
          2 2 123.5275329 217.192101 29.8629644 0.1374956 -2.8625423 0.1169151
          3 3 0.3245974 0.000000 0.6491949 Inf Inf 1.0000000
          4 4 125.0737757 207.949884 42.1976671 0.2029223 -2.3010007 0.1651501
          5 5 5.2703034 9.242217 1.2983898 0.1404847 -2.8315155 0.1715208
          6 6 359.4632766 485.216397 233.7101564 0.4816617 -1.0539079 0.3340572
          padj resVarA resVarB
          1 1 NA NA
          2 1 NA NA
          3 1 NA NA
          4 1 NA NA
          5 1 NA NA
          6 1 NA NA
          > head(res)
          id baseMean baseMeanA baseMeanB foldChange log2FoldChange pval
          1 1 62.5678206 81.639584 43.4960569 0.5327815 -0.9083842 0.3533266
          2 2 123.5275329 217.192101 29.8629644 0.1374956 -2.8625423 0.1169151
          3 3 0.3245974 0.000000 0.6491949 Inf Inf 1.0000000
          4 4 125.0737757 207.949884 42.1976671 0.2029223 -2.3010007 0.1651501
          5 5 5.2703034 9.242217 1.2983898 0.1404847 -2.8315155 0.1715208
          6 6 359.4632766 485.216397 233.7101564 0.4816617 -1.0539079 0.3340572
          padj resVarA resVarB
          1 1 NA NA
          2 1 NA NA
          3 1 NA NA
          4 1 NA NA
          5 1 NA NA
          6 1 NA NA
          As a next step to get a plot I tried following command and got the error
          > plot(
          + res$baseMean,
          + res$log2FoldChange
          + log="x", pch=20, cex=.1,
          Error: unexpected symbol in:
          "res$log2FoldChange
          log"

          Could you please help in this situation. how can i get reed of this errorand will get MvA plot.

          Regards
          Aniket

          Comment


          • #20
            Hi Aniket,

            your call to plot contains a syntax error, you need a comma between res$log2FoldChange and log="x".

            Have a look at http://www.r-project.org -> Manuals -> An Introduction to R if you are unsure about using R.

            Best wishes
            Wolfgang
            Wolfgang Huber
            EMBL

            Comment


            • #21
              Thank you Wolfgang Huber for you help, now the plot command is working but still I didnt get a proper plot.
              Now I got a blank plot, I am not understanding y I got a blank plot instead of MvA plot
              IS it because of the presence of "-Inf" in log2FoldChange column?

              Comment


              • #22
                Variance outliers? Columns resVarA or resVarB

                Hello Simon and Wolfgang -

                Could any of you elaborate on the presence of variance outliers? I've done several DE tests and I've sometimes noticed values in resVarA and resVarB differ a lot for one specific gene (8x, 20x). The manual (pg 10) suggests to excercise caution during interpretation, but is there any rule-of-thumb to determine one of these values is large compared to the other?

                Thanks in advance,
                Daniel

                Comment


                • #23
                  Hi,
                  I have installed DESeq and all seems right:

                  > biocLite("DESeq")
                  Using R version 2.11.1, biocinstall version 2.6.9.
                  Installing Bioconductor version 2.6 packages:
                  [1] "DESeq"
                  Please wait...

                  Warning: unable to access index for repository http://brainarray.mbni.med.umich.edu...d/contrib/2.11
                  trying URL 'http://www.bioconductor.org/packages/2.6/bioc/bin/macosx/leopard/contrib/2.11/DESeq_1.0.6.tgz'
                  Content type 'application/x-gzip' length 1533404 bytes (1.5 Mb)
                  opened URL
                  ==================================================
                  downloaded 1.5 Mb


                  but when I try to use
                  > cds <- newCountDataSet( countsTable, conds )
                  Error: could not find function "newCountDataSet"

                  what seems to be the problem?

                  Comment


                  • #24
                    Afadda,

                    did you type
                    library("DESeq")
                    to load the library?

                    Also, you're using an old version of R and - what's more of a problem - a rather old version of DESeq. I'd recommend to update.

                    Wolfgang
                    Wolfgang Huber
                    EMBL

                    Comment


                    • #25
                      ok. I'm ignorant in R...
                      I don't seem to be able to get the newCountDataSet to work. I get this error:
                      Error in round(countData) : Non-numeric argument to mathematical function
                      I checked, and non of the cells in the table contain alphabets. But, the class of one of the columns is "factor". I don't know why is it so and how to get around this.
                      thanks,

                      Comment


                      • #26
                        how about this error

                        > cds= newCountDataSet( table, conds )
                        Error in if (any(round(countData) != countData)) stop("The countData is not integer.") :
                        missing value where TRUE/FALSE needed


                        here all the table columns are of class integer. if anyone would just give me a clue on what's wrong with my data

                        Comment


                        • #27
                          Please send the output of the commands
                          Code:
                          head(table)
                          and
                          Code:
                          str(table)
                          . They will show you what exactly is in your "table" variable.

                          (By the way, "table" is not a good name for a variable, as it shadows the function "table" that you may need later.)

                          On the long run, it may also pay to learn a bit about R. Maybe read a few pages into the Introduction to R.

                          Comment


                          • #28
                            > head(table)
                            WT1 WT2 T
                            Tb927.1.10 0 0 0
                            Tb927.1.30 0 0 0
                            Tb927.1.40 0 0 0
                            Tb927.1.110 2260 2914 4825
                            Tb927.1.270 214 278 359
                            Tb927.1.280 205 331 589

                            > str(table)
                            'data.frame': 7882 obs. of 3 variables:
                            $ WT1: int 0 0 0 2260 214 205 628 26 379 142 ...
                            $ WT2: int 0 0 0 2914 278 331 724 58 591 261 ...
                            $ T : int 0 0 0 4825 359 589 1027 27 757 1380 ...

                            Comment


                            • #29
                              Sorry, I didn't notice that you have missing values, according to your error message.

                              So, use the command

                              Code:
                              which( is.na(table), arr.ind=TRUE )
                              to find out which rows contain missing values ("NA"). To see, e.g., the content of rows 7, 12, and 17, write

                              Code:
                              table[ c(7, 12, 17), ]
                              (Note the extra comma!) This should give you a hint what went wrong.

                              Of course, you could simply remove the offending rows with:
                              Code:
                              cleanTable <- table[ -which( is.na(table), arr.ind=TRUE )[,"row"], ]
                              Simon

                              Comment


                              • #30
                                thanks Simon. it's working now!

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Current Approaches to Protein Sequencing
                                  by seqadmin


                                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                  04-04-2024, 04:25 PM
                                • seqadmin
                                  Strategies for Sequencing Challenging Samples
                                  by seqadmin


                                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                  03-22-2024, 06:39 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 04-11-2024, 12:08 PM
                                0 responses
                                30 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-10-2024, 10:19 PM
                                0 responses
                                32 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-10-2024, 09:21 AM
                                0 responses
                                28 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-04-2024, 09:00 AM
                                0 responses
                                53 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X