Seqanswers Leaderboard Ad

**emma_n** · 06-24-2010, 03:35 AM

Hi all,

I'm very new to R and also RNA-seq in general so I hope my question is not too daft and that you might be able to help me out.

I have an experiment set up where I have a group of 8 samples. 2 x infected at 4hrs, 2 x mock infected at 4hrs, 2 x infected at 7hrs and 2 x mock infected at 7hrs.

I've been trying to use Deseq to analyse the read counts and by following the instructions from the PDF (which btw is very well written and can even be followed by a novice like me!) and I get to the point where I can look for differentially expressed genes. However, the instuctions only go into comparisons between two conditions. This is fine for looking at the infected and uninfected states but I would also like to examine whether the time course of infection has any effect on gene transcription. Is there any way of running a 3-way analysis in Deseq?
Also, when I come to display the results of the most significantly differentially expressed genes the table generated only produces the top 6 genes. Is there a way of exporting the whole table created so that i can introduce a cut-off myself based on p-values?

Thanks in advance for any help!

Emma

**Simon Anders** · 06-29-2010, 01:07 AM

Dear Emma

Originally posted by emma_n View Post

I have an experiment set up where I have a group of 8 samples. 2 x infected at 4hrs, 2 x mock infected at 4hrs, 2 x infected at 7hrs and 2 x mock infected at 7hrs.

I've been trying to use Deseq to analyse the read counts and by following the instructions from the PDF (which btw is very well written and can even be followed by a novice like me!) and I get to the point where I can look for differentially expressed genes. However, the instuctions only go into comparisons between two conditions. This is fine for looking at the infected and uninfected states but I would also like to examine whether the time course of infection has any effect on gene transcription. Is there any way of running a 3-way analysis in Deseq?

I take it that, by "3-way analysis", you mean that you want to test an interaction contrast, i.e., to see whether the effect of the combination of later time and infection is significantly different from the sum of the effect of the later time only and the effect of the infection only.

This is not yet possible with DESeq. The stress is on "not yet": I've found a way to do this (you are not the first to ask for it), and it works on my computer; I only need to add it to DESeq, and I hope that I'll be able to do so within the next wto weeks. You are welcome to be one of the first to try this new feature.

If you want to try something right now, you can also use the 'getVarianceStabilizedData' to transform your data to a homoscedastic scale and the perform the analysis with limma, an R package for microarray analysis that is designed for such tasks and might be able to work with sequencing data as well, provided you first do a variance-stabilizing transform.

Also, when I come to display the results of the most significantly differentially expressed genes the table generated only produces the top 6 genes. Is there a way of exporting the whole table created so that i can introduce a cut-off myself based on p-values?

The 'head' function used in the vignette instructs R to only display the first 10 or so lines of a data frame. Write 'head( ..., 100 )' to get 100 lines, or omit the 'head' altogether to get all lines.

If you want to look at the data with your favorite spreadsheet program (Excel or the like), just save the result data frame with the 'write.csv' function: write.csv( res, file="myresults.txt" )

Simon

**emma_n** · 06-29-2010, 03:54 AM

Thanks for your reply Simon, i'm now generating the tables I'm after!

As for the 3-way analysis, yes you're right that I want to be able to find out if the combination of infection plus time has and affect on the differentially transcribed genes. If you are developing this as an add in for Deseq then I'm more than happy to hang on for that. I would much prefer to do the analysis altogether within one package and if Deseq will be able to offer this in the near future that would be great for me.

Thanks again for your help

Emma

**avm970** · 07-31-2010, 11:23 AM

Hi Everyone

In my Analysis, i am working without any relicates......I trying to find differentially expressed genes using DESeq.
After running > res <- nbinomTest(cds, "T", "N") I got following results
> head(res)
id baseMean baseMeanA baseMeanB foldChange log2FoldChange pval
1 1 62.5678206 81.639584 43.4960569 0.5327815 -0.9083842 0.3533266
2 2 123.5275329 217.192101 29.8629644 0.1374956 -2.8625423 0.1169151
3 3 0.3245974 0.000000 0.6491949 Inf Inf 1.0000000
4 4 125.0737757 207.949884 42.1976671 0.2029223 -2.3010007 0.1651501
5 5 5.2703034 9.242217 1.2983898 0.1404847 -2.8315155 0.1715208
6 6 359.4632766 485.216397 233.7101564 0.4816617 -1.0539079 0.3340572
padj resVarA resVarB
1 1 NA NA
2 1 NA NA
3 1 NA NA
4 1 NA NA
5 1 NA NA
6 1 NA NA
> head(res)
id baseMean baseMeanA baseMeanB foldChange log2FoldChange pval
1 1 62.5678206 81.639584 43.4960569 0.5327815 -0.9083842 0.3533266
2 2 123.5275329 217.192101 29.8629644 0.1374956 -2.8625423 0.1169151
3 3 0.3245974 0.000000 0.6491949 Inf Inf 1.0000000
4 4 125.0737757 207.949884 42.1976671 0.2029223 -2.3010007 0.1651501
5 5 5.2703034 9.242217 1.2983898 0.1404847 -2.8315155 0.1715208
6 6 359.4632766 485.216397 233.7101564 0.4816617 -1.0539079 0.3340572
padj resVarA resVarB
1 1 NA NA
2 1 NA NA
3 1 NA NA
4 1 NA NA
5 1 NA NA
6 1 NA NA
As a next step to get a plot I tried following command and got the error
> plot(
+ res$baseMean,
+ res$log2FoldChange
+ log="x", pch=20, cex=.1,
Error: unexpected symbol in:
"res$log2FoldChange
log"

Could you please help in this situation. how can i get reed of this errorand will get MvA plot.

Regards
Aniket

**Wolfgang Huber** · 07-31-2010, 12:48 PM

Hi Aniket,

your call to plot contains a syntax error, you need a comma between res$log2FoldChange and log="x".

Have a look at http://www.r-project.org -> Manuals -> An Introduction to R if you are unsure about using R.

Best wishes
Wolfgang

**avm970** · 07-31-2010, 01:52 PM

Thank you Wolfgang Huber for you help, now the plot command is working but still I didnt get a proper plot.
Now I got a blank plot, I am not understanding y I got a blank plot instead of MvA plot
IS it because of the presence of "-Inf" in log2FoldChange column?

**dgu** · 08-11-2010, 02:32 PM

Variance outliers? Columns resVarA or resVarB

Hello Simon and Wolfgang -

Could any of you elaborate on the presence of variance outliers? I've done several DE tests and I've sometimes noticed values in resVarA and resVarB differ a lot for one specific gene (8x, 20x). The manual (pg 10) suggests to excercise caution during interpretation, but is there any rule-of-thumb to determine one of these values is large compared to the other?

Thanks in advance,
Daniel

**afadda** · 02-09-2011, 05:55 AM

Hi,
I have installed DESeq and all seems right:

> biocLite("DESeq")
Using R version 2.11.1, biocinstall version 2.6.9.
Installing Bioconductor version 2.6 packages:
[1] "DESeq"
Please wait...

Warning: unable to access index for repository http://brainarray.mbni.med.umich.edu...d/contrib/2.11
trying URL 'http://www.bioconductor.org/packages/2.6/bioc/bin/macosx/leopard/contrib/2.11/DESeq_1.0.6.tgz'
Content type 'application/x-gzip' length 1533404 bytes (1.5 Mb)
opened URL
==================================================
downloaded 1.5 Mb

but when I try to use
> cds <- newCountDataSet( countsTable, conds )
Error: could not find function "newCountDataSet"

what seems to be the problem?

**Wolfgang Huber** · 02-09-2011, 06:01 AM

Afadda,

did you type
library("DESeq")
to load the library?

Also, you're using an old version of R and - what's more of a problem - a rather old version of DESeq. I'd recommend to update.

Wolfgang

**afadda** · 02-09-2011, 10:17 AM

ok. I'm ignorant in R...
I don't seem to be able to get the newCountDataSet to work. I get this error:
Error in round(countData) : Non-numeric argument to mathematical function
I checked, and non of the cells in the table contain alphabets. But, the class of one of the columns is "factor". I don't know why is it so and how to get around this.
thanks,

**afadda** · 02-10-2011, 07:51 AM

how about this error

> cds= newCountDataSet( table, conds )
Error in if (any(round(countData) != countData)) stop("The countData is not integer.") :
missing value where TRUE/FALSE needed

here all the table columns are of class integer. if anyone would just give me a clue on what's wrong with my data

**Simon Anders** · 02-10-2011, 07:58 AM

Please send the output of the commands

Code:

head(table)

and

Code:

str(table)

. They will show you what exactly is in your "table" variable.

(By the way, "table" is not a good name for a variable, as it shadows the function "table" that you may need later.)

On the long run, it may also pay to learn a bit about R. Maybe read a few pages into the Introduction to R.

**afadda** · 02-10-2011, 09:35 AM

> head(table)
WT1 WT2 T
Tb927.1.10 0 0 0
Tb927.1.30 0 0 0
Tb927.1.40 0 0 0
Tb927.1.110 2260 2914 4825
Tb927.1.270 214 278 359
Tb927.1.280 205 331 589

> str(table)
'data.frame': 7882 obs. of 3 variables:
$ WT1: int 0 0 0 2260 214 205 628 26 379 142 ...
$ WT2: int 0 0 0 2914 278 331 724 58 591 261 ...
$ T : int 0 0 0 4825 359 589 1027 27 757 1380 ...

**Simon Anders** · 02-11-2011, 02:53 AM

Sorry, I didn't notice that you have missing values, according to your error message.

So, use the command

Code:

which( is.na(table), arr.ind=TRUE )

to find out which rows contain missing values ("NA"). To see, e.g., the content of rows 7, 12, and 17, write

Code:

table[ c(7, 12, 17), ]

(Note the extra comma!) This should give you a hint what went wrong.

Of course, you could simply remove the offending rows with:

Code:

cleanTable <- table[ -which( is.na(table), arr.ind=TRUE )[,"row"], ]

Simon

**afadda** · 02-11-2011, 07:03 AM

thanks Simon. it's working now!

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 30 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 32 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 28 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 53 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News