SEQanswers (
-   Bioinformatics (
-   -   DESeq DiffExpress Vs. Fold Change (

KellerMac 06-10-2011 07:49 AM

DESeq DiffExpress Vs. Fold Change
I don't understand the logic behind the data set DESeq provides when I perform the differential expression command. All of the numbers in the columns are in random order, there is not one column that gives the table numerical sense. What is the basis for the list of genes I am recieving? Fold change on the other hand makes more sense be it up or down regulation and the numbers in the columns have an obvious order. Is the fold change shown relative to each other, or in the perspective of the first condition over the second?

mbblack 06-10-2011 10:04 AM

The first column is just a row id taken from your input file. The data is simply presented in the same order as the input data. If you don't want to sort the data in R, just write it out to a tab delimited table and parse it in Excel (the headers will be offset by one, since that first column is not data, it's just row number relative to the input data order).

> res <- nbinomTest(cds, "Control", "Treatment")
> write.table(res, file = "nBinom_results.txt", sep = "\t")

Open nBinom_results.txt in Excel and parse it by tabs and sort it by whichever column is of interest. If set up as above, fold change will be for Treatment relative to control (ie. + is up-regulated in Treatement, and - is down-regulated in Treatment.

KellerMac 06-10-2011 10:27 AM

Thanks For the quick reply Mr. Black

Ok, I understand that the data can be sorted manually in excel, but I don't see what corelation is among the genes that are marked as differential expressed. I would assume it would be greatest to least or vice versa for at least one column of numbers but i am not seeing that relation. I'm sure its there...its just not immediately obvious.

P.S. please excuse my spelling errors I cannot spell well and this Mac wont let me use spell check or rather i don't care enough to make it work or learn how.

mbblack 06-10-2011 10:49 AM

The default is to list the data in the same gene/transcript order as the input data. The only way to get order like you wish is to sort the data yourself, by pValue, adjPvalue, fold change, whatever you wish. But the default is just to list it by input order.

Note too that if you use the two commands I wrote, you get the results back for all the genes in your input, not just significant genes. You would then have to sort that table and apply whatever statistical cutoffs you wish (e.g. sort by adjPvalue, then select only those less than 0.01).

You could do something like:

> resSig <- res[ res$padj < 0.1, ]
> head(resSig[ order(resSig$padj), ])

to see the most significant genes sorted by adjPvalue. But again, if you wrote those out to a file:

> write.table(resSig, file = "nBinom_results_lessthan_.1.txt", sep = "\t")

That file would only include those genes with an adjPvalue less than 0.1, but the table would still be sorted initially simply by the row input order of the genes. So, you would still need to sort it manually to apply whatever order you wished to see in it.

All times are GMT -8. The time now is 07:37 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.