Thread: DESeq problems
View Single Post
Old 08-05-2013, 01:57 AM   #11
vd4mindia
Member
 
Location: Milan

Join Date: May 2013
Posts: 40
Unhappy Help with DESeq analysis and filtering the DEGs from its output

I have just started using DESeq and trying to compare my results for DEGs between cuffdiff , DESeq and RankProd. I would like to ask certain stuffs as I am confused at a point after the analysis is done. I am comparing 2 conditions of tumor where I am having in total 5 samples. Its like 3 samples for peripheries giving tumor (PGT) and 2 for peripheries not giving tumor(PDGT). So what I did is according to DESeq I created a matrix for the conditions with the raw fragment counts as DESeq works only with raw fragment counts and converted the matrix to nearest integer values as the package only works with integer values. Then I used the normal DESeq commands to create my own results of DEGs but the output does not preferentially gives DEGs , it lists for all the genes. Can you tell me where I am going wrong and also is there any pre filtering I should do or post filtering to extract the list of DEGs from the output. I am sending the output file as well and the script code. Another problem is the p.adj which is the corrected p-value is also not giving proper values so I cannot on the basis of that and then list my DEGs up and down with Log2FC values. The p.adj values are either 1 or NA and even I am not getting proper value in the field of Basemean as sometimes I am getting 0 and in Log2FC is #NAME? which means excel cannot recognize the formula used to calculate it as its for those rows where one of the BaseMean is 0 and so the FC is also zero and the Log2FC cannot be calculated.

dat1<- read.table("/Users/vdas/Documents/RNA-Seq_Smaples_Udine_08032013/GBM_29052013/UD_RP_25072013/RP_matrix_RF_PGTvsPDGT.txt",sep="",header=TRUE,stringsAsFactors=FALSE)

dat1[,-1]<- lapply(lapply(dat1[,-1],round),as.integer)

write.table(dat1,"/Users/vdas/Documents/RNA-Seq_Smaples_Udine_08032013/GBM_29052013/UD_RP_25072013/rev_RF_PGTvsPDGT.txt",sep="\t",)

count_table<-read.table("/Users/vdas/Documents/RNA-Seq_Smaples_Udine_08032013/GBM_29052013/UD_RP_25072013/rev_RF_PGTvsPDGT.txt",header=T,sep="\t",row.names=1)

expt_design <- data.frame(row.names = colnames(count_table),
condition = c("PGT","PGT","PGT","PDGT","PDGT"))

expt_design

conditions = expt_design$condition

conditions

data <- newCountDataSet(count_table, conditions)

head(counts(data))

data <- estimateSizeFactors(data)

sizeFactors(data)

data <- estimateDispersions(data)

results <- nbinomTest(data, "PGT", "PDGT")

Is there anything wrong in the analysis script? Please let me know or if I have to introduce some post filtering or not. Please let me know if you want any more infos.
vd4mindia is offline   Reply With Quote