View Single Post
Old 07-19-2017, 08:47 PM   #1
lagrace
Junior Member
 
Location: Mississippi

Join Date: May 2017
Posts: 4
Default NA's in data (using TopHat-HTSeq-DESeq)

Hello all,

I have some data that I have retrieved from GEO. I am using the TopHat-HTSeq-DESeq pipeline to analyze this data. Within linux, I am using R and ran into an issue with NA's in the data. I am trying to determine why there are so many NA's. I was told to view the annotations. Is that possible in R? If so, how?
Not sure if there is some other reason. Below are the codes used:

samTab=read.table(file="samTab.txt",sep="\t")
samTab
df=newCountDataSetFromHTSeqCount(samTab)
df

cds<-estimateSizeFactors(df)
sizeFactors(cds)
head(counts(cds,normalized=TRUE))

cds=estimateDispersions(cds)
head(cds)
str(fitInfo(cds))
head(fData(cds))

res=nbinomTest(cds,"Nor","Can")
head(res)
id baseMean baseMeanA baseMeanB foldChange log2FoldChange pval padj
A1 138.33103 129.97950 146.68255 1.1285052 0.17441313 0.9057131 1


resSig = res[res$padj < 0.1, ]
head(resSig[order(resSig$pval), ])
id baseMean baseMeanA baseMeanB foldChange log2FoldChange pval padj
NA <NA> NA NA NA NA NA NA NA
NA.1 <NA> NA NA NA NA NA NA NA
NA.2 <NA> NA NA NA NA NA NA NA
NA.3 <NA> NA NA NA NA NA NA NA
NA.4 <NA> NA NA NA NA NA NA NA
NA.5 <NA> NA NA NA NA NA NA NA

I don't understand why the padj is 1. I suppoose that is the reason for the NA's. Is there a way to view the annotations while in R? Any suggestions? Please assist.
lagrace is offline   Reply With Quote