SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
GO enrichment of gene lists ErikFas Bioinformatics 0 08-14-2014 12:44 AM
Producing associated gene lists.......? hbt Bioinformatics 4 07-31-2014 04:28 AM
gene lists for diabetes and Alzheimer's erezts Bioinformatics 6 04-12-2014 11:07 PM

Reply
 
Thread Tools
Old 05-11-2015, 05:36 PM   #1
bjackson
Junior Member
 
Location: Denver, CO

Join Date: May 2015
Posts: 6
Default CummeRbund shows 95% NAs for gene name lists

Code:
cuff_data = readCufflinks(dir=paste0(getwd(),"/../cuffdiff/"), 
                          gtfFile = paste0(getwd(),"/../cuffmerge/merged.gtf"), 
                          genome="hg19", rebuild=F)

#get significant IDs
diffGeneIDs = getSig(cuff_data, level="genes", alpha=0.05)

diffGenes = getGenes(cuff_data, diffGeneIDs)

featureNames(diffGenes)[1:40,]
tracking_id gene_short_name
1 XLOC_000265 CROCC
2 XLOC_000328 ALPL
3 XLOC_000443 SYTL1
4 XLOC_000679 ARTN
5 XLOC_000938 RNU6-387P
6 XLOC_001003 FAM73A,RNA5SP21
7 XLOC_001244 GSTM1,GSTM2
8 XLOC_001530 ADAMTSL4,AL356356.1,MIR4257
9 XLOC_001735 FCER1A
10 XLOC_001988 AXDND1
11 XLOC_002089 RGS1
12 XLOC_002211 NFASC,RP11-494K3.2
13 XLOC_002282 HHAT,KCNH1
14 XLOC_002522 RYR2
15 XLOC_002961 PLA2G2A
16 XLOC_003294 SLC2A1
17 XLOC_003772 MIR137HG
18 XLOC_004252 ASH1L,ASH1L-IT1,MIR555
19 XLOC_004467 RP1-117P20.3,SELE
20 XLOC_004799 CR1L
21 XLOC_004802 CD34
22 XLOC_004843 <NA>
23 XLOC_005255 <NA>
24 XLOC_005256 <NA>
25 XLOC_005257 <NA>
26 XLOC_005263 <NA>
27 XLOC_005264 <NA>
28 XLOC_005265 <NA>
29 XLOC_005273 <NA>
30 XLOC_005276 <NA>
31 XLOC_005277 <NA>
32 XLOC_005288 <NA>
33 XLOC_005291 <NA>
34 XLOC_005292 <NA>
35 XLOC_005293 <NA>
36 XLOC_005294 <NA>
37 XLOC_005322 <NA>
38 XLOC_005331 <NA>
39 XLOC_005332 <NA>
40 XLOC_005333 <NA>
41 XLOC_005336 <NA>

Code:
mean(is.na(as.vector(featureNames(diffGenes)[2])))
[1] 0.9691279

This is human data and I have used the same GTF / genome data throughout, so why are so few genes labeled? Only about 3% of the total 6284 differentially expressed genes have labels.
I have followed the protocol in the nature paper (tophat -> cufflinks -> cuffmerge -> cuffdiff)

Thanks

Last edited by bjackson; 05-11-2015 at 05:55 PM.
bjackson is offline   Reply With Quote
Old 05-12-2015, 09:09 AM   #2
bjackson
Junior Member
 
Location: Denver, CO

Join Date: May 2015
Posts: 6
Default

Still looking through this data set. I have 194 named differentially expressed genes. What is the 'normal' amount of differentially expressed genes that you would expect? I know it can vary a lot based on how different the conditions are, but is this in the range of possibility? A lot of the unnamed bits are small chunks or exons. Does this sound like it might be a true result instead of an error in annotation?
bjackson is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:59 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO