SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
How to annotate 13 nts sequence Palgrave Bioinformatics 1 09-27-2013 10:03 AM
Annotate diff files cnyh Bioinformatics 4 03-08-2013 10:59 AM
Annotate coordinates giuliano stirparo Bioinformatics 1 01-10-2013 11:55 PM
Annotate memyselfandi Bioinformatics 0 09-22-2011 11:26 AM
vcftools to annotate SNP rururara Bioinformatics 0 03-31-2011 07:47 AM

Reply
 
Thread Tools
Old 04-02-2014, 04:44 AM   #1
willemate
Junior Member
 
Location: Berlin

Join Date: Mar 2014
Posts: 9
Default annotate ID after using DESeq

Hi everyone!

I have my rnaseq topTable below. And I would like to add a column with an alias code. The ID code in the topTable is the ORF code and I would like to add a alias code and predicted function to my data. Does anyone know what the best way of doing this is?

> write.csv( res.combined, file="rnaseq.topTable.csv" )
> res.combined[1:5,]
ID mean.d0 mean.d1 mean.d3 mean.d6 FC.d1-d0 FC.d3-d0 FC.d6-d0
1 An01g00010 22.56114 91.74978 91.6181 101.3439 4.0667164 4.0608797 4.4919670
2 An01g00020 59.12810 157.12747 186.8015 195.0879 2.6574079 3.1592673 3.2994108
3 An01g00030 2496.49429 877.02756 1232.1749 1349.5057 0.3513037 0.4935621 0.5405603
4 An01g00040 1697.34332 1329.64879 1256.3100 1336.8125 0.7833706 0.7401626 0.7875911
5 An01g00050 8234.36186 940.63154 1322.0219 1385.3169 0.1142325 0.1605494 0.1682361
logFC.d1-d0 logFC.d3-d0 logFC.d6-d0 pval.d1-d0 pval.d3-d0 pval.d6-d0 qval.d1-d0
1 2.0238644 2.0217923 2.1673473 5.947051e-07 6.311527e-07 7.022009e-08 1.547561e-06
2 1.4100197 1.6595900 1.7222084 4.491603e-06 5.141473e-08 1.880176e-08 1.079539e-05
3 -1.5092095 -1.0186965 -0.8874725 1.772232e-15 5.512175e-08 1.864615e-06 8.577393e-15
4 -0.3522332 -0.4340859 -0.3444813 6.577599e-02 2.044133e-02 6.936071e-02 9.494206e-02
5 -3.1299552 -2.6389108 -2.5714408 4.471197e-59 4.230865e-44 5.274404e-42 1.569486e-57
qval.d3-d0 qval.d6-d0
1 1.815415e-06 2.052575e-07
2 1.632608e-07 5.782136e-08
3 1.746078e-07 4.783620e-06
4 3.341204e-02 9.971601e-02
5 9.841654e-43 1.024172e-40


Excel file with needed info from an article:

ORF code Alias Code Predicted function
An01g00010 An00g03020 hypothetical protein [truncated ORF]
An01g00020 An00g13235 weak similarity to nucleotide binding protein phnN - Escherichia coli
An01g00030 An00g08601 strong similarity to Hgh1 - Saccharomyces cerevisiae
An01g00040 An00g07028 strong similarity to alpha subunit of transcription initiation factor TFIIF Tfg1 - Saccharomyces cerevisiae [truncated ORF]
An01g00050 An00g04204 similarity to fatty-acyl-CoA synthase beta chain Fas1 - Saccharomyces cerevisiae [truncated ORF]


Thank you!
willemate is offline   Reply With Quote
Old 04-02-2014, 04:52 AM   #2
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

Presumably you have access to information relating your ORFs to their predicted functions (otherwise, you get to make such a file). You can then load that and use "match()" to match the IDs and descriptions (technically, you match the IDs in both and the result is a vector of indices that you can use to resort the descriptions to then match your ORFs).
dpryan is offline   Reply With Quote
Old 04-02-2014, 06:12 AM   #3
willemate
Junior Member
 
Location: Berlin

Join Date: Mar 2014
Posts: 9
Default

Thanks, I think I understand what you are saying, but I already have a file that relates the ORF to the function (see excel file above). I need to link that file to the ORFs in the other file...But I dont really now much about the match function. I guess I use the base package?
willemate is offline   Reply With Quote
Old 04-02-2014, 06:26 AM   #4
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

Ah, I guess I hadn't fully realised that that was an Excel file that you already had. Assuming the excel file has been saved as a tsv (tab separated values) file called "foo.tsv":

Code:
anno <- read.delim("foo.tsv", header=T)
m <- match(res.combined$ID, anno$ORF)
res.combined$AliasCode <- anno$Alias.Code #I think it'll be called that
res.combined$PredFunc <- anno$Predicted.Function #Or whatever it gets called
Something like that should work. The match() function is pretty handy.
dpryan is offline   Reply With Quote
Old 04-02-2014, 07:15 AM   #5
willemate
Junior Member
 
Location: Berlin

Join Date: Mar 2014
Posts: 9
Default

I think that should work yes, thanks. But I cant get hold of that file. it is a xls file but i cant safe it as tab delimited or download it as other than a worksheet. I also tried to make my own file..

You can find the file here. http://www.biomedcentral.com/1471-21...380/additional
additional file 1
willemate is offline   Reply With Quote
Old 04-02-2014, 07:26 AM   #6
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

I've attached the information from supplemental figure 1 as a tab separated value text file (it's gzipped so the forum will allow it).
Attached Files
File Type: gz 1471-2164-13-380-s1.tsv.gz (244.9 KB, 2 views)
dpryan is offline   Reply With Quote
Old 04-02-2014, 08:31 PM   #7
Jeremy
Senior Member
 
Location: Pathum Thani, Thailand

Join Date: Nov 2009
Posts: 190
Default

lets assume that table with the annotation is called "annot" and it has the comumn with the An01g00010 entries also called "ID" the you want:
Code:
New <- merge(res.combined,annot,by="ID",all.x=T,sort=F)
edit:
Oh I see it is called "ORF code"
Code:
New <- merge(res.combined,annot,by.x="ID",by.y="ORF code",all.x=T,sort=F)

Last edited by Jeremy; 04-02-2014 at 08:33 PM.
Jeremy is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:54 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO