Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • annotate ID after using DESeq

    Hi everyone!

    I have my rnaseq topTable below. And I would like to add a column with an alias code. The ID code in the topTable is the ORF code and I would like to add a alias code and predicted function to my data. Does anyone know what the best way of doing this is?

    > write.csv( res.combined, file="rnaseq.topTable.csv" )
    > res.combined[1:5,]
    ID mean.d0 mean.d1 mean.d3 mean.d6 FC.d1-d0 FC.d3-d0 FC.d6-d0
    1 An01g00010 22.56114 91.74978 91.6181 101.3439 4.0667164 4.0608797 4.4919670
    2 An01g00020 59.12810 157.12747 186.8015 195.0879 2.6574079 3.1592673 3.2994108
    3 An01g00030 2496.49429 877.02756 1232.1749 1349.5057 0.3513037 0.4935621 0.5405603
    4 An01g00040 1697.34332 1329.64879 1256.3100 1336.8125 0.7833706 0.7401626 0.7875911
    5 An01g00050 8234.36186 940.63154 1322.0219 1385.3169 0.1142325 0.1605494 0.1682361
    logFC.d1-d0 logFC.d3-d0 logFC.d6-d0 pval.d1-d0 pval.d3-d0 pval.d6-d0 qval.d1-d0
    1 2.0238644 2.0217923 2.1673473 5.947051e-07 6.311527e-07 7.022009e-08 1.547561e-06
    2 1.4100197 1.6595900 1.7222084 4.491603e-06 5.141473e-08 1.880176e-08 1.079539e-05
    3 -1.5092095 -1.0186965 -0.8874725 1.772232e-15 5.512175e-08 1.864615e-06 8.577393e-15
    4 -0.3522332 -0.4340859 -0.3444813 6.577599e-02 2.044133e-02 6.936071e-02 9.494206e-02
    5 -3.1299552 -2.6389108 -2.5714408 4.471197e-59 4.230865e-44 5.274404e-42 1.569486e-57
    qval.d3-d0 qval.d6-d0
    1 1.815415e-06 2.052575e-07
    2 1.632608e-07 5.782136e-08
    3 1.746078e-07 4.783620e-06
    4 3.341204e-02 9.971601e-02
    5 9.841654e-43 1.024172e-40


    Excel file with needed info from an article:

    ORF code Alias Code Predicted function
    An01g00010 An00g03020 hypothetical protein [truncated ORF]
    An01g00020 An00g13235 weak similarity to nucleotide binding protein phnN - Escherichia coli
    An01g00030 An00g08601 strong similarity to Hgh1 - Saccharomyces cerevisiae
    An01g00040 An00g07028 strong similarity to alpha subunit of transcription initiation factor TFIIF Tfg1 - Saccharomyces cerevisiae [truncated ORF]
    An01g00050 An00g04204 similarity to fatty-acyl-CoA synthase beta chain Fas1 - Saccharomyces cerevisiae [truncated ORF]


    Thank you!

  • #2
    Presumably you have access to information relating your ORFs to their predicted functions (otherwise, you get to make such a file). You can then load that and use "match()" to match the IDs and descriptions (technically, you match the IDs in both and the result is a vector of indices that you can use to resort the descriptions to then match your ORFs).

    Comment


    • #3
      Thanks, I think I understand what you are saying, but I already have a file that relates the ORF to the function (see excel file above). I need to link that file to the ORFs in the other file...But I dont really now much about the match function. I guess I use the base package?

      Comment


      • #4
        Ah, I guess I hadn't fully realised that that was an Excel file that you already had. Assuming the excel file has been saved as a tsv (tab separated values) file called "foo.tsv":

        Code:
        anno <- read.delim("foo.tsv", header=T)
        m <- match(res.combined$ID, anno$ORF)
        res.combined$AliasCode <- anno$Alias.Code #I think it'll be called that
        res.combined$PredFunc <- anno$Predicted.Function #Or whatever it gets called
        Something like that should work. The match() function is pretty handy.

        Comment


        • #5
          I think that should work yes, thanks. But I cant get hold of that file. it is a xls file but i cant safe it as tab delimited or download it as other than a worksheet. I also tried to make my own file..

          You can find the file here. http://www.biomedcentral.com/1471-21...380/additional
          additional file 1

          Comment


          • #6
            I've attached the information from supplemental figure 1 as a tab separated value text file (it's gzipped so the forum will allow it).
            Attached Files

            Comment


            • #7
              lets assume that table with the annotation is called "annot" and it has the comumn with the An01g00010 entries also called "ID" the you want:
              Code:
              New <- merge(res.combined,annot,by="ID",all.x=T,sort=F)
              edit:
              Oh I see it is called "ORF code"
              Code:
              New <- merge(res.combined,annot,by.x="ID",by.y="ORF code",all.x=T,sort=F)
              Last edited by Jeremy; 04-02-2014, 07:33 PM.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Recent Advances in Sequencing Analysis Tools
                by seqadmin


                The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                05-06-2024, 07:48 AM
              • seqadmin
                Essential Discoveries and Tools in Epitranscriptomics
                by seqadmin




                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                04-22-2024, 07:01 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 05-07-2024, 06:57 AM
              0 responses
              12 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 05-06-2024, 07:17 AM
              0 responses
              16 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 05-02-2024, 08:06 AM
              0 responses
              22 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-30-2024, 12:17 PM
              0 responses
              24 views
              0 likes
              Last Post seqadmin  
              Working...
              X