Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • annotate ID after using DESeq

    Hi everyone!

    I have my rnaseq topTable below. And I would like to add a column with an alias code. The ID code in the topTable is the ORF code and I would like to add a alias code and predicted function to my data. Does anyone know what the best way of doing this is?

    > write.csv( res.combined, file="rnaseq.topTable.csv" )
    > res.combined[1:5,]
    ID mean.d0 mean.d1 mean.d3 mean.d6 FC.d1-d0 FC.d3-d0 FC.d6-d0
    1 An01g00010 22.56114 91.74978 91.6181 101.3439 4.0667164 4.0608797 4.4919670
    2 An01g00020 59.12810 157.12747 186.8015 195.0879 2.6574079 3.1592673 3.2994108
    3 An01g00030 2496.49429 877.02756 1232.1749 1349.5057 0.3513037 0.4935621 0.5405603
    4 An01g00040 1697.34332 1329.64879 1256.3100 1336.8125 0.7833706 0.7401626 0.7875911
    5 An01g00050 8234.36186 940.63154 1322.0219 1385.3169 0.1142325 0.1605494 0.1682361
    logFC.d1-d0 logFC.d3-d0 logFC.d6-d0 pval.d1-d0 pval.d3-d0 pval.d6-d0 qval.d1-d0
    1 2.0238644 2.0217923 2.1673473 5.947051e-07 6.311527e-07 7.022009e-08 1.547561e-06
    2 1.4100197 1.6595900 1.7222084 4.491603e-06 5.141473e-08 1.880176e-08 1.079539e-05
    3 -1.5092095 -1.0186965 -0.8874725 1.772232e-15 5.512175e-08 1.864615e-06 8.577393e-15
    4 -0.3522332 -0.4340859 -0.3444813 6.577599e-02 2.044133e-02 6.936071e-02 9.494206e-02
    5 -3.1299552 -2.6389108 -2.5714408 4.471197e-59 4.230865e-44 5.274404e-42 1.569486e-57
    qval.d3-d0 qval.d6-d0
    1 1.815415e-06 2.052575e-07
    2 1.632608e-07 5.782136e-08
    3 1.746078e-07 4.783620e-06
    4 3.341204e-02 9.971601e-02
    5 9.841654e-43 1.024172e-40


    Excel file with needed info from an article:

    ORF code Alias Code Predicted function
    An01g00010 An00g03020 hypothetical protein [truncated ORF]
    An01g00020 An00g13235 weak similarity to nucleotide binding protein phnN - Escherichia coli
    An01g00030 An00g08601 strong similarity to Hgh1 - Saccharomyces cerevisiae
    An01g00040 An00g07028 strong similarity to alpha subunit of transcription initiation factor TFIIF Tfg1 - Saccharomyces cerevisiae [truncated ORF]
    An01g00050 An00g04204 similarity to fatty-acyl-CoA synthase beta chain Fas1 - Saccharomyces cerevisiae [truncated ORF]


    Thank you!

  • #2
    Presumably you have access to information relating your ORFs to their predicted functions (otherwise, you get to make such a file). You can then load that and use "match()" to match the IDs and descriptions (technically, you match the IDs in both and the result is a vector of indices that you can use to resort the descriptions to then match your ORFs).

    Comment


    • #3
      Thanks, I think I understand what you are saying, but I already have a file that relates the ORF to the function (see excel file above). I need to link that file to the ORFs in the other file...But I dont really now much about the match function. I guess I use the base package?

      Comment


      • #4
        Ah, I guess I hadn't fully realised that that was an Excel file that you already had. Assuming the excel file has been saved as a tsv (tab separated values) file called "foo.tsv":

        Code:
        anno <- read.delim("foo.tsv", header=T)
        m <- match(res.combined$ID, anno$ORF)
        res.combined$AliasCode <- anno$Alias.Code #I think it'll be called that
        res.combined$PredFunc <- anno$Predicted.Function #Or whatever it gets called
        Something like that should work. The match() function is pretty handy.

        Comment


        • #5
          I think that should work yes, thanks. But I cant get hold of that file. it is a xls file but i cant safe it as tab delimited or download it as other than a worksheet. I also tried to make my own file..

          You can find the file here. http://www.biomedcentral.com/1471-21...380/additional
          additional file 1

          Comment


          • #6
            I've attached the information from supplemental figure 1 as a tab separated value text file (it's gzipped so the forum will allow it).
            Attached Files

            Comment


            • #7
              lets assume that table with the annotation is called "annot" and it has the comumn with the An01g00010 entries also called "ID" the you want:
              Code:
              New <- merge(res.combined,annot,by="ID",all.x=T,sort=F)
              edit:
              Oh I see it is called "ORF code"
              Code:
              New <- merge(res.combined,annot,by.x="ID",by.y="ORF code",all.x=T,sort=F)
              Last edited by Jeremy; 04-02-2014, 07:33 PM.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Recent Developments in Metagenomics
                by seqadmin





                Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
                09-23-2024, 06:35 AM
              • seqadmin
                Understanding Genetic Influence on Infectious Disease
                by seqadmin




                During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

                Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
                09-09-2024, 10:59 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 09-26-2024, 12:57 PM
              0 responses
              11 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 09-25-2024, 05:35 AM
              0 responses
              19 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 09-20-2024, 06:25 AM
              0 responses
              53 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 09-19-2024, 01:02 PM
              0 responses
              46 views
              0 likes
              Last Post seqadmin  
              Working...
              X