Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • annotate ID after using DESeq

    Hi everyone!

    I have my rnaseq topTable below. And I would like to add a column with an alias code. The ID code in the topTable is the ORF code and I would like to add a alias code and predicted function to my data. Does anyone know what the best way of doing this is?

    > write.csv( res.combined, file="rnaseq.topTable.csv" )
    > res.combined[1:5,]
    ID mean.d0 mean.d1 mean.d3 mean.d6 FC.d1-d0 FC.d3-d0 FC.d6-d0
    1 An01g00010 22.56114 91.74978 91.6181 101.3439 4.0667164 4.0608797 4.4919670
    2 An01g00020 59.12810 157.12747 186.8015 195.0879 2.6574079 3.1592673 3.2994108
    3 An01g00030 2496.49429 877.02756 1232.1749 1349.5057 0.3513037 0.4935621 0.5405603
    4 An01g00040 1697.34332 1329.64879 1256.3100 1336.8125 0.7833706 0.7401626 0.7875911
    5 An01g00050 8234.36186 940.63154 1322.0219 1385.3169 0.1142325 0.1605494 0.1682361
    logFC.d1-d0 logFC.d3-d0 logFC.d6-d0 pval.d1-d0 pval.d3-d0 pval.d6-d0 qval.d1-d0
    1 2.0238644 2.0217923 2.1673473 5.947051e-07 6.311527e-07 7.022009e-08 1.547561e-06
    2 1.4100197 1.6595900 1.7222084 4.491603e-06 5.141473e-08 1.880176e-08 1.079539e-05
    3 -1.5092095 -1.0186965 -0.8874725 1.772232e-15 5.512175e-08 1.864615e-06 8.577393e-15
    4 -0.3522332 -0.4340859 -0.3444813 6.577599e-02 2.044133e-02 6.936071e-02 9.494206e-02
    5 -3.1299552 -2.6389108 -2.5714408 4.471197e-59 4.230865e-44 5.274404e-42 1.569486e-57
    qval.d3-d0 qval.d6-d0
    1 1.815415e-06 2.052575e-07
    2 1.632608e-07 5.782136e-08
    3 1.746078e-07 4.783620e-06
    4 3.341204e-02 9.971601e-02
    5 9.841654e-43 1.024172e-40


    Excel file with needed info from an article:

    ORF code Alias Code Predicted function
    An01g00010 An00g03020 hypothetical protein [truncated ORF]
    An01g00020 An00g13235 weak similarity to nucleotide binding protein phnN - Escherichia coli
    An01g00030 An00g08601 strong similarity to Hgh1 - Saccharomyces cerevisiae
    An01g00040 An00g07028 strong similarity to alpha subunit of transcription initiation factor TFIIF Tfg1 - Saccharomyces cerevisiae [truncated ORF]
    An01g00050 An00g04204 similarity to fatty-acyl-CoA synthase beta chain Fas1 - Saccharomyces cerevisiae [truncated ORF]


    Thank you!

  • #2
    Presumably you have access to information relating your ORFs to their predicted functions (otherwise, you get to make such a file). You can then load that and use "match()" to match the IDs and descriptions (technically, you match the IDs in both and the result is a vector of indices that you can use to resort the descriptions to then match your ORFs).

    Comment


    • #3
      Thanks, I think I understand what you are saying, but I already have a file that relates the ORF to the function (see excel file above). I need to link that file to the ORFs in the other file...But I dont really now much about the match function. I guess I use the base package?

      Comment


      • #4
        Ah, I guess I hadn't fully realised that that was an Excel file that you already had. Assuming the excel file has been saved as a tsv (tab separated values) file called "foo.tsv":

        Code:
        anno <- read.delim("foo.tsv", header=T)
        m <- match(res.combined$ID, anno$ORF)
        res.combined$AliasCode <- anno$Alias.Code #I think it'll be called that
        res.combined$PredFunc <- anno$Predicted.Function #Or whatever it gets called
        Something like that should work. The match() function is pretty handy.

        Comment


        • #5
          I think that should work yes, thanks. But I cant get hold of that file. it is a xls file but i cant safe it as tab delimited or download it as other than a worksheet. I also tried to make my own file..

          You can find the file here. http://www.biomedcentral.com/1471-21...380/additional
          additional file 1

          Comment


          • #6
            I've attached the information from supplemental figure 1 as a tab separated value text file (it's gzipped so the forum will allow it).
            Attached Files

            Comment


            • #7
              lets assume that table with the annotation is called "annot" and it has the comumn with the An01g00010 entries also called "ID" the you want:
              Code:
              New <- merge(res.combined,annot,by="ID",all.x=T,sort=F)
              edit:
              Oh I see it is called "ORF code"
              Code:
              New <- merge(res.combined,annot,by.x="ID",by.y="ORF code",all.x=T,sort=F)
              Last edited by Jeremy; 04-02-2014, 07:33 PM.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Advancing Precision Medicine for Rare Diseases in Children
                by seqadmin




                Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
                12-16-2024, 07:57 AM
              • seqadmin
                Recent Advances in Sequencing Technologies
                by seqadmin



                Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

                Long-Read Sequencing
                Long-read sequencing has seen remarkable advancements,...
                12-02-2024, 01:49 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 12-17-2024, 10:28 AM
              0 responses
              33 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 12-13-2024, 08:24 AM
              0 responses
              49 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 12-12-2024, 07:41 AM
              0 responses
              34 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 12-11-2024, 07:45 AM
              0 responses
              46 views
              0 likes
              Last Post seqadmin  
              Working...
              X