Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • kelseyca
    Member
    • May 2013
    • 12

    Help with cuffdiff and cummeRbund?

    Hi all!

    Sorry to bother with a simple question- I have read through all the cummeRbund posts and tutorials but I seem to be stuck right at the start!

    I have ran RNA-seq analyses on galaxy online- tophat, cufflinks, cuffmerge, and cuffdiff. I would now like to visualize results in cummeRbund. I downloaded the cuffdiff files (11 each for two groups) off galaxy and they are in .tabular format. I installed cummeRbund, and ran the following. It does not work. Could the issue be that files should be in .db format? I don't know where the cuffData.db file came from- it appeared before I had even downloaded the cuffdiff files.

    > source("http://bioconductor.org/biocLite.R")
    > biocLite("cummeRbund")
    > getwd()
    > setwd("C:/Users/caetano1/Downloads/SEDENTARYDFF")
    > list.files()
    > library(cummeRbund)
    > cuff= readCufflinks (dbFile = "cuffData.db",
    + geneFPKM = "CuffdiffSEDENTARY__gene_FPKM_tracking.tabular",
    + geneDiff = "CuffdiffSEDENTARY__CDS_FPKM_differential_expression_testing.tabular",
    + isoformFPKM = "CuffdiffSEDENTARY__transcript_FPKM_tracking.tabular",
    + isoformDiff = "CuffdiffSEDENTARY__transcript_differential_expression_testing.tabular",
    + TSSFPKM = "CuffdiffSEDENTARY__TSS_groups_FPKM_tracking.tabular",
    + TSSDiff = "CuffdiffSEDENTARY__TSS_groups_differential_expression_testing.tabular",
    + CDSFPKM = "CuffdiffSEDENTARY__CDS_FPKM_tracking.tabular",
    + CDSExpDiff = "CuffdiffSEDENTARY__CDS_FPKM_differential_expression_testing.tabular"",
    + CDSDiff = "CuffdiffSEDENTARY__CDS_overloading_diffential_expression_testing.tabular",
    + promoterFile = "CuffdiffSEDENTARY__promoters_differential_expression_testing.tabular",
    + splicingFile = "CuffdiffSEDENTARY__splicing_differential_expression_testing.tabular",
    + rebuild = T)

    I think I'm missing something really obvious here!

    Thank you so much!

    Kelesy
  • muthu545
    Member
    • Jul 2011
    • 32

    #2
    Hi kelseyca,

    cuffData.db is the database file created by cummeRbund to store all the results from cuffdiff in a easy to access format for commands in cummeRbund in R.

    So if you run readCufflinks (dbFile = "cuffData.db",....) command even without loading all the files from cuffdiff into the directory, a default cuffData.db fill will be created.

    Hope this helps

    Thanks
    --
    Muthu

    Comment

    • kelseyca
      Member
      • May 2013
      • 12

      #3
      Originally posted by muthu545 View Post
      Hi kelseyca,

      cuffData.db is the database file created by cummeRbund to store all the results from cuffdiff in a easy to access format for commands in cummeRbund in R.

      So if you run readCufflinks (dbFile = "cuffData.db",....) command even without loading all the files from cuffdiff into the directory, a default cuffData.db fill will be created.

      Hope this helps

      Thanks
      --
      Muthu
      Muthu,

      Thanks for your reply! So, how can I get R to read my cuffdiff files? Are they in the wrong format?

      Kelsey

      Comment

      • sazz
        Member
        • Oct 2012
        • 28

        #4
        Output files should look like this:

        genes.read_group_tracking
        genes.fpkm_tracking
        genes.count_tracking
        gene_exp.diff

        I guess, you should delete ".tabular" part and organize them in the right format.

        Comment

        • muthu545
          Member
          • Jul 2011
          • 32

          #5
          Kelsey,

          As Sazz mentioned, the output files from cuffdiff will not have .tabular file formats.
          Please verify your output files from cuffdiff, if it doesnot match names provided in the readCufflinks command, then the files will not be recognized in R.

          Simple is to copy all the output files from cuffdiff into a directory and run the following command.

          cuff= readCufflinks (dbFile = "cuffData.db",dir="C:/Users/caetano1/Downloads/SEDENTARYDFF",
          gtfFile='DIRPATH/gtffile', genome='genomename',rebuild = T)

          This command recognizes all the files required to make the directory. You need not specify them individually.

          GTF file is needed for some visualization commands in cummeRbund.

          Hope this is helpful

          Thanks
          --
          Muthu

          Comment

          • kelseyca
            Member
            • May 2013
            • 12

            #6
            Originally posted by muthu545 View Post
            Kelsey,

            As Sazz mentioned, the output files from cuffdiff will not have .tabular file formats.
            Please verify your output files from cuffdiff, if it doesnot match names provided in the readCufflinks command, then the files will not be recognized in R.

            Simple is to copy all the output files from cuffdiff into a directory and run the following command.

            cuff= readCufflinks (dbFile = "cuffData.db",dir="C:/Users/caetano1/Downloads/SEDENTARYDFF",
            gtfFile='DIRPATH/gtffile', genome='genomename',rebuild = T)

            This command recognizes all the files required to make the directory. You need not specify them individually.

            GTF file is needed for some visualization commands in cummeRbund.

            Hope this is helpful

            Thanks
            --
            Muthu
            Hi Muthu,

            One last question. Sorry If I am missing something very obvious here and wasting your time. thank you so much for being so patient and all of your help.

            I can not figure out how to export cuffdiff files from galaxy online in any other format than .tabular. I am just clicking "download" under the cuffdiff run. All manuals and FAQ's I have been reading are from running the tuxedo suite offline.

            Also, R cannot find the function "readCufflinks".

            Kelsey

            Comment

            • kelseyca
              Member
              • May 2013
              • 12

              #7
              Originally posted by muthu545 View Post
              Kelsey,

              As Sazz mentioned, the output files from cuffdiff will not have .tabular file formats.
              Please verify your output files from cuffdiff, if it doesnot match names provided in the readCufflinks command, then the files will not be recognized in R.

              Simple is to copy all the output files from cuffdiff into a directory and run the following command.

              cuff= readCufflinks (dbFile = "cuffData.db",dir="C:/Users/caetano1/Downloads/SEDENTARYDFF",
              gtfFile='DIRPATH/gtffile', genome='genomename',rebuild = T)

              This command recognizes all the files required to make the directory. You need not specify them individually.

              GTF file is needed for some visualization commands in cummeRbund.

              Hope this is helpful

              Thanks
              --
              Muthu
              > cuff= readCufflinks (dbFile = "cuffData.db",dir="C:/Users/caetano1/Downloads/SEDENTARYDFF",
              + gtfFile='DIRPATH/gtffile', genome='genomename',rebuild = T)
              Creating database C:/Users/caetano1/Downloads/SEDENTARYDFF/cuffData.db
              Reading GTF file
              Error in import(FileForFormat(con), ...) :
              error in evaluating the argument 'con' in selecting a method for function 'import': Error in FileForFormat(con) : Format 'DIRPATH/gtffile' unsupported
              >

              Comment

              • muthu545
                Member
                • Jul 2011
                • 32

                #8
                Originally posted by kelseyca View Post
                Hi Muthu,

                One last question. Sorry If I am missing something very obvious here and wasting your time. thank you so much for being so patient and all of your help.

                I can not figure out how to export cuffdiff files from galaxy online in any other format than .tabular. I am just clicking "download" under the cuffdiff run. All manuals and FAQ's I have been reading are from running the tuxedo suite offline.

                Also, R cannot find the function "readCufflinks".

                Kelsey
                Hi Kelsey,

                Not a problem.

                If that's the case (Galaxy's output is .tabular), then you could rename the files in order to change the .tabular file format, after you download them.

                If R cannot find the functions 'readCufflinks', it means you did not load the corresponding library 'cummeRbund' in the current workspace.

                Thanks
                --
                Muthu

                Comment

                • muthu545
                  Member
                  • Jul 2011
                  • 32

                  #9
                  Originally posted by kelseyca View Post
                  > cuff= readCufflinks (dbFile = "cuffData.db",dir="C:/Users/caetano1/Downloads/SEDENTARYDFF",
                  + gtfFile='DIRPATH/gtffile', genome='genomename',rebuild = T)
                  Creating database C:/Users/caetano1/Downloads/SEDENTARYDFF/cuffData.db
                  Reading GTF file
                  Error in import(FileForFormat(con), ...) :
                  error in evaluating the argument 'con' in selecting a method for function 'import': Error in FileForFormat(con) : Format 'DIRPATH/gtffile' unsupported
                  >
                  Kelsey,

                  Rightnow, its throwing out error because its not able to detect the directory 'DIRPATH' and the gtf file.

                  I mentioned 'DIRPATH' in order to imply the directory in which you have the .gtf file you used to run cufflinks.
                  you could copy the XXX.gtf file to the same working directory 'C:/Users/caetano1/Downloads/SEDENTARYDFF' and then replace the DIRPATH/gtffile in the command to 'C:/Users/caetano1/Downloads/SEDENTARYDFF/XXX.gtf' and the 'genomename' to the name of the genome you are working with eg. 'hg19', 'hg18','pt03','mm9','mm10' etc...

                  Your readcufflinks command should work after this without any error.

                  thanks
                  --
                  Muthu

                  Comment

                  • kelseyca
                    Member
                    • May 2013
                    • 12

                    #10
                    > source("http://bioconductor.org/biocLite.R")
                    Bioconductor version 2.12 (BiocInstaller 1.10.2), ?biocLite for help
                    > biocLite("cummeRbund")
                    BioC_mirror: http://bioconductor.org
                    Using Bioconductor version 2.12 (BiocInstaller 1.10.2), R version 3.0.1.
                    Installing package(s) 'cummeRbund'
                    trying URL 'http://bioconductor.org/packages/2.12/bioc/bin/windows/contrib/3.0/cummeRbund_2.2.0.zip'
                    Content type 'application/zip' length 2600163 bytes (2.5 Mb)
                    opened URL
                    downloaded 2.5 Mb

                    package ‘cummeRbund’ successfully unpacked and MD5 sums checked

                    The downloaded binary packages are in
                    C:\Users\caetano1\AppData\Local\Temp\RtmpQTqdVW\downloaded_packages
                    Warning message:
                    installed directory not writable, cannot update packages 'class', 'foreign',
                    'MASS', 'mgcv', 'nnet', 'spatial'
                    > getwd()
                    [1] "\\\\ansci-alpha/Homes/Grads/caetano1/Documents"
                    > setwd("C:/Users/caetano1/Downloads/SEDENTARYDFF")
                    > list.files()
                    [1] "cuffData.db"
                    [2] "CuffdiffSEDENTARY__CDS_FPKM_differential_expression_testing.tabular"
                    [3] "CuffdiffSEDENTARY__CDS_FPKM_tracking.tabular"
                    [4] "CuffdiffSEDENTARY__CDS_overloading_diffential_expression_testing.tabular"
                    [5] "CuffdiffSEDENTARY__gene_differential_expression_testing.tabular"
                    [6] "CuffdiffSEDENTARY__gene_FPKM_tracking.tabular"
                    [7] "CuffdiffSEDENTARY__promoters_differential_expression_testing.tabular"
                    [8] "CuffdiffSEDENTARY__splicing_differential_expression_testing.tabular"
                    [9] "CuffdiffSEDENTARY__transcript_differential_expression_testing.tabular"
                    [10] "CuffdiffSEDENTARY__transcript_FPKM_tracking.tabular"
                    [11] "CuffdiffSEDENTARY__TSS_groups_differential_expression_testing.tabular"
                    [12] "CuffdiffSEDENTARY__TSS_groups_FPKM_tracking.tabular"
                    [13] "mm10.gtf"
                    > library(cummeRbund)
                    Loading required package: BiocGenerics
                    Loading required package: parallel

                    Attaching package: ‘BiocGenerics’

                    The following objects are masked from ‘packagearallel’:

                    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
                    clusterExport, clusterMap, parApply, parCapply, parLapply,
                    parLapplyLB, parRapply, parSapply, parSapplyLB

                    The following object is masked from ‘package:stats’:

                    xtabs

                    The following objects are masked from ‘package:base’:

                    anyDuplicated, as.data.frame, cbind, colnames, duplicated, eval,
                    Filter, Find, get, intersect, lapply, Map, mapply, match, mget,
                    order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
                    rbind, Reduce, rep.int, rownames, sapply, setdiff, sort, table,
                    tapply, union, unique, unlist

                    Loading required package: RSQLite
                    Loading required package: DBI
                    Loading required package: ggplot2
                    Loading required package: reshape2
                    Loading required package: fastcluster

                    Attaching package: ‘fastcluster’

                    The following object is masked from ‘package:stats’:

                    hclust

                    Loading required package: rtracklayer
                    Loading required package: GenomicRanges
                    Loading required package: IRanges
                    Loading required package: Gviz
                    Loading required package: grid

                    Attaching package: ‘cummeRbund’

                    The following object is masked from ‘package:GenomicRanges’:

                    promoters

                    The following object is masked from ‘package:IRanges’:

                    promoters

                    > cuff= readCufflinks (dbFile = "cuffData.db",dir="C:/Users/caetano1/Downloads/SEDENTARYDFF",
                    + gtfFile="C:/Users/caetano1/Downloads/SEDENTARYDFF/mm10.gtf", genome='mm10',rebuild = T)
                    Creating database C:/Users/caetano1/Downloads/SEDENTARYDFF/cuffData.db
                    Reading GTF file
                    Error in .parse_attrCol(attrCol, file, colnames) :
                    Some attributes do not conform to 'tag value' format
                    >

                    Comment

                    • jp.
                      Senior Member
                      • Jul 2013
                      • 142

                      #11
                      Please try simple this one.
                      Note: keep you "diff_out" folder within cuff_data folder
                      change directory to: cuff_data
                      > cuff_data<- readCufflinks('diff_out',rebuild=T)

                      Comment

                      Latest Articles

                      Collapse

                      • SEQadmin2
                        Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                        by SEQadmin2


                        I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                        Here are nine questions we think about, in roughly the order they matter, before...
                        06-18-2026, 07:11 AM
                      • SEQadmin2
                        From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                        by SEQadmin2


                        Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                        The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                        ...
                        06-02-2026, 10:05 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by SEQadmin2, 06-26-2026, 11:10 AM
                      0 responses
                      12 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-17-2026, 06:09 AM
                      0 responses
                      48 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-09-2026, 11:58 AM
                      0 responses
                      107 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-05-2026, 10:09 AM
                      0 responses
                      125 views
                      0 reactions
                      Last Post SEQadmin2  
                      Working...