Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Error in CummeRbund while creating database

    Hello

    I'm trying to investigate my Tophat 2.05 Bowtie 2.0 aligned BAM file, plus BAM bai file and cuffdiff 2.02 outputs using the latest version of cummeRbund in R version 2.15.2.

    While generating the database using the > cuff <- readCufflinks(rebuild=T) command i encounted an error. Does anyone know what could have caused it?

    Any feedback greatfully received

    [dr_richard_barker@vm142-21 cuffdiff_out]$ R

    R version 2.15.2 (2012-10-26) -- "Trick or Treat"
    Copyright (C) 2012 The R Foundation for Statistical Computing
    ISBN 3-900051-07-0
    Platform: x86_64-redhat-linux-gnu (64-bit)

    R is free software and comes with ABSOLUTELY NO WARRANTY.
    You are welcome to redistribute it under certain conditions.
    Type 'license()' or 'licence()' for distribution details.
    R is a collaborative project with many contributors.
    Type 'contributors()' for more information and
    'citation()' on how to cite R or R packages in publications.
    Type 'demo()' for some demos, 'help()' for on-line help, or
    'help.start()' for an HTML browser interface to help.
    Type 'q()' to quit R.

    > library (cummeRbund)
    Loading required package: BiocGenerics
    Attaching package: 'BiocGenerics'
    The following object(s) are masked from 'package:stats':
    xtabs
    The following object(s) are masked from 'package:base':
    Filter, Find, Map, Position, Reduce, anyDuplicated, cbind,
    colnames, duplicated, eval, get, intersect, lapply, mapply, mget,
    order, paste, pmax, pmax.int, pmin, pmin.int, rbind, rep.int,
    rownames, sapply, setdiff, table, tapply, union, unique

    Loading required package: RSQLite
    Loading required package: DBI
    Loading required package: ggplot2
    Loading required package: reshape2
    Loading required package: fastcluster

    Attaching package: 'fastcluster'

    The following object(s) are masked from 'package:stats':
    hclust

    Loading required package: rtracklayer
    Loading required package: GenomicRanges
    Loading required package: IRanges
    Loading required package: Gviz
    Loading required package: grid
    > cuff <- readCufflinks()
    > cuff <- readCufflinks(rebuild=T)
    Creating database /mydata2/0_GS_atmosphere_test/cuffdiff_out/cuffData.db
    Reading Run Info File /mydata2/0_GS_atmosphere_test/cuffdiff_out/run.info
    Writing runInfo Table
    Reading Read Group Info /mydata2/0_GS_atmosphere_test/cuffdiff_out/read_groups.info
    Writing replicates Table
    Error in sqliteExecStatement(con, statement, bind.data) :
    RS-DBI driver: (RS_SQLite_exec: could not execute: column rep_name is not unique)

    >

  • #2
    Hi Richard,

    I am having the exact same problem, but from searching Google it doesn't seem to be a common one.

    Did you figure out what was going on? Any advice would be very much appreciated!

    Thanks,

    Heather

    Comment


    • #3
      Dear Heather

      I didn't solve that issue in the end so found a couple of ways around the issue. The first thing i did was set up a free account with the iPlant collaborative (http://www.iplantcollaborative.org/). They have intergrated the Tuxedo protocol into a graphical user interface (GUI) that makes the whole system really easy to use. It runs on some super computers in Texas and/or Arazona so its also really fast. You have to up load your data on to their cloud computing system, but their data stores are limitless and their security tight. If you want to use the Tuxedo protocol i would strongly recormend their system.

      After analyzing my samples with Tuxedo i wasn't happy with my results (one of my biological repeats had double the reads of the others and i don't think the normalisation methods within Tuxedo are the greatest). So i investigated some other packages. I used the Max Planck's RobiNA package (http://mapman.gabipd.org/web/guest/robin) and was very impressed! It can take your data as raw sequencing files, BAM files or even as counts per gene. Its got a really easy to use GUI and best of all incorporates 2 normalisation packages DESeq and EdgeR which gave me some really interesting lists of genes to chase up...

      I hope this helps...

      Best wishes, Richard
      Last edited by Richard Barker; 07-15-2013, 06:38 AM. Reason: I wanted to add a web link

      Comment


      • #4
        TorontoHeather and Rechard
        re-run cufflinks/cuff_diff, change sample names (sample1...control....treatment...).

        Comment


        • #5
          Dear Richard and JP,

          Thank you both for your responses. I've been trying to get the iPlant Cloud services working but was encountering a different error when I tried running my data there.

          I just recently heard from someone who was able to solve both errors. The one I posted on here was solved by replacing "-" with "neg" and "+" with "pos" in the sample names. I guess re-running cufflinks/cuffdiff and renaming the samples would have the same effect!

          Now I am working through another error: the cummeRbund "getsig" does not return anything even though I can see that there are many that are significant. If I do not get it sorted soon I will write a new post with more details.

          Thanks again!

          Heather

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Essential Discoveries and Tools in Epitranscriptomics
            by seqadmin


            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
            Yesterday, 07:01 AM
          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          55 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          52 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          45 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          55 views
          0 likes
          Last Post seqadmin  
          Working...
          X