Hi,
I have a dataset with 4 conditions and 2 biological replicates each. I'm following the (Nature Protocols paper), however, when I try to create the density or box plots I get the following error:
Here is my basic code:
I think something is seriously wrong with either the data coming from our core lab, my implementation of the TopHat pipeline, or the TopHat pipeline itself. Using CuffDiff on this data I'm getting 623 FAILED tests, 145024 NOTESTs, and 8829 OK. I also have another data set with similar issues mentioned in another thread where I also tested different parameters in CuffDiff to see if the data is the problem.
Here is my code to generate the data mentioned in this thread. It's generalized to remove sample names, etc. so this wouldn't run as is. But you can see what I did:
I appreciate any help! I'm wondering if I need to abandon TopHat, etc., though I would rather not since this whole pipeline is more integrated than anything else I've seen for RNASeq!
Useful information:
I have a dataset with 4 conditions and 2 biological replicates each. I'm following the (Nature Protocols paper), however, when I try to create the density or box plots I get the following error:
Error in dat$fpkm + pseudocount : non-numeric argument to binary operator
Code:
cuff<-readCufflinks() dens<-csDensity(genes(cuff)) b<-csBoxplot(genes(cuff))
Here is my code to generate the data mentioned in this thread. It's generalized to remove sample names, etc. so this wouldn't run as is. But you can see what I did:
Code:
datapath='/path/to/data/' genomepath='/path/to/genome/' gtfpath='/path/to/gtf/' # This is done for each of the 8 samples in separate PBS scripts tophat2 -p 12 -o ${datapath}results -G ${gtfpath}genes.gtf --transcriptome-index ${gtfpath}known ${genomepath}genome ${datapath}data.txt.gz cufflinks -p 12 -o ${datapath}sample-clout ${datapath}results/accepted_hits.bam # I create assemblies.txt before this step cuffmerge -g ${gtfpath}genes.gtf -s ${genomepath}genome.fa -p 12 ${datapath}assemblies.txt -o ${datapath} cuffdiff -o ${datapath}diff_out -b ${genomepath}genome.fa -p 12 -L Group1,Group2,Group3,Group4 -u ${datapath}merged_asm/merged.gtf \ ${datapath}results1/accepted_hits.bam,${datapath}results2/accepted_hits.bam \ ${datapath}results3/accepted_hits.bam,${datapath}results4/accepted_hits.bam \ ${datapath}results5/accepted_hits.bam,${datapath}results6/accepted_hits.bam \ ${datapath}results7/accepted_hits.bam,${datapath}results8/accepted_hits.bam
Useful information:
- I'm using R 2.15.1, tophat2 2.0.3, cufflinks 2.0.2, and cummeRbund 1.99.2
Comment