I'm using cuffdiff for a differential expression analysis and the cummeRbund R package for followup analysis/visualization. I have 5 biological replicates for control, 3 biological replicates for mutant condition. Based on the cuffdiff documentation I ran the following cuffdiff command:
(I forgot to use the -L option to give my groups labels.)
My question is, is this the proper way to specify biological replicates? The documentation seems to suggest that each comma separated file is a technical replicate of the same sample, but it isn't clear.
Later on, when I went to use cummeRbund to do some visualization (e.g. boxplots, heatmaps), I'm only getting results for two samples, like it combined the FPKM values across all the alignments for each comma separated list.
I would expect to have 8 boxplots, one for each sample, and 8 rows in the heatmap, one for each sample. This makes me think the way I'm specifying replicates is asking cuffdiff to treat each item in the comma-separated list as a technical, not biological, replicate.
Thanks in advance!
Code:
cuffdiff cuffcmp.combined.gtf \ c1.bam,c2.bam,c3.bam,c4.bam,c5.bam \ m1.bam,m2.bam,m3.bam
My question is, is this the proper way to specify biological replicates? The documentation seems to suggest that each comma separated file is a technical replicate of the same sample, but it isn't clear.
Later on, when I went to use cummeRbund to do some visualization (e.g. boxplots, heatmaps), I'm only getting results for two samples, like it combined the FPKM values across all the alignments for each comma separated list.
Code:
library(cummeRbund) cuff <- readCufflinks() #make boxplot csBoxplot(genes(cuff)) #get the top 100 diff expr genes gene.diff <- diffData(genes(cuff)) gene.diff.top <- gene.diff[order(gene.diff$q_value),][1:100,] # gene ids of top 100 diff expr genes myGeneIds <- gene.diff.top$gene_id # get genes myGenes <- getGenes(cuff, myGeneIds) # make a heatmap csHeatmap(myGenes, cluster="both")
Thanks in advance!
Comment