![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
What is the meaning density in CummeRbund plots? | Carlos Borroto | Bioinformatics | 16 | 12-29-2016 10:05 PM |
volcano plots cummeRbund | godzilla07 | Bioinformatics | 8 | 02-04-2015 01:55 PM |
Fruit Fly CummeRbund Plots Are Not The Same | jmwhitha | Bioinformatics | 10 | 09-26-2014 10:10 AM |
CummerBund error scale plots | Mokinhas | Bioinformatics | 1 | 12-04-2013 04:19 AM |
CummeRbund plots | godzilla07 | Bioinformatics | 1 | 08-10-2012 05:00 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Junior Member
Location: Ottawa, Ontario Join Date: Sep 2014
Posts: 9
|
![]()
Hi everyone,
I'm analyzing some RNA-seq data for a colleague and some of the results/diagnostic plots have raised some caution flags in my mind. I'll include the code I used to avoid going back and forth. Cliffnotes: Getting a large number of differentially expressed genes (which may or may not be normal?), large log2FCs, and diagnostic plots don't quite look typical. Experimental Design: RNA-seq before and after inducing differentiation(GNP -> GC). Two biological replicates for each condition. Alignment & Counts (done for each of the four samples): Code:
tophat -p 8 -G Mus.musculus.NCBIM37.65.gtf -o tophat_dir --no-novel-juncs mm9 GC2_R1.fastq GC2_R2.fastq samtools sort -n accepted_hits.bam sorted.GC2 htseq-count -f bam sorted.GC2 Mus.musculus.NCBIM37.65.norRNA.gtf > GC2.counts.txt ##annotation lacks rRNA just in case Code:
sampleName fileName condition 1 GC2.counts.txt GC2.counts.txt GC 2 GC3.counts.txt GC3.counts.txt GC 3 GNP2.counts.txt GNP2.counts.txt GNP 4 GNP3.counts.txt GNP3.counts.txt GNP > dds class: DESeqDataSet dim: 37651 4 exptData(0): assays(3): counts mu cooks rownames(37651): ENSMUSG00000000001 ENSMUSG00000000003 ... ENSMUSG00000093788 ENSMUSG00000093789 rowData metadata column names(27): baseMean baseVar ... deviance maxCooks colnames(4): GC2.counts.txt GC3.counts.txt GNP2.counts.txt GNP3.counts.txt colData names(2): condition sizeFactor Code:
dds <- DESeq(dds) res <- results(dds) > summary(res) out of 24178 with nonzero total read count adjusted p-value < 0.1 LFC > 0 (up) : 5752, 24% ##~1000 LFC > 2 LFC < 0 (down) : 5786, 24% ##~1000 LFC < 2 outliers [1] : 0, 0% low counts [2] : 4687, 19% (mean count < 1.7) [1] see 'cooksCutoff' argument of ?results [2] see 'independentFiltering' argument of ?results ##MA plot plotMA(res, main="DESeq2", ylim=c(-10,10)) ##Large number of significant genes with high mean exp value ##P value dist. hist(res$pval, breaks=100) ##Very few genes with high p-value (looks similar with padj) ##Dispersion plot plotDispEsts(dds) ##Not really sure if there's anything strange here ##Per gene standard deviation plots #Script exactly as presented in DESeq2 vignette: shifted logarithm log2(n + 1) (left), the regularized log transformation(center), and the variance stabilizing transformation (right) #Some really high SDs ##Euclidian distances distsRL <- dist(t(assay(rld))) EDIT: Just realized I mentioned cuffdiff/cummeRbund in the title but didn't include anything from it. Volcano plot from there showed results consistent with the MA plot here. David Cook MSc. Candidate, University of OTtawa Last edited by DPCook; 03-12-2015 at 09:57 AM. Reason: Deceiving title, sorry |
![]() |
![]() |
![]() |
#2 |
Senior Member
Location: Boston Join Date: Jul 2013
Posts: 333
|
![]()
PCA plot is also nice for seeing sample distances. From the distplot these conditions look very distinct. Why don't you consider testing at a higher threshold than |LFC| > 0, as this seems to be achieved by many genes? See the lfcThreshold argument of ?results. The reasoning is described in the "Specifying minimum effect size" section of the paper.
|
![]() |
![]() |
![]() |
#3 |
Junior Member
Location: Ottawa, Ontario Join Date: Sep 2014
Posts: 9
|
![]()
Thanks for the reply Michael. I adjusted the lfc threshold and it certainly made the list a bit more manageable (LFC > 1.5, FDR=0.05 yielded about 800 DEGs in both directions). I guess I just wasn't expecting such large differences between the two conditions and was concerned that I was missing some obvious artefact that could cause it.
I also ran the PCA following regularize log transformation to look at distances. PC1 is apparently capturing 100% variance and splits the two conditions. I'm no expert, so correct me if I'm wrong, but I suppose this supports that idea that the results are just the product of very distinct conditions because the variability between biological replicates is negligible (at least relative to the differences between conditions). Thanks! |
![]() |
![]() |
![]() |
#4 |
Junior Member
Location: USA Join Date: Nov 2019
Posts: 1
|
![]()
Thanks for sharing. We provide full support for all your Arlo devices, including guidance for your netgear extender setup . So if you are having issues with connecting to the Wifi or configure settings on the Arlo app, then contact us using our live chat services or our email. You can also call us using our customer support phone number.
|
![]() |
![]() |
![]() |
Thread Tools | |
|
|