SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
Cufflinks2 analysis without significant genes rboettcher Bioinformatics 3 11-08-2012 06:58 AM
cuffdiff & significant diffex genes secda1 RNA Sequencing 0 10-15-2012 02:55 AM
cuffdiff generating ALL not significant results with ensembl gtf twotwo RNA Sequencing 0 09-13-2012 01:36 PM
CuffDiff and Significant Splicing events Starr_Hazard Bioinformatics 3 08-28-2012 09:41 AM
Cufflinks/Cuffdiff significant differential expression memo Bioinformatics 5 01-25-2011 10:49 AM

Reply
 
Thread Tools
Old 01-04-2013, 06:36 AM   #1
RocheKermit
Member
 
Location: Luxembourg

Join Date: Nov 2011
Posts: 15
Default No significant DE genes in cuffdiff

Dear all,

I want to get SDE known genes between 2 conditions (Tumor vs Normal) with 8 replicates in each condition.
The pipeline used is the following:
1. (16x)tophat:
Code:
tophat2 -p 12 -G /path/to/Ensembl/Homo_sapiens.GRCh37.69.gtf -o $outcache /path/to/hg19/bowtie2index/Ensembl/Homo_sapiens/Ensembl/GRCh37/Sequence/Bowtie2Index/genome /path/to/_1.merged.fastq /path/to/_2.merged.fastq
2. (1x)cuffdiff:
Code:
cuffdiff(v2.0.2) -p 12 -L ADD-Tumor,ADD-Normal -o /path/to/AllADD-Tumor-vs-Normal -b /path/to/hg19/bowtie2index/Ensembl/Homo_sapiens/Ensembl/GRCh37/Sequence/Bowtie2Index/genome.fa /path/to/Ensembl/Homo_sapiens.GRCh37.69.gtf /tumor/1/accepted_hits.bam,..,/tumor/n/accepted_hits.bam /normal/1/accepted_hits.bam,..,/normal/n/accepted_hits.bam
The results provided by cuffdiff show NO SDE genes at all (no q-Val<5%), which is very surprising, biologically speaking... all q-val are equal to 1.
About the p-Val, 2 genes are p-Val<0.01 and 16 genes are p-Val<0.05; that's weak numbers.

Having a look to a "positive" control, the SPP1 gene, here are it's numbers:
- in gene_exp.diff:
Code:
test_id	gene_id	gene	locus	sample_1	sample_2	status	value_1	value_2	log2(fold_change)	test_stat	p_value	q_value	significant
ENSG00000118785	ENSG00000118785	SPP1	4:88896818-88904562	ADD-Tumor	ADD-Normal	OK	124.125	1.81571	-6.09512	0.158608	0.873978	1	no
Note that the status is OK and the logFC is big (-6.1), but the p-Val and q-Val are bad.

- in genes.read_group_tracking:
Code:
tracking_id	condition	replicate	raw_frags	internal_scaled_frags	external_scaled_frags	FPKM	effective_length	status
ENSG00000118785	ADD-Tumor	1	30290	25617.2	25752.3	314.011	-	OK
ENSG00000118785	ADD-Tumor	0	5660	5864.01	5894.94	72.4812	-	OK
ENSG00000118785	ADD-Tumor	2	3096	3420.64	3438.68	42.9502	-	OK
ENSG00000118785	ADD-Tumor	3	5706	2969.05	2984.71	36.866	-	OK
ENSG00000118785	ADD-Tumor	4	32526	30257	30416.6	369.57	-	OK
ENSG00000118785	ADD-Tumor	5	594	803.644	807.883	10.3996	-	OK
ENSG00000118785	ADD-Tumor	6	6095	7016.68	7053.69	87.6907	-	OK
ENSG00000118785	ADD-Tumor	7	6180	5622.82	5652.48	68.0664	-	OK
ENSG00000118785	ADD-Normal	1	165	94.5041	91.8415	1.10194	-	OK
ENSG00000118785	ADD-Normal	0	12	18.5537	18.031	0.277918	-	OK
ENSG00000118785	ADD-Normal	2	33	32.5715	31.6538	0.489188	-	OK
ENSG00000118785	ADD-Normal	3	155	182.537	177.394	2.15486	-	OK
ENSG00000118785	ADD-Normal	4	243	230.643	224.144	2.70818	-	OK
ENSG00000118785	ADD-Normal	5	117	142.523	138.508	1.93188	-	OK
ENSG00000118785	ADD-Normal	6	712	466.519	453.375	5.51449	-	OK
ENSG00000118785	ADD-Normal	7	69	63.7359	61.9401	0.743175	-	OK
Note that, visually, there is a clear difference between groups ADD-Tumor vs ADD-Normal in ie: FPKM or raw frags...

Why is that SPP1 gene not catched as significant neither in p-Val, nore in q-Val?
How can be tuned the parameters in order to be less stringent?
I'm thinking to add:
-u/--multi-read-correct
-c/--min-alignment-count 10
-F/--min-outlier-p 0.05
-N/--upper-quartile-norm (instead of --geometric-norm)
--emit-count-tables (for tracking reasons)
--max-frag-multihits 10

Many thanks for your help!

Happy new year!!
RocheKermit is offline   Reply With Quote
Old 01-07-2013, 05:20 AM   #2
RocheKermit
Member
 
Location: Luxembourg

Join Date: Nov 2011
Posts: 15
Default

Just an up, just one I promise!
RocheKermit is offline   Reply With Quote
Old 01-10-2013, 08:03 AM   #3
sqcrft
Member
 
Location: boston

Join Date: May 2012
Posts: 29
Default

I have a number of experiments that have no DEGs in cuffdiff result.
It seems to me that, cuffdiff v2 is too stringent.
sqcrft is offline   Reply With Quote
Old 01-10-2013, 08:05 AM   #4
sqcrft
Member
 
Location: boston

Join Date: May 2012
Posts: 29
Default

I would suggest you to use p-value < 0.05. It is not that statistically sound, but it should work.
sqcrft is offline   Reply With Quote
Old 01-11-2013, 01:30 AM   #5
RocheKermit
Member
 
Location: Luxembourg

Join Date: Nov 2011
Posts: 15
Default

Yes, that's a parameter I will change (-F 0.05) and it's now running. This is still ok statistically speaking.
Of note, the usage of the FPKM/counts coming from the output of cuffdiff can be used in ie: EdgeR. This one is able to found significant genes (approx. 1 000) in my dataset.
So, yes, CuffDiff v2 looks too much stringent and I can't see, for the moment, how to make it more soft.
CuffDiff looks quite powerfull in abundance estimation.
RocheKermit is offline   Reply With Quote
Old 01-12-2014, 08:12 AM   #6
sindrle
Senior Member
 
Location: Norway

Join Date: Aug 2013
Posts: 266
Default

Hi!
This has happened to me in two separate experiments, even with up to 154 biological replicates.

There is just no way this result is correct.


What I did was to upgrade from Cuffdiff 2.0.2 to 2.1.1, because a lot of improvements has been done in this translation.

2.1.1 find a lot of changed genes, which I later validated with RT-qPCR.

So, upgrade and re-run!
sindrle is offline   Reply With Quote
Old 01-13-2014, 02:53 AM   #7
RocheKermit
Member
 
Location: Luxembourg

Join Date: Nov 2011
Posts: 15
Default No significant DE genes in cuffdiff

Many thanks for the tip!
I'll test cuffdiff 2.1.1 versus EdgeR with our coming new RNAseq results.

Of note, up to now I'm using tophat for alignment,
next, cufflinks -G genome.gtf accepted_hits.bam in order to estimate abundance of transcripts (raw reads and FPKM),
next EdgeR for evaluating significance.

Happy sequencing in 2014!
RocheKermit is offline   Reply With Quote
Old 01-13-2014, 04:02 AM   #8
sindrle
Senior Member
 
Location: Norway

Join Date: Aug 2013
Posts: 266
Default

Thats good.

Personally, I have tested some genes with RT-qPCR to evaluate edgeR vs DESeq2 vs Cuffdiff 2.1.1.

My conclusion is that Cuffdiff 2.1.1 is much worst for GENE level analysis. Of course its better for transcript analysis. I personally exclude Cuffdiff 2.1.1 from all other than transcript analysis.

Actually, edgeR was the best one, most sensitvite. DESeq 2 is maybe too conservative. Cuffdiff 2.1.1 is just wrong in a lot of cases.


So, my current approach is: edgeR as main gene level analyser, but also DEseq 2 as supplement. Cuffdiff 2.1.1 for novel discoveries and isoforms etc, but not gene level.

Please update us on your results!
sindrle is offline   Reply With Quote
Reply

Tags
cuffdiff

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:13 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO