Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • No significant DE genes in cuffdiff

    Dear all,

    I want to get SDE known genes between 2 conditions (Tumor vs Normal) with 8 replicates in each condition.
    The pipeline used is the following:
    1. (16x)tophat:
    Code:
    tophat2 -p 12 -G /path/to/Ensembl/Homo_sapiens.GRCh37.69.gtf -o $outcache /path/to/hg19/bowtie2index/Ensembl/Homo_sapiens/Ensembl/GRCh37/Sequence/Bowtie2Index/genome /path/to/_1.merged.fastq /path/to/_2.merged.fastq
    2. (1x)cuffdiff:
    Code:
    cuffdiff(v2.0.2) -p 12 -L ADD-Tumor,ADD-Normal -o /path/to/AllADD-Tumor-vs-Normal -b /path/to/hg19/bowtie2index/Ensembl/Homo_sapiens/Ensembl/GRCh37/Sequence/Bowtie2Index/genome.fa /path/to/Ensembl/Homo_sapiens.GRCh37.69.gtf /tumor/1/accepted_hits.bam,..,/tumor/n/accepted_hits.bam /normal/1/accepted_hits.bam,..,/normal/n/accepted_hits.bam
    The results provided by cuffdiff show NO SDE genes at all (no q-Val<5%), which is very surprising, biologically speaking... all q-val are equal to 1.
    About the p-Val, 2 genes are p-Val<0.01 and 16 genes are p-Val<0.05; that's weak numbers.

    Having a look to a "positive" control, the SPP1 gene, here are it's numbers:
    - in gene_exp.diff:
    Code:
    test_id	gene_id	gene	locus	sample_1	sample_2	status	value_1	value_2	log2(fold_change)	test_stat	p_value	q_value	significant
    ENSG00000118785	ENSG00000118785	SPP1	4:88896818-88904562	ADD-Tumor	ADD-Normal	OK	124.125	1.81571	-6.09512	0.158608	0.873978	1	no
    Note that the status is OK and the logFC is big (-6.1), but the p-Val and q-Val are bad.

    - in genes.read_group_tracking:
    Code:
    tracking_id	condition	replicate	raw_frags	internal_scaled_frags	external_scaled_frags	FPKM	effective_length	status
    ENSG00000118785	ADD-Tumor	1	30290	25617.2	25752.3	314.011	-	OK
    ENSG00000118785	ADD-Tumor	0	5660	5864.01	5894.94	72.4812	-	OK
    ENSG00000118785	ADD-Tumor	2	3096	3420.64	3438.68	42.9502	-	OK
    ENSG00000118785	ADD-Tumor	3	5706	2969.05	2984.71	36.866	-	OK
    ENSG00000118785	ADD-Tumor	4	32526	30257	30416.6	369.57	-	OK
    ENSG00000118785	ADD-Tumor	5	594	803.644	807.883	10.3996	-	OK
    ENSG00000118785	ADD-Tumor	6	6095	7016.68	7053.69	87.6907	-	OK
    ENSG00000118785	ADD-Tumor	7	6180	5622.82	5652.48	68.0664	-	OK
    ENSG00000118785	ADD-Normal	1	165	94.5041	91.8415	1.10194	-	OK
    ENSG00000118785	ADD-Normal	0	12	18.5537	18.031	0.277918	-	OK
    ENSG00000118785	ADD-Normal	2	33	32.5715	31.6538	0.489188	-	OK
    ENSG00000118785	ADD-Normal	3	155	182.537	177.394	2.15486	-	OK
    ENSG00000118785	ADD-Normal	4	243	230.643	224.144	2.70818	-	OK
    ENSG00000118785	ADD-Normal	5	117	142.523	138.508	1.93188	-	OK
    ENSG00000118785	ADD-Normal	6	712	466.519	453.375	5.51449	-	OK
    ENSG00000118785	ADD-Normal	7	69	63.7359	61.9401	0.743175	-	OK
    Note that, visually, there is a clear difference between groups ADD-Tumor vs ADD-Normal in ie: FPKM or raw frags...

    Why is that SPP1 gene not catched as significant neither in p-Val, nore in q-Val?
    How can be tuned the parameters in order to be less stringent?
    I'm thinking to add:
    -u/--multi-read-correct
    -c/--min-alignment-count 10
    -F/--min-outlier-p 0.05
    -N/--upper-quartile-norm (instead of --geometric-norm)
    --emit-count-tables (for tracking reasons)
    --max-frag-multihits 10

    Many thanks for your help!

    Happy new year!!

  • #2
    Just an up, just one I promise!

    Comment


    • #3
      I have a number of experiments that have no DEGs in cuffdiff result.
      It seems to me that, cuffdiff v2 is too stringent.

      Comment


      • #4
        I would suggest you to use p-value < 0.05. It is not that statistically sound, but it should work.

        Comment


        • #5
          Yes, that's a parameter I will change (-F 0.05) and it's now running. This is still ok statistically speaking.
          Of note, the usage of the FPKM/counts coming from the output of cuffdiff can be used in ie: EdgeR. This one is able to found significant genes (approx. 1 000) in my dataset.
          So, yes, CuffDiff v2 looks too much stringent and I can't see, for the moment, how to make it more soft.
          CuffDiff looks quite powerfull in abundance estimation.

          Comment


          • #6
            Hi!
            This has happened to me in two separate experiments, even with up to 154 biological replicates.

            There is just no way this result is correct.


            What I did was to upgrade from Cuffdiff 2.0.2 to 2.1.1, because a lot of improvements has been done in this translation.

            2.1.1 find a lot of changed genes, which I later validated with RT-qPCR.

            So, upgrade and re-run!

            Comment


            • #7
              No significant DE genes in cuffdiff

              Many thanks for the tip!
              I'll test cuffdiff 2.1.1 versus EdgeR with our coming new RNAseq results.

              Of note, up to now I'm using tophat for alignment,
              next, cufflinks -G genome.gtf accepted_hits.bam in order to estimate abundance of transcripts (raw reads and FPKM),
              next EdgeR for evaluating significance.

              Happy sequencing in 2014!

              Comment


              • #8
                Thats good.

                Personally, I have tested some genes with RT-qPCR to evaluate edgeR vs DESeq2 vs Cuffdiff 2.1.1.

                My conclusion is that Cuffdiff 2.1.1 is much worst for GENE level analysis. Of course its better for transcript analysis. I personally exclude Cuffdiff 2.1.1 from all other than transcript analysis.

                Actually, edgeR was the best one, most sensitvite. DESeq 2 is maybe too conservative. Cuffdiff 2.1.1 is just wrong in a lot of cases.


                So, my current approach is: edgeR as main gene level analyser, but also DEseq 2 as supplement. Cuffdiff 2.1.1 for novel discoveries and isoforms etc, but not gene level.

                Please update us on your results!

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Current Approaches to Protein Sequencing
                  by seqadmin


                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                  04-04-2024, 04:25 PM
                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 04-11-2024, 12:08 PM
                0 responses
                18 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 10:19 PM
                0 responses
                22 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 09:21 AM
                0 responses
                17 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-04-2024, 09:00 AM
                0 responses
                48 views
                0 likes
                Last Post seqadmin  
                Working...
                X