SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
cuffdiff output kasutubh Bioinformatics 1 07-30-2013 06:46 PM
cuffdiff and limma, puzzled by the differences dawe Bioinformatics 1 08-31-2012 05:44 AM
CuffDiff output Rachelly Bioinformatics 11 04-17-2012 08:04 PM
q>1 in cuffdiff output kasutubh Bioinformatics 1 04-03-2012 05:29 PM
Odd characters in samtools mpileup output Bueller_007 Bioinformatics 0 08-26-2011 04:33 PM

Reply
 
Thread Tools
Old 04-25-2014, 04:59 AM   #1
rdsqc22
Junior Member
 
Location: Rochester

Join Date: Nov 2013
Posts: 7
Default Odd statistical differences in Cuffdiff output?

Hi,

I've been aligning and counting some RNA-seq reads with SHRiMP and Cuffdiff, doung the same analysis with both an older genome assembly and a newer one, and I found an interesting possible discrepancy in my Cuffdiff output. If anyone could help explain it would be much appreciated.

Basically, I noticed a number of different genes where the expression levels was similar between the two assemblies, yet for some reason Cuffdiff was reporting wildly different significance results between the two. For example:

gene locus sample_1 sample_2 status value_1 value_2 log2(fold_change) test_stat p_value q_value significant
Asmb 5: Gfap 10:90763148-90771847 lineA lineN OK 484.283 11.1909 -5.43545 1.62926 0.103258 0.394632 no
Asmb 4: Gfap 10:92059880-92068555 lineA lineN OK 526.67 12.77 -5.36606 4.09233 4.27058E-005 0.00052085 yes

Both were run with the same cuffdiff binary (Cuffdiff 2.0.2), with the exact same command (adjusted for the appropriate assembly), with an FDR of 0.05. It would stand to reason that the results are similar between the binaries- Line A is much more upregulated than line N in both cases, and the only statistical difference I can see that might have an effect is that the size of the gene in the assembly changed by 24 nucleotides, out of just under 10000.

If the gene size, fold change, and FPKM values are so similar, why are the statistical values so wildly different? This does not make sense to me.

Thanks!
rdsqc22 is offline   Reply With Quote
Old 04-25-2014, 08:22 AM   #2
Wallysb01
Senior Member
 
Location: San Francisco, CA

Join Date: Feb 2011
Posts: 286
Default

Agree this is odd.

Can you post your commands? That might help us get a little more information.

Also, how much changed as far as number of genes in your annotation, or what percent of reads are mapping to each genome?

You should probably update your version of cufflinks too. Even though these are the same version, we are a long way from v2.0.2 now.

Have you tried doing this with DESeq2? Might be worth seeing if this is something that is cufflinks specific or more broadly true about something going on in your new genome and genome annotation.
Wallysb01 is offline   Reply With Quote
Old 04-28-2014, 11:52 AM   #3
rdsqc22
Junior Member
 
Location: Rochester

Join Date: Nov 2013
Posts: 7
Default

My command, used in both cases, is simply:

cuffdiff --FDR 0.05 -u -b genome.fa -p 4 -L lineA,lineN -o cuffdiffout genes.gtf lineA.bam lineN.bam

We downgraded to 2.0.2 because we had run into trouble with version 2.1.1, which is what had been installed previously- our sequencing center uses 2.0.2, which is why that version was chosen. I'm currently running another run with 2.2.0.

This is the older assembly used: http://www.ncbi.nlm.nih.gov/assembly/237618/
And the newer one: http://www.ncbi.nlm.nih.gov/assembly/382928

I'm familiar with Cuffdiff, which is why it was used. I'll try using DeSeq2, though.
rdsqc22 is offline   Reply With Quote
Reply

Tags
cuffdiff, cufflinks, rna-seq

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:23 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO