SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
cuffdiff results, different cuff id for same locus? potato84 RNA Sequencing 3 07-07-2013 07:05 PM
Inconsistency between cuffdiff 1.1.0 and cuffdiff 1.0.2 tleonardi Bioinformatics 3 11-16-2011 08:23 AM
cuffdiff replicates empty results fangquan Bioinformatics 7 09-07-2011 11:21 PM
Suspicious results from CuffDiff Pejman Bioinformatics 1 02-20-2011 02:20 PM
?'s about cuffdiff without genome, and about results dlamber2 RNA Sequencing 0 07-06-2010 01:40 PM

Reply
 
Thread Tools
Old 10-09-2012, 12:36 PM   #1
potato84
Member
 
Location: USA

Join Date: Jun 2012
Posts: 10
Question Inconsistency in Cuffdiff results

Hi all,

I use cuffdiff to compare my RNA-Seq samples, and the result I got is inconsistent.

For example, I have three samples, S1, S2, and S3. I ran cuffdiff for a pair first S1 vs. S2. Then, I ran cuffdiff for all three samples. Since cuffdiff does pair-wise, it reports all pairs. S1 vs. S2; S1 vs. S3; and S2 vs. S3.

The results I got for S1 vs. S2 from these two runs are different. I assume they should be the same. I'm wondering is there anything I did wrong? or cuffdiff considers more factors when sample is more?

Thanks,
Xiaoyu
potato84 is offline   Reply With Quote
Old 10-10-2012, 11:54 PM   #2
hlwright
Member
 
Location: Liverpool, UK

Join Date: Feb 2011
Posts: 30
Default

I have noticied this as well. I also notice that in the new version of cufflinks (2.0.2), cuffdiff produces a file with the individual RPKM values for replicates. I have 14 disease samples and 6 controls so when I run cuffdiff I have two conditions with replicates (14 and 6 in each condition). If I run the analysis disease v control I get different individual RPKM values than if I split the disease samples into "drug responder" and "drug non-responder" and re-run cuffdiff with 3 conditions (responder, non-responder, control). I would expect the individual RPKM values to be the same irrespective of the number of conditions.

Or am I misunderstanding something?

Thanks
Helen
hlwright is offline   Reply With Quote
Old 10-11-2012, 07:06 AM   #3
potato84
Member
 
Location: USA

Join Date: Jun 2012
Posts: 10
Default

Quote:
Originally Posted by hlwright View Post
I have noticied this as well. I also notice that in the new version of cufflinks (2.0.2), cuffdiff produces a file with the individual RPKM values for replicates. I have 14 disease samples and 6 controls so when I run cuffdiff I have two conditions with replicates (14 and 6 in each condition). If I run the analysis disease v control I get different individual RPKM values than if I split the disease samples into "drug responder" and "drug non-responder" and re-run cuffdiff with 3 conditions (responder, non-responder, control). I would expect the individual RPKM values to be the same irrespective of the number of conditions.

Or am I misunderstanding something?

Thanks
Helen
I guess my problem is not exactly as yours, but similar. For your case, after you split the disease sample, each sample has different number of replicates than the first time you run the experiment. For my case, I have exact same sample, just adding one sample for the second run.
potato84 is offline   Reply With Quote
Old 10-11-2012, 07:13 AM   #4
hlwright
Member
 
Location: Liverpool, UK

Join Date: Feb 2011
Posts: 30
Default

Xiaoyu

Yes I have a different number of replicates when I run the analysis the second time, so I might expect that the gene RPKM value in the genes.fpkm_tracking file (one RPKM for each condition/gene) would be different. However, would I not expect the individual RPKM values (in the genes.read_group_tracking file) for each sample to be the same no matter how the analysis was run?

Helen
hlwright is offline   Reply With Quote
Old 11-01-2012, 08:14 AM   #5
potato84
Member
 
Location: USA

Join Date: Jun 2012
Posts: 10
Default

Honestly, I don't know the answer... But, if you are checking the Cuffdiff result, I guess, they might be different, since your replicates are different, and cuffdiff will do normalization differently ...
potato84 is offline   Reply With Quote
Old 11-01-2012, 08:00 PM   #6
masterpiece
Member
 
Location: malaysia

Join Date: Mar 2009
Posts: 40
Default

Can you share with us the command you use to run the cuffdiff ?
masterpiece is offline   Reply With Quote
Old 11-02-2012, 06:49 AM   #7
potato84
Member
 
Location: USA

Join Date: Jun 2012
Posts: 10
Default

This is the command I used. Thanks

Code:
cuffdiff -p 8 -o dfout -L S1,S2,S3 merged.gtf ./S1/accepted_hits.bam ./S2_R1/accepted_hits.bam,./S2_R2/accepted_hits.bam ./S3_R1/accepted_hits.bam,./S3_R2_accepted_hits.bam
potato84 is offline   Reply With Quote
Old 11-02-2012, 08:49 AM   #8
mbblack
Senior Member
 
Location: Research Triangle Park, NC

Join Date: Aug 2009
Posts: 207
Default

Quote:
Originally Posted by potato84 View Post
Hi all,

I use cuffdiff to compare my RNA-Seq samples, and the result I got is inconsistent.

For example, I have three samples, S1, S2, and S3. I ran cuffdiff for a pair first S1 vs. S2. Then, I ran cuffdiff for all three samples. Since cuffdiff does pair-wise, it reports all pairs. S1 vs. S2; S1 vs. S3; and S2 vs. S3.

The results I got for S1 vs. S2 from these two runs are different. I assume they should be the same. I'm wondering is there anything I did wrong? or cuffdiff considers more factors when sample is more?

Thanks,
Xiaoyu
Keep in mind that in the absence of replicates, cuffdiff uses the pooled conditions to derive its dispersion estimate. So your dispersions estimates may be very different when you ran with only a pair of samples versus running with all three. That will inherently affect your estimates of significance when computing differences between pairs of samples. So you would not expect to get the same significance for those analyses.
__________________
Michael Black, Ph.D.
The Hamner Institutes for Health Sciences
RTP, N.C.
mbblack is offline   Reply With Quote
Old 11-02-2012, 09:15 AM   #9
potato84
Member
 
Location: USA

Join Date: Jun 2012
Posts: 10
Default

Thank you for answering my questions, mbblack.

Then shall I expect to get the same result for S2 vs. S3? I have replicates for both samples.
potato84 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:06 AM.


Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.