SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Different fpkm values for cuffdiff and cuffcompare madsaan Bioinformatics 3 12-12-2012 04:14 PM
Different FPKM values of cufflinks and cuffdiff mrfox Bioinformatics 5 10-17-2012 01:10 PM
Cufflinks and cuffdiff FPKM values combiochem Bioinformatics 12 10-13-2012 11:37 PM
cuffdiff: same run but different FPKM, your thoughts? tdm Bioinformatics 0 01-31-2011 12:58 PM
Different FPKM values of cufflinks and cuffdiff in latest version mrfox Bioinformatics 1 11-23-2010 05:23 AM

Reply
 
Thread Tools
Old 01-18-2012, 08:24 PM   #1
peromhc
Senior Member
 
Location: Durham, NH

Join Date: Sep 2009
Posts: 108
Question cufflinks FPKM >>> Cuffdiff FPKM

I cannot understand why the FPKM estimated in cufflinks is SO much larger than that in cuffdiff:

Cufflinks
Code:
cufflinks -p8 -m320 -u -o /media/hd/working/tuco/17Jan12socialcuff -L social \
--upper-quartile-norm --max-mle-iterations 20000 \
/media/hd/working/tuco/b2.social/social.bam

cat transcripts.gtf | grep 'comp14388_c0_seq1'

comp14388_c0_seq1; FPKM "1630419.4581286784";
I merged the .gtf files from each cufflinks run, and fed that to cufflinks
I have 5 biological reps for each group

Cuffdiff
Code:
mkdir /media/hd/working/tuco/17Jan.cuffdiff
cd /media/hd/working/tuco/17Jan.cuffdiff

cuffdiff -p8 -L social,solitary -N -u \
--max-mle-iterations 10000 /media/hd/working/tuco/17Jan12cuffcompare/*gtf \
/media/hd/working/tuco/b2.bams/406A.bam,\
/media/hd/working/tuco/b2.bams/4262.bam,\
/media/hd/working/tuco/b2.bams/2354.bam,\
/media/hd/working/tuco/b2.bams/4241.bam,\
/media/hd/working/tuco/b2.bams/401C.bam \
/media/hd/working/tuco/b2.bams/6236.bam,\
/media/hd/working/tuco/b2.bams/2226.bam,\
/media/hd/working/tuco/b2.bams/5B5C.bam,\
/media/hd/working/tuco/b2.bams/255D.bam,\
/media/hd/working/tuco/b2.bams/4572.bam

cat gene_exp.diff | grep 'comp14388_c0_seq1'

comp14388_c0_seq1:0-1977	social	solitary	10.5437	8.08172

ok... 1630419.4581286784 >>> 10.5437 Why??
peromhc is offline   Reply With Quote
Old 01-19-2012, 07:57 AM   #2
peromhc
Senior Member
 
Location: Durham, NH

Join Date: Sep 2009
Posts: 108
Default

I should note that 'social.bam' is just a product of samtools merge for all the individuals in the social treatment.. Those bamfiles are listed individually in Cuffdiff-- to indicate that there are biological replicates.

So, in essence, the FPKM from social.bam from cufflinks should be the average value from all the individuals in that group.
peromhc is offline   Reply With Quote
Old 01-25-2012, 10:09 AM   #3
polyatail
Member
 
Location: New York, NY

Join Date: Dec 2010
Posts: 25
Default

Just at first glance, in your cufflinks run you specify two different parameters that will affect the FPKM calculation.
Code:
--upper-quartile-norm --max-mle-iterations 20000
I would try changing --max-mle-iterations to match cuffdiff, disabling quartile normization, and running the biological replicates through cufflinks separately to see if this difference is true. Then I would try cufflinks with the merged BAMs. Internally the same code does the quantification in both cufflinks and cuffdiff.

Also, I noticed you're looking in transcripts.gtf for cufflinks and gene_exp.diff for cuffdiff. It would be better to look in isoforms.fpkm_tracking for both cufflinks and cuffdiff, as gene_exp.diff lists quantification at the locus level while transcripts.gtf is at the isoform level.
polyatail is offline   Reply With Quote
Old 01-25-2012, 07:21 PM   #4
peromhc
Senior Member
 
Location: Durham, NH

Join Date: Sep 2009
Posts: 108
Default

also, I just realized that log10(1630419.4581286784) is about 6, which is pretty close to 10.. I wonder if the difference is this easy.
peromhc is offline   Reply With Quote
Old 04-18-2012, 04:17 AM   #5
sudders
Member
 
Location: Sheffield, UK

Join Date: Dec 2011
Posts: 32
Default

Did you ever find a solution to this? We run into the same problem.

Our pipeline is thus:
We map reads with tophat for each sample
Run cufflinks on each sample to generate a transcriptome assembly

the command looks something like:
Code:
 cufflinks --label tax-Pre-R5
               --num-threads 4
               --library-type fr-secondstrand
               --frag-bias-correct /ifs/mirror/genomes/bowtie/hg19.fa
               --multi-read-correct
               --upper-quartile-norm
               /ifs/projects/proj004/rnaseq4/tax-Pre-R5.accepted.bam
Run Cuffmerge and Cuffcompare to generate merged gene sets.

We also run cuff diff to test for differences.

Our cuffdiff commands look like:

Code:
 cuffdiff --output-dir abinitio.cuffdiff.dir             
                 --library-type fr-secondstrand
                 --upper-quartile-norm 
                 --frag-bias-correct /ifs/mirror/genomes/bowtie/hg19.fa
                 --multi-read-correct
                 --verbose
                 --num-threads 16
                 --labels Prostate-Pre-agg,Prostate-Post-agg,tax-Pre-agg,tax-Post-agg              
                 --FDR 0.050000
                abinitio.gtf
              Prostate-Pre-R7.accepted.bam,Prostate-Pre-R1.accepted.bam,Prostate-Pre-R4.accepted.bam,Prostate-Pre-R2.accepted.bam,Prostate-Pre-R8.accepted.bam,Prostate-Pre-R5.accepted.bam,Prostate-Pre-R3.accepted.bam,Prostate-Pre-R6.accepted.bam
             Prostate-Post-R7.accepted.bam,Prostate-Post-R8.accepted.bam,Prostate-Post-R6.accepted.bam,Prostate-Post-R3.accepted.bam,Prostate-Post-R5.accepted.bam,Prostate-Post-R2.accepted.bam,Prostate-Post-R4.accepted.bam,Prostate-Post-R1.accepted.bam   
            tax-Pre-R1.accepted.bam,tax-Pre-R3.accepted.bam,tax-Pre-R2.accepted.bam,tax-Pre-R6.accepted.bam,tax-Pre-R4.accepted.bam,tax-Pre-R5.accepted.bam
           tax-Post-R6.accepted.bam,tax-Post-R1.accepted.bam,tax-Post-R4.accepted.bam,tax-Post-R5.accepted.bam,tax-Post-R2.accepted.bam,tax-Post-R3.accepted.bam
If we compare the FPKMs coming out of cuffcompare and cuffdiff they are not even within two or three orders of magnitude of each other, with the cuffcompare FPKMs being in the millions or tens of millions, while the cuffdiff outputs being in the more sensible 0 - several hundred range.

We're using cufflinks 1.3.1.
sudders is offline   Reply With Quote
Old 08-01-2012, 07:49 AM   #6
mmanrique
Member
 
Location: Granada, Spain

Join Date: Dec 2009
Posts: 12
Default

Hi,

we had the same problem and tried the new Cufflinks version 2.0.2 and it seems the values from Cufflinks and Cuffdiff are the same (have to check it more carefully)

these are the commands I used

Code:
cufflinks -o ./Sample001_cufflinks_out_No_N_2.0.2 -u -g ../genes.gtf -p 2 --total-hits-norm ../Sample_001_accepted_hits.bam
Code:
cuffdiff -o ./COMPARISON1_SAMPLE1_SAMPLE1BIS_cuffdiff_out/ -L SAMPLE1,SAMPLE1BIS -p 2 -u -v -emit-count-tables -total-hits-norm ../Sample001_cufflinks_out/transcripts.gtf ../Sample_001_accepted_hits.sam ../Sample_001_bis_accepted_hits.sam
I know it's weird to use cuffdiff to compare one sample to itself but I had no other choice...

HTH

Marina

EDIT: Though the FPKM values from Cufflinks and Cuffdiff are now more similar I still get unreasonable high FPKM values specially for very short genes (around 37nt, regulatory RNAs I guess). Searching for some kind of explanation I found this thread http://seqanswers.com/forums/showthread.php?t=20702 it's worth reading it, good explanation by Cole Trapnell on why in small genes you can get extremely high FPKM values

Last edited by mmanrique; 08-04-2012 at 07:25 AM.
mmanrique is offline   Reply With Quote
Old 10-17-2012, 01:07 PM   #7
IBseq
Member
 
Location: uk

Join Date: Jul 2012
Posts: 56
Default

hi all, i had the same prob and i was told to run cuffdiff WITHOUT the "N" option (perform quartile normalization)

hope it helps....
ib
IBseq is offline   Reply With Quote
Reply

Tags
cuffdiff, cufflinks

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:02 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO