Seqanswers Leaderboard Ad

**id0** · 03-28-2014, 05:44 AM

According to the cuffnorm documentation:

Cuffnorm will report both FPKM values and normalized, estimates for the number of fragments that originate from each gene, transcript, TSS group, and CDS group. Note that because these counts are already normalized to account for differences in library size, they should not be used with downstream differential expression tools that require raw counts as input.

So they specifically warn against using cuffnorm counts for tools that require raw counts.

I actually have a follow-up question. If these are just normalized counts, why can't they be used? When they get re-normalized again by another tool, wouldn't they just come out the same as if they weren't normalized? The initial normalization shouldn't lose any information.

**gringer** · 03-29-2014, 11:53 AM

No, this is not an appropriate thing to do for either DESeq or edgeR. They assume raw counts are used as input, and these have a particular distribution that is assumed by the programs. The programs use the assumed distribution to estimate biological variation and determine statistical significance. While your normalised count values may be similar (or the same), the probability calculations will likely be off.

**dpryan** · 03-30-2014, 02:22 AM

There seems to be a common misconception that tools like DESeq(2) actually store the normalized counts somewhere. They don't, in fact, which is why trying to input normalized counts will lead to no end of problems.

**zaki** · 05-20-2014, 12:15 AM

Following up from the original question..

If we were to use cuffnorm with --library-norm-method parameter specifying classic-fpkm, can the count data be used for DESeq/DESeq(2)?

classic-fpkm - Library size factor is set to 1 - no scaling applied to FPKM values or fragment counts. (default for Cufflinks)

Does this mean the library size normalization was not applied? and therefore can the count data be considered as raw count??

**raphael123** · 06-30-2014, 10:08 AM

Do you think there is a way to get the raw count in a readable format for DESeq ?
Or a way to read the binary file ? I can t find that !

**dpryan** · 06-30-2014, 11:36 AM

What binary file? If you mean the BAM file, just use featureCounts or htseq-count.

**raphael123** · 06-30-2014, 11:39 AM

No the raw count table:

Cuffquant produces writes a single output file, abundances.cxb, into the output directory. CXB files are binary files, and can be passed to Cuffnorm or Cuffdiff for further processing.

I would like to analyse the raw count with DESeq2

**dpryan** · 06-30-2014, 11:58 AM

I'm sure it's theoretically possible to read the CXB file, but since its format seems to have never been documented, you'd have to go through the source code and reverse-engineer its format. It'd be faster to just ignore it.

**raphael123** · 06-30-2014, 12:00 PM

Thanks for your answer !|
So there is no way to get the read counts from cuff-tools ? Maybe I miss something here..

**dpryan** · 06-30-2014, 12:20 PM

Hard to say, there are a lot of undocumented areas of those programs. It's quick enough to just use featureCounts.

**raphael123** · 06-30-2014, 12:23 PM

Oh ! so featureCounts is a tool to construct a count table from a sam/bam file ?
Thank you !

**dpryan** · 06-30-2014, 12:25 PM

Yes, it's similar to htseq-count, though significantly faster.

**gringer** · 06-30-2014, 05:22 PM

To repeat myself, you shouldn't be using cufflinks output as input to DESeq2, because DESeq is expecting raw count data, and depends on that for its model.

If you want to do isoform-level analysis with a DESeq-like workflow, look at DEXSeq, which has its own method of counting by using raw counts for exon bins.

**shi** · 07-01-2014, 03:10 PM

Another option is to use limma/voom, which accepts fractional counts.

Topics	Statistics	Last Post
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Yesterday, 08:47 AM	0 responses 14 views 0 likes	Last Post by seqadmin Yesterday, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 54 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM

Seqanswers Leaderboard Ad

Announcement

cuffquant count data as input for DESeq/DEXseq

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News