Seqanswers Leaderboard Ad

**drdna** · 06-08-2012, 06:42 PM

I'm currently following up on this by generating a control dataset containing known transcript abundances. Stay tuned...

**drdna** · 06-09-2012, 11:33 AM

Cufflinks IS flawed

So, using artificially-generated control datasets, I find that cufflinks is flawed in two ways:

First, it's FPKM values are inflated. Problem is the magnitude of inflation varies from gene-to-gene - there is no consistency in the error.

Second the "locus" interval defined in the cuffdiff output is often just plain wrong. In many instances, the reported "locus" frequently spans multiple transcripts and intergenic regions, even though the dataset contains reads from only one transcript. In other words, neither the .gtf file, nor the input sequence data support expansion of the "locus" to cover multiple genes.

**Portah** · 06-09-2012, 08:04 PM

Hi, I've met almost the same problem in addition in gtf file for mm9 from UCSC annotation I have:

chr10 unknown exon 80640798 80640979 . + . gene_id "Eef2"; gene_name "Eef2"; p_id "P7224"; transcript_id "NM_007907"; tss_id "TSS5168";
chr10 unknown CDS 80641426 80641637 . + 2 gene_id "Eef2"; gene_name "Eef2"; p_id "P7224"; transcript_id "NM_007907"; tss_id "TSS5168";
chr10 unknown exon 80641426 80641637 . + . gene_id "Eef2"; gene_name "Eef2"; p_id "P7224"; transcript_id "NM_007907"; tss_id "TSS5168";
chr10 unknown exon 80641706 80641758 . + . gene_id "Snord37"; gene_name "Snord37"; transcript_id "NR_028549"; tss_id "TSS16143";
chr10 unknown CDS 80641826 80642004 . + 0 gene_id "Eef2"; gene_name "Eef2"; p_id "P7224"; transcript_id "NM_007907"; tss_id "TSS5168";
chr10 unknown exon 80641826 80642004 . + . gene_id "Eef2"; gene_name "Eef2"; p_id "P7224"; transcript_id "NM_007907"; tss_id "TSS5168";
chr10 unknown CDS 80642091 80642196 . + 1 gene_id "Eef2"; gene_name "Eef2"; p_id "P7224"; transcript_id "NM_007907"; tss_id "TSS5168";
chr10 unknown exon 80642091 80642196 . + . gene_id "Eef2"; gene_name "Eef2"; p_id "P7224"; transcript_id "NM_007907"; tss_id "TSS5168";
chr10 unknown CDS 80642289 80642541 . + 0 gene_id "Eef2"; gene_name "Eef2"; p_id "P7224"; transcript_id "NM_007907"; tss_id "TSS5168";

Snord37 gene inside Eef2 and length of the Snord37 gene is just 52 but in cuffdiff output I've got:
Snord37 Snord37 Snord37 chr10:80639375-80645254 Control IL33 OK 0 8173.79 1.79769e+308 1.79769e+308 0.0786496 0.428305 no

locus size is 5879. Also cufflinks found 8173.79 FPKM in bam file for the Snord37 but there just 2 reads.

I have a couple other examples. I've tested it on 1.2.1, 1.3.0, 2.0.0 versions of cufflinks the result is the same.

**drdna** · 06-10-2012, 05:22 AM

I'm glad to hear to someone else can verify my suspicions. I have contacted the tophat cufflink support site about this but I do not expect them to reply because they ignored a previous question I submitted about a month ago.

**NicoBxl** · 06-11-2012, 12:59 AM

Is there other tool like cufflinks ? to compare the results.

**colindaven** · 06-11-2012, 04:02 AM

I haven't tried cufflinks but have heard others complaining at conferences.

I have been impressed with edgeR and use that in production here.

**NicoBxl** · 06-11-2012, 04:04 AM

Originally posted by colindaven View Post

I haven't tried cufflinks but have heard others complaining at conferences.

I have been impressed with edgeR and use that in production here.

yes edgeR and DESeq work pretty well. But is there a tool to perform a reference-based transcriptome assembly (like cufflinks)

**GenoMax** · 06-11-2012, 04:18 AM

Have you looked at MapSplice? http://www.netlab.uky.edu/p/bioinfo/MapSplice

**pbluescript** · 06-11-2012, 04:26 AM

Originally posted by NicoBxl View Post

yes edgeR and DESeq work pretty well. But is there a tool to perform a reference-based transcriptome assembly (like cufflinks)

Have you tried Scripture?

http://www.broadinstitute.org/software/scripture/

**rboettcher** · 06-11-2012, 05:42 AM

Hi all,
I just finished my first cufflinks run on RNAseq data and I also encountered results that make me doubt the validity of cufflinks' and cuffdiff's output.

Therefore I'm also considering to switch my analysis pipeline and rerun the analysis. However, during an Agilent seminar last week it was mentioned that Scripture would be an alternative which is heavy weight and requires serious computational ressources in order to perform the assembly. So my question is: does anybody already have experiences with Scripture and if so could you give recommendations towards the machine specifications needed?

Best regards

**drdna** · 06-11-2012, 04:44 PM

MapSplice is good for gene structure analysis but doesn't do differential expression analysis.

**NicoBxl** · 06-12-2012, 12:40 AM

Originally posted by pbluescript View Post

Have you tried Scripture?
http://www.broadinstitute.org/software/scripture/

Yes but I have several problems to run it. I'll open a new thread now with my problems.

edit > here's the thread for scripture : http://seqanswers.com/forums/showthr...5998#post75998

**drdna** · 06-12-2012, 12:44 PM

Upon quick inspection, it appears to me that Scripture simply assembles transcripts but does not quantify and compare expression levels. Is that the case?

**chadn737** · 06-12-2012, 01:39 PM

Originally posted by NicoBxl View Post

yes edgeR and DESeq work pretty well. But is there a tool to perform a reference-based transcriptome assembly (like cufflinks)

One alternative would be to run cufflinks and then use the transcripts.gtf or combined.gtf from cuffcompare as your input for something like HTSeq-count. That will give you a list of transcripts with raw reads which can then be used in either DESeq or EdgeR.

This approach would avoid any potential problems with Cufflinks quantification/differential expression while giving the advantage of a reference-based transcriptome assembly.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 30 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 32 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 28 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 53 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Is cufflinks fundamentally flawed?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News