SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Differential expression, splicing, and promoter use with Cufflinks Cole Trapnell Bioinformatics 47 09-24-2015 04:03 PM
Cufflinks 1.0.0: Major new features in assembly and differential expression Cole Trapnell Bioinformatics 31 08-26-2011 11:54 AM
Cufflinks differential expression problem Rachelly Bioinformatics 2 05-12-2011 12:08 AM
Cufflinks/Cuffdiff significant differential expression memo Bioinformatics 5 01-25-2011 10:49 AM
help with differential gene expression with cufflinks and tophat waterboy Bioinformatics 1 11-28-2010 10:51 AM

Reply
 
Thread Tools
Old 04-26-2010, 04:16 AM   #1
anna_vt
Junior Member
 
Location: London

Join Date: Jun 2009
Posts: 5
Default Differential expression analysis workflow in Cufflinks

Hi,

I'm hoping that someone can help me, as I couldn't work out how to do
this from the manual. Would someone be able to give me the steps in a
differential expression analysis?

I have run tophat with the following command for each of my two solexa
sequence.txt RNA-seq files seperately:

tophat --solexa1.3-quals -p 2 -o 101/100315/tophat/
~/software/bowtie-0.12.2/indexes/m_musculus_ncbi37
101/100315/s_2_sequence.txt &> 101/100315/tophat/tophat.out &

I would like to get the expression levels for all Ensembl transcripts.
I have downloaded this gtf file from Ensembl,
ftp://ftp.ensembl.org/pub/current_gtf/mus_musculus,

However when I run
cuffdiff ~/data/gtf/Mus_musculus.NCBIM37.57.gtf
101/100315/tophat/accepted_hits.sam 95/100315/tophat/accepted_hits.sam
&> cuffdiff.out &

or
cufflinks -G ~/data/gtf/Mus_musculus.NCBIM37.57.gtf 101/100315/tophat/accepted_hits.sam &>
101/100315/tophat/cufflinks.out &

I get the following error

Error: duplicate GFF ID 'ENSMUST00000127664' (or exons too far apart)!

I'm pretty sure I've misunderstood the workflow, if someone could give me
an overview of the steps and what gtf file I should be using that would
be great.

Many Thanks
Anna

(Cross posted to Bowtie forum)
anna_vt is offline   Reply With Quote
Old 04-27-2010, 07:26 AM   #2
Boel
Member
 
Location: Stockholm, Sweden

Join Date: Oct 2009
Posts: 62
Default

I get the same error message, and if you look at the transcript ENSMUST00000127664 it is indeed very long and has an intron of size ~ 4.4 Mb. This is way above the default maximum intron length (300,000) and that is why you get this error.


Also, make sure that your GTF file only contains rows for exons, not CDS or transcripts as well. Otherwise all your records in the GTF are duplicated. Maybe you have done this already, but just in case.

Last edited by Boel; 04-27-2010 at 07:35 AM. Reason: one more thing!
Boel is offline   Reply With Quote
Old 05-17-2010, 03:11 AM   #3
Wei-HD
Member
 
Location: Germany

Join Date: Oct 2009
Posts: 59
Default

Hi All,

I met the same error with transcript ENSMUST00000127664, but if I use the Mus_musculus.NCBIM37.56.gtf and the relative index, I did not meet the error. Is that because some annotation has been updated?

Well, I manually deleted the rows which contain transcript ENSMUST00000127664 in the annotation file, then the problem solved. I will try cuffdif later on. But what if people are interested in this gene/transcript? it might not be a good idea to delete? Hopefully someone can give good explanation!

Thanks!

Last edited by Wei-HD; 05-17-2010 at 07:40 AM.
Wei-HD is offline   Reply With Quote
Old 12-19-2010, 02:52 AM   #4
Gangcai
Member
 
Location: Shanghai, China

Join Date: Nov 2009
Posts: 30
Default

Quote:
Originally Posted by Wei-HD View Post
Hi All,

I met the same error with transcript ENSMUST00000127664, but if I use the Mus_musculus.NCBIM37.56.gtf and the relative index, I did not meet the error. Is that because some annotation has been updated?

Well, I manually deleted the rows which contain transcript ENSMUST00000127664 in the annotation file, then the problem solved. I will try cuffdif later on. But what if people are interested in this gene/transcript? it might not be a good idea to delete? Hopefully someone can give good explanation!

Thanks!
I met the same problem. Is this problem solved for you?
Gangcai is offline   Reply With Quote
Old 12-19-2010, 03:04 AM   #5
Wei-HD
Member
 
Location: Germany

Join Date: Oct 2009
Posts: 59
Default

Hi Gangcai,

Sorry I did not figure out a solution about this, I just sticked to the old version index and GTF file, since all my samples were analyzed against the old index (NCBIM37.56). Also I use DESeq R package for all the gene expression level analysis.

I wonder how other SEQers think?

Thanks!
Wei-HD is offline   Reply With Quote
Reply

Tags
cufflinks, differential expression, rna-seq

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:30 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO