Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
cufflinks - having difficulty reading reference GTF tguo Bioinformatics 31 07-07-2019 07:03 AM
how to get a gtf file for cufflinks camelbbs Bioinformatics 9 07-07-2019 06:30 AM
GTF reference files that work with TopHat/Cufflinks marcora Bioinformatics 23 01-15-2014 12:10 AM
Cufflinks and Comparison Analysis DrD2009 Bioinformatics 0 03-10-2010 01:40 PM
Cufflinks GTF file ECHo Bioinformatics 0 02-15-2010 03:59 AM

Thread Tools
Old 06-09-2011, 06:54 AM   #1
Junior Member
Location: Toulouse, France

Join Date: Mar 2011
Posts: 6
Default cufflinks : analysis comparison with and without a gtf reference file

I have many questions about cufflinks output. Here one of them :
First I've used tophat to map my RNAseq (100pb) to obtain a accepted_hits.bam file.
Then I've used cufflinks in two ways :
- simply : cufflinks accepted_hits.bam
- with a gtf file, that is the actually annotation of my genome (eucalyptus) :
cufflinks -g annotGenome.gtf accepted_hits.bam
Note that I've used the g and not the G option.

One example result :
- without reference gtf :
one gene / one isoform : 12110-17714
The first part of this isoform 12-16530 has the same structure intron/exon than isoforms formed with the reference.
Then I have a last exon 16530-17714.

-with reference gtf
one gene/two isoforms
+ transcript 1: 12024-17350 = exact transcript from the reference
full_read_support "no";
The corresponding no reference last exon is now :
16530 - 16561
16597 - 17350
That's my reference, but in my run this intron is mapped. There is no read that split in two parts. A few reads begin at position 16595. I've checked no read ending at 16561.
I thing this RNA doesn't exist in my transcriptome.
+ transcript 2 : 12024-17714
full_read_support "yes";
last exon : 16530-17714, the same exon than the no reference version
Why this transcript2 contains the 12024-12109 portion that is not mapped with RNAseq (instead the reference=transcript1 begin with this sequence) ?

for the two isoforms, I have FPKM values (4 for transcript1 that doesn't seem to exist in my transcriptome and 13 for the transcript2). How cufflinks attributes those values ?

With the version without gtf reference, I have a FPKM=36, that is the double comparing with the version with reference (13+4=17) while the mapping file is the same.

At least, note that those transcripts are located on the forward strand of the genome and that there is nothing in gtf and cufflink results on the opposite strand at this location.

Many thanks for your suggestion,

sohnic is offline   Reply With Quote
Old 12-17-2013, 01:45 AM   #2
Stefano Manzini
Junior Member
Location: Milan

Join Date: Dec 2013
Posts: 3

Hello sohnic,

I have no answer to your question, but I would like to ask you one because you're posting about the very same issue I am wondering about.

Can you just give me some hints about the difference of using (or not) a local reference to run cufflinks?
I would like to understand what does cufflinks use to assembly the information contained in target.bam file when you don't provide it a local reference. Further, I'd also like to know whether using it will make cufflinks name the transcripts with the gene names instead of CUFF.65279 and the like.

Thank you for your help.
Stefano Manzini is offline   Reply With Quote
Old 11-03-2016, 06:28 PM   #3
Junior Member
Location: Boston Ma

Join Date: Oct 2016
Posts: 1
Default My experience with cufflinks

Cufflinks will use the mapped reads and the reference genome to create a parsimonious assembly (a minimum spanning tree-esque structure) that explains the reads. The logic of the algorithm is based on the Burset and Guigo paper (Burset, Guigo, Evaluation of Gene Structure Prediction, 1996) and their observations regarding high false positive rates of gene finding programs that are guided by annotation. This is my understanding of the matter based on the past year I have spent working on similar issues.
ahalfpen is offline   Reply With Quote
Old 07-07-2019, 06:40 AM   #4
Location: Bhopal

Join Date: Jul 2019
Posts: 19

Sleeve fasteners will utilize the mapped peruses and the reference genome to make a closefisted get together (a base spreading over tree-esque structure) that clarifies the peruses. The rationale of the calculation depends on the Burset and Guigo paper (Burset, Guigo, Evaluation of Gene Structure Prediction, 1996) and their perceptions in regards to high false positive rates of quality discovering programs that are guided by explanation. This is my comprehension of the issue dependent on the previous year I have spent taking a shot at comparable issues.
brojee is offline   Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 09:01 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO