SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
cufflinks FPKM >>> Cuffdiff FPKM peromhc Bioinformatics 6 10-17-2012 02:07 PM
Combining FPKM values for a gene john_nl Bioinformatics 5 02-16-2012 12:28 AM
Can I use FPKM to represent gene expression slowsmile Bioinformatics 2 07-01-2011 08:53 AM
multiple FPKM problem for single gene in gene_exp.diff after running cuffdiff ngs RNA Sequencing 4 03-30-2011 02:55 PM
PubMed: Quantification of Gene Transcripts with Deep Sequencing Analysis of Gene Expr Newsbot! Literature Watch 0 01-13-2011 03:00 AM

Reply
 
Thread Tools
Old 05-09-2011, 11:05 PM   #21
ngs_agd
Junior Member
 
Location: India

Join Date: Feb 2011
Posts: 7
Default

Sorry, in my previous thread I had asked whether the cuffcompare file needs to be edited. I just looked at a cuffcompare file, it seems to have only annotation information and no FPKM values. So, how (or where) is one supposed to combine the FPKM values from different transcripts for a gene and run cuffdiff?
ngs_agd is offline   Reply With Quote
Old 05-10-2011, 12:22 AM   #22
honey
Senior Member
 
Location: Pittsburgh

Join Date: Feb 2010
Posts: 151
Default Read

Not clear what you want to say. However, I agree FPKM per gene is an ongoing research.
honey is offline   Reply With Quote
Old 05-10-2011, 12:59 AM   #23
ngs_agd
Junior Member
 
Location: India

Join Date: Feb 2011
Posts: 7
Default

Hi Honey,
Sorry if I am not being clear. This is what I have done so far and I am struggling to make some sense of the information I am getting:
1. I have 2 .bam files (1 control and 1 disease). I am trying to identify gene expression differences).
2. Using galaxy I ran the cufflinks-cuffcompare-cuffdiff workflow.
3. For running cufflinks, I took the .bam files and ran cufflinks with the defaults.
4. I ran cuffcompare (with assembled transcripts file from each of the sample, along with the reference).
5. I fed the output (transcript file) of cuffcompare along with the two original bam files into cuffdiff.
6. I was looking at the output of cuffdiff and am seeing a few things I don't quite understand:
There are more than one rows per gene for most of the genes in the output file (I would have thought that the differential expression would be reported at gene level). I read in some other threads on Seqanswers (including this one) that summing up the FPKM values of the transcript shall give me the gene level value (which is file). What I don't understand is which output file fom the workflow should I perform the operation on:
a) The cufflinks output has the FPKM, but no gene annotations
b) The cuffcompare output has the annotations, but not the FPKM values (unless I m missing them).
c) The cuffdiff output has both the FPKM and gene annotation values, but the "statistical" analysis is already done.
So should I take the cuffdiff output, edit it and then fed it back into the workflow (again, at what point?)
This is where my first confusion is coming from.

There is another (possibly related) issue that some of the transcripts in the cuffdiff output have FPKM = 0, so when diff analysis is run, the FC are ridiculous.

What is making this all the more frustrating is that I am trying to use published data (with paper that gives some list of genes that are diff expressed between conditions analyzed using galxaxy) in a bid to educate myself and am going in circles.

As you pointed out in one of my other threads that I have a lot of reading to do, but at the risk of sounding like a nag and unbelievably dense, i have been unsuccessful in finding some material that might help me understand these things.

Any help from anybody greatly appreciated
ngs_agd is offline   Reply With Quote
Old 05-10-2011, 06:49 AM   #24
honey
Senior Member
 
Location: Pittsburgh

Join Date: Feb 2010
Posts: 151
Default

You will look for cuffdiff out put files-gene.expr, isoform.expr which are diff files and combined GTF file. However, to get one FPKM per gene it is suggested sum FOKM corresponding to gene name and same location. However as Adam has also suggested if gene has more than on location (overlap) it may not be possible to sum those FPKM. It is on going area of research. I am not very convinced that summing of FPKM all row per gene is good idea. Though several publications including a recent one has reported the same. (http://genome.cshlp.org/content/earl...d-4783a31b68c6). My suggestion is if you are trying to learn RNA-seq start with isoform.expr not gene level.
Best.
honey is offline   Reply With Quote
Old 06-01-2011, 09:57 PM   #25
edge
Senior Member
 
Location: China

Join Date: Sep 2009
Posts: 199
Default

Hi yjlui,

Do you have already figure out the problem of the description of "test status" that shown "OK" , "LOWDATA", and "FAIL".
Should I delete those transcript for downstream analysis and consider them as poor assembly transcript?
Apart from that, do you have any idea about FPKM is 0?
Is it mean that those transcript is poor assembly transcript as well?
Thanks in advance.
edge is offline   Reply With Quote
Old 05-17-2014, 10:10 PM   #26
emanlee
Member
 
Location: Xi'an

Join Date: Apr 2013
Posts: 15
Smile Collapse duplicate FPKMs for a gene

Quote:
Originally Posted by mgogol View Post
I ended up writing a script to sum the FPKMS for a given gene id, which I think is right...

Here's my (unpolished) code (a perl script and a shell script).

This botches the confidence intervals, by the way.

The format of cufflinks outputs (genes.fpkm_tracking files) are now different from previous. I updated the code written by mgogol and published it on sourceforge.net https://sourceforge.net/projects/col...?source=navbar . I hope it will facilitate your work.
emanlee is offline   Reply With Quote
Old 07-14-2016, 12:54 AM   #27
tedwong
Member
 
Location: Sydney

Join Date: Mar 2015
Posts: 13
Default

I'm using Cufflinks 2.2.1 but still seeing duplicate genes in the tracking file. Has the issue ever fixed?
tedwong is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 11:15 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO