SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics
Similar Threads
Thread Thread Starter Forum Replies Last Post
cufflinks with -G option give 0 FPKM, why?? sterding Bioinformatics 10 04-14-2014 10:49 PM
cufflinks FPKM >>> Cuffdiff FPKM peromhc Bioinformatics 6 10-17-2012 01:07 PM
cufflinks-1.0.3 produces very high FPKM values when compared to cufflinks-0.9.3. Why? pinki999 Bioinformatics 5 06-09-2012 06:48 AM
FPKM determination of de novo transcripts morebasesplease RNA Sequencing 0 08-06-2011 07:56 PM
cufflink -G option give 0 FPKM arrchi Bioinformatics 0 04-19-2011 12:46 PM

Reply
 
Thread Tools
Old 10-31-2011, 04:41 AM   #1
pettervikman
Member
 
Location: Sweden

Join Date: Nov 2009
Posts: 23
Cool Cufflinks -g option yields many 0 FPKM transcripts and annotates them interestingly

Hi

I've just started using the -g option in the newest version of cufflinks. I've found that it then creates a whole bunch of transcripts that are specified as 0 in the FPKM, as well as in high low conf. FPKM columns. It's easy to filter this after but I find that it's interesting that they are created initially.

Second I've found that when looking in the GENES file the transcripts are tagged as OK or LOWCOVERAGE. Here transcripts with a low-conf FPKM of 0 and any of the other FPKM fields above 0 are flagged as lowcoverage while transcripts with FPKM of 0 and all fields > 0 are OK. Once again it's easy to change the OK to not expressed or similar but it would be nice to not having to do this.

So my questions are 1, has anybody else seen this and 2, have anyone information regarding how "true" the prescence of transcripts are?

Sincerely

/Petter
pettervikman is offline   Reply With Quote
Old 10-31-2011, 07:51 AM   #2
jhb1980
Junior Member
 
Location: Switzerland

Join Date: Dec 2010
Posts: 7
Default

Hi Petter,

I'm still pretty new to this all, so if anyone spots a wrong statement below, please feel free to point it out.

As for question 1, I think this is an inherent property of the -g (RABT) option, which tiles reference transcripts with faux reads wether the transcript is expressed in your sample or not (see http://cufflinks.cbcb.umd.edu/howitworks#hrga ). If you take your output through Cuffcompare, the "-R" option should remove any reference *.gtf transcript not expressed.

As for question 2, I don't think this is a trivial one and pretty much an own topic of research. The best ways (from a biological point of view) in my opinion would be to see if a) the transcript expression / (structure) is reproducible between replicates; and b) if the transcript is still detected when using higher mapping stringency (mismatch allowance, multimapper limit), i.e. "playing around" with the input parameters of the Mapper and Assembler. The first point of course raises your invoice considerable .
jhb1980 is offline   Reply With Quote
Old 11-11-2011, 02:07 AM   #3
pettervikman
Member
 
Location: Sweden

Join Date: Nov 2009
Posts: 23
Default

Hi

Thanks for the reply. I've realised that the output annotation is a bit unprecise, or I don't understand the labelling. Transcripts with 0 coverage and 0 in all positions except the highFPKM are labelled OK for example.

Anyhow, I have 48 samples and I've decided to filterout anything that is not present in at least two samples. After that I'll see what I have left.

/Petter
pettervikman is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



All times are GMT -8. The time now is 09:58 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2022, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO