SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Multi-occurence of transcripts in Cufflinks output file Jesse K. Bioinformatics 0 05-26-2014 11:57 PM
What is the actually output of a Cufflinks GFT file? KnowNothing2 RNA Sequencing 0 10-18-2013 03:28 PM
cufflinks output against annotation file masylichu Bioinformatics 1 09-19-2012 02:43 AM
BWASW more reads in the output SAM file than in the input file nanto Bioinformatics 2 09-18-2012 12:41 AM
Zero FPKM at some Cufflinks output file fuad193 Bioinformatics 1 06-06-2012 11:16 PM

Reply
 
Thread Tools
Old 10-27-2014, 06:40 AM   #1
ErikFas
Member
 
Location: Sweden

Join Date: Jun 2014
Posts: 86
Default Understanding Cufflinks input/output file structure

I've been mostly doing differential expression with featureCounts + DESeq2 so far, but now I want to calculate the FPKM of my various samples / replicates as well. I was told I should do this with Cufflinks, so I gave it a go. While I can get some results, I'm a bit unsure exactly how I got them and exactly what they represent...

First, the input files, coming last in the command (I'm running from the Terminal in OS X Mavericks). As far as I understand, I can either run a separate cufflinks command for each replicate for each sample, but I should also be able to run it as a single command, in some way. I read the documentation and searched around on SEQanswers, but I didn't really get any definitives. For example, are these different commands valid?

Code:
cufflinks <options> <input1.bam> <input2.bam>
cufflinks <options> <input1.bam>,<input2.bam>
cufflinks <options> <input*.bam>
... and if so, what's the difference? If have I two samples (two different cell lines) with 3 replicates each, how should I run the command(s)?

Secondly, the output. I've gotten the four files specified in the documentation, but the one I'm supposed to be interested in (genes.fpkm_tracking) only has one "FPKM" column. I suppose this makes sense when you run each replicate as a separate command, but having used some trial and error (mostly error, I suppose ) with the above commands, I still only get one FPKM column (discounting the "FPKM_conf_lo" etc. columns). Is this how the output should be? Should it different for a single .bam-file and multiple inputs, or am I misunderstanding how the program works in some way?

And, lastly, the --GTF or --GTF-guide options. Which do I use? I'm using Illumina paired-end, stranded data (so I'm using the --library-type fr-firststrand option), and I'm interested in knowing the FPKM for each replicate for each sample. I've thus far used the --GTF option and the same reference annotation as was used in the alignment (Tophat2). Is this the correct thinking?

Thanks in advance!

Last edited by ErikFas; 10-27-2014 at 06:44 AM.
ErikFas is offline   Reply With Quote
Old 10-27-2014, 10:48 PM   #2
sdriscoll
I like code
 
Location: San Diego, CA, USA

Join Date: Sep 2009
Posts: 438
Default

I think you have to run cufflinks one time per BAM file. That's how I have always done it and that's how it would handle things internally anyways since each set of alignments would be interpreted independently of the others. If you can supply multiple BAMs in a single command then I'd expect the program to pool them and evaluate them as a single sample.

--GTF is a mode for quantification only. --GTF-guide is a mode for quantification plus de-novo assembly of isoforms from alignments using a supplied GTF as a guide (like giving it a starting point for de-novo assembly). --GTF-guide uses a sligntly different assembly strategy than the default (without --GTF or --GTF-guide).
__________________
/* Shawn Driscoll, Gene Expression Laboratory, Pfaff
Salk Institute for Biological Studies, La Jolla, CA, USA */
sdriscoll is offline   Reply With Quote
Old 10-28-2014, 12:15 AM   #3
ErikFas
Member
 
Location: Sweden

Join Date: Jun 2014
Posts: 86
Default

Okay, then I'll continue doing a single command for every file, with the -G flag. Thanks for the clarification!
ErikFas is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:21 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO