Seqanswers Leaderboard Ad

**yueluo** · 04-11-2014, 08:28 AM

Supposedly, Cufflinks-2.2.0 introduced a new workflow. You can now run cuffquant to estimate transcript abundance for each sample before running cuffdiff, which speeds up the process and solves some runtime issues. However, I have encountered some minor issues with the output of cuffnorm. You can check one of my posts about it in this forum(also posted the problem on Google Group), but so far no feedback from other users.
If you only care about examining differences between your two groups then it shouldn't be much of a problem.

**adiallo** · 04-11-2014, 08:37 AM

Thanks for the quick reply, I will run cuffquant/ cuffnorm and cuffquant / cuffdiiff and let you know if everything went well.

Cheers,
Alpha

**Wallysb01** · 04-11-2014, 10:19 AM

In terms of speed, cuffquant made the difference between me being able to use Cufflinks or not. I tried to use cuffdiff a while back on my data and it was looking like it would take around a month or so on 12 cores. Now with cuffquant, its more like overnight. And once you’ve run cuffquant, you can rerun cuffdiff very quickly, since you only have to generate those cxb files once.

**jeales** · 04-14-2014, 01:31 PM

Try the --no-diff argument to cuffdiff
http://cufflinks.cbcb.umd.edu/manual.html#cuffdiff
I can't see your original command line
But if you didn't specify labels with -L or comma delimit your list of case and control bams
then it will be doing pairwise DE tests for all against all samples
and this is likely to be the slowest step

**adiallo** · 04-15-2014, 06:24 AM

Hello,
Here is an example of my command line:
I have a lot of bash variables.
$cuffdiff -o ${output_path_diff} -b ${genomeIndex} -p 1 -L TEST_ALL,CONTROL_ALL -u ${merged_gtf} $bam14,$bam16,$bam26,$bam28,$bam30,$bam34,$bam36,$bam40,$bam42,$bam44,$bam46,$bam48,$bam50,$bam52,$bam54,$bam56,$bam58,$bam60,$bam64,$bam66,$bam68,$bam117,$bam118,$bam119,$bam32 $bam2,$bam4,$bam6,$bam70,$bam72,$bam74,$bam76,$bam78,$bam80,$bam82,$bam84,$bam86,$bam90,$bam92,$bam94,$bam96,$bam98,$bam100,$bam102,$bam104,$bam108,$bam110,$bam112,$bam114,$bam116

Cheers,
Alpha

**jeales** · 04-15-2014, 06:48 AM

That looks ok to me
as long as the line breaks are actually spaces

I'd definitely try the new cufflinks workflow to see if it reduces the ram usage by splitting up the tasks
i.e. tophat > cufflinks > cuffmerge > cuffquant > cuffdiff
but you are a supplying a huge amount of data, it's going to need a lot of memory

As a comparator i've got a cuffdiff running with 32 threads on 72 bams (average size 5GB) and that is using 90GB of ram

I predict, based on progress from the verbose (-v) output, that it'll take 6 days for my job to finish, that doesn't bode well for your analysis runtime

**jeales** · 04-15-2014, 06:49 AM

Also if you just want expression values per sample then omit the cuffdiff
You can always do your own DE testing in R etc

**jeales** · 04-15-2014, 06:54 AM

New cufflinks workflow compared to old
cuffnorm outputs expression values from the CXB files generated by cuffquant
then you could do your own testing on the output

http://cufflinks.cbcb.umd.edu/

**adiallo** · 04-15-2014, 10:55 AM

Thanks jeales,
I am using the new version of cufflinks, the cuffquant is done. I am running the cuffdiff part.
I am testing on different servers I have access too to speedup the process.
I will let you know the results and computation time and ressources soon.

Cheers,
Alpha

**vishnuamaram** · 06-18-2014, 01:42 AM

That's a great news from alpha.

I do have a suggestion, I think as the process goes out of memory and your RAM size is less (64gb). Try creating a tmp folder in your server hard drive and give command input of the tmp folder while running the analysis.

**adiallo** · 06-18-2014, 05:18 AM

Thanks vishnuamaram
I will try this solution. I was still trying to run cuffdiff with all my datasets, I only can run it with 1 cpu and it's a very long process.
Since I have 100 samples, 4 conditions (25 samples/ condition) and the samples in a condition are not replicates, cuffdiff is not the best!!
Do you have any suggestion for that ?
For now I am exploring another idea : writting a R script with DESeq and use the cuffnorm results do to my diff expression analysis.

Alpha

**adiallo** · 06-18-2014, 05:36 AM

Hello vishnuamaram,
I realize that cufflinks programs don't have a parameter for tmp folder !!!
How can i manage to make it work ?

Alpha

**shangzhong0619** · 07-11-2014, 10:11 AM

Cuffquant takes a long time

Hi all,
I have a problem about running cuffquant, when I didn't use the option '-b/--frag-bias-correct <genome.fa>', I can got results fast. However if I add that option, it always got stuck at a processing percentage and seems taking forever.

I also tried to use the old pipeline, when running the cuffdiff it also takes forever. I searched online and found that in the annotation file, removing the line whose 3rd feature is 'gene' can increase the speed. I did that, but the speed didn't increase that much. Does anyone know what is the possible issue? Thanks.

**adiallo** · 07-11-2014, 10:23 AM

Hello shangzhong0619,

Here is the parameters of cuffquant:

General Options:
-o/--output-dir write all output files to this directory [ default: ./ ]
-M/--mask-file ignore all alignment within transcripts in this file [ default: NULL ]
-b/--frag-bias-correct use bias correction - reference fasta required [ default: NULL ]
-u/--multi-read-correct use 'rescue method' for multi-reads [ default: FALSE ]
-p/--num-threads number of threads used during quantification [ default: 1 ]
--library-type Library prep used for input reads [ default: below ]

Advanced Options:
-m/--frag-len-mean average fragment length (unpaired reads only) [ default: 200 ]
-s/--frag-len-std-dev fragment length std deviation (unpaired reads only) [ default: 80 ]
-c/--min-alignment-count minimum number of alignments in a locus for testing [ default: 10 ]
--max-mle-iterations maximum iterations allowed for MLE calculation [ default: 5000 ]
-v/--verbose log-friendly verbose processing (no progress bar) [ default: FALSE ]
-q/--quiet log-friendly quiet processing (no progress bar) [ default: FALSE ]
--seed value of random number generator seed [ default: 0 ]
--no-update-check do not contact server to check for update availability[ default: FALSE ]
--max-bundle-frags maximum fragments allowed in a bundle before skipping [ default: 500000 ]
--max-frag-multihits Maximum number of alignments allowed per fragment [ default: unlim ]
--no-effective-length-correction No effective length correction [ default: FALSE ]
--no-length-correction No length correction [ default: FALSE ]

I suggest you to change some default parameters, like --max-bundle-frags to 50000.

Cheers,
Alpha

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 18 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 22 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 16 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 47 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

How to speedup Cuffdiff ?? It is taking forever !!!

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News