Cufflinks 2.2.1 is taking a really long time. I start with 45 million 100bp paired-end, rRNA depleted, stranded reads aligned with STAR, 24 million uniquely align, 15 million are multimappers, using a sorted .bam as the input for cufflinks. Cufflinks command is:
Things to note:
-p 32, all 32 CPU's are in use for pretty much the entire time, here's the usage for the past 24 hours from the 32-CPU node I've been using, you can see it going down as threads are completing at the end of the cufflinks run.
The mask file is masking out a few very highly expressed genes which make up almost 20% of all reads. When I didn't mask these out it got hung up at these loci.
The library type is reversed because I'm following these instructions.
After 3.5 days I think it's just about done (it's at "waiting for 18 threads to complete"). Given that the number of input reads isn't huge (especially once all the masked reads are accounted for) and I'm using 32 CPU's, I'm surprised it's taking so long. It doesn't seem like it's getting hung up at any specific spots, but it does seem to slow down as it goes, until it's taking many hours for each of the last few percent.
Is this runtime normal? Anything I can do to speed it up?
Code:
cufflinks -o outputFolder -p 32 -g gencode.v2.annotation.gtf -M maskFile.gtf -b mm10.fa -u --library-type fr-secondstrand inputSorted.bam
-p 32, all 32 CPU's are in use for pretty much the entire time, here's the usage for the past 24 hours from the 32-CPU node I've been using, you can see it going down as threads are completing at the end of the cufflinks run.
The mask file is masking out a few very highly expressed genes which make up almost 20% of all reads. When I didn't mask these out it got hung up at these loci.
The library type is reversed because I'm following these instructions.
After 3.5 days I think it's just about done (it's at "waiting for 18 threads to complete"). Given that the number of input reads isn't huge (especially once all the masked reads are accounted for) and I'm using 32 CPU's, I'm surprised it's taking so long. It doesn't seem like it's getting hung up at any specific spots, but it does seem to slow down as it goes, until it's taking many hours for each of the last few percent.
Is this runtime normal? Anything I can do to speed it up?
Comment