Seqanswers Leaderboard Ad

**GenoMax** · 01-02-2015, 11:36 AM

Originally posted by eb0906 View Post

Hi all,

Our small core lab purchased two Dell Precision T7610 Tower Workstations equipped with 1 Intel Xeon E5-2687W v2 Eight-core 3.4 GHz Turbo, 25 MB processor, 64 GB 1866MHz DDR3 RAM, 1GB NVIDIA Quadro K600 Video card, 256 GB Solid-state drive and two 1TB SATA drives, DVD-RW drive, 10Gb Network adapter, and an Nvidia Tesla K20C Computer Processor.

I am a novice user, but some initial thoughts I have are:

1) Do we have enough RAM to support multiple (2-3) RNA-seq analyses? For example, alignments, mapping, differential expression analysis, etc.

2) Do we need an additional CPU? (Assuming we will be analyzing at least 2 RNA-seq experiments at any given time and will have additional users (2-3) logged on and trying to analyze their own data.)

3) It is my understanding that the greatest limiting factor in computational requirements for NGS analysis is I/O. At this point, is there any advantage to having a GPU versus CPU when it comes to NGS analysis?

It is tricky to provide meaningful answers for these kind of questions since the actual workflow will vary from time to time plus it is hard for outsiders to completely understand how your lab/users operate on a daily basis.

But here goes.

#1. Probably. Depending on memory usage you may have to limit number of jobs that can be running at a given time. If you work with small genomes it may not be a big problem.

#2. If you do get an additional CPU you should look into getting more RAM (hopefully the RAM slots are not maxed out otherwise you will need to discard some memory sticks to get higher capacity ones), at least for one of the two machines. 2 x 1 TB is not much storage (hopefully you have other storage available over the network). It is not going to be enough to support multiple users.

#3. At this time there is likely no practical benefit in your case to worry about GPU computing.

**eb0906** · 01-02-2015, 12:09 PM

Thanks, Genomax!

You are right; it is hard to anticipate workflows.

1) It's interesting that you mention we probably have enough RAM. Currently, one of my colleagues is running cuffdiff on 16 c.elegans samples (15M reads/sample), and it looks like it's stalling at the 'Processing Loci' step with 98% of the memory in use. Is this typical? This is our first time using these workstations for RNA-seq analysis, so we are not sure what to expect with processing time.

2) I agree, and yes, we do have additional server space, 20 TB local and 110 TB on the network.

**GenoMax** · 01-02-2015, 12:36 PM

Originally posted by eb0906 View Post

1) It's interesting that you mention we probably have enough RAM. Currently, one of my colleagues is running cuffdiff on 16 c.elegans samples (15M reads/sample), and it looks like it's stalling at the 'Processing Loci' step with 98% of the memory in use. Is this typical? This is our first time using these workstations for RNA-seq analysis, so we are not sure what to expect with processing time.

Is there anything else running on the system (what OS are you running BTW)? On a single server (without a job queuing system) you (or a sys admin) is going to have to keep an eye on things since resource constrained jobs would slow everything to a crawl or at the worst case lead to a hung/non-responsive server.

With newer UNIX/Linux distros just looking at free memory (in top or a similar tool) in not enough. The OS normally caches RAM and will use it in most efficient way as needed. If system starts using a large amount of swap space (how much swap is configured on your machines) then there may be a problem. Have you looked at the swap usage?

**eb0906** · 01-02-2015, 01:35 PM

The OS is Red Hat Enterprise and it's a single server with no job queuing system (as far as I know as I have not personally run anything yet).

This is what my colleague sent for the current run:
top - 15:26:10 up 3 days, 5:09, 7 users, load average: 19.31, 19.18, 19.27
Tasks: 392 total, 4 running, 388 sleeping, 0 stopped, 0 zombie
Cpu(s): 51.7%us, 26.1%sy, 0.0%ni, 22.2%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 65919692k total, 65459552k used, 460140k free, 5228k buffer PID USER PR NI VIRT
Swap: 25001980k total, 18036092k used, 6965888k free, 292000k cached
RES SHR S %CPU %MEM TIME+ COMMAND
5673 usr 20 0 78.8g 60g 1960 S 1016.5 96.9 4879:22 cuffdiff

Is the above helpful? This is all new to me.

**cmbetts** · 01-02-2015, 04:41 PM

Originally posted by eb0906 View Post

The OS is Red Hat Enterprise and it's a single server with no job queuing system (as far as I know as I have not personally run anything yet).

This is what my colleague sent for the current run:
top - 15:26:10 up 3 days, 5:09, 7 users, load average: 19.31, 19.18, 19.27
Tasks: 392 total, 4 running, 388 sleeping, 0 stopped, 0 zombie
Cpu(s): 51.7%us, 26.1%sy, 0.0%ni, 22.2%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 65919692k total, 65459552k used, 460140k free, 5228k buffer PID USER PR NI VIRT
Swap: 25001980k total, 18036092k used, 6965888k free, 292000k cached
RES SHR S %CPU %MEM TIME+ COMMAND
5673 usr 20 0 78.8g 60g 1960 S 1016.5 96.9 4879:22 cuffdiff

Is the above helpful? This is all new to me.

You might need to mask rRNA and other abundant RNA species. I've had similar issues with cufflinks hanging at this step when processing human RNA-Seq data on a very similarly built workstation. Building a GTF of rRNA from the UCSC repeatmasker table to use with the -M flag fixed it right up for me. I couldn't find the original thread where I found the solution, but this one seems pretty similar

Cufflinks Runtime - SEQanswers

http://seqanswers.com/forums/showthread.php?t=12458

Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

**cmbetts** · 01-02-2015, 04:44 PM

Of course google finds it for me right after I posted. I'm not sure if it's a STAR specific issue, but I was using it as my aligner when I ran into the problem and found the solution on their message boards

Redirecting to Google Groups

https://groups.google.com/forum/#!msg/rna-star/X8mjUc7nm1U/RXnlXBr5oHYJ

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 18 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 22 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 17 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 49 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Hardware for NGS analysis - GPU vs CPU?

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News