Seqanswers Leaderboard Ad

**gprakhar** · 10-03-2011, 03:24 AM

Hello,

Please refer to this thread, a similar topic has been discussed before.
http://seqanswers.com/forums/showthread.php?t=13995

Regards,
--
pg

**dalesan** · 10-03-2011, 03:29 AM

Sweet, thanks. I had a feeling a similar post was already out there -- but my crappy searching didn't reveal it. Thanks a bunch!

**maubp** · 10-03-2011, 03:31 AM

To me there is a vital question missing: Will your new sequencing service also be expected to offer some analysis services, or just provide the raw data after basic QC?

e.g. mapping RNA seq onto a reference genome (relatively straight forward and can be automated), or de novo assembly (still very hands on and demanding in terms of bioinformatician time as well as computational load).

Will you have access to any existing computational resources, e.g. an institute cluster?

Also what kind of organisms will you be working with? Bacteria and virus genomes being small will require less computational resources.

I'm sure you'll be thinking about this too, but you will need more staff (wet lab expert for library preparation and loading the machines, bioinformaticians, and probably a Linux systems admin). In your shoes I would try to head-hunt someone from an existing sequencing center to run it, and try to do this as soon as possible (to they can deal with many of these choices).

Also, I would suggest you sign up to the bioinfo-core mailing list at http://bioinfo-core.org/ and ask their advice too.

**dalesan** · 10-03-2011, 03:45 AM

Originally posted by maubp View Post

To me there is a vital question missing: Will your new sequencing service also be expected to offer some analysis services, or just provide the raw data after basic QC?

e.g. mapping RNA seq onto a reference genome (relatively straight forward and can be automated), or de novo assembly (still very hands on and demanding in terms of bioinformatician time as well as computational load).

Will you have access to any existing computational resources, e.g. an institute cluster?

Also what kind of organisms will you be working with? Bacteria and virus genomes being small will require less computational resources.

I'm sure you'll be thinking about this too, but you will need more staff (wet lab expert for library preparation and loading the machines, bioinformaticians, and probably a Linux systems admin). In your shoes I would try to head-hunt someone from an existing sequencing center to run it, and try to do this as soon as possible (to they can deal with many of these choices).

Also, I would suggest you sign up to the bioinfo-core mailing list at http://bioinfo-core.org/ and ask their advice too.

These are excellent points that you bring up. I am still trying to find out the extent of the services required by our in-house researchers. The majority of the sequencing will be done on eukaryotic organisms, this much I know. Regarding the offering of analysis-services, I think basic mapping and assembly is a given. Anything beyond that, I still do not know as this depends a lot on the particular research group/individual, as well as our staffing.

Last I heard, people from the University wanted to use our new (but non-existent) computing resources to run their jobs. I feel like gouging my eyes out with a spoon.

I am working with so little information and it's entirely frustrating. I think I am going to end up posting an institute-wide email with some very pertinent questions to get a handle on things before committing to anything.

**mbblack** · 10-03-2011, 04:54 AM

You really should be in on the discussions of what vendors the PIs plan to look at and the whole setup of the core. The reason I say that is often you can get analytical hardware bundled into a complete system quote for less than buying separately.

When we bought our ABI SOLiD system, we ended up also purchasing a Penguin computing cluster (ABI has partnered with Penguin to provide downstream computing resources for SOLiD customers). The whole deal as quoted to us was a better deal than we could get purchasing a cluster separately. And while the Penguin cluster had BioScope preinstalled, it is just a basic Beowulf cluster at heart, so you are not limited in what else you can do with it.

It just has been my experience that the optimal way to do this, when starting from scratch, is to consider the whole core system as one integrated purchase - sequencers and associated lab equipment, along with computational and storage needs all discussed together (and keep in mind you may need to visit things like network issues for data transfer, environmentally controlled server/cluster space, power requirements for the hardware, backup and archival storage systems , and so on).

Storage is not a trivial issue and needs to be discussed. How many jobs per week or month do you anticipate? If the core is performing primary and secondary analyses, will the PIs also insist on access to raw data? Who is responsible for final data&results storage and archiving? Will data be archived permanently? If not, for how long (and hence, how much storage do you need). Do you have the network bandwidth to handle the data, or will you also need to upgrade there as well? If the core is not storing data permanently, how is final data to be delivered to the PI?

There are a whole host of data and analysis issues involved in setting up a core, and they need to be considered upfront, and budgeted appropriately. Far too often, academic cores are set up by PIs who think solely of the data generation aspects. Then, when there is no money left for the bioinformatics resources, the lab resources end up being grossly underused (and thus never recoup their costs) because the downstream support was never anticipated and allowed for. I've been there, seen that (and know of several academic NGS "cores" that have sat largely idle, as their own institution's PIs have farmed their NGS work out, since their in-house cores cannot provide any support for post-sequencing data or analysis).

You need to make it clear to the PIs that you are not talking about a few off-the-shelf desktop computers here and a couple of cheap disk drives. They need to think about data analysis and storage issues right up front, and factor that fully into their initial and long term plans for a core, including some available bioinformatics expertise to at least guide them in both analysis and interpretation. Otherwise, you don't really have a core as you cannot offer end-user services.

**ETHANol** · 10-03-2011, 05:02 AM

As previously mentioned there is really no way to know how much computing power will be needed without knowing how much the machines will be used what they will be used for. That being said, as a core facility you should have enough computing power to align every run to the human genome. As an institution you will obviously need more.

On actual usage of the machines, my guess is that they will not get much use. It is extremely expensive to run a HiSeq (and even more on a per read or per MB rate with the MiSeq). I don't know the cost of disposables per run but it is high. I doubt your average lab at your institution in Portugal has the funding to have many samples run. Certainly it will be hard to attract any external users. Your operational costs for disposables will be higher then what heavy users are paying. It is much easier to get funding to 'bring cutting edge genomics' to your institution by buying expensive equipment then it is to get the funding to run them. There is also the issue of human capital. How many people at your institution have any experience with next-generation sequencing? How many are working on projects that use next generation sequencing? From the sounds of it, not a whole lot or you would be getting better advice.

The much more logical way to get genomics going at an institution is come up with projects that use next generation sequencing, prepare libraries and send them out to be sequenced somewhere else. When the institutional demand reaches a point where the machine will be running full time then you can start your own facility. To buy a HiSeq and have it sit idle is a huge waste of money. It will be obsolete in less then 2 years. Especially if that money could have been used to sequence samples and actually do cutting edge genomics at your institution and publish in high impact journals.

Genomics is not the machine. Genomics is the experimental design and the analysis.

I hope that didn't sound too harsh. There was a post by a guy in Bulgaria contemplating getting a GAIIx a few days ago. Probably worth a read if you can find it.

I didn't really answer your question but I think goes to the reason why no one knows the answer.

**maubp** · 10-03-2011, 05:15 AM

Originally posted by ETHANol View Post

The much more logical way to get genomics going at an institution is come up with projects that use next generation sequencing, prepare libraries and send them out to be sequenced somewhere else. When the institutional demand reaches a point where the machine will be running full time then you can start your own facility.

That's pretty much the approach our Institute has taken thus far. Initially we outsourced the library preparation and sequencing, but are now looking to do the library preparation in house. My guess is once that is up and running, we may look at the new "desktop sequencers", but already it seems the bottleneck is bioinformatics staff rather than data generation. Again, some of the analysis can be outsourced (or done via collaborations).

Perhaps you can get your bosses to invite some existing sequencing center managers over to visit, and go though some of these issues with their first hand advice?

Topics	Statistics	Last Post
A Close Examination at Probiotic-Related Bacteremia by seqadmin Started by seqadmin, 05-02-2024, 08:06 AM	0 responses 17 views 0 likes	Last Post by seqadmin 05-02-2024, 08:06 AM
Expanded Genetic Insights into Blood Pressure Regulation by seqadmin Started by seqadmin, 04-30-2024, 12:17 PM	0 responses 20 views 0 likes	Last Post by seqadmin 04-30-2024, 12:17 PM
The Role of Enhancers in Defining Cell Fate by seqadmin Started by seqadmin, 04-29-2024, 10:49 AM	0 responses 27 views 0 likes	Last Post by seqadmin 04-29-2024, 10:49 AM
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, 04-25-2024, 11:49 AM	0 responses 28 views 0 likes	Last Post by seqadmin 04-25-2024, 11:49 AM

Seqanswers Leaderboard Ad

Announcement

Newbie needing advice on required computing power for small-scale NGS facility

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News