SEQanswers

Go Back   SEQanswers > General



Similar Threads
Thread Thread Starter Forum Replies Last Post
MetaVelvet Google Forum vanillasky Bioinformatics 0 04-14-2014 04:59 AM
Google Genomics Richard Finney Bioinformatics 3 03-07-2014 11:32 PM
iOmics - Cloud based NGS data analysis platform geschickten Introductions 0 10-30-2011 06:51 AM
SRA Database in Google cloud hosted by DNAnexus GenoMax Bioinformatics 0 10-12-2011 05:09 AM
Google spreadsheet for NGS stats avilella General 5 03-09-2010 12:20 PM

Reply
 
Thread Tools
Old 11-24-2014, 03:20 AM   #1
Marcela Uliano
Member
 
Location: Berlin, Germany

Join Date: Apr 2012
Posts: 18
Default Google Cloud Platform

Hey guys!

I was asked to test a little this Google Cloud Platform. I'm just login to free trial, but I would like to have opinions of people who have already used it.
Is it like amazon cloud, where you can by processors and memory ram depending on the algorithms you need to use?
Does it has programs installed? Pipelines? Applications?
If you have a computer cluster where you can install everything you need to assembly and annotated your eukaryotic genome, is there any advantage in the cloud at all?

Thank you so much for your time and feedback!
Marcela Uliano is offline   Reply With Quote
Old 11-24-2014, 04:11 AM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,978
Default

Quote:
Originally Posted by Marcela Uliano View Post
Hey guys!
If you have a computer cluster where you can install everything you need to assembly and annotated your eukaryotic genome, is there any advantage in the cloud at all?

Thank you so much for your time and feedback!
You are likely to stumble on a project that cannot be handled by the local cluster ("planet scale" computing, think thousands of cores for 100,000 genomes project). That is when something like google genomics would come in handy. Google has done some interesting work with 1000 genomes data but I can't find a public source that describes the results. There is a brief mention here: https://www.genomeweb.com/informatic...isanal-factory
GenoMax is offline   Reply With Quote
Old 11-24-2014, 08:03 AM   #3
Richard Finney
Senior Member
 
Location: bethesda

Join Date: Feb 2009
Posts: 700
Default

Some thoughts ..
#1 I still really like Big Servers and nearby beowulf style computing.
Typically there's a pipe and that pipe is going to limit what you can do.
If that pipe is always saturated by transferring terabytes back and forth from the cloud, things are going to slow down.
Being able to launch 500 virtual cloud machines isn't really solving the bandwidth problem. Not if 25 other people using the same pipe are doing the same thing.

#2 The internet is bursty. If you enjoy the "circling doughnut" or "please wait ... buffering", then you're ready for the cloud.

#3 This is going to be very expensive.

#4. Google's omnivision is getting a little scary. It's getting to be like ... Jeremy Bentham's Panoption ... or like George Orwell's 1984 where TV watches you. I just don't trust the brand anymore. Clever marketing is not going to fix this today.

#5. There's liability and privacy concerns using patient data. The cloud is a real problem.

Someday, somebody is going to get "cloud" right.

Last edited by Richard Finney; 11-24-2014 at 08:13 AM.
Richard Finney is offline   Reply With Quote
Old 11-24-2014, 09:13 AM   #4
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,978
Default

I am only going to comment on technical thoughts mentioned in Richard's post.

1. Presumably anyone wanting to do a true "planet-scale" project would have a direct peering connection with Google and/or a major internet backbone provider. For less ambitious projects/normal users, the bottleneck will likely be the connection on the user-end (rather than Google's).

2. Google (in all likelihood, I don't know for sure) does not use public internet for internal connectivity so once the data is in the Google cloud the circling doughnut may actually indicate a running process on a large scale

3. Costs are advertised here: https://cloud.google.com/genomics/pricing. $22 per TB per month for storage.

5. Privacy and liability are issues that the cloud providers are coming to grips with. There are mechanisms to sign agreements that would cover this aspect (http://googlecloudplatform.blogspot....-entities.html). Since local legal counsel will be involved in setting up such an agreement I doubt Google would use/access user data beyond terms spelled out in the BAA.
GenoMax is offline   Reply With Quote
Old 12-04-2014, 05:26 AM   #5
Geneus
Member
 
Location: New Jersey

Join Date: Dec 2010
Posts: 61
Default

Quote:
Originally Posted by Richard Finney View Post
Some thoughts ..
.

#5. There's liability and privacy concerns using patient data. The cloud is a real problem.
Get over it already. How's your credit card doing these days at Target?
Geneus is offline   Reply With Quote
Old 03-09-2016, 04:56 PM   #6
Richard Finney
Senior Member
 
Location: bethesda

Join Date: Feb 2009
Posts: 700
Default

Bumping this thread.

Anybody used google cloud genomics?

Any thoughts?

Is Amazon or Microsoft doing any genomics cloud projects?
Richard Finney is offline   Reply With Quote
Old 03-10-2016, 06:25 AM   #7
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,978
Default

Some folks (I know locally) have been using Google compute very successfully and they are extremely happy about the possibilities. It is not inexpensive but if one keeps things in perspective (e.g. where else can you find resources to analyze 5000 RNAseq samples in a day or so) then the possibilities are only limited by available budget.

Google cloud storage is also compelling. Looking at using it for data backup (BTW: Veritas NetBackup can use google cloud storage as a target).
GenoMax is offline   Reply With Quote
Old 03-10-2016, 09:30 AM   #8
Richard Finney
Senior Member
 
Location: bethesda

Join Date: Feb 2009
Posts: 700
Default

Well. I am very hazy on this so bear with me this is a naive question ...

This not trivial and would demonstrate a computationally big, practical test case ...

I guess the question(s) I have are ... if the API interface is to objects , then are the methods to access these objects user definable?
Can I for instance call a method to realign a bwa aligned hg19 bam files to a hg38 novoalign bam file? User must, of course, do the work of gluing programs together to do this : oldbam->fastq->novalaign->newbam . Can theses sorts of programs be run out "in the cloud" ?

... and ...

Is docker or other virtual machine images usable in the google cloud?

Last edited by Richard Finney; 03-10-2016 at 09:34 AM.
Richard Finney is offline   Reply With Quote
Old 03-10-2016, 11:01 AM   #9
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,978
Default

I referred to "Google compute" and not "Google genomics" in my post, even though you were asking about google genomics.

Since the workflows are specific to sites we are using our own (not the default on genomics) via google compute with VM's.

Docker containers are usable on google compute. There is a nice container management system (kubernetes) available.
GenoMax is offline   Reply With Quote
Old 03-11-2016, 06:16 PM   #10
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

I happen to be a Google employee now. I do not work directly with the cloud or genomics groups, but I would be happy to relay questions, and will post the responses here.
Brian Bushnell is offline   Reply With Quote
Old 03-11-2016, 08:12 PM   #11
nucacidhunter
Jafar Jabbari
 
Location: Melbourne

Join Date: Jan 2013
Posts: 1,226
Default

Quote:
Originally Posted by Brian Bushnell View Post
I happen to be a Google employee now. I do not work directly with the cloud or genomics groups, but I would be happy to relay questions, and will post the responses here.
I wonder if you or someone else will be maintaining BBTools.
nucacidhunter is offline   Reply With Quote
Old 03-12-2016, 12:34 AM   #12
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

I'm still maintaining it on weekends
Brian Bushnell is offline   Reply With Quote
Reply

Tags
cloud computing, google cloud

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:26 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO