ashishb
3 days ago
I love Google Cloud Run and highly recommend it as the best option[1]. The Cloud Run GPU, however is not something I can recommend. It is not cost effective (instance based billing is expensive as opposed to request based billing), GPU choices are limited, and the general loading/unloading of model (gigabytes) from GPU memory makes it slow to be used as server less.
Once you compare the numbers it is better to use a VM + GPU if the utilization of your service is even only for 30% of the day.
1 - https://ashishb.net/programming/free-deployment-of-side-proj...
gabe_monroy
3 days ago
google vp here: we appreciate the feedback! i generally agree that if you have a strong understanding of your static capacity needs, pre-provisioning VMs is likely to be more cost efficient with today's pricing. cloud run GPUs are ideal for more bursty workloads -- maybe a new AI app that doesn't yet have PMF, where you really need that scale-to-zero + fast start for more sparse traffic patterns.
jakecodes
3 days ago
Appreciate the thoughtful response! I’m actually right in the ICP you described — I’ve run my own VMs in the past and recently switched to Cloud Run to simplify ops and take advantage of scale-to-zero. In my case, I was running a few inference jobs and expected a ~$100 bill. But due to the instance-based behavior, it stayed up the whole time, and I ended up with a $1,000 charge for relatively little usage.
I’m fairly experienced with GCP, but even then, the billing model here caught me off guard. When you’re dealing with machines that can run up to $64K/month, small missteps get expensive quickly. Predictability is key, and I’d love to see more safeguards or clearer cost modeling tooling around these types of workloads.
gabe_monroy
3 days ago
Apologies for the surprise charge there. It sounds like your workload pattern might be sitting in the middle of the VM vs. Serverless spectrum. Feel free to email me at (first)(last)@google.com and I can get you some better answers.
ashishb
3 days ago
> But due to the instance-based behavior, it stayed up the whole time, and I ended up with a $1,000 charge for relatively little usage.
Indeed. IIRC, if you get a single request every 15 mins (~100 requests a day), you will pay for Cloud Run GPU for the full day.
Sn0wCoder
3 days ago
Has this changed? When I looked pre-ga the requirements were you need to pay for the CPU 24x7 to attach a GPU so that is not really scaling to zero unless this requirement has changed...
ashishb
3 days ago
Speaking from my experience, it does scale to zero except you pay for 15 mins after the last request.
So if you get all your requests in a 2 hours window then that's great. It will scale to zero for rest of the 22 hours.
However, if you get at least one request every 15 mins then you will pay for 24 hours and it is ~3X more expensive then equivalent VM on Google Cloud.
Sn0wCoder
a day ago
OK thanks will check out the options again, if it does scale to zero (including CPU) that will make it more reasonably priced.
krembo
3 days ago
How does that compare to spinning up some ec2s with amazon trainium gpus?
mgraczyk
3 days ago
Depending on your model, you may spend a lot of time trying to get it to work with Trainium
icedchai
3 days ago
Cloud Run is a great service. I find it much easier to work with than AWS's equivalent (ECS/Fargate.)
psanford
3 days ago
AWS AppRunner is the closest equivalent to Cloud Run. Its really not close though, AppRunner is an unloved service at AWS and is missing a lot of the features that make Cloud Run nice.
vrosas
3 days ago
AppRunner was Amazon's answer to AppEngine a full decade+ later. Cloud Run is miles ahead.
romanhn
3 days ago
I agree with the unloved part. It was a great middle ground between Lambda and Fargate (zero cold start, reasonable pricing), but has seemingly been in maintenance mode for quite a while now. Really sad to see.
gabe_monroy
3 days ago
i am biased, but i agree :)
icedchai
3 days ago
hah. I looked at your comments and saw you were a google VP! I've migrated some small systems from AWS to GCP for various POCs and prototypes, mostly Lambda and ECS to Cloud Run, and find GCP provides a better developer experience overall.
gabe_monroy
3 days ago
love that you're enjoying the devex. we put a lot of sweat into it, especially in services like cloud run.
ashishb
3 days ago
Yeah, anyone who uses GCP and AWS thoroughly will agree that GCP is a superior developer experience.
The problem is continuous product churn. This was discussed at length at https://news.ycombinator.com/item?id=41614795
AChampaign
3 days ago
I think Lambda is more or less the AWS equivalent.
icedchai
3 days ago
It's not. Cloud Run can be longer running: you can have batch and services. Lambda is closer to Cloud Functions.
ZeroCool2u
3 days ago
I think Cloud Run Functions would be the direct equivalent to Lambda.
hn_throwaway_99
3 days ago
I agree, but in the GCP world, a lot of these things are merging. My understanding is that Cloud Run, Cloud Run Functions (previously known as Cloud Functions Gen2) and even App Engine Flexible all run in the same underlying cloud run infrastructure, so it's essentially just some interface differences that to me now seem more like historical legacy/backwards compatibility reasons than meaningful functionality differences (e.g. Functions can now handle multiple concurrent requests).
yegle
3 days ago
FWIW, App Engine Flexible is a different product that runs on GCE VM.
Other products (App Engine standard, Cloud Functions gen1, Cloud Run, Cloud Run Functions) share many underlying infrastructures.
hn_throwaway_99
3 days ago
Oh, thanks! I guess I had it backwards - I thought App Engine standard was the one on a different infrastructure.
AChampaign
3 days ago
Oh, you’re probably right.
shiftyck
3 days ago
Eh idk Cloud Run is much better suited to long running instances than Lambda. You would use Cloud Functions for those types of workloads in GCP.
weberer
3 days ago
For those who don't know, AWS Lambda functions have a hard limit of 15 minutes.
mountainriver
3 days ago
The problem is you can't reliably get VMs on GCP.
All the major clouds are suffering from this. AWS you can't ever get an 80gb gpu without a long term reserve and even then it's wildly expensive. GCP you can sometimes but its also insanely expensive.
These companies claim to be "startup friendly", they are anything but. All the neo-clouds somehow manage to do this well (runpod, nebius, lambda) but the big clouds are just milking enterprise customers who won't leave and in the process screwing over the startups.
This is a massive mistake they are making, which will hurt their long term growth significantly.
covi
3 days ago
To massively increase the reliability to get GPUs, you can use something like SkyPilot (https://github.com/skypilot-org/skypilot) to fall back across regions, clouds, or GPU choices. E.g.,
$ sky launch --gpus H100
will fall back across GCP regions, AWS, your clusters, etc. There are options to say try either H100 or H200 or A100 or <insert>.
Essentially the way you deal with it is to increase the infra search space.
rendaw
3 days ago
We've hit into this a lot lately too, even on AWS. "Elastic" compute, but all the elasticity's gone. It's especially bitter since splitting the costs for spare capacity is the major benefit of scale here...
mountainriver
3 days ago
Enterprises are just gobbling up all the supply on reserves so they see no need to lower the price.
All the while saying they are "startup friendly".
dconden
3 days ago
Agreed. Pricing is insane and availability generally sucks.
If anyone is curious about these neo-clouds, a YC startup called Shadeform has their availability and pricing in a live database here: https://www.shadeform.ai/instances
They have a platform where you can deploy VMs and bare metal from 20 or so popular ones like Lambda, Nebius, Scaleway, etc.
bodantogat
3 days ago
I had the opposite experience with cloud run. Mysterious scale outs/restarts - I had to buy a paid subscription to cloud support to get answers and found none. Moved to self managed VMs. Maybe things have changed now.
PaulMest
3 days ago
Sadly this is still the case. Cloud Run helped us get off the ground. But we've had two outages where Google Enhanced Support could give us no suggestion other than "increase the maximum instances" (not minimum instances). We were doing something like 13 requests/min on this instance at the time. The resource utilization looked just fine. But somehow we had a blip in any containers being available. It even dropped below our min containers. The fix was to manually redeploy the latest revision.
We're now investigating moving to Kubernetes where we will have more control over our destiny. Thankfully a couple people on the team have experience with this.
Something like this never happened with Fargate in the years my previous team had used that.
ajayvk
3 days ago
https://github.com/claceio/clace is project I am building which gives a Cloud Run type deployment experience on your own VMs. For each app, it supports scale down to zero containers (scaling up beyond one is being built).
The authorization and auditing features are designed for internal tools, any app can be deployed otherwise.
holografix
3 days ago
Have a look at Knative
ajayvk
30 minutes ago
Clace is built to run on a single machine without needing Kubernetes. The plan is to add support for Kubernetes hosting later, but running on one or a few machines should not required Kubernetes.
Clace is built for the use case of deploying internal tools, so it comes out of the box with CI/CD, auditing, OAuth etc. With Kubernetes, you need to glue together ArgoCD, an IDP etc to get the same.
Bombthecat
2 days ago
Knative is amazing!
Bombthecat
2 days ago
You don't go to cloud services because they are cheaper.
You go there because you are already there or have contracts etc etc
JoshTriplett
3 days ago
Does Cloud Run still use a fake Linux kernel emulated by Go, rather than a real VM?
Does Cloud Run give you root?
seabrookmx
3 days ago
You're thinking of gvisor. But no, the "gen2" runtime is a microvm ala firecracker and performs a lot better as a result.
JoshTriplett
3 days ago
Ah, that's great.
And it looks like Cloud Run can do something Lambda can't: https://cloud.google.com/run/docs/create-jobs . "Unlike a Cloud Run service, which listens for and serves requests, a Cloud Run job only runs its tasks and exits when finished. A job does not listen for or serve requests."
pryz
3 days ago
https://github.com/cloud-hypervisor/cloud-hypervisor or something else?
yencabulator
a day ago
I believe that's an Intel project, not a Google project. I personally think it's more likely Cloud Run is on top of the same proprietary KVM-based code they use for their Compute Engine.
seabrookmx
2 days ago
Possibly? I haven't found any public documentation that says specifically what hypervisor is used.
Google built crosvm which was the initial inspiration for firecracker, but Cloud Run runs on top of Borg (this fact is publicly documented). Borg is closed source, so it's possible the specific hypervisor they're using is as well.
rpei
3 days ago
We (I work on Cloud Run) are working on root access. If you'd like to know more you can reach me rpei@google.com
JoshTriplett
3 days ago
Awesome! I'll reach out to you, thank you.
dig1
3 days ago
> I love Google Cloud Run and highly recommend it as the best option
I'd love to see the numbers for Cloud Run. It's nice for toy projects, but it's a money sink for anything serious, at least from my experience. On one project, we had a long-standing issue with G regarding autoscaling - scaling to zero sounds nice on paper, but they will not mention you the warmup phases where CR can spin up multiple containers for a single request and keep them for a while. And good luck hunting for unexplainedly running containers when there are no apparent cpu or network uses (G will happily charge you for this).
Additionally, startup is often abysmal with Java and Python projects (although it might perform better with Go/C++/Rust projects, but I don't have experience running those on CR).
tylertreat
3 days ago
> It's nice for toy projects, but it's a money sink for anything serious, at least from my experience.
This is really not my experience with Cloud Run at all. We've found it to actually be quite cost effective for a lot of different types of systems. For example, we ended up helping a customer migrate a ~$5B/year ecommerce platform onto it (mostly Java/Spring and Typescript services). We originally told them they should target GKE but they were adamant about serverless and it ended up being a perfect fit. They were paying like $5k/mo which is absurdly cheap for a platform generating that kind of revenue.
I guess it depends on the nature of each workload, but for businesses that tend to "follow the sun" I've found it to be a great solution, especially when you consider how little operations overhead there is with it.
ivape
3 days ago
Maybe I just don't know, but I really don't think most people here can even point to a cloud GPU with 1000 concurrent users and not end up with a million dollar bill.