hackernews client

Forget CDK and AWS's insane costs. Pulumi and DigitalOcean to the rescue

174 pointsposted 8 months ago

171 Comments

jmspring

8 months ago

Pulumi is really a royal piece of shit. Why the f*ck am I writing code to do "deployment". In C# --> new Dictionary<string, object> when dealing with a values.yaml for instance. The whole need to figure out when and when not to use Apply.

Give me Terraform (as much as I hate it) any day.

stackskipton

8 months ago

As SRE dealing with former Pulumi, "Hey Devs can use code to deploy infrastructure" is not great idea you think it is. I've seen some real ugly conditional behavior where I'm like "Is this or is this not going to run? I honestly can't tell."

hinkley

8 months ago

We had so much conflict with the ops team over their choice of Terraform. The three colors of variable thing is just fucking bonkers. Getting tests wrapped around it that actually did what we thought they meant was a giant pain in the ass.

I won't go as far as to say we burned bridges arguing back and forth about it but they were definitely significantly singed.

Config files simply don't work until they do. And if it's your job to stare at them for hours and hours a day then maybe that's okay with you, but if you expect other people to 'just learn' it you're an idiot or an asshole. Or both. Ain't nobody got time for magic incantations.

I also think it should tell you you're on the wrong path when your app is named after a verb and the data it deals with is all declarative.

darkstar_16

8 months ago

Ever thought that "Ops" needs a different mindset than the devs are used to ?

user

8 months ago

[deleted]

hinkley

8 months ago

And that’s why we don’t delegate that work to devs.

salmo

8 months ago

Honestly, the culture/org structure is a way bigger problem in this story than any proper noun tool.

If you’re ignoring guidance and patterns and getting mad reinventing the wheel, that’s on dev. If “ops” mandates tooling and doesn’t have any skin in the game, that’s on them. And both problems are on your leadership.

If y’all just hate each other and don’t listen or participate, then you can’t be successful. It is ironic that this is the pattern that the devops movement landed us in.

slillibri

8 months ago

Honestly curious, I've been writing terraform for a while but I have never heard of "The three colors of variable thing". Could you expand on that?

skywhopper

8 months ago

They mean var vs local vs from-a-resource. There are some places you can’t use some types of variables. It can be annoying but it’s not really a huge problem if you design your approach with that in mind.

The worst part is that the Terraform team at Hashicorp often excuse not fixing these design issues as “safety measures” which isn’t entirely untrue but when over half of your users want something, sometimes you should get over yourself.

For what it’s worth, OpenTofu is fixing many of these sorts of things that cause people pain.

But my advice is to learn to use the tool. Terraform has such great benefits (in the right use cases). If you’re struggling, either you are missing something or you chose the wrong tool for your particular job. Either way, don’t gripe that this specialized tool for infra management doesn’t work exactly like every other general purpose programming language.

slillibri

8 months ago

That makes sense I guess, I just never considered locals or data resources as variables.

878654Tom

8 months ago

Same, locals are in my head like consts. You define it and it stays that way. A shortcut for a repeated value.

Data resources are you requesting a dynamic value of your environment.

Variables are dynamic values that a user can change.

hinkley

8 months ago

That’s only the case if you spend all day rerunning deployments. If your task is more frequently to transition the cluster config from A -> B then the distinction blurs and you go from a 10:1 delta ratio of the different classes of state to maybe 3:2, at which point it feels like splitting hairs.

Especially if the locals vary between prod and pre-prod, and worse if dev sandboxes end up with per-user instances, which for us was mercifully only needed for people working on the TF scripts, so we could run our tests locally.

878654Tom

8 months ago

We have multiple separate environments per application. For environment specific inputs we use variables.

The distinction is very clear in our team. Locals are used as const (like an application name), variables are for more dynamic user/environment inputs and data is to fetch dynamic information from other resources.

Zero problems. If a local becomes more environment specific a quick refactor fixes that. You can also have locals that use variable or data values if necessary.

One big win we also have is that we stopped using modules except for one big main module. We noticed from previous projects that as soon as we implemented modules everything became a big problem. Modules that are version pinned still required a lot of maintenance to upgrade. Modules that weren't version pinned caused more destruction than we planned. Modules outputs and inputs caused a lot of cycle problems,... Modules always seem too deep or too shallow.

hinkley

8 months ago

So would you go opentofu or pulumi or Sir Not Appearing in This Film?

pjmlp

8 months ago

Seconded, as someone that really does developer / operations, depending on the project assignment, I have learned the hard way that infrastructure configuration code should be as declarative as possible.

Sure "use code to deploy infrastructure" sounds great, and that is why we get stuff like Ant, Gradle, Pulumi, Jenkins Groovy scripts, .NET Aspire,.... until someone has to debug spaghetti code on a broken deployment.

AtlasBarfed

8 months ago

On the flip side dsl declarative stuff is obfuscated magic that you can't step through or drive into.

a dsl like SQL involves one basic substrate (data organized in tables) that you can compile in your head. But declarative infra as code involves a thousand different things across a dozen different clouds.

Declarative will hold off spaghetti for... A bit. But it devolves to spaghetti as well (think fine grained acls, or places where order of operations, which the dsl does not specify and is magically resolved, becomes ambiguous).

And if you need to go off the reservation (dsl support doesn't exist or is immature for rapidly evolving platforms, need some custom postprocess steps) then you are... What?

Probably writing code and scripts to autoinvoke on the new node, phone home to a central.... Yup that's code.

Finally, declarative code has an implicit execution loop. But for something like iac that is a very complicated, the execution loop that isn't well documented. And some committed changes to declarative code May trigger a destructive pass followed by a possibly broken constructive phase.

It's a tough problem.

Longwelwind

8 months ago

I would agree with you, if HCL wasn't a bad language in itself:

* You can't make have variables in an import block (for example, to specify a different "id" value for each workspace)

* There is no explicit way to make a resource conditional based on variables. Only a hacky way to do that using "count = foo ? 1 : 0"

* You can't have variables in the backend configuration, making it impossible to store states in different places depending on the environment.

* You can't have variables in the "ignore_changes" field of a resource, making it impossible to dynamically ignore changes for a field (for example, based on module variables).

* The VSCode extension for HCL is slow and buggy. Using TS with pulumi or TFCDK makes it possible to use all the existing tooling of the language.

breendreams

8 months ago

For Terraform, most of the issues with conditionals can be resolved by creating dictionaries dynamically and looping through it to generate resources.

You get the bonus of controlling the resource id and being able to selectively delete resources without worrying about ordering.

cyberpunk

8 months ago

This massively depends on your provider code. Using loops to manage tf stuff can you you into really “fun” scenarios when you want to e.g delete an openstack firewall rule from the middle of the array.

I’ve been burned so many times here that I hate all of this stuff with an extreme passion.

Crossplane seems to be a genuinely better way out but there are big gotchas there also like resources that can simply never be deleted

Hawxy

8 months ago

As much as I like it, I find C# to be too inflexible of a language for infrastructure code. I tried with Pulumi for a while but moved to TypeScript as it works so much better. Structural typing makes your life a lot easier.

MrLeap

8 months ago

I bounce back and forth between javascript and C# depending on the nature of the job at hand. I'm curious what things you'd like to do with C# that you can't?

I find that with some handwringing, C# can be forced to do almost anything. between extension methods, dispatch proxies and reflection you can pummel it into basically any shape.

Having to write a little boilerplate to make it happen can be a drag though. I do sometimes wish C# had something from a blank project that let me operate with as much reckless abandon as Object.assign does in js land.

Hawxy

8 months ago

It's not the fault of the language, it's just the nature of infrastructure code that's been ported from terraform. With Pulumi C# you end up with multiple nested objects/dictionaries with a load of `new` object calls that just add noise to your codebase. There's also some pain points with some types being Input<T> which IDEs try to autocomplete when in reality you need to call `new T()`. Typescript permits structural typing that _feels_ a lot better to write and read within this context.

I use C# extensively for most other things I do, but this the one area where I prefer not to use it.

cruffle_duffle

8 months ago

> Give me Terraform (as much as I hate it) any day

Terraform sure is a quirky little DSL ain’t it? It’s so weirdly verbose.

But at the same time I can create some azure function app, setup my GitHub build pipeline, get auth0 happy and in theory hook up parts of stripe all in one system. All those random diverse API’s plumbed together and somehow it manages to work.

But boy howdy is that language weird.

rwiggins

8 months ago

I haven't used Terraform in years (because I changed jobs, not because of the tech itself), but back in the day v0.12 solved most of my gripes. I have always wished they'd implement a better "if" syntax for blocks, because the language itself pseudo-supports it: https://github.com/hashicorp/terraform/issues/21512

But yeah, at $previous_job, Terraform enabled some really fantastic cross-SaaS integrations. Stuff like standing up a whole stack on AWS and creating a statuspage.io page and configuring Pingdom all at once. Perfect for customers who wanted their own instance of an application in an isolated fashion.

We also built an auto-approver for Terraform plans based on fingerprinting "known-good" (safe to execute) plans, but that's a story for a different day.

raffraffraff

8 months ago

I get around most of the if stuff using "for each" to iterate over a map. That map might be config (usually from the hiera data provider) or the output of another deployment. It's not generally a very flexible "if" that you need most of the time, it's more like "if this thing exists then create an X for it", or "while crafting X turn this doohickey on of that data set has this flap", which can be accomplished my munging together days with a locals var for loop (which support if statements).

Honestly, I only use terraform with hiera now, so I pretty much only write generic and reusable "wrapper" modules that accept a single block of data from Hiera via var.config. I can use this to wrap any 3rd party module, and even wrote a simple script to wrap any module by pointing at its git project.

That probably scares the shit out of folks who do the right thing, and use a bunch of vars with types and defaults. But it's so extremely flexible and it neutered all of the usual complexity and hassle I had writing terraform. I have single handedly deployed an entire infrastructure via terraform like this, from DNS domains up through networking, k8s clusters, helm charts and monitoring stack (and a heap of other AWS services like API Gateway, SQS, SES etc). The beauty of removing all of the data out to Hiera is that I can deploy new infra to a new region in about an 2 hours, or deploy a new environment to an existing region in about 10 minutes. All of that time is just waiting for AWS to spin things up. All I have to do in code is literally "cp -a eu-west-1/production eu-west-2/production" and then let all of the "stacks" under that directory tree deploy. Zero code changes, zero name clashes, one man band.

The hardest part is sticking rigidly to naming conventions and choosing good ones. That might seem hard because cloud resources can have different naming rules or uniqueness requirements. But when you build all of your names from a small collection of hiera vars like "%{product}-%{env}-%{region}-uploads", you end up with something truly reusable across any region, environment and product.

I'm pretty sure there's no chance I'd be able to do this with Pulumi.

stackskipton

8 months ago

Tip for naming, create a naming module where you pass in stuff like product, environment, region, service, have a bunch of locals for each thing like S3 bucket, RDS, EC2, EKS whatever you use then make them all outputs.

So at top of your IaC, you have module naming {variables as inputs} then all other resources are aws_s3 { name = module.naming.s3bucket }

grncdr

8 months ago

In pulumi

     regions = [
       “eu-west-1”,
    +  “eu-west-2”,
     ]

     for region in regions:
         …

raffraffraff

8 months ago

Of course Pulumi can do for loops, you're using a proper programming language.

I meant that I doubt that I could 'cp -a' on a whole deployment tree, and deploy the copy successfully without having to make any code changes.

Although thinking about it, I take it back. It may be possible with Pulumi with the right code structure and naming conventions, and if configuration were separated entirely from the codebase, and if variables were inferred from the directory structure. That is really the thing that allows me do to it.

grncdr

8 months ago

Yes, sorry for the rather pithy response, but separating out the "what changes" vs. "what doesn't" (config vs. code in your terms) is what makes these things possible.

As you also noted, doing this in plain terraform is kind of a pain, so using a tool like Hiera allows you to skip a lot of the work involved in doing it the "right" way. IMO if you're starting greenfield Pulumi (or CDK, anything that lets you use a "real" programming language) allows you to write (or consume!) that config in basically any form, instead of needing to funnel everything through a Terraform data provider.

paulgb

8 months ago

Yeah. I guess maybe terraform makes sense if the people writing it spend enough of their time writing HCL to master it, but I ported our terraform config to Pulumi a few years ago and never looked back. It meant I could spend way less time googling for the HCL way to do something (say, templated resource) and just use the JS primitives I already know.

hinkley

8 months ago

>spend enough of their time writing HCL to master it

Making Terraform changes every six weeks was enough time that we forgot everything and had to refresh our memories. Every time it felt like going into the water in a northern beach and forgetting how goddamned cold the water was, then reproaching yourself for forgetting.

postalrat

8 months ago

Why are people templating yaml for terraform like they templated html in php in 1996?

benatkin

8 months ago

Because it works fine, and is also used in for other things like Helm Charts?

https://helm.sh/docs/chart_template_guide/control_structures...

jnsaff2

8 months ago

Helm charts are a horrible example of text based templating.

You have YAML/JSON that k8s API wants, that is fed through helm which is fed through helmsman or whatever newer thing. There might be a layer or two of other templating around. Sometimes companies have built systems so developers/devops don't even have the ability to see what the final compiled version of the template is which is like the mother of all: "works on my laptop" problems.

It's super easy to break text based templating because of some space, tab, string escaping or whatever.

YAML makes it worse as there are lots of gotchas and different ways of doing. JSON, being quite verbose and inflexible at least has strong structure right in your face so it's a bit easier to figure out what went wrong.

With a proper programming language data structure you can be much better with verifying that the things you add or remove or iterate over will produce a valid result, much better refactoring and working as a team independently.

arkh

8 months ago

> Helm charts are a horrible example of text based templating.

Every time I see " | nindent whatever" I'm asking why the fuck the tool cannot manage indentation.

kevincox

8 months ago

And it breaks every time a variable gets a `:` inside of it and now you are producing invalid yaml everywhere you forgot to call `| toYaml`.

adhamsalama

8 months ago

I once got a nil pointer exception when I updated a helm chart. I wondered why the hell am I getting a nil pointer exception for updating a YAML file. After some investigation I found an issue on GitHub where the maintainers said the Go team says this is an intended behavior for some case in Go templates.

Wasn't fun.

benatkin

8 months ago

That isn't a typical nill/null exception, like in JavaScript, ruby, and python. That's in a language where a lot of values are non-nullable, and some of the ones that are have zero-values that can be used without getting a nil pointer exception. https://go.dev/tour/moretypes/12

So, there's a good chance was an error that was really unexpected and it's better to show the error than to risk producing bad output.

smw

8 months ago

Never read anything more true in my life!

jq-r

8 months ago

I’m not sure why nobody invented a way to dynamically update values.yaml based on what are writing in the template file. And maybe vice-versa. It would be such a time saver. Maybe someone did, but I didn’t find it yet.

dainiusse

8 months ago

This

arkh

8 months ago

Tried Pulumi thinking "it's gonna abstract all the k8s specifics". Welp no, still need to know and understand K8s so I still don't see the value from those kind of tools. In which case why not use something like Pkl to generate my yaml from some sensible code-like structures?

katdork

8 months ago

kubernetes is very complex and therefore any abstraction which completely glosses over the way the underlying systems work would make it very hard to avoid leaking or a bad abstraction to begin with.

the complexity in one way or another must be preserved within the abstraction (in all likelihood) or you will have cases you cannot create in that layer or breakages which now have the total complexity of both the abstraction itself AND kubernetes itself required to fix.

i would not say IaC is going to provide you a magic solution to learning k8s, although the value in using IaC (e.g. Argo CD / Flux CD + Kustomize + ...) in K8s land is that you are no longer imperatively managing your cluster resources and therefore can keep them within a repository, managed like code. the point of the solution is not to make it easier for newcomers, but to make it easier to have teams manage and work together on an established cluster for deployments, ...

in the case of Pulumi, you leverage the single language with typechecking instead of relying upon K8s flavoured YAML, which is itself beneficial in many ways (since you can use your regular developer tooling)

wrt pkl, pretending K8s manifest structure underneath does not help because you will need to know how the keys within a manifest interact with the underlying system regardless, especially to understand functionality, e.g. node selectors, taints and tolerations, node affinity, ...

i prior managed a terraform-based deployment of several k8s clusters and it still required knowledge of those keys and values, alongside knowledge of the underlying resource types.

without those you can't implement things like GPU-based node selection for jobs which require a GPU, ...

rusty-jules

8 months ago

What about pulumi's declarative yaml interface which can be exported from type-safe languages like cue? https://www.pulumi.com/blog/extending-pulumi-languages-with-...

nuker

8 months ago

> Give me Terraform (as much as I hate it) any day.

Just use CloudFormation. Easy to write, declarative, vars (Parameters and Output exports). Trick is not to pile everything in one Stack. Use several.

notyourwork

8 months ago

CDK is much better to express this. Why cfn?

nuker

8 months ago

Less lines, easier to read, declarative (cdk is interactive, less predictable).

And it generates shitty CFN, we can do better ourselves :)

notyourwork

8 months ago

How is cdk interactive? I use cdk and have it auto build and deploy.

nuker

8 months ago

It is "imperative", not interactive, sorry. From Wiki:

"There are generally two approaches to IaC: declarative (functional) vs. imperative (procedural). The difference between the declarative and the imperative approach is essentially 'what' versus 'how'."

https://en.wikipedia.org/wiki/Infrastructure_as_code#Types_o...

klysm

8 months ago

Apply is really straightforward. The dictionary stuff is very annoying overhead but it’s nice keeping everything in one language.

nothrabannosir

8 months ago

For anyone deliberating between Pulumi and CDK let me recommend what I consider the best of both worlds: CDKTF, Hashicorp’s answer to Pulimi (my quote not theirs).

It’s got everything you want:

- strong type system (TS),

- full expressive power of a real programming language (TS),

- can use every existing terraform provider directly,

- compiles to actual Terraform so you can always use that as an escape hatch to debug any problems or interface with any other tools,

- official backing of Hashicorp so it’s a safe bet

It’s a super power for infra. If you have strong software dev skills and you want to leverage the entire TF ecosystem without the pain of Terraform the language, CDKTF is for you.

(No affiliation)

https://developer.hashicorp.com/terraform/cdktf

ranguna

8 months ago

Cdktf is good, but it's not amazing. You are still constrained by terraform syntax like `count = condition? 1 : 0` , instead of doing a normal` if` statement. And there's a fairly good amount of times where you need to use terraform iterators instead of doing a normal for/forEach/map/reduce.

But all in all, it works. It's just a bit limited on what you can do with the actual language.

eadmund

8 months ago

> - full expressive power of a real programming language (TS)

I suppose TypeScript does count as a real programming language, in that it’s Turing complete. But I can use Pulumi from (they claim) any programming language. Specifically, I can use it from Go. Why would I add TypeScript to my project when I can live in one language?

> - official backing of Hashicorp so it’s a safe bet

Given the number of folks leaving the Hashicorp platform, I think it’s arguably no longer a ‘safe bet.’

gregwebs

8 months ago

The Go SDK is a lot more verbose for configuration (plums.String, etc) and then you have error handling boilerplate as well. Exceptions are a better match for creating resources in Pulumi.

ivantop

8 months ago

How is compiling to terraform a positive? I'd rather debug python than python-compiled-to-terraform.

nothrabannosir

8 months ago

Because you can use that to interface with existing tooling. Terraform has a huge and established ecosystem and it’s an uphill battle to compete with it. It’s risky to bet your infra on a tech that tries to drink the ocean and supplant the entire thing. Meanwhile if you compile down to TF you get to use a different language without having to pay the cost of moving out of the tf ecosystem. And given that the language itself is by far the worst thing about terraform that’s a big win.

It turns out terraform is actually quite acceptable when you slap a decent language on top of it. Passable, even :)

ivantop

8 months ago

Makes sense! Except for one little thing..

We've been migrating off of Terraform at BigCo recently and it has been a tremendous success. The migration has saved countless hours. Before, I was jaded and routinely in the office until 8 or 9 or so manually running terraform deploys for our engineering teams in India. Now, thanks to Pulumi, I'm able to leave the office at 7:30-8 -- and I can tell you single handed that this has saved my relationship with my daughter and maybe even my marriage. I'm running the fastest for loops thanks to Pulumi. We actually compile our Python down to c and use the Pulumi C SDK for insane speed benefits when we loop over our datacenter arrays. Turns out, not having bounds checks shaves off valuable time that I would otherwise be spending with my daughter. Routinely I'd be waking up screaming at 4 in the morning due to Terraform (or, what we would refer to as Tearaform because all of the infra engineers were constantly in tears). Now, I can sleep soundly until 5:30.

djtango

8 months ago

Thanks for sharing your story it sounds like you had a really rough time of Terraform.

I don't have much experience running Terraform at scale. What has Pulumi made easier? Why is looping a bottleneck in infrastructure code?

Based on the info I can glean from this story you may be working at a scale / use case that may be too big or a poor fit for Terraform but I'm not sure...

jpeeler

8 months ago

I think he's kidding... there's no C CDK:

https://www.pulumi.com/docs/iac/languages-sdks/

meekins

8 months ago

In an AWS scenario I can think of:

Pro vs pulumi: you get a declarative template to debug and review

Pro vs CDK: The declarative template is applied via APIs instead of CloudFormation. The CDK CloudFormation abstraction leaks like hell

terminalbraid

8 months ago

Does Typescript offer a strong type system?

spicyusername

8 months ago

Yes

terminalbraid

8 months ago

What's your argument here? For example, Typescript allows lots of operations on objects that cannot be known at compile time because it relies on the user to inform it of types accurately, anything can be coerced into anything without complaint with "as", and it allows for arbitrary operations on an "any" type without complaint.

I've heard it referred to it as an "optionally typed" or "gradually typed" system, which, having worked for years in Typescript and other languages like Rust and Kotlin, etc, I agree with.

terandle

8 months ago

Pretty easy to add runtime validation at the edges with Zod https://github.com/colinhacks/zod

Great thing is that the zod schema also doubles as your typescript type so you don't have to write a duplicate/shadow TS type definition.

terminalbraid

8 months ago

That doesn't make Typescript as a language "strongly typed".

user

8 months ago

[deleted]

user

8 months ago

[deleted]

turtlebits

8 months ago

I wish CDK was fully baked enough to actually use. It's still missing coverage for some AWS services (sometimes you have to do things in cloudformation, which sucks) and integrating existing infra doesn't work consistently. Oh and it creates cloudformation stacks behind the scenes and makes for troubleshooting hell.

Aeolun

8 months ago

> sometimes you have to do things in cloudformation, which sucks

All of CDK does things in cloudformation, which made the whole thing stillborn as far as I’m concerned.

The CDK team goes to some lengths to make it better, but it’s all lambda based kludges.

liveoneggs

8 months ago

so like every other aws "solution"

roncesvalles

8 months ago

CDK is an abomination and I'm not sure why AWS is pushing it now. A few years ago all their Quick Starts were written in CloudFormation, now it's CDK that compiles to CloudFormation. Truly a bad idea.

Just write CloudFormation directly. Once you get the hang of the declarative style and become aware of the small gotchas, it's pretty comfy.

nuker

8 months ago

> Just write CloudFormation directly. Once you get the hang of the declarative style and become aware of the small gotchas, it's pretty comfy.

Exactly this. And don't make huge templates, split stuff logically to several stacks and pass vars via export/importvalue.

user

8 months ago

[deleted]

LunaSea

8 months ago

The biggest hurdle I've encountered is cross-stack resource sharing, especially in case of bidirectional dependencies like KMS keys and IAM roles.

8note

8 months ago

The biggest hurdle is when you want to refactor your stacks, and you pretty well just can't, without risk of deleting everything

irjustin

8 months ago

> you pretty well just can't, without risk of deleting everything

This is one hyper annoying area.

It is possible to get around it, but it's ugly, drop to L1 and override logical id:

   let vpc = new ec2.Vpc(this, 'vpc', { natGateways: 1 })
   let cfnVpc = vpc.node.defaultChild as ec2.CfnVPC
   cfnVpc.overrideLogicalId('MainVpc')

You have to do this literally for every resource that's refactored.

For us, we run 2 stacks. One that basically cannot/should-not be deleted/refactored. VPC, RDS, critical S3 buckets - i.e. critical data.

The 2nd stack runs the software and all those resources can be destroyed, moved whatever w/o any data loss.

wcchoi

8 months ago

+1 CDK refactoring is annoying and ugly

in my experience you'd need to read the CDK source code to find the offending node and call `overrideLogicalId`

there is a library to do it in nicer way: https://github.com/mbonig/cdk-logical-id-mapper

however it does not work in every case

nuker

8 months ago

> we run 2 stacks. One that basically cannot/should-not be deleted/refactored. VPC, RDS, critical S3 buckets

Why, dear god, you put VPC and RDS in one stack? They are much better off as separate CFN stacks.

LunaSea

8 months ago

There are deletion protection flags that can be enabled.

But circular dependencies can also lead to issues here where CDK will prevent you from deleting a resource used or referenced by a different stack.

x0x0

8 months ago

I also had a really rough go with cdk. I personally found the lack of upsert functionality -- you can't use a resource if it exists or create if it doesn't -- to make it way more effort than I felt was useful. Plus a lack of useful error messages... maybe I'm dumb, but I can't recommend it to small companies.

otterley

8 months ago

Upserting resources is an antipattern in cloud resource management. The idiom that works best is to declare all the resources you use and own their lifecycle from cradle to grave.

The problem with upserting is that if the resource already exists, its existing attributes and behavior might be incompatible with the state you're declaring. And it's impossible to devise a general solution that safely transitions an arbitrary resource from state A to state A' in a way that is sure to honor your intent.

x0x0

8 months ago

Hmm.

If you don't mind sharing, suppose (because it's what I was doing) I was trying to create personal dev, staging, and prod environments. I want the usual suspects: templated entries in route53, a load balancer, a database, some Fargate, etc.

What are you meant to do here? Thank you.

otterley

8 months ago

If they're all meant to look alike, you'd deploy the stack (or app, in CDK parlance) into your dev, staging, and prod accounts. You'd get the same results in each.

yieldcrv

8 months ago

Cant use bun to deploy CDK, CDK fails as it looks for package-lock yarn-lock or pnpm’s exclusively

So dumb. Trying to move to SST for only that reason

but if you add cdk to the path, you can still deploy, its just that your cicd and deployment scripts are not all using bun anymore

Tehnix

8 months ago

Hmm, beyond a bug they had in bun between version 1.0.8 and 1.1.20[0] bun has otherwise worked perfectly fine for me

You have to do a few adjustments which you can see here https://github.com/codetalkio/bun-issue-cdk-repro?tab=readme...

- Change app/cdk.json to use bun instead of ts-node

- Remove package-lock.json + existing node_modules and run bun install

- You can now use bun run cdk as normal

[0]: https://github.com/codetalkio/bun-issue-cdk-repro

hinkley

8 months ago

mmm, I wonder how hard that would be to fix in a PR.

yieldcrv

8 months ago

actually good idea, didnt think about it

petcat

8 months ago

Kubernetes no thanks. Terraform + Kamal [1] on Digital Ocean is the way I deploy/run apps now.

[1] https://kamal-deploy.org/

mati365

8 months ago

Plain Podman systemd integration is way more powerful and secure, as it does not mess with firewall and allows to run rootless containers using services. It's even possible to run healthchecks and enforce building images just before starting service making on-demand containers using systemd-proxyd possible. Check example: https://github.com/Mati365/hetzner-podman-bunjs-deploy

petcat

8 months ago

> way more powerful and secure

I don't care about powerful. That's the opposite of what I want. I could just use k8s if I cared about that.

mati365

8 months ago

It looks like you don't even care about opening documentation before pressing reply. Podman is a simple hammer without any moving parts, that used properly can be used to build fancy stuff without much knowledge.

petcat

8 months ago

I'm aware of what Podman and Systemd are. Apparently you are not aware of what Kamal is. Open documentation, then press reply.

woleium

8 months ago

Be nice folks, we are all here to learn :)

ngrilly

8 months ago

Does it support zero downtime deploys?

mati365

8 months ago

Why not? Install trafeik or any other load balancer, setup two services, and restart one after one.

striking

8 months ago

https://kamal-deploy.org/docs/configuration/proxy/

I think GP's point was that Kamal has all of these things already, so you don't have to set them up.

ngrilly

8 months ago

Precisely. I've been implementing some kind of blue-green deployment with both systemd and dockerd, but it was an imperfect and incomplete solution. Kamal put much more effort into it and it seems more convenient and reliable (but I haven't tried it yet in production).

FridgeSeal

8 months ago

Ah yes my favourite thing to have to do, rolling my own deploys and rollbacks.

It’s stuff like this that’s just a thousand papercuts that dissuades me from using these “simpler” tools. By the time you’ve rebuilt by hand what you need, you’ve just created a worse version of the “more complex” solution.

I get it if your workload is so simple ir low requirement that zero-downtime deploys, rollbacks, health/liveness, automatic volumes, monitoring etc are features you don’t want or need, but “it’s just as good, just DIY all the things” doesn’t make it a viable alternative in my mind.

stackskipton

8 months ago

Sure but Kumal getting all those features means it strays close to Kubernetes in complexity and it quickly because "Why not Kubernetes? At least that is massive popular with a ton of support."

selcuka

8 months ago

I disagree. An opinionated tool can be as powerful as, but much simpler than a generic tool.

ngrilly

8 months ago

Kamal is doing most of this, but on a single node. This is the limitation that differentiates it from k8s, but also makes it much simpler.

stackskipton

8 months ago

I've looked into Kamal but it feels so "It's as complex as Kubernetes but isn't so support is going to be nightmarish."

Why is this better then Ansible + Docker Compose?

petcat

8 months ago

You could certainly implement Kamal just with Ansible and Docker Compose. It's just an abstraction that does it for you and handles all the edge-cases. (Kamal doesn't use Ansible, it has its own SSH lib).

amzans

8 months ago

Technically, it’s not much different from using Ansible to run Docker on remote hosts.

What it provides is a set of conventions based on what most web apps look like.

Eg. built-in proxy with automatic TLS and zero downtime deployments, first-class support for a DB and cache, encrypted secrets, etc.

It’s definitely not for every use case, but for your typical 3-tier monolith on a handful of servers I found it does the job well.

mplewis

8 months ago

Kamal is simply NIH K8s made by an unreliable company with poor leadership. No thanks, not for my prod infra!

archy_

8 months ago

I don't trust any project with a Discord listed so prominently

Give me a forum (even Discourse will do) , I'm tired of needing 3rd party spyware to interact with developers. That it is all closed off from search engines makes it even worse

thinkindie

8 months ago

Pulumi genAI-based documentation is trashed. I've moved to terraform and i was able to achieve much better results in shorter time thanks to higher documentation level for terraform.

tholm

8 months ago

Worth noting that most of the terraform documentation for classic pulumi providers (providers build on top of TF providers) is still relevant to Pulumi.

mavdi

8 months ago

Hi everyone,

We've gone through a lot of pain to get this blueprint working since our AWS costs were getting out of hand but we didn't want to part ways with CDK.

We've now got the same stack structure going with Pulumi and Digital ocean, having the same ease of development with at least 60% cost reduction.

vundercind

8 months ago

Keep an eye on reachability and performance. I’ve seen DO consistently perform terribly and/or drop connections for months (that is, didn’t look like some brief routing glitch somewhere) for some US and Canadian routes (not, like, Sri Lanka or something) on excellent Internet connections. The fix was moving to AWS, problem gone. It felt like a shitty-peering-agreements issue.

nostrebored

8 months ago

People will pretend that this quality difference doesn’t exist in networking, uptime, server quality.

It’s not a drop in replacement. It might be worth it depending on what you’re doing.

vundercind

8 months ago

Frustratingly, it’s also something that doesn’t meaningfully appear on any features list or comparison sheet.

data_marsupial

8 months ago

How do you monitor the connection quality?

vundercind

8 months ago

From the client side. You can’t know what it should be like without knowing the client.

I’m sure there are lots of DO clients seeing the same things we did, but not realizing it.

We did see it (multiple DCs—we didn’t just not try to fix this before going to AWS) in multiple cases with tens of clients so if there’s good news it’s that if you can monitor like 100 clients distributed over a wide area and all of them behave as expected you may not be experiencing what we did. What we saw was closer to 5% with absurd slowness or frequently-dropped connections than to 0.01%.

And if you are just operating a website and sticking Cloudflare or whatever in front of DO anyway, this doesn’t matter. I expect that’s why it’s not a more widely-reported issue.

skywhopper

8 months ago

Please change the title text unless you add some discussion of the cost differences to the page you linked. However useful your tool is, nothing on this page mentions AWS or costs.

Aeolun

8 months ago

I don’t think Digital Ocean is all that much better for pricing, but using Pulumi over CDK is a pure win as far as I’m concerned.

JamesSwift

8 months ago

Agreed. On the bright side, I was able to migrate managed k8s on DO to managed k8s in GCP with very minimal work since it was managed via pulumi.

CSMastermind

8 months ago

Yeah, I've been really disappointed with Digital Ocean so far. Not just from a pricing perspective but from a customer service perspective.

Anyone using CDK should switch to Pulumi though.

thelittleone

8 months ago

Perhaps Pulumi with Vultr is also worth a look.

fulafel

8 months ago

Why's everyone going away from declarative? Terraform, CloudFormation, AWS Copilot etc have a lot of virtues and are programming language agnostic.

Using a complex programming language (C++ of the browser world) just for this has a big switching cost. Unless you're all in on TS. And/or have already built a huge complex IaC tower of babel where programming-in-the-large virtues justify it.

jnsaff2

8 months ago

> Why's everyone going away from declarative?

If I had to guess it's because

- more imperative background developers need to work with infrastructure and they bring over their mindset and ways of working

- infrastructure is more and more available through API's and it saves a lot of effort to dynamically iterate over cattle than declaratively deal with pets

- things like conditionals, loops and abstractions are very useful for a reason

- in essence the declarative tools are not flexible enough for many use cases or ways of working, using a programming language brings infinite flexibility

Personally I am more in the declarative camp and see the benefits of it, but there is certain amount of banging ones head against it's rigidity.

1dom

8 months ago

Complex programming languages for infrastructure code get used when people who are more comfortable using complex programming languages to solve their problems are given the problem of infrastructure and ops.

It is classic "every problem is a nail to the person with a hammer". Complex languages - by definition - can solve a wider variety of problems than a simple declarative language but - by definition - are less simple.

Complex languages for infra - IMO - are the wrong tool for the wrong job because of the wrong skills and the wrong person. The only reason why inefficiencies like this are ever allowed to happen is money.

"Why hire a dev and an ops when we can hire a single devops for fractionally less?" - some excited business person or some broken dev manager, probably.

vundercind

8 months ago

Declarative has in-practice meant “programming, but in YAML” more often than not, which is hell. YAML’s not even a good format for static data, and it’s even worse when you try to program in it.

anothernewdude

8 months ago

Terraform isn't really declarative. It's declarative right up until the point at which it isn't, where it falls apart. I need a declarative deployment right up to the application layer, which is where terraform fails.

pjmlp

8 months ago

Because they like to spend endless hours debugging infrastructure builds.

nasmorn

8 months ago

A small CDK project is a lot more readable in my opinion. It doesn’t have a ton of yml files where your config is spread out

fulafel

8 months ago

It seems to me that there's not a big difference in nr of files. You can have a single template in CF or Terraform files and similarly you can split your CDK code in many files, or not.

(For bigger stuff apparently CF has some limits relating to resoures per single stack)

fragmede

8 months ago

Because sometimes you just need a for loop in a way that terraform's for_each/other DSL doesn't support

nsonha

8 months ago

declarative does not equate to config files

the property that equates to config files is "being static", which modern deployments are not.

nextworddev

8 months ago

Controversial opinion here: just use CDK. Learn cloud formation for advanced stuff. It’s really not that hard and pays dividends

coredog64

8 months ago

Just learn CloudFormation. It’s not that hard, and if you really want to write code, you can implement custom resources for all the times the service team let you down.

turbobrew

8 months ago

CDK is a second class citizen, it is missing implementations for many services and features. CDK was DOA as it should have been a requirement that when AWS added something to terraform it needed to be added to CDK as well.

ptdorf

8 months ago

In my experience AWS' CloudFormation is limited in the number of resources and exposed APIs than any of the CDK.

nextworddev

8 months ago

AWS service teams provide cloud formation support before CDK support in many cases, so eventually CDK users run into situations where they need to look at CF

mythz

8 months ago

Hetzner has been our "expensive AWS cloud costs" saviour

We've also started switching our custom Docker compose + SSL GitHub Action deployments to use Kamal [1] to take advantage of its nicer remote monitoring features

[1] https://kamal-deploy.org

KronisLV

8 months ago

I’ve been pretty happy with something like Docker Compose or Docker Swarm and Portainer, but honestly it’s nice that there are other alternatives that strive for something manageable and not too complex!

jmspring

8 months ago

One thing about managing EKS with Pulumi, Terraform, etc. if you deploy things like Istio that makes changes to infrastructure. Do a Terraform destroy - no luck, you are hunting down maybe some security groups or other assets Istio generated that TF doesn't know about. Good times.

skywhopper

8 months ago

This title text is nowhere on the linked page. Please get rid of the editorialization. DO is not that much cheaper for a baseline instance.

lysace

8 months ago

Pulumi is very neat with straight AWS, too. I suspect this is the primary use case.

giorgioz

8 months ago

CDK APIs in JavaScript are very nice. It's a much much developer experience than Pulumi/Terra form and even Server less Framework. In our monorepo each service is in a separate folder with a folder called /infrastructure inside with a file called Stack.js that defines all the resources needed. When starting a new service we just copy one of the last similar services that we developed. We are able to deploy a new service in hours. Services are getting better and better with accumulation of nice to have features that you wouldn't have time to add to most services.

lazzurs

8 months ago

This doesn’t sound good to me. Would you do the same with some functional code rather than creating an external versioned library?

Terraform or CDK I would want a simple shareable thing that did the boilerplate that I called with any variables I needed to change.

nasmorn

8 months ago

My DO K8S cluster ist bugging me every couple of months to do an upgrade. I am always scared to just run it but moving shit over to a new cluster instead is so much work that I simply gamble on it. AWS ECS is worth over penny

katdork

8 months ago

DO's K8S is more equivalent to AWS's EKS offering, so of course ECS which abstracts away pretty much all of the other parts of K8s is going to require less maintenance. It's sort of a false equivalence to say ECS == that solution.

On EKS, you need to do the same version updates with the same amount of terror.

You do pay the extra for the further management to just run containers somewhere!

(you might want to say "every" instead of over, "is" instead of "ist")

nasmorn

8 months ago

I definitely want to say is instead of ist but it is bugging me every couple of months. You do the upgrade and 6 months later it needs another one. No LTS in sight

wordofx

8 months ago

It’s only “insane costs” if you don’t know what you’re doing.

postalrat

8 months ago

Or need a good amount of ram. Which should be really cheap these days.

hinkley

8 months ago

My life on AWS the last five or so years really would have been a lot simpler if every new generation of EC2 servers didn't have the exact same ratio of RAM to cores.

zokier

8 months ago

At this point the memory:vcpu ratio is the defining characteristic of main general purpose C/M/R series, I'd think it would be pretty disrupting to change that significantly anymore. And they got also the special extra-high memory X series available. I would say ec2 is pretty flexible in this regard, you have options for 2/4/8/16/32 gigabytes per vcpu. It's mostly problem if you need even less memory than what C series provide, or need some special features.

hinkley

8 months ago

As products age they tend to use more memory. Add in space/time tradeoffs asking to use more. You either get stuck applying the brakes trying to keep the memory creep at bay, or you give in and jump to 2x the memory pool which will disappear too.

The old solution in on-prem was to populate machines with 2/3 to 3/4 of their max addressable memory and push back on the expensive upgrade as long as possible, or at least until memory prices came down for the most expensive modules. Then faster hard drives or new boxes are the next step.

mkesper

8 months ago

RAM in cloud is expensive because it's the only thing still not possible to over-provision performantly afaik.

yieldcrv

8 months ago

and even if you do, it’s usually a system design problem that you’re maintaining

on one hand, I can see how this is an unfalsifiable standard, on the other hand I can see the utility of solving a friction for people that messed up

mise_en_place

8 months ago

EKS has become a clusterf*ck to manage and provision. This looks very useful. Bare metal k8s, even running on EC2, might be another option.

GauntletWizard

8 months ago

You don't choose EKS because it's easy to manage. You choose it because you intend to use the bevy of other AWS hosted services. The clusterfuck of management is directly related to that.

The alternative, which I feel is far too common (and I say this as someone who directly benefits from it): You choose AWS because it's a "Safe" choice and your incubator gets you a bunch of free credits for a year or two. You pay nothing for compute for the first year, but instead pay a devops guy a bunch to do all the setup - In the end it's about a wash because you have to pay a devops guy to handle your CI and deploy anyway, you're just paying a little more in the latter.

trallnag

8 months ago

What's your issue with EKS? I operate several very simple and small single-tenant clusters, and I have to touch the infrastructure only once a year for updates

RoxaneFischer1

8 months ago

I personally love terraform. It's easy to use and actually it's rigid framework allow to make less mistakes/way more readable than pulumi

strzibny

8 months ago

You can also simplify Kubernetes to just Kamal and things become instantly easier...

pmarreck

8 months ago

Anyone use Garnix? https://garnix.io/

mplewis

8 months ago

This looks too experimental for me to trust with production deployments.

kristianpaul

8 months ago

Is this an Ad?

nextworddev

8 months ago

GitHub has been littered with developer relations growth hacks recently.

icar

8 months ago

I strongly recommend sst.dev

magamanlegends

8 months ago

[dead]

nixdev

8 months ago

Digital Ocean isn't really a "real" cloud. Maybe use Digital Ocean if you're hosting video game servers, but no serious business should be on it.

Sohcahtoa82

8 months ago

I wouldn't even use DO for that, unless it's like a private server for just your friends.

I won't touch DO after they took my droplet offline for 3 hours because I got DDoS'd by someone that was upset that I banned them from an IRC channel for spamming N-bombs and other racial slurs.

aitchnyu

8 months ago

When was this? Now DO and Linode promise full DDOS protection.

Dylan16807

8 months ago

What's your definition of real cloud?

And can you name a real cloud that charges a half-reasonable price for bandwidth? I consider $10/TB to be half-reasonable.

15155

8 months ago

Ideally one that doesn't have these kinds of issues:

https://news.ycombinator.com/item?id=6983097

Dylan16807

8 months ago

That was more than ten years ago, I don't think that tells us about current quality.

nixdev

8 months ago

While yes, it was more than ten years ago, we can see that such stupidity is woven into their DNA as a company.

TL;DR: where a cloud provider hosts customers for which there are real-world consequences for data leakage, not a single customer can be at-risk for data leakage. It's a different line of thinking, almost "a different world", to those who have this line of thinking vs those who do.

"The thing about reputations is you only have one".

By contrast even more than ten years before that, AWS was publishing whitepapers about how all contents of RAM to be used by a VM are initialized before a VM is provisioned, and other efforts to proactively scrub customer data.

I worked at a niche cloud provider a bit over ten years ago. We used Intel QAT for client-side encryption for our network attached pools of SSD. We were able to offer all-SSD at low cost and without security blindspots by crypto key rotation implemented by compartmentalized teams and also physical infrastructure compartmentalization patterns. Which, about half a decade later we found we were second only to AWS and almost second (but ahead of in other ways) to some smaller cloud-style hosting provider.

Dylan16807

8 months ago

> While yes, it was more than ten years ago, we can see that such stupidity is woven into their DNA as a company.

I don't know if it really meets that bar, but I won't argue about that right now. I'm just going to ask again for your definition of "real cloud" and whether you can suggest some that don't price gouge bandwidth (and aren't oracle, I would not consider them worthy of trust either).

nixdev

8 months ago

> I'm just going to ask again for your definition of "real cloud"

Even from all the way over here, I infer that I think we're from so different worlds that what "real cloud" means to my side of the world isn't a part of your world.

What I can tell you, is AWS is the king of cloud, Google Cloud is a very very distant 2nd place, and Azure is an event more distant 3rd place.

> and aren't oracle, I would not consider them worthy of trust either

Smart man.