hackernews client

Why did containers happen?

208 pointsposted 4 months ago

286 Comments

alphazard

4 months ago

Containers (meaning Docker) happened because CGroups and namespaces were arcane and required lots of specialized knowledge to create what most of us can intuitively understand as a "sandbox".

Cgroups and namespaces were added to Linux in an attempt to add security to a design (UNIX) which has a fundamentally poor approach to security (shared global namespace, users, etc.).

It's really not going all that well, and I hope something like SEL4 can replace Linux for cloud server workloads eventually. Most applications use almost none of the Linux kernel's features. We could have very secure, high performance web servers, which get capabilities to the network stack as initial arguments, and don't have access to anything more.

Drivers for virtual devices are simple, we don't need Linux's vast driver support for cloud VMs. We essentially need a virtual ethernet device driver for SEL4, a network stack that runs on SEL4, and a simple init process that loads the network stack with capabilities for the network device, and loads the application with a capability to the network stack. Make building an image for that as easy as compiling a binary, and you could eliminate maybe 10s of millions of lines of complexity from the deployment of most server applications. No Linux, no docker.

Because SEL4 is actually well designed, you can run a sub kernel as a process on SEL4 relatively easily. Tada, now you can get rid of K8s too.

tliltocatl

4 months ago

Containers and namespaces are not about security. They are about not having singleton objects at the OS level. Would have called it virtualization if the word wasn't so overloaded already. There is a big difference that somehow everyone misses. A bypassable security mechanism is worse than useless. A bypassable virtualization mechanism is useful. It is useful to be able to have a separate root filesystem just for this program - even if a malicious program is still able to detect it's not the true root.

As about SEL4 - it is so elegant because it leaves all the difficult problems to the upper layer (coincidentally making them much more difficult).

topspin

4 months ago

> Containers and namespaces are not about security

True. Yet containers, or more precisely the immutable images endemic to container systems, directly address the hardest part of application security: the supply chain. Between the low effort and risk entailed when revising images to address endlessly emerging vulnerabilities, and enabling systematized auditing of immutable images, container images provide invaluable tools for security processes.

I know about Nix and other such approaches. I also know these are more fragile than the deeply self-contained nature of containers and their images. That's why containers and their image paradigm have won, despite all the well-meaning and admirable alternatives.

> A bypassable security mechanism is worse than useless

Also true. Yet this is orthogonal to the issues of supply chain management. If tomorrow, all the problems of escapable containers were somehow solved, whether by virtual machines on flawless hypervisors, or formally verified microkernels, or any other conceivable isolation mechanism, one would still need some means to manage the "content" of disparate applications, and container systems and the image paradigm would still be applicable.

otabdeveloper4

4 months ago

> I also know these are more fragile than the deeply self-contained nature of containers and their images

Not really. People only use Nix because it doesn't randomly break, bitrot or require arcane system setup.

Unlike containers. You really need k8s or something like it to mould Docker containers into something manageable.

topspin

4 months ago

> People only use Nix because it doesn't randomly break, bitrot or require arcane system setup.

I'll stipulate this, despite knowing and appreciating the much greater value Nix has.

Then, the problem that Nix solves isn't something container users care about. At scale, the bare metal OS hosting containers is among the least of one's problems: typically a host image is some actively maintained, rigorously tested artifact provided by one of a couple different reliable sources. Ideally container users are indifferent to it, and they experience few if any surprises using them, including taking frequent updates to close vulnerabilities.

> Unlike containers.

Containers randomly break or bitrot? I've never encountered that view. They don't do this as far as I'm aware. Container images incorporate layer hashing that ensure integrity: they do not "bitrot." Image immutability delivers highly consistent behavior, as opposed to "randomly break." The self-contained nature of containers delivers high portability, despite differences in "system setup." I fail to find any agreement with these claims. Today, people think nothing of developing images using one set of tools (Docker or what have you) and running these image using entirely distinct runtimes (containerd, cloud service runtimes, etc.) This is taken entirely for granted, and it works well.

> Arcane system setup.

I don't know what is meant by "system setup" here, and "arcane" is subjective. What I do know is that the popular container systems are successfully and routinely used by neophytes, and that this doesn't happen when the "system setup" is too demanding and arcane. The other certainty I have is that whatever cost there is in acquiring the rather minimal knowledge needed to operate containers is vastly smaller than achieving the same ends without containers: the moment a system involves more than 2-3 runtime components, containers start paying off verses running the same components natively.

otabdeveloper4

4 months ago

> Containers randomly break or bitrot?

All the fucking time. Maybe it's possible to control your supply chain properly with containers, but nobody actually does that. 99% of the time they're pulling in some random "latest image" and applying bespoke shell commands on top.

> I don't know what is meant by "system setup" here, and "arcane" is subjective.

Clearly you've never debugged container network problems before.

topspin

4 months ago

> but nobody actually does that

They do. I assure you.

> they're pulling in some random "latest image"

Hardly random. Vendoring validated images from designated publishers into secured private repos is the first step on the supply chain road.

> Clearly you've never debugged container network problems before.

Configuring Traefik ingress to forward TCP connections to pods was literally the last thing I did yesterday. At one time or another I've debugged all the container network problems for every widely used protocol in existence, and a number of not so common ones.

otabdeveloper4

4 months ago

> first step on the supply chain road

99 percent of Docker container users aren't on the supply chain road. They just want to "docker pull", #yolo.

> Configuring Traefik ingress to forward TCP connections to pods was literally the last thing I did yesterday

Docker does crazy insane modifications to your system settings behind the scenes. (Of which turning off the system firewall is the least crazy.)

Have fun when the magic Docker IP addresses happen to conflict with your corporate LAN.

topspin

4 months ago

Feel free to have whatever problems you enjoy with Docker and its users. The discussion was about containers and their security, reliability and usability, and there I haven't found one thing you've written that made any sense. Your conflation of Docker with all of this is a strong clue that your actual knowledge on the topic is limited.

XorNot

4 months ago

Containers don't break in any of those ways, but rebuilding the images with updates does and the same is entirely true of nix.

otabdeveloper4

4 months ago

No, because Nix configuration is declarative and statically checked.

Containers is "run these random shell commands I copy pasted from the internet on top of this random OS image I pulled from the internet, #yolo".

XorNot

4 months ago

Did you inspect the build code of all the nixpkgs you imported? Did you inspect the code of the tarballs they depend on? Sure, the SHA256 is right there...did you look at it?

People copy and paste nix code all the damn time because it's downright unparseable and inscrutable to the majority of users. Just import <module>, set some attrs and hit build. #yolo

otabdeveloper4

4 months ago

Nix code is composable and statically checked for consistency. Docker containers is just a random sequence of shell scripts that sometimes happens to not error out because people mostly only use the same five Ubuntu or Alpine base images and don't layer more than two things at once.

You see the difference?

alphazard

4 months ago

> As about SEL4 - it is so elegant because it leaves all the difficult problems to the upper layer (coincidentally making them much more difficult).

I completely buy this as an explanation for why SEL4 for user environments hasn't (and probably will never) take off. But there's just not that much to do to connect a server application to the network, where it can access all of its resources. I think a better explanation for the lack of server side adoption is poor marketing, lack of good documentation, and no company selling support for it as a best practice.

frumplestlatz

4 months ago

The lack of adoption is because it’s not a complete operating system.

Using sel4 on a server requires complex software development to produce an operating environment in which you can actually do anything.

I’m not speaking ill of sel4; I’m a huge fan, and things like it’s take-grant capability model are extremely interesting and valuable contributions.

It’s just not a usable standalone operating system. It’s a tool kit for purpose-built appliances, or something that you could, with an enormous amount of effort, build a complete operating system on top of.

josephg

4 months ago

Yes. I really hope someone builds a nice, usable OS with SeL4 as a base. If SeL4 is like the linux kernel, we need a userland (GNU). And a distribution that's simple to install and make use of.

I'd love to work on this. It'd be a fun problem!

boredatoms

4 months ago

seL4 needs a ‘the rest of the kernel’ to be like linux

josephg

4 months ago

It needs device drivers for modern x86 hardware. And filesystems, and a TCP stack. All of that code can be done in "SeL4 userland", but yeah - I see your point.

Are there any projects like that going on? It feels like an obvious thing.

frumplestlatz

4 months ago

A lot of deployments essentially virtualize Linux or run portions of NetBSD (e.g. via their "rump" kernel mechanism) to achieve driver support, file systems, etc. That's not really a general-purpose solution, though.

There is work within major consumer product companies building such things (either with sel4, or things based on sel4's ideas), and there's Genode on seL4.

LargoLasskhyfv

4 months ago

Are you aware of https://genode.org ?

tliltocatl

4 months ago

> But there's just not that much to do to connect a server application to the network, where it can access all of its resources.

If you only care to run stateless stuff that never write anything (or at least never read what they wrote) - it's comparatively easy. Still gotta deal with the thousand drivers - even on the server there are a lot of quirky stuff. But then you gotta run the database somewhere. And once you run a database you get all the problems Linus warned about. So you gotta run the database on a separate Linux box (at that point - what do you win vs. using Linux for everything?) or develop a new database tailored for SeL4 (and that's quite a bit more complex than an OS kernel). An elegant solution that only solves a narrow set of cases stands no chance over a crude solution that solves every case.

Also, with the current sexy containerized stacks it's easy to forget, but having same kind of environment on the programmer's workbench and on the sever was once Unix's main selling point. It's kinda expensive to support a separate abstraction stack for a single purpose.

man8alexd

4 months ago

> A bypassable security mechanism is worse than useless

Looks like the Nirvana fallacy.

user

4 months ago

[deleted]

coppsilgold

4 months ago

> Containers and namespaces are not about security

An escape from properly configured container/namespaces is a kernel 0day. Or a 0day in whatever protocol the isolated workload talks to the outside with.

graemep

4 months ago

> Containers and namespaces are not about security

People keep saying that, but I do not get it. If an attack that would work without a container, fails from inside a container (e.g. because it cannot read or write a particular file, or it cannot) it is better security.

> A bypassable security mechanism is worse than useless.

It needs the bypass to exist, and it needs an extra step to actually bypass it.

Any security mechanism (short of air gaps) might have a bypass.

> even if a malicious program is still able to detect it's not the true root.

Also true for security unless it can read or write to the true root.

tliltocatl

4 months ago

You can use containers as a security measure, but I'd argue that if (when) it fails in a spectacular way (see e. g. abstract sockets for an interesting past issue) it's your fault and not a zero-day in the kernel as a sibling comment suggest. To put it a bit less harsh - containers are not just for security and containerization tools have to balance security vs usability.

graemep

4 months ago

Yes, I do not think we disagree much.

I use containers as an extra security measure. i.e. as a way of reducing the chance that a compromise of one process will lead to a compromise of the rest of the system.

That said, I would guess that providers of container hosting must be fairly confident that they can keep them secure. I do not know what extra precautions they take though.

bombcar

4 months ago

Is that why containers started? I seem to recall them taking off because of dependency hell, back in the weird time when easy virtualization wasn't insanely available to everyone.

Trying to get the versions of software you needed to use all running on the same server was an exercise in fiddling.

mbreese

4 months ago

I think there were multiple reasons why containers started to gain traction. If you ask 3 people why they started using containers, you're likely to get 4 answers.

For me, it was avoiding dependencies and making it easier to deploy programs (not services) to different servers w/o needing to install dependencies.

I seem to remember a meetup in SF around 2013 where Docker (was it still dotCloud back then?) was describing a primary use-case was easier deployment of services.

I'm sure for someone else, it was deployment/coordination of related services.

dabockster

4 months ago

The big selling points for me were what you said about simplifying deployments, but also the fact that a container uses significantly less resource overhead than a full blown virtual machine. Containers really only work if your code works in user space and doesn't need anything super low level (eg TCP network stack), but as long as you stay in user space it's amazing.

misnome

4 months ago

The main initial drive for me was that it let me separately run many things without a) trying to manage separate dependency sets, and b) Sharing RAM - Without having to physically allocate large amounts of memory to virtual machines; on an 8GB machine at a couple per VM that doesn’t let you get far.

fragmede

4 months ago

"making it easier to deploy" is a rather... clinical description for fixing the "but it works on my machine!" issue. We could go into detail on how it solved that, but imo it comes down to that.

yen223

4 months ago

There's a classic joke where it turns out the solution to "it works on my machine" was to ship my machine

agumonkey

4 months ago

my view of docker, as a who thought it was a shallow wrapper on linux namespaces, is that it was a good fit for the average IT shop to solve the deployment friction

no more handmade scripts(or worse fully manual operations) stupid simple dockerfile scripts.. any employee would be able to understand and groups can organize around it

docker-compose tying services into their own subnet was really a cool thing though

MonaroVXR

4 months ago

Still not the case, today anyone would understand them. At least in this part of the country where I live

simonjgreen

4 months ago

This matches my recollection. Easily repeatable development and test environments that would save developers headaches with reproduction. That then lead logically to replacement of Ansible etc for the server side with the same methodology.

There were many use cases that rapidly emerged, but this eclipsed the rest.

Docker Hub then made it incredibly easy to find and distribute base images.

Google also made it “cool” by going big with it.

chasd00

4 months ago

iirc full virtualization was expensive ( vmware ) and paravirtualization was pretty heavyweight and slow ( Xen ). I think Docker was like a user friendlier cgroups and everyone loved it. I can't remember the name but there was a "web hosting company in a box" software that relied heavily on LXC and probably was some inspiration for containerization too.

edit: came back in to add reference to LXC, it's been probably 2 decades since i've thought about that.

oblio

4 months ago

LXD?

chasd00

4 months ago

Heh til LXD is not a typo. Thanks :)

ctkhn

4 months ago

On a personal level, that's why I started using them for self hosting. At work, I think the simplicity of scaling from a pool of resources is a huge improvement over having to provision a new device. Currently at an on-prem team and even moving to kubernetes without going to cloud would solve some of the more painful operational problems that send us pages or we have to meet with our prod support team about.

alphazard

4 months ago

Yes, totally agree that's a contributor too. I should expand that by namespaces I mean user, network, and mount table namespaces. The initial contents of those is something you would have to provide when creating the sandbox. Most of it is small enough to be shipped around in a JSON file, but the initial contents of a mount table require filesystem images to be useful.

kace91

4 months ago

There are two answers to “why x happened”.

You’re talking about the needs it solves, but I think others were talking about the developments that made it possible.

My understanding is that Docker brought features to the server and desktop (dependency management, similarity of dev machine and production, etc), by building on top of namespacing capabilities of Linux with a usability layer on top.

Docker couldn’t have existed until those features were in place and once they existed it was an inevitability for them to be leveraged.

DrScientist

4 months ago

And what was the reason for the dependency hell?

Was it always so hard to build the software you needed on a single system?

theamk

4 months ago

Because our computers have global state all over the place, and people like it, as it simplifies a lot of things.

You could see that history repeat itself in Python - "pip install something" is way easier to do that messing with virtualenvs, and even works pretty well as long as number of package is small, so it was a recommendation for a long time. Over time, as number of Python apps on same PC grew, and as the libraries gained incompatible versions, people realized it's a much better idea to keep all things isolated in its own virtualenv, and now there are tools (like "uv" and "pipx") which make it trivial to do.

But there are no default "virtualenvs" for regular OS. Containers get closest. nix tries hard, but it is facing uphill battle - it goes very much "against the grain" of *nix systems, so every build script of every used app needs to be updated to work with it. Docker is just so much easier to use.

Golang has no dynamic code loading, so a lot of times it can be used without containers. But there is still global state (/etc/pki, /etc/timezone, mime.types , /usr/share/, random Linux tools the app might call on, etc...) so some people still package it in docker.

linksnapzz

4 months ago

No. Back before dynamic objects, for instance, it was easier-of course, there were other challenges at the time.

DrScientist

4 months ago

So perhaps the Linux choice of dynamic by default is partly to blame for dependency hell, and thus the rise of cloning entire systems to isolate a single program?

Ironically one of the arguments for dynamic linking is memory efficiency and small exec size ( the other is around ease of centrally updating - say if you needed to eliminate a security bug ).

linksnapzz

4 months ago

See...there's the thing; dynamic linking was originally done by Unixen in the '80s, way before Linux, as a way to cope w/ original X11 on machines that had only 2-4MB of RAM.

X was (in)famous for memory use (see the chapter in the 'Unix-Hater's Handbook'); and shared libs was the consensus as to how to make the best of a difficult situation, see:

http://harmful.cat-v.org/software/dynamic-linking/

DrScientist

4 months ago

According to your link ( great link BTW ) Rob Pike said dynamic linking for X was a net negative on memory and speed and only had a tiny advantage in disk space.

My preference is to bring dependencies in at the source code level and compile them in to the app - stops the library level massive dependency trees ( A need part of B but because some other part of B needs C our dependency tool brings in C, and then D and so on ).

linksnapzz

4 months ago

This seems to have worked out well for the plan9 guys. It's just not a popular approach nowadays.

tptacek

4 months ago

This makes sense if you look at containers as simply a means to an end of setting up a sandbox, but not really much sense at all if you think of containers as a way to make it easy to get an arbitrary application up and running on an arbitrary server without altering host system dependencies.

ianburrell

4 months ago

I suspect that containers would have taken off even without isolation. I think the important innovation of Docker was the image. It let people deploy consistent version of their software or download outside software.

All of the hassle of installing things was in the Dockerfile, and it was run in containers so more reliable.

bostik

4 months ago

I honestly think that Dockerfile was the biggest driver. Containers as a technology are useful, for the many reasons outlined in this thread. But what Dockerfiles achieved was to make the technology accessible to much wider and much less technically deep audience. The syntax is easy to follow, the vocabulary available for the DSL is limited, and the results are immediately usable.

Oh, and the layer caching made iterative development with _very_ rapid cycles possible. That lowered the bar for entry and raised the floor for everyone to get going easier.

But back to Dockerfiles. The configuration language used made it possible for anyone[tm] to build a container image, to ship a container image and to run the container. Fire-and-forget style. (Operating the things in practice and at any scale was left as an exercise for the reader.)

And because Anyone[tm] could do it, pretty much anyone did. For good and ill alike.

tptacek

4 months ago

I agree: I think the container image is what matters. As it turns out, getting more (or less) isolation given that image format is not a very hard problem.

thaumasiotes

4 months ago

> I think the important innovation of Docker was the image. It let people deploy consistent version of their software or download outside software.

What did it let people do that they couldn't already do with static linking?

cschep

4 months ago

I can't tell if this is a genuine question or not but if it is.. deploying a Ruby on Rails app with a pile of gems that have c deps isn't fixed with static linking. This is true for python and node and probably other things I'm not thinking of.

pjc50

4 months ago

- most languages don't really do static linking in the same way as C

- things like "a network port" can also be a dependency, but can't be "linked". And so on for all sorts of software that expects particular files to be in particular places, or requires deploying multiple communicating executables

- Linux requires that you be root to open a port below 1024, a security disaster

- some dependencies really do not like being statically linked (this includes the GNU standard library!), for things like nsswitch

misnome

4 months ago

Because it doesn’t need you to delve deep into the build system of every dependency and application you ever want to package?

thaumasiotes

4 months ago

>> It let people deploy consistent version of their software

oblio

4 months ago

Surprisingly "my software" depends on a lot of other stuff. Python, Ruby, PHP, JS, etc all need tens to hundreds of native libraries that have to be deployed.

misnome

4 months ago

>> …or download outside software.

theamk

4 months ago

I've worked on software that linked to ffmpeg libs.

Good luck making _that_ static.

oblio

4 months ago

What about the ton of languages that don't have static linking?

zellyn

4 months ago

Agreed. There was a point where I thought AMIs would become the unit of open source deployment packaging, and I think docker filled that niche in a cloud-agnostic way

zellyn

4 months ago

ps I still miss the alternate universe where Kenton won the open source deployment battle :-)

lproven

4 months ago

Do tell? Google got me nowhere...

zellyn

4 months ago

https://sandstorm.io

ants_everywhere

4 months ago

> Because SEL4 is actually well designed, you can run a sub kernel as a process on SEL4 relatively easily. Tada, now you can get rid of K8s too.

k8s is about managing clusters of machines as if they were a single resource. Hence the name "borg" of its predecessor.

AFAIK, this isn't a use case handled by SEL4?

alphazard

4 months ago

The K8s master is just a scheduling application. It can run anywhere, and doesn't depend on much (just etcd). The kublet (which runs on each node) is what manages the local resources. It has a plugin architecture, and when you include one of each necessary plugin, it gets very complicated. There are plugins for networking, containerization, storage.

If you are already running SEL4 and you want to spawn an application that is totally isolated, or even an entire sub-kernel it's not different than spawning a process on UNIX. There is no need for the containerization plugins on SEL4. Additionally the isolation for the storage and networking plugins would be much better on SEL4, and wouldn't even really require additional specialized code. A reasonable init system would be all you need to wire up isolated components that provide storage and networking.

Kubernetes is seen as this complicated and impressive piece of software, but it's only impressive given the complexity of the APIs it is built on. Providing K8s functionality on top of SEL4 would be trivial in comparison.

ants_everywhere

4 months ago

I understand what you're saying, and I'm a fan of SEL4. But isolation isn't one of the primary points of k8s.

Containerization is after all, as you mentioned, a plugin. As is network behavior. These are things that k8s doesn't have a strong opinion on beyond compliance with the required interface. You can switch container plugin and barely notice the difference. The job of k8s is to have control loops that manage fleets of resources.

That's why containers are called "containers". They're for shipping services around like containers on boats. Isolation, especially security isolation, isn't (or at least wasn't originally) the main idea.

You manage a fleet of machines and a fleet of apps. k8s is what orchestrates that. SEL4 is a microkernel -- it runs on a single machine. From the point of view of k8s, a single machine is disposable. From the point of view of SEL4, the machine is its whole world.

So while I see your point that SEL4 could be used on k8s nodes, it performs a very different function than k8s.

MrDarcy

4 months ago

The scheduler is the least interesting thing about k8s. The extensible API common to all operating environments is the real value add.

As others mentioned containers aren’t about security either, I think you’re rather missing the whole purpose of the cloud native ecosystem here.

antonvs

4 months ago

> Kubernetes is seen as this complicated and impressive piece of software, but it's only impressive given the complexity of the APIs it is built on.

There are other reasons it's impressive. Its API and core design is incredibly well-designed and general, something many other projects could and should learn from.

But the fact that it's impressive because of the complexity of the APIs it's built on is certainly a big part of its value. It means you can use a common declarative definition to define and deploy entire distributed systems, across large clusters, handling everything from ingress via load balancers to scaling and dynamic provisioning at the node level. It's essentially a high-level abstraction for entire data centers.

seL4 overlaps with that in a pretty minimal way. Would it be better as underlying infrastructure than the Linux kernel? Perhaps, but "providing K8s functionality on top of SEL4" would require reimplementing much of what Linux and various systems on top of it currently provide. Hardly "trivial in comparison".

never_inline

4 months ago

You're just replacing the functionality of CRI, which is already pluggable. Rest of the kubernetes is still needed.

amazingman

4 months ago

You have solved the isolation and some storage problems for a single node. You have not solved for scaling that to 10s, 100s, 1000s of nodes. That's where Kubernetes comes in. You made a lot of good points, but "you no longer need k8s" is not one of them.

orbifold

4 months ago

It would be great if we got "kernel independent" Nvidia drivers. I have some experience with bare-metal development and it really seems like most of what an operating system provides could be provided in a much better way as a set of libraries that make specific pieces of hardware work, plus a very good "build" system.

themafia

4 months ago

> which has a fundamentally poor approach to security

Unix was not designed to be convenient for VPS providers. It was designed to allow a single computer to serve an entire floor of a single company. The security approach is appropriate for the deployment strategy.

As it did with all OSes, the Internet showed up, and promptly ruined everything.

stinkbeetle

4 months ago

cgroups first came from resource management frameworks that IIRC came out of IBM and got into some distro kernels for a time but not upstream.

Namespaces were not an attempt to add security, but just grew out of work to make interfaces more flexible, like bind mounts. And Unix security is fundamentally good, not having namespaces isn't much of a point against it in the first place, but now it does have them.

And it's going pretty well indeed. All applications use many kernel features, and we do have very secure high performance web and other servers.

L4 systems have been around for as long as Linux, and SEL4 in particular for 2 decades. They haven't moved the needle much so I'd say it's not really going all that well for them so far. SEL4 is a great project that has done some important things don't get me wrong, but it doesn't seem to be a unix replacement poised for a coup.

onjectic

4 months ago

> Unix security is fundamentally good

L. Ron Hubbard is fundamentally good!

I kid, but seriously, good how? Because it ensures cybersecurity engineers will always have a job?

seL4 is not the final answer, but something close to it absolutely will be. Capability-based security is an irreducible concept at a mathematical level, meaning you can’t do better than it, at best you can match it, and its certainly not matched by anything else we’ve discovered in this space.

stinkbeetle

4 months ago

> good how?

Good because it is simple both in terms of understanding it and implementing it, and sufficient in a lot of cases.

> seL4 is not the final answer, but something close to it absolutely will be. Capability-based security is an irreducible concept at a mathematical level, meaning you can’t do better than it, at best you can match it, and its certainly not matched by anything else we’ve discovered in this space.

Security is not pure math though, it's systems and people and systems of people.

man8alexd

4 months ago

cgroups are from Google. https://lwn.net/Articles/199643/

stinkbeetle

4 months ago

Yes but I don't see how you are addressing what I wrote. What are you getting at?

stinkbeetle

4 months ago

Oh, maybe my first post was poorly worded. cgroups used ideas about resource management that came from IBM, who had resource maangement before Google was a company and their CKRM was an earlier proposal. For whatever reason, cgroups won. But they also mostly came about due to Google's internal resource management, not so much security.

zozbot234

4 months ago

> Cgroups and namespaces were added to Linux in an attempt to add security to a design (UNIX) which has a fundamentally poor approach to security (shared global namespace, users, etc.)

Namespacing of all resources (no restriction to a shared global namespace) was actually taken directly from plan9. It does enable better security but it's about more than that; it also sets up a principled foundation for distributed compute. You can see this in how containerization enables the low-level layers of something like k8s - setting aside for the sake of argument the whole higher-level adaptive deployment and management that it's actually most well-known for.

lproven

4 months ago

> I hope something like SEL4 can replace Linux for cloud server workloads eventually.

Why not 9front and diskless Linux microVMs, Firecracker/Kata-containers style?

Filesystem and process isolation in one, on an OS that's smaller than K8s?

Keep it simple and Unixy. Keep the existing binaries. Keep plain-text config and repos and images. Just replace the bottom layer of the stack, and migrate stuff to the host OS as and when it's convenient.

pianopatrick

4 months ago

The story I heard was that containers let you use less memory and better share the kernel and CPU compared to Virtual Machines, such that you could run more applications on the same servers. This translates into direct cost savings, which is why large companies with large server farms were willing to pay their engineers to develop the technology and transition to the technology.

In terms of security, I think even more secure than SEL4 or containers or VMs would be having a separate physical server for each application and not sharing CPUs or memory at all. Then you have a security boundary between applications that is based in physics.

Of course, that is too expensive for most business use cases, which is why people do not use it. I think using SEL4 will run into the same problem - you will get worse utilization out of the server compared to containers, so it is more expensive for business use cases and not attractive. If we want something to replace containers that thing would have to be both cheaper and more secure. And I'm not sure what that would be

tombert

4 months ago

> Drivers for virtual devices are simple, we don't need Linux's vast driver support for cloud VMs. We essentially need a virtual ethernet device driver for SEL4, a network stack that runs on SEL4, and a simple init process that loads the network stack with capabilities for the network device, and loads the application with a capability to the network stack. Make building an image for that as easy as compiling a binary, and you could eliminate maybe 10s of millions of lines of complexity from the deployment of most server applications. No Linux, no docker.

Wasn't this what unikernels were attempting a decade ago? I always thought they were neat but they never really took off.

I would totally be onboard with moving to seL4 for most cloud applications. I think Linux would be nearly impossible to get into a formally-verified state like seL4, and as you said most cloud stuff doesn't need most of the features of Linux.

Also seL4 is just cool.

antod

4 months ago

I don't think Docker came about due to cgroups and namespaces being arcane, LXC was already abstracting that away.

Docker's claim to fame was connecting that existing stuff with layered filesystem images and packaging based off that. Docker even started off using LXC to cover those container runtime parts.

lmm

4 months ago

> Containers (meaning Docker) happened because CGroups and namespaces were arcane and required lots of specialized knowledge to create what most of us can intuitively understand as a "sandbox".

That might be why Docker was originally implemented, but why it "happened" is because everyone wanted to deploy Python and pre-uv Python package management sucks so bad that Docker was the least bad way to do that. Even pre-kubernetes, most people using Docker weren't using it for sandboxing, they were using it as fat jars for Python.

procaryote

4 months ago

Not only python, although python is particularly bad.

Even java things wher fatjars exist you at some point end up with os level dependencies like "and this logging thing needs to be set up, and these dirs need these rights, and this user needs to be in place" etc. Nowadays you can shove that into a container

thaumasiotes

4 months ago

> namespaces were added to Linux in an attempt to add security to a design (UNIX) which has a fundamentally poor approach to security (shared global namespace, users, etc.)

If the "fundamentally poor approach to security" is a shared global namespace, why are namespaces not just a fix that means the fundamental approach to security is no longer poor?

noduerme

4 months ago

You say applications and web servers kind of interchangeably. I don't know anything about SEL4. What if your application needs to spawn and manage executables as child processes? Is it Linux-like enough to run those and handle stuff like that so that those of us coding at the application layer don't need to worry about it too much?

Eikon

4 months ago

    make tinyconfig

can get you pretty lean already.

https://archive.kernel.org/oldwiki/tiny.wiki.kernel.org/

m463

4 months ago

seems like all this was part of a long evolution.

I think the whole thing has been levels of abstraction around a runtime environment.

in the beginning we had the filesystem. We had /usr/bin, /usr/local/bin, etc.

then chroot where we could run an environment

then your chgroups/namespaces

then docker build and docker run

then swarm/k8s/etc

I think there was a parallel evolution around administration, like configure/make, then apt/yum/pacman, then ansible/puppet/chef and then finally dockerfile/yaml

man8alexd

4 months ago

The irony is that dockerfile/yaml contains so much ugly bash code nowadays that it feels like we are back at configure/make stage.

theamk

4 months ago

Luckily, no Dockerfile is ever as bad as old "configure" scripts were.

As long I never have to worry about configure snippets that deal with Sun's CC compiler from 1990's, or with gcc-3, I will be happy.

m463

4 months ago

If you are talking about the

  RUN  foo && \
       bar && \
       baz

thing, I completely agree.

I've always wondered if there could be something like:

  LAYER
  RUN foo
  RUN bar
  RUN baz
  LAYER

to accomplish something similar, or maybe:

  RUN foo
  AND bar
  AND baz

man8alexd

4 months ago

There are also health/liveness checks, entry point code, sometimes embedded right in the Helm templates.

otabdeveloper4

4 months ago

> which get capabilities to the network stack as initial arguments, and don't have access to anything more

Systemd does this and it is widely used.

pjmlp

4 months ago

Linux is already being replaced by type 1 hypervisors on cloud server workloads.

Anyone doing deployments in managed languages, regardless of AOT compiled, or using a JIT, the underlying operating system is mostly irrelevant, with exception of some corner cases regarding performance tweeks and such.

Even if those type 1 hypervisors happen to depend on Linux kernel for their implementation, it is pretty much transparent when using something like Vercel, or Lambda.

user

4 months ago

[deleted]

nisegami

4 months ago

Why go a step further and deploy all cloud workloads using webassembly?

chatmasta

4 months ago

My headcanon is that Docker exists because Python packaging and dependency management was so bad that dotCloud had no choice but to invent some porcelain on top of Linux containers, just to provide a pleasant experience for deploying Python apps.

bane

4 months ago

That's basically correct. But the more general problem is that engineers simply lost the ability to succinctly package applications and their dependencies into simple to distribute and run packages. Somehow around the same time Java made .jar files mainstream (just zip all the crap with a manifest), the rest of the world completely forgot how to do the equivalent of statically linking in libraries and that we're all running highly scheduled multithreaded operating systems now.

The "solution" for a long time was to spin up single application Virtual Machines, which was a heavy way to solve it and reduced the overall system resources available to the application making them stupidly inefficient solutions. The modern cloud was invented during this phase, which is why one of the base primitives of all current cloud systems is the VM.

Containers both "solved" the dependency distribution problem as well as the resource allocation problem sort of at once.

procaryote

4 months ago

> engineers simply lost the ability to succinctly package applications and their dependencies into simple to distribute and run packages.

but this is what docker is

If anything, java kinda showed it doesn't have to suck, but as not all things are java, you need something more general

BSVogler

4 months ago

With the difference that with docker you are shipping the runtime to your source code as well.

arccy

4 months ago

which is great when you realize that not all software is updated at the same time.

how managing multiple java runtime versions is supposed to work is still beyond me... it's a different tool at every company, and the instructions never seem to work

oftenwrong

4 months ago

It's less complicated than you might think. A Java Development Kit (JDK) is a filesystem directory, and includes everything necessary to run a Java program. Most of the mysterious installers and version managers are managing a collection of these JDK directories in some fixed location on disk. You can download a JDK directory (tarball), and use the `java` binary within it directly.

There is also a convention of using the `JAVA_HOME` environment variable to allow tools to locate the correct JDK directory. For example, in a unix shell, add `$JAVA_HOME/bin` to your `PATH`.

qcnguy

4 months ago

Java runtimes is just: export JAVA_HOME=/path; ./app.sh

procaryote

4 months ago

But the java runtime needs to be at /path then, and it needs to stay there as long as ./app.sh needs it. And when app2.sh needs a different version you need that to be at /path2

procaryote

4 months ago

You need the runtime though

andoando

4 months ago

And even a java program may need a system wide install of ffmpeg or opencv or libgtk or VC runtime 2019 but not 2025 or some other dependency.

And sometimes you want to ship multiple services together.

In any case 'docker run x' is easier and seemingly less error prone than a single sudo apt get install

bborud

4 months ago

I would argue that the traditional way to install applications (particularly servers) on UNIX wasn’t very compatible with the needs that arose in the 2000s.

The traditional way tends to assume that there will be only one version of something installed on a system. It also assumes that when installing a package you distribute binaries, config files, data files, libraries and whatnot across lots and lots of system directories. I grew up on traditional UNIX. I’ve spent 35+ years using perhaps 15-20 different flavors of UNIX, including some really, really obscure variants. For what I did up until around 2000, this was good enough. I liked learning about new variants. And more importantly: it was familiar to me.

It was around that time I started writing software for huge collections of servers sitting in data centers on a different continent. Out of necessity I had to make my software more robust and easier to manage. It had to coexist with lots of other stuff I had no control over.

It would have to be statically linked, everything I needed had to be in one place so you could easily install and uninstall. (Eventually in all-in-one JAR files when I started writing software in Java). And I couldn’t make too many assumptions about the environment my software was running in.

UNIX could have done with a re-thinking of how you deal with software, but that never happened. I think an important reason for this is that when you ask people to re-imagine something, it becomes more complex. We just can’t help ourselves.

Look at how we reimagined managing services with systemd. Yes, now that it has matured a bit and people are getting used to it, it isn’t terrible. But it also isn’t good. No part of it is simple. No part of it is elegant. Even the command line tools are awkward. Even the naming of the command line tools fail the most basic litmus test (long prefixes that require too many keystrokes to tab-complete says a lot about how people think about usability - or don’t).

Again, systemd isn’t bad. But it certainly isn’t great.

As for blaming Python, well, blame the people who write software for _distribution_ in Python. Python isn’t a language that lends itself to writing software for distribution and the Python community isn’t the kind of community that will fix it.

Point out that it is problematic and you will be pointed to whatever mitigation that is popular at the time (to quote Queen “I've fallen in love for the first time. And this time I know it's for real”), and people will get upset with you, downvote you and call you names.

I’m too old to spend time on this so for me it is much easier to just ban Python from my projects. I’ve tried many times, I’ve been patient, and it always ends up biting me in the ass. Something more substantial has to happen before I’ll waste another minute on it.

lproven

4 months ago

> UNIX could have done with a re-thinking of how you deal with software, but that never happened.

I think it did, but the Unix world has an inherent bad case of "not invented here" syndrome, and a deep cultural reluctance to admit that other systems (OSes, languages, and more) do some things better.

NeXTstep fixed a big swath of issues (in the mid-to-late 1980s). It threw out X and replaced it with Display Postscript. It threw out some of the traditional filesystem layout and replaced it with `.app` bundles: every app in its own directory hierarchy, along with all its dependencies. Isolation and dependency packaging in one.

(NeXT realised this is important but it has to be readable and user-friendly. It replaces the traditional filesystem with something more readable. 15Y later, Nix realised the same lesson, but forgot the 2nd, so it throws out the traditional FHS and replaces it with something less readable, which needs software to manage it. The NeXT way means you can install an app with a single `cp` command or one drag-and-drop operation.)

Some of this filtered back upstream to Ritchie, Thompson and Pike, resulting in Plan 9: bin X, replace it with something simpler and filesystem-based. Virtualise the filesystem, so everything is in a container with a virtual filesystem.

But it wasn't Unixy enough so you couldn't move existing code to it. And it wasn't FOSS, and arrived at the same time as a just-barely-good-enough FOSS Unix for COTS hardware was coming: Linux on x86.

(The BSDs treated x86 as a 2nd class citizen, with grudging limited support and the traditional infighting.)

bborud

4 months ago

I can’t remember NeXTStep all that well anymore, but the way applications are handled in Darwin is a partial departure from the traditional unix way. Partial, because although you can mostly make applications live in their own directory, you still have shared, global directory structures where app developers can inflict chaos. Sometimes necessitating third party solutions for cleaning up after applications.

But people don’t use Darwin for servers to any significant degree. I should have been a bit more specific and narrowed it down to Linux and possibly some BSDs that are used for servers today.

I see the role of Docker as mostly a way to contain the “splatter” style of installing applications. Isolating the mess that is my application from the mess that is the system so I can both fire it up and then dispose of it again cleanly and without damaging my system. (As for isolation in the sense of “security”, not so much)

lproven

4 months ago

> a way to contain the “splatter” style of installing applications

Darwin is one way of looking at it, true. I just referred to the first publicly released version. NeXTstep became Mac OS X Server became OS X became macOS, iOS, iPadOS, watchOS, tvOS, etc. Same code, many generations later.

So, yes, you're right, little presence on servers, but still, the problems aren't limited to servers.

On DOS, classic MacOS, on RISC OS, on DR GEM, on AmigaOS, on OS/2, and later on, on 16-bit Windows, the way that you install an app is that you make a directory, put the app and its dependencies in it, and maybe amend the system path to include that directory.

All single-user OSes, of course, so do what you want with %PATH% or its equivalent.

Unix was a multi-user OS for minicomputers, so the assumption is that the app will be shared. So, break it up into bits, and store those component files into the OS's existing filesystem hierarchy (FSH). Binaries in `/bin`, libraries in `/lib`, config in `/etc`, logs and state in `/var`, and so on -- and you can leave $PATH alone.

Make sense in 1970. By 1980 it was on big shared departmental computers. Still made sense. By 1990 it was on single-user workstations, but they cost as much as minicomputers, so why change?

The thing is, the industry evolved underneath. Unix ended up running on a hundred million times more single-user machines (and VMs and containers) than multiuser shared hosts.

The assumptions of the machine being shared turned out to be wrong. That's the exception, not the rule.

NeXT's insight was to only keep the essential bits of the shared FSH layout, and to embed all the dependencies in a folder tree for each app -- and then to provide OS mechanisms to recognise and manipulate those directory trees as individual entities. That was the key insight.

Plan 9 virtualised the whole FSH. Clever but hard to wrap one's head around. It's all containers all the way down. No "real" FSH.

Docker virtualises it using containers. Also clever but in a cunning-engineer's-hacky-kludge kind of way, IMHO.

I think GoboLinux maybe made the smartest call. Do the NeXT thing, junk the existing hierarchy -- but make a new more-readable one, with the filesystem as the isolation mechanism, and apply it to the OS and its components as well. Then you have much less need for containers.

skydhash

4 months ago

I agree with tou that the issue is packaging. And to have developers trying to package software is the issue IMO. They will come up with the most complicated build system to handle all scenarios, and the end result will be brittle and unwieldy.

There’s also the overly restrictive dependency list, because each deps in turn is happy to break its api every 6 months.

IshKebab

4 months ago

Exactly this, but not just Python. The traditional way most Linux apps work is that they are splayed over your filesystem with hard coded references to absolute paths and they expect you to provide all of their dependencies for them.

Basically the Linux world was actively designed to apps difficult to distribute.

antod

4 months ago

It wasn't about making apps difficult to distribute at all, that's a later side effect. Originally distros were built around making a coherent unified system of package management that made it easier to manage a system due to everything being built on the same base. Back then Linux users were sysadmins and/or C programmers managing (very few) code dependencies via tarballs. With some CPAN around too.

For a sysadmin, distros like Debian were an innovative godsend for installing and patching stuff. Especially compared to the hell that was Windows server sysadmin back in the 90s.

The developer oriented language ecosystem dependency explosion was a more recent thing. When the core distros started, apps were distributed as tarballs of source code. The distros were the next step in distribution - hence the name.

IshKebab

4 months ago

Right but those things are not unrelated. Back in the day if you suggested to the average FOSS developer that maybe it should just be possible to download a zip of binaries, unzip it anywhere and run it with no extra effort (like on Windows), they would say that that is actively bad.

You should be installing it from a distro package!!

What about security updates of dependencies??

And so on. Docker basically overrules these impractical ideas.

skydhash

4 months ago

It’s still actively bad. And security updates for dependencies is easy to do when the dependencies developer is not bundling those with feature changes and actively breaking the API.

graemep

4 months ago

I would say those are good point, not impractical ideas.

You make software harder to distribute (so inconvenient for developers and distributors) but gain better security updates and lower resource usage.

IshKebab

4 months ago

The success of Docker shows that this is a minority view.

graemep

4 months ago

I was replying to a comment comparing the distribution of self-contained binaries to Linux package management. This is a much more straightforward question

Containers are a related (as the GP comment says) thing, but offer a different and varied set of tradeoffs.

Those tradeoffs also depend on what you are using containers for. Scaling by deploying large numbers of containers on a cloud providers? Applications with bundled dependencies on the same physical server? As a way of providing a uniform development environment?

IshKebab

4 months ago

> Those tradeoffs also depend on what you are using containers for. Scaling by deploying large numbers of containers on a cloud providers? Applications with bundled dependencies on the same physical server? As a way of providing a uniform development environment?

Those are all pretty much the same thing. I want to distribute programs and have them work reliably. Think about how they would work if Linux apps were portable as standard:

> Scaling by deploying large numbers of containers on a cloud providers?

You would just rsync your deployment and run it.

> Applications with bundled dependencies on the same physical server?

Just unzip each app in its own folder.

> As a way of providing a uniform development environment?

Just provide a zip with all the required development tools.

graemep

4 months ago

> Those are all pretty much the same thing. I want to distribute programs and have them work reliably.

Yes, they are very similar in someways, but the tradeoffs (compared to using containers) would be very different.

> You would just rsync your deployment and run it.

If you are scaling horizontally and not using containers you are already probably automating provisioning and maintenance of VMs, so you can just use the same tools to automate deployment. You would also be running one application per VM so you do not need to worry about portability.

> Just unzip each app in its own folder.

What is stopping people from doing this? You can use an existing system like Appimage, or write a windows like installer (Komodo used to have one). The main barrier as far as I can see is that users do not like it.

> Just provide a zip with all the required development tools.

vs a container you still have to configure it and isolation can be nice to have in a development environment.

vs installing what you need with a package manager, it would be less hassle in some cases but this is a problem that is largely solved by things like language package managers.

IshKebab

4 months ago

> What is stopping people from doing this?

Most Linux apps do not bundle their dependencies, don't provide binary downloads, and aren't portable (they use absolute paths). Some dependencies are especially awkward like glibc and Python.

It is improving with programs written in Rust and Go which tend to a) be statically linked, and b) are more modern so they are less likely to make the mistake of using absolute paths.

Incidentally this is also the reason Nix has to install everything globally in a single root-owned directory.

> The main barrier as far as I can see is that users do not like it.

I don't think so. They've never been given the option.

graemep

4 months ago

> Most Linux apps do not bundle their dependencies, don't provide binary downloads, and aren't portable (they use absolute paths).

That is because the developers choose not to, and no one else chooses to do it for them. On the other hand lots of people package applications (and libraries) for all the Linux distros out there.

> I don't think so. They've never been given the option.

The options exist. AppImage does exactly what you want. Snap and Flatpak are cross distro, have lots of apps, and are preinstalled by many major distros.

man8alexd

4 months ago

Docker was the tool for those who couldn't create a deb or rpm package.

IshKebab

4 months ago

You mean a deb and an rpm right? And multiple versions of each.

man8alexd

4 months ago

The same as we have docker images with different OS flavours now: "-debian", "-alpine", "-slim", etc.

theamk

4 months ago

deb or rpm packages are a tool for people who did not care about reproducible code and who always lived "at the edge".

man8alexd

4 months ago

So Debian, Ubuntu LTS, RHEL, distributions with multi-year release cycles are living at the edge? Ok.

theamk

4 months ago

Sure, it is not as edgy Arch or something, but unless you have your own mirror, your stuff can be broken at any time.

To be fair, they are _usually_ pretty good about that, the last big breakage I've seen was that git "security" fix which basically broke git commands as root. There is also some problems with Ubuntu LTS kernel upgrades, but docker won't save you here, you need to use something like AMI images.

man8alexd

4 months ago

The irony is that the majority of docker images are built from the same packages and break all the same. But in your eyes, `apt install package` is bad but `RUN apt install package` inside a `Dockerfile` somehow makes it reproducible. I suspect you are confusing "having an artifact" with "reproducible builds" [1]. Having a docker image as an artifact is the same as having tar/zip with your application and its dependencies or having a filesystem snapshot or having VM image like AMI/OVM/VMDK. You can even have a deb file with all your dependencies vendored in.

[1]: https://en.wikipedia.org/wiki/Reproducible_builds

pjmlp

4 months ago

If only handling Dockerfiles were as easy.

man8alexd

4 months ago

"Dockerfile is simple", they promised. Now look at the CNCF landscape.

pjmlp

4 months ago

I stopped listening to cloud related podcasts, because it started to feel like it was just PR for whatever product the guest came up with.

theamk

4 months ago

why would you do this?

If you are considering bare-metal servers with deb files, you compare them to bare-metal servers with docker containers. And in the latter case, you immediately get all the compatibility, reproducibility, ease of deployment, ease of testing, etc... and there is no need for a single YAML file.

man8alexd

4 months ago

If you need a reliable deployment without catching 500 errors from Docker Hub, then you need a local registry. If you need a secure system without accumulating tons of CVEs in your base images, then you need to rebuild your images regularly, so you need a build pipeline. To reliably automate image updates, you need an orchestrator or switch to podman with `podman auto-update` because Docker can't replace a container with a new image in place. To keep your service running, you again need an orchestrator because Docker somehow occasionally fails to start containers even with --restart=always. If you need dependencies between services, you need at least Docker Compose and YAML or a full orchestrator, or wrap each service in a systemd unit and switch all restart policies to systemd. And you need a log collection service because the default Docker driver sucks and blocks on log writes or drops messages otherwise. This is just the minimum for production use.

theamk

4 months ago

Yes, running server farms in production is complex, and docker won't magically solve _every one_ of your problems. But it's not like using deb files will solve them either - you need most of the same components either way.

> If you need a reliable deployment without catching 500 errors from Docker Hub, then you need a local registry.

Yes, and with debs you need local apt repository

> If you need a secure system without accumulating tons of CVEs in your base images, then you need to rebuild your images regularly, so you need a build pipeline.

presumably you were building your deb with build pipeline as well.. so the only real change is that pipeline now has to has timer as well, not just "on demand"

> To reliably automate image updates, you need an orchestrator or switch to podman with `podman auto-update` because Docker can't replace a container with a new image in place.

With debs you only have automatic-updates, which is not sufficient for deployments. So either way, you need _some_ system to deploy the images and monitor the servers.

> To keep your service running, you again need an orchestrator because Docker somehow occasionally fails to start containers even with --restart=always. If you need dependencies between services, you need at least Docker Compose and YAML or a full orchestrator, or wrap each service in a systemd unit and switch all restart policies to systemd.

deb files have the same problems, but here dockerfiles have an actual advantage: if you run supervisor _inside_ docker, then you can actually debug this locally on your machine!

No more "we use fancy systemd / ansible setups for prod, but on dev machines here are some junky shell scripts" - you can poke the things locally.

> And you need a log collection service because the default Docker driver sucks and blocks on log writes or drops messages otherwise. This is just the minimum for production use.

What about deb files? I remember bad old pre-systemd days where each app had to do its own logs, as well as handle rotations - or log directly to third-party collection server. If that's your cup of tea, you can totally do this in docker world as well, no changes for you here!

With systemd's arrival, the logs actually got much better, so it's feasible to use systemd's logs. But here is a great news! docker has "journald" driver, so it can send its logs to systemd as well... So there is feature parity there as well.

The key point is there are all sorts of so-called "best practices" and new microservice-y way of doing things, but they are all optional. If you don't like them, you are totally free to use traditional methods with Docker! You still get to keep your automation, but you no longer have to worry about your entire infra breaking, with no easy revert button, because your upstream released broken package.

man8alexd

4 months ago

You switched from

> ease of deployment

> running server farms in production is complex

You just confirmed my initial point:

> "Dockerfile is simple", they promised. Now look at the CNCF landscape.

> with debs you need local apt repository

No, you don't need an apt repository. To install a deb file, you need to scp/curl the file and run `dpkg`.

>presumably you were building your deb with build pipeline as well

You don't need to rebuild the app package every time there is a new CVE in a dependency. Security updates for dependencies are applied automatically without any pipeline, you just enable `unattended-updates`, which is present out of the box.

> With debs you only have automatic-updates, which is not sufficient for deployments.

Again, you only need to run `dpkg` to update your app. preinst, postinst scripts and systemd unit configuration included in a deb package should handle everything.

> deb files have the same problems

No, they don't. deb files intended to run as a service have systemd configuration included and every major system now runs systemd.

> but here dockerfiles have an actual advantage: if you run supervisor _inside_ docker, then you can actually debug this locally on your machine!

Running a supervisor inside a container is an anti-pattern. It just masks errors from the orchestrator or external supervisor. Also usually messes with logs.

> No more "we use fancy systemd / ansible setups for prod, but on dev machines here are some junky shell scripts" - you can poke the things locally.

systemd/ansible are not fancy but basic beginner-level tools to manage small-scale infrastructure. That tendency to avoid appropriate but unfamiliar tools and retreat into more comfortable spaces reminds me of the old joke about a drunk guy searching for keys under a lamp post.

> What about deb files? I remember bad old pre-systemd days where each app had to do its own logs, as well as handle rotations - or log directly to third-party collection server.

Everything was out of the box - syslog daemon, syslog function in libc, preconfigured logrotate and logrotate configs included in packages.

There are special people who write their own logs bypassing syslog and they are still with us and they still write logs into files inside containers.

There are already enough rants about journald, so I'll skip that.

> but you no longer have to worry about your entire infra breaking, with no easy revert button, because your upstream released broken package.

Normally, updates are applied in staging/canary environments and tested. If upstream breaks a package - you pin the package to a working version, report the bug to the upstream or fix it locally and live happily ever after.

dabockster

4 months ago

> Basically the Linux world was actively designed to apps difficult to distribute.

It has "too many experts", meaning that everyone has too much decision making power to force their own tiny variations into existing tools. So you end up needing 5+ different Python versions spread all over the file system just to run basic programs.

rwmj

4 months ago

It was more like, library writers forgot how to provide stable APIs for their software, and applications decided they just wanted to bundle all the dependencies they needed together and damn the consequences on the rest of the system. Hence we got static linked binaries and then containers.

arccy

4 months ago

even if you have a stable interface... the user might not want to install it and then forget to remove it down the line

frumplestlatz

4 months ago

Pretty much this; systems with coherent isolated dependency management, like Java, never required OS-level container solutions.

They did have what you could call userspace container management via application servers, though.

drowsspa

4 months ago

NodeJS, Ruby, etc also have this problem, as does Go with CGO. So the problem is the binary dependencies with C/C++ code and make, configure, autotools, etc... The whole C/C++ compilation story is such a mess that almost 5 decades ago inventing containers was pretty much the only sane way of tackling it.

Java at least uses binary dependencies very rarely, and they usually have the decency of bundling the compiled dependencies... But it seems Java and Go just saw the writing on the wall and mostly just reimplement everything. I did have problems with the Snappy compression in the Kafka libraries, though, for instance .

skydhash

4 months ago

The issue is with cross platform package management without proper hooks for the platform themselves. That may be ok if the library is pure, but as soon as you have bindings to another ecosystem (C/C++ in most cases), then it should be user/configurable instead of the provider doing the configuration with post installs scripts and other hacky stuff.

If you look at most projects in the C world, they only provide the list of dependencies and some build config Makefile/Meson/Cmake/... But the latter is more of a sample and if your platform is not common or differs from the developer, you have the option to modify it (which is what most distros and port systems do).

But good luck doing that with the sprawling tree of modern packages managers. Where there's multiple copies of the same libraries inside the same project just because.

theamk

4 months ago

Funnily enough, one of the first containers I did on my current job was to package a legacy Java app.

It was pretty old, and required a very specific version of java, not available on modern systems. Plus some config files in global locations.

Packaging it in the docker container made it so much easier to use.

p2detar

4 months ago

That makes sense, but still - there is nothing OS-specific here, like system lib or even a database, it's just the JRE version that you needed to package into the container, or am I missing something?

antonvs

4 months ago

I don't agree with this. Java systems were one of the earliest beneficiaries of container-based systems, which essentially obsoleted those ridiculously over-complicated, and language-specific, application servers that you mentioned.

lmm

4 months ago

Java users largely didn't bother with containers IME, largely for the same reasons that most Java users didn't bother with application servers. Those who did want that functionality already had it available, making the move from an existing Java application server to Docker-style containers a minor upgrade at best.

antonvs

4 months ago

This is just a testament to how widely Java is used, and in how many different ways. Sounds like you're more focused on "Core Java" applications, or something like that. Every company I've been with since the late 90s was using application servers of some kind until Docker came along. And all of the more recent ones switched to containers and ditched the application servers.

The switch was often much more than a minor upgrade, because it often made splitting up monoliths possible in ways that the Java ecosystem itself didn't have good support for.

ruszki

4 months ago

Tomcat and Jetty are application servers which are in almost every Spring application. There are such application servers which you mentioned, like Wildfly, but they are not obsolete as a whole.

antonvs

4 months ago

Tomcat and Jetty are not application servers according to the Jakarta EE definition. They're just servlet containers.

The reason Spring includes those libraries is partly historical - Spring is old, and dates from the applications server days. Newer frameworks like Micronaut and Quarkus use more focused and performant libraries like Netty, Vert.x, and Undertow instead.

pjmlp

4 months ago

Not really, and apparently there is enough value to now having startups replicating application servers by running WebAssembly docker containers as Kubernetes pods.

antonvs

4 months ago

What are you thinking of specifically? Because using WASM doesn't sound like "replicating application servers", but rather like an attempt to address things like the startup speed of typical large Java apps.

Unless you just mean that using Kubernetes at all is replicating application servers, which was my point. Kubernetes makes language-specific application servers like Wildfly/JBoss or Websphere obsolete, and is much more powerful, generic, and an improvement in just about every respect.

pjmlp

4 months ago

I rather deal with Websphere 5 than Kubernetes, that version number is on purpose, anyone that was there will get it.

As for the question I mean the startups trying to sell the idea to use WebAssembly based pods as the next big idea.

globular-toast

4 months ago

Pyinstaller predates Docker. It's not about any individual language not being able to do packaging, it's about having a uniform interface for running applications in any language/architecture. That's why platforms like K8s don't have to know a thing about Python or anything else and they automatically support any future languages too.

ecnahc515

4 months ago

Sure they definitely were using Docker for their own applications, but also dotCloud was itself a PaaS, so they were trying to compete with Heroku and similar offerings, which had buildpacks.

The problem is/was that buildpacks aren't as flexible and only work if the buildpack exists for your language/runtime/stack.

rdsubhas

4 months ago

I used cgroups, lxc, chroots, self-extracting executables. I built rugged, portable applications for UNICEF laptops and camps before docker was a thing.

And I think this whole point about "virtualization", "security", making the most use of hardware, reducing costs, and so on, while true, it's an "Enterprise pitch" targeted at heads of tech and security. Nice side effects, but I couldn't care less.

There are real, fundamental benefits to containers for a solo developer running a solo app on a solo server.

Why? My application needs 2 or 3 other folders to write or read files into, maybe 2 or 3 other runtime executables (jvm, node, convert, think of the dozens of OSS CLI tools, not compile-time libraries), maybe apt-get install or download a few other dependencies.

Now I, as a indie developer, can "mkdir" a few files from a shell script. But that "mkdir" will work the first time. It will fail the second time saying "directory already exists". I can "apt-get install" a few things, but upgrading and versioning is a different story altogether. It's a matter of time before I realize I need atleast some barebones ansible or state management. I can tell you many times how I've reinvented "smallish" ansible in shell scripts before docker.

Now if I'm in an enterprise, I need to communicate this entire State of my app – to the sysadmin teams. Forget security and virtualization and all that. I need to explain every single part of the state, versions of java and tomcat, the directories, and all those are moving targets.

Containers reduce state management. A LOT. I can always "mkdir". I can always "apt-get install". It's an ephemeral image. I don't need to write half-broken shell scripts or use ansible or create mini-shell-ansible.

If you use a Dockerfile with docker-compose, you've solved 95% of state management. The only 5% left is to docker-compose the right source.

Skip the enterprisey parts. A normal field engineer or solo developer, like me, who's deploying a service on the field, even on my raspberry pi, would still use containers. It boils down to one word: "State management" which most people completely underestimate as "scripting". Containers grant a large control on state management to me, the developer, and simplify it by making it ephemeral. That's a big thing.

bunderbunder

4 months ago

Years ago, this is exactly how I got my coworkers interested in containers. I never pushed for any changes to how we do things in production. All I did was start using containers to manage run-time environment on my workstation for development and testing purposes. And then my colleagues started to see how much less time I spent fussing with it compared to our more typical VM-based way of managing run-time environment. Soon enough people started asking me to help them get set up the same way, and eventually we containerized our CI pipeline too. But we never changed what was happening in production because Ops was perfectly happy with their VMs+Ansible setup and nobody had a reason to mess with it that was more cogent than "rah rah containers."

Fast forward to now, though, and I feel like the benefit of containers for development has largely been undone with the adoption of Devcontainers. Because, at least from my perspective, the real value of containers for development was looser coupling between the run-time environment for the application you do your typing in, and the run-time environment where you do your testing. And Devcontainers are designed to make those two tightly coupled again.

sh34r

4 months ago

If you know your way around the Docker CLI, you can mount your workspace in a new container environment and run it however which way you want. You can attach VSCode to arbitrary containers. You can find the commands used to build the dev container image and run it, either in the logs or with docker inspect.

There’s no coupling being forced by devcontainers. It’s just a useful abstraction layer, versus doing it all manually. There is some magic happening under the hood where it takes your specified image or dockerfile and adds a few extra layers in there, but you can do that all yourself if you wanted to.

I will say, if you stray too far off the happy path with devcontainers, it will drive you insane, and you’ll be better off just doing it yourself, like most things that originated from MSFT. But those edge cases are pretty rare. 99% of workflows will be very happily supported with relatively minimal declarative json configuration.

SV_BubbleTime

4 months ago

Ok, but I love my devcontainer. It’s not like I can go back. I can’t install dozens of environment programs and variables and compilers and niche applications per machine.

The devcontainer, also does not preclude the simple testing container.

bunderbunder

4 months ago

You wouldn't have to. You just set up your containerization scheme so that it doesn't rely on the editor extension.

jsmith45

4 months ago

I'm confused by your perspective.

The simplest (and arguably best) usage for a devcontainer is simply to set up a working development environment (i.e. to have the correct version of the compiler, linter, formatters, headers, static libraries, etc installed). Yes, you can do this via non-integrated container builds, but then you usually need to have your editor connect to such a container, so the language server can access all of that, plus when doing this manually you need to handle mapping in your source code.

Now, you probably want to have your main Dockerfile set up most of the same stuff for its build stage, although normally you want the output stage to only have the runtime stuff. For interpreted languages the output stage is usually similar to the "build" stage, but out to omit linters or other pure development time tooling.

If you want to avoid the overlap between your devcontainer and your main Dockerfile's build stage? Good idea! Just specify a stage in your main Dockerfile where you have all development time tooling installed, but which comes before you copy your code in. Then in your .devcontainer.json file, set the `build.dockerfile` property to point at your Dockerfile, and the `build.target` to specify that target stage. (If you need some customizations only for dev container, your docker file can have a tiny otherwise unused stage that derives from the previous one, with just those changes.)

Under this approach, the devcontainer is supposed to be suitable for basic development tasks (e.g. compiling, linting, running automated tests that don't need external services.), and any other non-containerized testing you would otherwise do. For your containerized testing, you want the `ghcr.io/devcontainers/features/docker-outside-of-docker:1` feature added, at which point you can just use just run `docker compose` from the editor terminal, exactly like you would if not using dev containers at all.

alexjplant

4 months ago

When it comes to software it is state, not money, that is the root of all evil. Anything that I can do to constrain state mutation is worthwhile for preventing bugs. Containers are great for this, particularly if you've ever had to deal with "sysadmins" who SSH (or, more frequently in this instance, RDP) into individual application servers and manually update applications instead of using proper automation.

heresie-dabord

4 months ago

> Containers reduce state management. A LOT.

And if you use Podman to build/run containers without root privilege, you reduce state management and avoid unwanted privilege escalation.

ahartmetz

4 months ago

(Note about mkdir: mkdir -p succeeds if the directory already exists)

rdsubhas

4 months ago

yes, and that my friend is how mini-shell-ansible usually begins :)

PyWoody

4 months ago

(will also make intermediate directories as needed (super handy))

ahartmetz

4 months ago

Yeah - it's the "make sure that all these directories exist" command. If they already exist, it's just a trivial success case.

tracker1

4 months ago

Maybe because a single app on a single server rarely stays a single app. And while the landscape has generally improved, and imo is better under Linux than windows... there's nothing worse than trying to get a handful of .Net and Java applications installed and working in concert on a Windowss server with multiple framework versions for the differing apps. Let alone harder dependencies.

Docker for Windows Containers itself was a horrible exercise in frustration just because of it's own dependency issues, and I thought it was a bad idea from the start because of it, and it dilluted Docker for Linux IMO.

Docker/Containers and Compose are pretty great to work with, assuming your application has dependencies like Databases, Cache, etc. Not to mention options such as separating TLS certificate setup and termination from the application server(s) or scaling to larger orchestration options... though I haven't gone past compose for home-lab or on my own server(s).

I can also better position data storage and application configurations for backup/restore by using containers and volumes next to the compose/config. I've literally been able to migrate apps between servers with compose-down, rsync, dns change, compose up -d on the new server. In general, it's been pretty great all around.

user

4 months ago

[deleted]

NotPractical

4 months ago

> I can always "apt-get install".

I don't think you can reliably fix a specific version of a package though, meaning things will still break here the same way they did before containers.

invalidator

4 months ago

If you need a specific version of one package: apt-get install hello=2.10-3

If you want to lock down versions on a system, Apt Pinning: https://wiki.debian.org/AptConfiguration#Using_pinning

If you have a herd of systems - prod environments, VMs for CI, lots of dev workstations, and especially if your product is an appliance VM: you might want to run your own apt mirror, creating known-good snapshots of your packages. I use https://www.aptly.info/

Containers can also be a great solution though.

chriswarbo

4 months ago

That's what the apt sources are for; point them to a snapshot of known-good packages (e.g. S3, AptOnCD, whatever), and disable everything else.

I remember doing such things (via .deb packages, rather than random scripts) a couple of decades ago.

jiggunjer

4 months ago

That's two words. How about "deterministic".

abound

4 months ago

Perhaps ironically, most Docker builds aren't deterministic. Run `docker build`, clear the cache, run it again five minutes later and you might not have a bit-compatible image because many images don't pin their base and pull from live updating package repositories.

You can make a Docker image deterministic/hermetic, but it's usually a lot more work.

theamk

4 months ago

The build process is non-determenstic, sure.

But the images themselves are, and that is a great improvement on pre-docker state of the art. Before docker, if you wanted to run the app with all of the dependencies as of last month, you had _no way to know_ at all. With docker, you pull that old image and you get exactly the same version of every dependency (except kernel) with practically zero effort.

Sure, it's annoying that instead of few-kB-long lockfile you are now having hundred of MBs of docker images. But all the better alternatives are significantly harder.

Hasz

4 months ago

some steps, e.g ap-get, are not deterministic and practically, it would be painful to make them so (usually controlling updates with an external mirror, ignoring phased upgrades, bunch of other misc stuff).

You then start looking at immutable OSes, then get to something like NixOS.

man8alexd

4 months ago

Have you tried building rpm/deb packages?

theamk

4 months ago

We've tried this and it was a major PITA.

Something trivial - like "hey, that function is failing... was it failing with last week's version as well?" - is very hard to arrange if you have any non-trivial dependencies. You have to build some homebrew lockfile mechanism (ugly!) and then you discover that most open-source mirrors don't keep old versions for that long, and so now you have to set up mirror as well... And then there is dependency resolution problems as you try to downgrade stuff...

And then at some point someone gets a great idea: "hey, instead of trying to get dpkg to do things it was not designed for, why don't we snapshot entire filesystem" - and then the docker is born again.

wucke13

4 months ago

My take: containers forced devepopers to declare various aspects of the application in a standardized, opinioated way:

- Persistant state? Must declare a volume. - IO with external services? Must declare the ports (and maybe addresses). - Configurable parameters? Must declare some env variables. - Trasitive dependecies? Must declare them, but using a mechanism of your choosing (e.g. via the package manager of your base image distro).

Separation of state (as in persistency) and application (as in binaries, assets) makes updates easy. Backups also.

Having most all IO visible and explicit simplifies operation and integration.

And a single, (too?!?) simple config mechanism increases reusability, by enabling e.g. lightweight tailoring of generic application service containers (such as mariadb).

Together this bunch of forced, yet leaky abstractions is just good enough to foster immense reuse & composability on to a plethora of applications, all while allowing to treat them almost entirely like blackboxes. IMHO that is why OCI containers became this big, compared to other virtualization and (application-) cuntainer technologies.

lotyrin

4 months ago

Yeah. I remember when "okay, cool, how do we deploy, run, maintain the thing we (our leadership) just bought from you" was several weeks of meetings with the vendor and discussing awful "human-readable" documentation (if it existed).

Those meetings still happen sometimes, but "cloud" *aaS and containers really put a dent in that sort of thing.

ranger207

4 months ago

Yep, this is why. Containers are a way to package more of the OS environment than you can do otherwise

alexey-salmin

4 months ago

The article seems to assume that containers appeared to solve the software distribution problem and then somehow got repurposed into virtualization, isolation and management of production services. I think this view is very far from truth.

The virtualization/isolation aspect came first, the SWSoft Virtuozzo was doing that quite well in early 2000s. They even had [some] IO isolation which I think took around a decade to support elsewhere. Then gradually pieces of Virtuozzo/OpenVZ reached the mainline in a form of cgroups/LXC and the whole thing slowly brewed for a while until the Docker added the two missing pieces: the fast image rebuilds and the out-of-the-box user experience.

Docker of course was the revolution, but by then sufficiently advanced companies have been already using containers for isolation for a full decade.

DanielHB

4 months ago

I remember hearing about all the support from Intel for "0-cost" virtualization at the hardware level way before I heard about containers. From what I remember it was mostly to speed up virtual machines (VMWare stuff). It was a massive market differentiator for Intel in the server space.

I vaguely remember having to turn on some features in VirtualBox at the time to speed up my VMs, it was a massive uplift in performance if you had a CPU that supported it.

bayindirh

4 months ago

I believe BSD Jails created a "Why don't we have this under Linux" feeling as well.

rmoriz

4 months ago

To my knowledge the very first appearance was the "ctx" kernel patches project by linux-vserver.org by Herbert Pötzl et al. https://web.archive.org/web/20250307173707/http://linux-vser... (sadly, the domain seems to be expired just recently)

I used it heavily back in the days, long before kernel namespaces went upstream and docker became a thing.

bane

4 months ago

Containers happened because nobody can be bothered to build an entire application into a single distributable executable anymore - heck even the tooling barely exists anymore. But instead of solving problems like dependency management and linking, today's engineers simply build piles of abstraction into the problem space until the thing you want to do more than anything (i.e. execute an application) becomes a single call.

Of course you now need to build and maintain those abstract towers, so more jobs for everybody!

theamk

4 months ago

by "today's engineers", do you mean "2001 engineers"?

That's when sbuild[0], a tool to build deb packages in containers, was created. It was pretty innovative in that it started from clean container every time, and thus would build deb in a reliable way even if user's machine had some funky dependencies installed.

(Note that was schroot containers, docker did not exist back then)

[0] https://metadata.ftp-master.debian.org/changelogs//main/s/sb...

gorfian_robot

4 months ago

this is what happens when hw is too cheap

zbentley

4 months ago

You sure? Which hardware?

Put another way: stuff like Electron makes a pretty good case for the "cheap hardware leads to shitty software quality/distribution mechanisms" claim. But does Docker? Containers aren't generally any more expensive in hardware other than disk-space to run than any other app. And disk space was always (at least since the advent of the discrete HDD) one of the cheapest parts of a computer to scale up.

wmf

4 months ago

If you go back to the Sun days, you literally could not afford enough servers to run one app per server so instead you'd hire sysadmins to figure out how to run Sendmail and Oracle and whatever on one server without conflicting. Then x86/Linux 1Us came out and people started just running one app per server ("server sprawl") which was easy because there was nothing to conflict. This later became VM sprawl and containers were an optimization on that.

zbentley

4 months ago

I'm not getting it, sorry.

We had to have multiple apps per server before, and now we have containers which offer a convenient way to have multiple apps per server? That seems like the same thing. Could you explain more re: what you meant?

alexey-salmin

4 months ago

That's why Docker and layers exist. Containers predate them by more than a decade.

Joker_vD

4 months ago

> What people do with Docker is spin up a database or another service to develop or test against.

Yep. Being able to run

    docker run --rm --publish=127.0.0.1:27017:27017 'mongo:3.6.8'

    docker run --rm --publish=127.0.0.1:27017:27017 'mongo:5.0'

and then get rid of it with simple Ctrl-C is a godsend.

DanielHB

4 months ago

I remember installing Oracle in my desktop computer to be able to do my programming classes...

If the installation crashed (which was quite common, happened to me once) it was easier to just format the computer completely and start again. Effed up the database? Probably easier to format everything and install from scratch

tacker2000

4 months ago

The author suggests that Docker doesnt help development and that devs just spin up databases, but I have to disagree with that and Im pretty sure i am not the only one.

All my projects (primarily web apps) are using docker compose which configures multiple containers (php/python/node runtime, nginx server, database, scheduler, etc) and run as a dev environment on my machine. The source code is mounted as a volume. This same compose file is then also used for the deployment to the production server (with minor changes that remove debug settings for example).

This approach has worked well for me as a solo dev creating web apps for my clients.

It has also enabled extreme flexibility in the stacks that I use, I can switch dev environments easily and quickly.

boomskats

4 months ago

I agree with you 100%, though arguably what you could be describing is how docker changed your deployment workflow, not your development workflow (although with devcontainers that line is blurry, as you say).

I guess it's worth keeping in mind that Justin only quit Docker a few months ago, and his long tenure as CTO there will have (obviously) informed the majority of the opinions in the article. I think the deployment over development spin and some of the other takes there more closely reflect the conversations he had with large corp paying customers at the exec level than the workflows of solo devs that switch dev environments much more frequently than most etc.

tacker2000

4 months ago

It goes hand in hand, it changed both the dev and he deploy workflow.

Before Docker I was using Xampp and FTP’ing source code to the prod server.

man8alexd

4 months ago

Alternatives to FTP'ing, as deploying from a source code repository and tools like capistrano, heroku, chef deploy resource, puppet vcsrepo existed long before Docker.

lotyrin

4 months ago

Not in the hearts and minds of most individual SMB hackers, though (except Heroku to some degree in some circles). Most of the rest were only for Enterprisey teams. Many, many folks graduated straight from FTPing PHP to the customer's shared host to running docker compose on the customer's VPS or cloud instance. The least-common-denominator clients that hire these folks graduated from shared hosting (where PHP/MySQL was the only viable option, really) to cloud instances (where suddenly all the other languages and backing services were viable). The GoDaddy->AWS LightSail "our website budget is 5 dollars" pipeline.

rshnotsecure

4 months ago

Fascinating documentary on Kubernetes for those who have 50 minutes. Gives more background to the "Container Wars". The filmmakers also have documentaries on the history of Python, Argo, etc.

Some highlights:

- How far behind Kubernetes was at the time of launch. Docker Swarm was significantly more simple to use, and Apache Mesos scheduler could already handle 10,000 nodes (and was being used by Netflix).

- RedHat's early contributions were key, despite having the semi-competing project of OpenShift.

- The decision to Open Source K8S came down to one meeting brief meeting at Google. Many of the senior engineers attended remotely from Seattle, not bothering to fly out because they thought their request to go OS was going to get shutdown.

- Brief part at the end where Kelsey Hightower talks about what he thinks might come after Kubernetes. He mentions, and I thought this was very interesting ... Serverless making a return. It really seemed like Serverless would be "the thing" in 2016-2017 but containers were too powerful. Maybe now with KNative or some future fusing of Container Orchestration + K8S?

[1] - https://youtu.be/BE77h7dmoQU

btreecat

4 months ago

I feel that's going to be more interesting than this video. The speaker is very unpracticed.

jmclnx

4 months ago

FreeBSD jails years ago based upon a user request.

>hosting provider's ... desire to establish a clean, clear-cut separation between their own services and those of their customers

https://en.wikipedia.org/wiki/FreeBSD_jail

My guess Linux started getting requests rom various orgs for a while, so in true Linux fashion, we got a a few different container type methods years later.

I still think Jails are the best of the bunch, but they can be a bit hard to setup. Once setup, Jails works great.

So here we are :)

lproven

4 months ago

Money.

Containers were around for a decade or more on FreeBSD and Solaris. They let you divvy up expensive big Unix iron.

Same as VMs were around on mainframes from the late 1960s and expensive Unix RISC servers from the late 1980s.

Linux didn't need it because it was cheap. So Linux replaced that older more expensive stuff, on cheap COTS hardware: x86.

Once everything was commoditised and cost-cut, suddenly, efficiency started to matter, so VMware thrived and was copied and VMs were everywhere.

Then the low usage and inefficiency of resource sharing of VMs made them look expensive, so they started to get displaced by the cheaper easier tech of containers, making "it works on my machine, so let's ship my machine" scale to production.

pjmlp

4 months ago

Even earlier, I was introduced to the concept of containers in HP-UX Vault back in 1999, before FreeBSD and Solaris got them.

Unfortunely HPe has removed most of HP-UX documentation out of Internet so it is hard to point it out.

However there are still some digital traces,

HP-UX 10.24 release,

> This is a Virtual Vault release of HP-UX, providing enhanced security features. Virtual Vault is a compartmentalised operating system in which each file is assigned a compartment and processes only have access to files in the appropriate compartment and unlike most other UNIX systems the superuser (or root) does not have complete access to the system without following correct procedures.

https://en.wikipedia.org/wiki/HP-UX

A forum discussion,

https://community.hpe.com/t5/operating-system-hp-ux/hp-virtu...

Now it would be great to get back those HP-UX Vault PDFs.

lproven

4 months ago

Fair enough. I never worked on HP/UX. AIX, Solaris, Xenix, and VMS, OS/400, and RSTS, but never HP.

I wrote about containers being the next big thing in 2011:

https://www.theregister.com/Print/2011/07/18/brief_history_o...

I credit FreeBSD and Solaris and AIX, but I think I didn't know about HP/UX.

pjmlp

4 months ago

Yeah, I got to know them by luck, years later at Nokia Networks, which was an heavy HP-UX user before transitioning into Red-Hat Linux, Vaults wasn't also not that well known.

Thanks for the article link.

hedgehog

4 months ago

Containers happened because running an ad network and search engine means serving a lot of traffic for as little cost as possible, and part of keeping the cost down is bin packing workloads onto homogeneous hardware as efficiently as possible.

https://en.wikipedia.org/wiki/Cgroups

(arguably FreeBSD jails and various mainframe operating systems preceded Linux containers but not by that name)

cbdumas

4 months ago

What does the 'ad network and search engine' have to do with it? Wouldn't any organization who serves lots of traffic have the same cost cutting goals you mentioned?

wmf

4 months ago

It's an oblique way to say that Linux cgroups and namespaces were developed by Google.

hedgehog

4 months ago

Yes, to expand: Both search and ads mean serving immense amounts of traffic and users while earning tiny amounts of revenue per unit of each. The dominant mid-90s model of buying racks of Sun and NetApp gear, writing big checks to Oracle, etc, would have been too expensive for Google. Instead they made a big investment in Linux running on large quantities of commodity x86 PC hardware, and building software on top of that to get the most out of it. That means things like combining workloads with different profiles onto the same servers, and cgroups kind of falls out of that.

Other companies like Yahoo, Whatsapp, Netflix also followed interesting patterns of using strong understanding of how to be efficient on cheap hardware. Notably those three all were FreeBSD users at least in their early days.

dmoy

4 months ago

Yup and just to add timelines - Google Borg and containerization was what... 2003-2005? Docker was 2011-2013?

boomskats

4 months ago

*cgroups v1. We have Facebook to thank for v2, right?

dmoy

4 months ago

It is the unfortunate reality that we tend to only remember the creator(s) of the first version (if anyone). Not just cgroups, but a lot of tech or protocols.

Anyways digging it up, looks like the primary author was at Facebook for a year before cgroupsv2, redhat for three years before that, and Google before that. So... I don't know haha you'd have to ask him.

dilyevsky

4 months ago

Linux CGroups specifically were started at Google because their cluster management system Borg (or maybe it was still Babysitter at the time) needed a way to do resource tracking and admission control. Here's a comment by one of original devs: https://news.ycombinator.com/item?id=25017753

deviation

4 months ago

This is also somewhat highlighted in Google's paper "Borg, Omega, and Kubernetes" which they published in 2016.

https://static.googleusercontent.com/media/research.google.c...

figassis

4 months ago

> I was always surprised someone didn't invent a tool for ftping to your container and updating the PHP

We thought of it, and were thankful that it was not obvious to our bosses, because lord forbid they would make it standard process and we would be right back where we started, with long lived images and filesystem changes, and hacks, and managing containers like pets.

daxfohl

4 months ago

> Application composition from open source components became the dominant way of constructing applications over the last decade.

I'm just as interested in why this ^ happened. I imagine it's pretty unique to software? I don't hear of car companies publishing component designs free for competitors to use, or pharmaceuticals freely waiving the IP in their patents or processes. Certainly not as "the dominant way of" doing business.

I wonder if LLM coding assistants had come about earlier, whether this would have been as prevalent. Companies might have been more inclined to create more of their own tooling from scratch since LLMs make it cheap (in theory). Individuals might have been less inclined to work on open source as hobbies because LLMs make it less personal. Companies might be less inclined to adopt open-source LLM-managed libraries because it's too chaotic.

travisjungroth

4 months ago

I think open source software took off because it’s more standalone than the other things you listed and this makes the rewards much higher.

If I write some code, it needs a computer and environment to run. If I’m writing for what’s popular, that’s pretty much a given. In short, for code the design is the product.

If I design a pharmaceutical, someone still has to make it. Same for car parts. This effort is actually greater than the effort of design. If you include regulation, it’s way higher.

So, this great feedback loop of creation-impact-collaboration never forms. The loop would be too big and involve too much other stuff.

The closest thing isn’t actually manufacturing, it’s more like writing and music. People have been reusing each other’s stuff forever in those spaces.

spullara

4 months ago

Because dependencies on Unix are terrible for some languages that assume things are installed globally.

aPoCoMiLogin

4 months ago

it happened because the story of dependencies (system & application) was terrible. the ability to run the app on different distribution/kernel/compiler/etc was hard. there were different solutions like vagrant, but they were heavy and the DX wasn't there

lisbbb

4 months ago

I loved the assertion that AI ate up all the budget and that K8s is now "boring" technology. That's fine because it was getting pretty annoying with all the clone competitors for practically everything that were popping up every month!

Do you use K8s? No! That's old! I use Thrumba! It's just a clone of K8s by some startup because people figured out that the easiest way to make gobs of money is/was to build platform products and then get people to use them.

guigar

4 months ago

I love this sentence about DevOps "Somehow it seems easier for people to relate to technology than culture, and the technology started working against the culture."

draw_down

4 months ago

[dead]

cyberax

4 months ago

The article is just wrong. Before Docker, there was OpenVZ and Virtuozzo. They were used to provide cheaper "dedicatd machine" hosting back around 2005.

Then the technology from OpenVZ slowly made its way into the mainline Linux, in the form of cgroups and namespaces. LWN called it a "container puzzle", with tens of moving pieces. And it was largely finished by early 2010-s.

I built my own container system in 2012 that used cgroups to oversubscribe the RAM, with simple chroot-based file namespaces for isolation. We even used XFS projects (raise your hand if you know what this is!) for the disk quota management. I remember that I had to use systemtap to patch the kernel to be able to find out which process died as a result of the OOM killer, there were no standard ways to do that.

We sold it as a part of our biotech startup to Illumina. Then we sold it again to Amazon as a part of another startup :)

The genius of Docker was the layered overlayfs-based image building. This one simple innovation made it possible to build images in a constructive way, without having to waste half an hour for each minor change. I was floored with its simplicity and power when I first saw it.

man8alexd

4 months ago

OpenVZ didn't use cgroups but its own resource limits, which sucked.

cyberax

4 months ago

Well, yes. "Beancounters" and it also was not nestable. That's why it took quite a bit of time to get it all upstreamed.

man8alexd

4 months ago

Unless my memory fails me, OpenVZ was never upstreamed.

BandButcher

4 months ago

"The compute we are wasting is at least 10x cheaper, but we have automation to waste it at scale now."

So much this. keep it simple, stupid (muah)

0dayz

4 months ago

What blew my mind and convinced me to only use immutable distros is the immutability of it.

For instance I could create my own login screen for an web service without having to worry about the package manager overriding my code, because I inject it into the container, which is already updated.

I can also forcefully reroute much easier ports or network connections the way I want it.

superkuh

4 months ago

Pretty simple. Modern development cycles for the libs that software depends on now happen significantly faster that OS/Distros update. This is future shock. The symptoms of this disease express as containers.

Do your part. Don't code on the bleeding edge. Keep things stable for users.

lloeki

4 months ago

> There was one key innovation, which was Docker Hub

Vagrant has had that for VMs long before.

Dockerfiles and Docker Hub are directly inspired from Vagrantfiles and the Vagrant Box library

coppsilgold

4 months ago

When you launch a container (either through docker you manually through namespaces) you are effectively representing yourself to the kernel as a separate thing. This allows you to construct a completely separate environment when interacting with the kernel where none of your concerns are going to leak out and nothing you don't care for is going to leak in.

When people say that static executables would solve the problem they are wrong, a static executable just means that you can eschew constructing a separate file-system inside your container - and you will probably need to populate some locations anyway.

Properly configured containers are actually supposed to be secure sandboxes, such that any violation is a kernel exploit. However the Linux kernel attack surface is very large so no one serious who offers multi-tenant hosting can afford to rely on containers for isolation. They have to assume that a container escape 0day can be sourced. It may be more accurate to say that a general kernel 0day can be sourced since the entire kernel surface area is open for anyone to poke. seccomp can mitigate the surface area but also narrow down the usefulness.

112233

4 months ago

not .. really. Linux kernel has no concept of a container, you have to be super careful to avoid "mixing" host stuff in. I'm yet to see an case where "leaking in" would be prevented by default. Docker "leaks in" as much as you want. Containers also do not nest gracefully (due to, e.g., uids), so cannot be used as a software component. It's mostly a linux system admin thing right now.

coppsilgold

4 months ago

The Linux kernel provides the namespaces...

Docker has made some strange decisions for default behavior but if you take a more hands on approach such as with bubblewrap/bwrap nothing will leak in.

112233

4 months ago

How would you do it? I'm quite interested! How can you hide container processes in host procfs using bwrap? And make sure no mounts stay mounted in the host? The most "nothing leaks in" runtime I've seen is gVisor (before going VM). Attaining that with bwrap would be nice, but I'm sceptical.

w10-1

4 months ago

This and the comments may miss the forest for the trees.

Enterprise software vendors sold libraries and then "application servers", essentially promising infrastructure (typically tied to databases).

Enterprise software developers -- Google in particular -- got tired of depending on others' licensed infrastructure. This birthed Spring and Docker, splitting the market.

(Fun aside: when is a container a vm? When it runs via Apple containerization.)

IshKebab

4 months ago

Because Linux devs generally suck at making portable packages that are easy to install.

stego-tech

4 months ago

A really good post that captures the IT perspective of infrastructure change and modernization over my career. They also hit the nail on the head that one of today's problems is itself automation, where we've built tools that let us create sprawling estates with no accountability for cost other than an opaque monthly bill, though I'd also chip in that said automation is often restricted to whoever your chief cloud provider is (AWS, Azure, etc).

Where I find myself advocating today is very much a "rational check" on infrastructure, and curtailing accordingly. We have the tooling to ensure high availability, but does everything need to be HA? Do our SLAs for enterprise tooling really need five-nines of availability, or can we knock some applications down to a limited schedule? Does dev/test need to be live 24/7, or can we power it off when not in use? Why are we only focusing on availability and not scalability? The list goes on, but they're also not popular in enterprises with entrenched politics, which admittedly is where I find myself struggling against the current. If my social chops were better, I suspect I'd thrive in consultancy doing just that.

All that being said, I do like containers that are done right (properly documented, secure-by-default, ready for scaling), and I continue driving more applications towards containerization in the enterprise where possible. They're the right solution for ~60-80% of enterprise use cases, with the difficulty being getting vendors on board with the idea that their software won't have a dedicated VM or hardware anymore (which everyone fiercely resists, because container-based licensing can be a PITA to them). For the rest, VMs are more than fine, and we have a growing number of ways for both to exist peacefully in the same environment. As this area of technology matures (along with "backporting" from hyperscalers to private cloud again) further, I'm really looking forward to managing global estates in smaller teams for bigger firms - things that VMs, Containers, and Infrastructure-as-Code allow.

majke

4 months ago

Most of the responses here are knee-jerk.

I liked the article. It's close to my adverntures with containers. I think the invention of docker is indeed mostly packaging: Dockerfile (functional) is pretty neat, docker hub (addressing a container) is awesome, and the ENTRYPOINT in Dockerfile is great, it distinguishes Docker from .deb.

But indeed, beyond Dockerfile things are bleak. Docker compose never rose to my expectations. To get serious things you need load blancer, storage, addressing, and these are beyond traditional containers scope.

forrestthewoods

4 months ago

Containers were invented because Linux is incapable of reliably running software programs. The global pool of shared libraries is an abject failure and containers are a heavy handed workaround.

lazylizard

4 months ago

how did solaris zones get left out of the story?

lproven

4 months ago

George Santayana and Henry Spencer can tell you.

Santayana in 1905:

"Those who cannot remember the past are condemned to repeat it."

Spencer in 1987:

"Those who do not understand Unix are condemned to reinvent it, poorly."

somat

4 months ago

containers happened because the original execution isolation environment(the process) was considered a lost cause, Processes shared too much with each other so additional isolation primitives had to be added, but they had to be sort of tacked on to the side because more important than security or correctness is backwards compatibility. so now containers are considered a different thing than processes when really they are process with these additional isolation primitives enabled.

ghaff

4 months ago

In the early 2000s (yes, long after the original jails), containers were pitched as an alternative to VMware's VMs. They lost out for a variety of reasons--but mostly because as purely a different (and somewhat lighter-weight) encapsulation technique they weren't that interesting.

user

4 months ago

[deleted]

kotaKat

4 months ago

Containers happened because nobody knew what the hell they were doing and still have no clue what the hell they are doing. Software by the deranged for the deranged.

goodpoint

4 months ago

...and they don't care about security.

nkozyra

4 months ago

> was always surprised someone didn't invent a tool for ftping to your container and updating the PHP.

No FTP needed, you can just mount the application directory.

crummy

4 months ago

> Docker also made Go credible as a programming language,

Can someone explain why Docker was beneficial for Go?

jopsen

4 months ago

Because docker was written in golang.

And before docker, not many large applications were.

freetonik

4 months ago

Ironically, also because Go is one of few popular languages for web applications that can produce a single executable binary and does not require a container to deploy with ease.

CamouflagedKiwi

4 months ago

I think there's a pretty big citation needed on that part of the article. I'm not clear that Docker contributed to that anywhere near as much as a general increase of momentum around Go as it became better known in the industry.

nitwit005

4 months ago

I've never heard of anyone basing their choice to use Go off of the fact that Docker used it. Certainly, the fact that Google was pushing Go mattered far more.

I think you should see this article as someone who's a huge fan of Docker due to working there for years.

jcelerier

4 months ago

For me the main reason to use containers is "one-line install any linux distro userspace". So much simpler than installing a dozen VirtualBox boxes to test $APP on various versions of ubuntu, debian, nixos, arch, fedora, suse, centos etc.

kccqzy

4 months ago

Yeah nowadays we have the distrobox(1) command. Super useful. But certainly that's not why containers happened.

Saris

4 months ago

I have no idea how close I am, but Linux seems to have no way of having multiple versions of a dependency at once like Windows does. So I suspect thats the main reason.

nahuel0x

4 months ago

Because containers were lighter than VM's, made reproducible and modularizable how to build the VM contents, and "solved" Unix/POSIX process isolation / library management design flaws.

blu3h4t

4 months ago

You can laugh or not but its because they never finished gnu/hurd :D

sharts

4 months ago

Why does every discussion about containers always end up revolving around linux and cgroups? Didn’t we have Zones and even Jails way back?

gorfian_robot

4 months ago

in some situations, the cloud is a no-go and the lead time to install new hardware is quite lengthy. in those case vm's or containers allow for rapid change at the software/OS level while the meat space still moves like a glacier.

sounds minor, but it is a Big Deal for some

bboreham

4 months ago

The tool that connects into your container and updates the files was called "Tilt".

boomskats

4 months ago

Fwiw the actual video that he links to is well worth a watch.

Featuring one of the most Justin intros ever.

0xbadcafebee

4 months ago

Why it happened is not nearly as important as what it unveiled: that versioned immutable systems are the most powerful system design concept in history. Most people have not yet grasped what an insanely powerful concept it is. At some point in the future, maybe 50-100 years from now, someone will look back and say "holy shit; deploying web apps are where this concept came from?" I hope in my lifetime that people get it and start applying it to other system designs/components.

hackshack

4 months ago

radmind had this philosophy. https://radmind.org

It let you create diffs of a filesystem, and layer them with configurations, similar to containers. Useful for managing computer labs at the time.

0xbadcafebee

4 months ago

Well disk (or tape) images are as old as 'dd' and 'tar', that's not the revolutionary part. If the disk is writeable the state is still constantly mutating, so you're fighting a war of attrition (it's configuration management at that point, which is terrible). But a read-only disk that doesn't accrue changes, and only needs a reboot to fix, that's the revolutionary part. Anybody who ran thin-terminals can tell you how reliable and easier to manage those are than a full-blown OS.

At some point in the future people are going to realize that every system should work that way.

all2

4 months ago

Likely because Plan9's 'everything-is-a-filesystem' failed.

walkabout

4 months ago

The standard answer is, "because inventing and implementing them was easier than fixing Python packaging."

LexiMax

4 months ago

I think "fixing distro packaging" is more apropos.

In a past life, I remember having to juggle third-party repositories in order to get very specific versions of various services, which resulted in more than few instances of hair-pull-inducing untangling of dependency weirdness.

This might be controversial, but I personally think that distro repos being the assumed first resort of software distribution on Linux has done untold amounts of damage to the software ecosystem on Linux. Containers, alongside Flatpak and Steam, are thankfully undoing the damage.

walkabout

4 months ago

> This might be controversial, but I personally think that distro repos being the assumed first resort of software distribution on Linux has done untold amounts of damage to the software ecosystem on Linux.

Hard agree. After getting used to "system updates are... system updates; user software that's not part of the base system is managed by a separate package manager from system updates, doesn't need root, and approximately never breaks the base system (to include the graphical environment); development/project dependencies are not and should not be managed by either of those but through project-specific means" on macOS, the standard Linux "one package manager does everything" approach feels simply wrong.

bsder

4 months ago

> development/project dependencies are not and should not be managed by either of those but through project-specific means" on macOS, the standard Linux "one package manager does everything" approach feels simply wrong.

This predates macOS. The mainframe folks did this separation eons ago (see IBM VM/CMS).

On Unix, it's mostly the result of getting rid of your sysadmins who actually had a clue. Even in Unix-land in the Bad Old Days(tm), we used to have "/usr/local" for a reason. You didn't want the system updating your Perl version and bringing everything to a screeching halt; you used the version of Perl in /usr/local that was under your control.

bombcar

4 months ago

I wonder if it can be traced back to something RedHat did somewhere, because it may have all begun once you COULDN'T be absolutely certain that anything even remotely "enterprise" was running on a RedHat.

LexiMax

4 months ago

I think it's a natural outgrowth of what Linux is.

Linux is just a kernel - you need to ship your own userland with it. Therefore, early distros had to assemble an entire OS around this newfangled kernel from bits and pieces, and those bits and pieces needed a way to be installed and removed at will. Eventually this installation mechanism gets scope creep and and suddenly things like FreeCiv and XBill are distributed using the same underlying system that bash and cron use.

This system of distro packaging might be good as a selling point for a distro - so people can brag about their distro comes with 10,000 packages or whatever. That said, I can think of no other operating system out there where the happiest path of releasing software is to simply release a tarball of the source, hope a distro maintainer packages it for you, hope they do it properly, and hope that nobody runs into a bug due to a newer or older version of a dependency you didn't test against.

skydhash

4 months ago

Yours is a philosophy I encounter more and more. Where there should be that unified platform, ideally fast moving, where software is only tested against $latest. Stability is a thing of the past. The important thing is more feature.

Instead of designing a solution and perfecting it overtime, it's endless tweaking where there's a new redesign every years. And you're supposed to use the exact computer as the Dev to get their code to work.

ghaff

4 months ago

Red Hat was actually doing something more directly based on a variety of existing Linux projects than Docker but switched to OCI/Docker when that came about--rather than jumping on the CloudFoundry bandwagon. (Which many argues was obviously the future for container orchestration.)

Kubernetes was also not the obvious winner in its time with Mesos in particular seeming like a possible alternative when it wasn't clear if orchestration and resource management weren't possibly different product categories.

I was at Red Hat at the time and my impression was they did a pretty good job of jumping onto where the community momentum at the time was--while doubtless influencing that momentum at the time.

aaroninsf

4 months ago

Ngl this is why I started using them

jauntywundrkind

4 months ago

Never grew popular, perhaps. But I'm not sure how it failed, and not sure how many of the Venm Diagrams of concerns plan9 really has with containers.

Yes there was an idea of creating bespoke filesystems for apps, custom mount structures that plan9 had. That containers also did something semi-parallel to. But container images as read only overlays (with a final rw top overlay) feel like a very narrow craft. Plan9 had a lot more to it (everything as a file), and containers have a lot more to them (process, user, net namespaces, container images to pre-assembled layers).

I can see some shared territory but these concerns feel mostly orthogonal. I could easily imagine a plan9 like entity arising amid the containerized world: these aren't really in tension with each other. There's also a decade and a half+ gap between Plan9's hayday and the rise of containers.

yohbho

4 months ago

> Docker was easy to adopt as it did not change very much about how you used software.

What?! Docker images should seperate the program from the data. Something casual image maintainers fail to adhere to. I find docker extreme hard to use for persistent state that is not a remote DB.

It is even less secure than VMs (which seem to be like swiss chess if you try to break out). (Security-wise: rkt failed five years ago, BSD jails could do it better?)

Also, Docker was supposed to solve performance issues of VMs, debloating them. Instead, we got hypervisors or full VMs as hosts for Docker images, just to get the image as a de facto meta package manager to ship software.

So for me, it can be convenient, but I have no idea what I am doing when adding make file like statements to a yml to run full containers just to run e.g. a web server with some dependecies. But the workflow for that was novell to me, who previously fought with ruby and python module installations on host system.