hackernews client

Debian's Git Transition

256 pointsposted 2 months ago

(diziet.dreamwidth.org)

150 Comments

ckastner

2 months ago

There is some nuance to this. Adding comments to the stated goal "Everyone who interacts with Debian source code (1) should be able to do so (2) entirely in git:

(1) should be able does not imply must, people are free to continue to use whatever tools they see fit

(2) Most of Debian work is of course already git-based, via Salsa [1], Debian's self-hosted GitLab instance. This is more about what is stored in git, how it relates to a source package (= what .debs are built from). For example, currently most Debian git repositories base their work in "pristine-tar" branches built from upstream tarball releases, rather than using upstream branches directly.

[1]: https://salsa.debian.org

amluto

2 months ago

> For example, currently most Debian git repositories base their work in "pristine-tar" branches built from upstream tarball releases

I really wish all the various open source packaging systems would get rid of the concept of source tarballs to the extent possible, especially when those tarballs are not sourced directly from upstream. For example:

- Fedora has a “lookaside cache”, and packagers upload tarballs to it. In theory they come from git as indicated by the source rpm, but I don’t think anything verifies this.

- Python packages build a source tarball. In theory, the new best practice is for a GitHub action to build the package and for a complex mess to attest that really came from GitHub Actions.

- I’ve never made a Debian package, but AFAICT the maintainer kind of does whatever they want.

IMO this is all absurd. If a package hosted by Fedora or Debian or PyPI or crates.io, etc claims to correspond to an upstream git commit or release, then the hosting system should build the package, from the commit or release in question plus whatever package-specific config and patches are needed, and publish that. If it stores a copy of the source, that copy should be cryptographically traceable to the commit in question, which is straightforward: the commit hash is a hash over a bunch of data including the full source!

soneil

a month ago

This was one of the "lessons learnt" from the XZ incident. One of the (many) steps they took to avoid scrutiny was modifications that existed in the real tarball but not the repo.

turminal

2 months ago

For lots of software projects, a release tarball is not just a gzipped repo checked out at a specific commit. So this would only work for some packages.

Terr_

2 months ago

A simple version of this might be a repo with a single file of code in a language that needs compilation, versus, and the tarball with one compiled binary.

Just having a deterministic binary can be non-trivial, let alone a way to confirm "this output came from that source" without recompiling everything again from scratch.

amluto

2 months ago

For most well designed projects, a source tarball can be generated cleanly from the source tree. Sure, the canonical build process goes (source tarball) -> artifact, but there’s an alternative build process (source tree) -> artifact that uses the source tarball as an intermediate.

In Python, there is a somewhat clearly defined source tarball. uv build will happily built the source tarball and the wheel from the source tree, and uv build --from <appropriate parameter here> will build the wheel from the source tarball.

And I think it’s disappointing that one uploads source tarballs and wheels to PyPI instead of uploading an attested source tree and having PyPI do the build, at least in simple cases.

In traditional C projects, there’s often some script in the source tree that runs it into the source tarball tree (autogen.sh is pretty common). There is no fundamental reason that a package repository like Debian or Fedora’s couldn’t build from the source tree and even use properly pinned versions of autotools, etc. And it’s really disappointing that the closest widely used thing to a proper C/C++ hermetic build system is Dockerfile, and Dockerfile gets approximately none of the details right. Maybe Nix could do better? C and C++ really need something like Cargo.

SonOfLilit

2 months ago

The hacker in me is very excited by the prospect of pypi executing code from my packages in the system that builds everyone's wheels.

vetrom

2 months ago

Launchpad does this for everything, as does sbuild/buildd in debian land. They generally make it work by both: running the build system in a neutered VM (network access generally not permitted during builds, or limited to only a debian/ubuntu/PPA package mirror), and going to some degree of invasive process/patching to make build systems work without just-in-time network access.

SUSE and Fedora both do something similar I believe, but I'm not really familiar with the implementation details of those two systems.

amluto

a month ago

I’m only familiar with the Fedora system. The build is hermetic, but the source input come from fedpkg new-sources, which runs on the client used by the package developer.

amluto

2 months ago

This seems no worse than GitHub Actions executing whatever random code people upload.

It’s not so hard to do a pretty good job, and you can have layers of security. Start with a throwaway VM, which highly competent vendors like AWS will sell you at a somewhat reasonable price. Run as a locked-down unprivileged user inside the container. Then use a tool like gVisor.

Also… most pure Python packages can, in theory, be built without executing any code. The artifacts just have some files globbed up as configured in pyproject.toml. Unfortunately, the spec defines the process in terms of installing a build backend and then running it, but one could pin a couple of trustworthy build backends versions and constraint them to configurations where they literally just copy things. I think uv-build might be in this category. At the very least I haven’t found any evidence that current uv-build versions can do anything nontrivial unless generation of .pyc files is enabled.

zahlman

2 months ago

If it isn't at least a gzip of a subset of the files of a specific commit of a specific repo, someone's definition of "source" would appear to need work.

LtWorf

2 months ago

To get a specific commit from a repo you need to clone usually, which will involve a much bigger download than just downloading your tar file.

amluto

2 months ago

Shallow clones are a thing. And it’s fairly straightforward to create a tarball that includes enough hashes to verify the hash chain all the way to the commit hash. (In fact, I once kludged that up several years ago, and maybe I should dust it off. The tarball extracted just like a regular tarball but had all the git objects needed hiding inside in a way that tar would ignore.)

zahlman

a month ago

I don't actually see why you'd need to verify the hash chain anyway. The point of a source tarball, as I understand it, is to be sure of what source you're building, and to be able to audit that source. The development path would seem to be the developer's concern, not the maintainer's.

amluto

a month ago

> The point of a source tarball, as I understand it, is to be sure of what source you're building

Perhaps, in the rather narrow sense that you can download a Fedora source tarball and look inside yourself.

My claim is that upstream developers produce actual official outputs: git commits and sometimes release tarballs. (But note that release tarballs on GitHub are often a mess and not really desired by the developer.). And I further think that verification that a system like Fedora or Debian or PyPI is building from correct sources should involve byte-for-byte comparison of the source tree and that, at least in the common case, there should be no opportunity for a user of one of these systems to upload sources that do not match the claimed upstream sources.

The sadly common workflow where a packager clones a source tree, runs some scripts, and uploads the result as a “source tarball” is, IMO, wrong.

LtWorf

a month ago

You know git allows history rewrite right?

LtWorf

2 months ago

of the head, or of any commit?

amluto

2 months ago

I’m not sure why this would make a difference. The only thing special about the head is that there is a little file (that is not, itself, versioned) saying that a particular commit is the head.

mjw1007

2 months ago

> If a package hosted by Fedora or Debian or PyPI or crates.io, etc claims to correspond to an upstream git commit or release, then the hosting system should build the package, from the commit or release in question plus whatever package-specific config and patches are needed, and publish that.

For Debian, that's what tag2upload is doing.

n8m8

a month ago

shoutout AUR, I’m trying arch for the first time (Omarchy) and wasn’t planning on using the AUR, but realized how useful it is when 3 of the tools I wanted to try were distributed differently. AUR made it insanely easy… (namely had issues with Obsidian and Google Antigravity)

cryptonector

2 months ago

If "whatever tools they see fit" means "patch quilting" then please no. Leave the stone age and enter the age of modern DVCS.

lta

2 months ago

git can be seen as porcelain on top of patch quilting so it's not as much done âge as one might think

cryptonector

2 months ago

This is a misunderstanding of what Git does. Git is a Merkle hash tree, content-addressed, immutable/append-only filesystem, with commits as objects that bind a filesystem root by its hash. The diffs that make up a commit are not really its contents -- they are computed as needed. Now most of the time it's best to think of Git as a patch quilting porcelain, but it's really more than that, and while you can get very far with the patch quilting porcelain model, at some point you need to understand that it goes deeper.

ongy

2 months ago

That point is not reached during packaging though.

I prefer rebasing git histories over messing with the patch quilting that debian packaging standards use(d to use). Though last I had to use the debian packaging mechanisms, I roundtripped them into git for working on them. I lost nothing during the export.

cryptonector

2 months ago

Yes, I also end up doing things like that, but it's just a pain. If Debian did it themselves then adding a local commit would be truly trivial.

cryptonector

2 months ago

The whole patch quilting thing is awful. Just keep the patches as commits. It won't "trick" me or anyone else, especially if you keep them in branches that denote "debian".

Please, please, stop the nonsense with the patch quilting -- it's really cumbersome, it adds unnecessary cognitive load, it raises the bar to contributions, it makes maintenance harder, and it adds _zero value_. Patch quilting is a lose-lose proposition.

gioele

2 months ago

> The whole patch quilting thing is awful. Just keep the patches as commits.

I'd say that `quilt` the utility is pretty much abandoned at this point. The name `quilt` remains in the format name, but otherwise is not relevant.

Nowadays people that maintain patches do it via `gbp-pq` (the "patch queue" subcommand of the badly named `git-buildpackage` software). `gbp-pq switch` reads the patches stored in `debian/patches/`, creates an ephemeral branch on top of the HEAD, and replays them there. Any change done to this branch (new commits, removed comments, amended commits) are transformed by `gbp-pq export` into a valid set of patches that replaces `debian/patches/`.

This mechanism introduces two extra commands (one to "enter" and one to "exit" the patch-applied view) but it allows Debian to easily maintain a mergeable Git repo with floating patches on top of the upstream sources. That's impossible to do with plain Git and needs extra tools or special workflows even outside of Debian.

coryrc

2 months ago

> That's impossible to do with plain Git and needs extra tools or special workflows even outside of Debian

Rebase.

coryrc

2 months ago

Also rebasing has less information available to it, so it's less likely to update cleanly than merging. Don't do it!! Just consider the diff between the new head and upstream as "the diff" and describe the reasons for it.

cryptonector

2 months ago

What, no. In a merge you have two parents and their histories. In a rebase you have... the same thing as-if you had merged a fast-forward-ready branch. It's the same thing.

If you insist you can add Merge commits to bracket fast-forward pushes, but arguably there is no need, and especially so for something like Debian packages where the convention would be that Debian's patches are "always on top", so you can see them by doing `git log ${base}..${debian_release_branch}` for any release. (And what's the base? Whatever upstream branch/tag the Debian release is based on, but you can add more tags with a Debian naming convention to denote the bases.)

coryrc

2 months ago

In practical, large-scale usage, the default merging algorithm works better than the default rebase algorithm. But I did switch teams from using a rebase workflow to a merge workflow and manual conflict resolution needs went way, way down. Obviously there are confounding issues, but that's my experience.

If your patches never touch the same files as others, I think it doesn't matter. But, IIRC, if patch A and patch B both touch file F, and the changes in patch A are in context for diffs of patch B, it always fails if patch A changes patch B's context, but since merging incorporates all changes at once, these separate context changes don't apply.

It's been a while, but it might be only when you need to manually resolve patch A, then you also have to manually resolve patch B even if you wouldn't have had to touch it in a merge scenario.

cryptonector

a month ago

> In practical, large-scale usage, the default merging algorithm works better than the default rebase algorithm.

You're referring to having to do conflict resolution for each commit in the rebase series, as opposed to all at once for a merge. Either way if the upstream has added thousands of commits since the last time, you're in for no fun.

This is a case where Git could be better, but as I responded to u/gioele there exist tools that greatly help with the conflict resolution issue, such as this one that I wrote myself:

https://gist.github.com/nicowilliams/ea2fa2b445c2db50d2ee650...

which basically bisects to find the upstream commit that introduces a conflict with each commit in the rebase series.

This has one major advantage over merge workflow conflict resolution: you get the most post possible context for each manual conflict resolution you have to do! And you still get to have clean, linear history when you're done.

cryptonector

2 months ago

What siblings say. What you want is `git rebase`, especially with the `--onto` and `--interactive` options. You might also want something like bisect-rebase.sh[0], though there are several other things like it now.

[0] https://gist.github.com/nicowilliams/ea2fa2b445c2db50d2ee650...

Snild

2 months ago

Rebasing would mean there's no continuous versioning of the "patches on top", which might be undesirable. Also, the history rewriting might make cooperation difficult.

Merges would avoid those problems, but are harder to do if there are lots of conflicts, as you can't fix conflicts patch by patch.

Perhaps a workflow based on merges-of-rebases or rebase-and-overwrite-merge would work, but I don't think it's fair to say "oh just rebase".

cryptonector

2 months ago

> Rebasing would mean there's no continuous versioning of the "patches on top", which might be undesirable. Also, the history rewriting might make cooperation difficult.

Let's say you have these version tags upstream: foo-1.0.1, foo-1.1.0, foo-1.3.0, and corresponding Debian releases 1.0.1-0, 1.1.0-0, 1.1.0-1, 1.3.0-0, 1.3.0-1, and 1.3.0-2, and the same 3 patches in all cases, except slightly different in each case. Then to see the several different versions of these patches you'd just `git log --oneline foo-${version}..debian-${version}-${deb_version}`.

coryrc

2 months ago

Gerrit introduces the concept of Commit-Id; essentially a uuid ties to the first review which merged a proposed commit into the trunk.

Cherry picks preserve that Commit-Id. And so do rebases; because they're just text in a commit message.

So you can track history of patches that way, if you needed to. Which you won't.

(PS some team at google didn't understand git or their true requirements, so they wasted SWE-decades at that point on some rebasing bullshit; I was at least able to help them make it slightly less bad and prevent other teams from copying it)

Snild

2 months ago

But that Commit-Id footer has no functional effect. I don't see how it would help me if I have a clone of the repo, and my upstream (in this case, the debian maintainer) rebases.

> Which you won't.

Why not? Doesn't it make sense to be able to track the history of what patches have been applied for a debian package?

coryrc

2 months ago

You need additional tooling to make use of Commit-Id. With Gerrit, it does link them all together.

> Doesn't it make sense to be able to track the history of what patches have been applied for a debian package?

... no. Each patch has a purpose, which will be described in the commit message. Hopefully it does what it says it does, which you can compare with its current diff.

If it was upstreamed with minimal changes, then the diff is near-empty. Drop it.

If it was upstreamed with significant changes, then the diff will be highly redundant. Drop it.

If the diff appears to do what the commit message says it does, then it probably does what it says.

If the diff is empty, either it was upstreamed or you fucked up rebasing. Don't be negligent when rebasing.

adastra22

2 months ago

How is that not literally the git history?

Snild

2 months ago

It is, except after rebasing.

adastra22

2 months ago

That’s what git-rebase is for, and it is built into standard git.

dima55

2 months ago

Moving from a patch stack maintained by quilt to git is what this article is about.

blucaz

2 months ago

Maintaining separate upstream sources and downstream patches does provide value. Maybe not to you, but it does.

For example, it's trivial from a web browser with a couple of clicks to go and find out all the downstream changes to a package. For example to see how glibc is currently customized in debian testing/unstable you can just navigate this webpage:

https://sources.debian.org/src/glibc/2.42-6/debian/patches

If everything gets merged in the same git tree it's way harder. Harder but doable with a rebase+force push workflow, which makes collaboration way harder. Just impossible with a merge workflow.

As an upstream maintainer of several project, being able to tell at a glance and with a few clicks how one of my projects is patched in a distribution is immensely useful when bug reports are opened.

In a past job it also literally saved a ton of money because we could show legal how various upstreams were customized by providing the content of a few .debian.tar.gz tarballs with a few small, detached patches that could be analyzed, instead of massive upstream trees that would take orders of magnitude more time to go through.

cryptonector

2 months ago

> For example, it's trivial from a web browser with a couple of clicks to go and find out all the downstream changes to a package.

How is this not also true for Git? Just put all the Debian commits "on top" and use an appropriate naming convention for your branches and tags.

> If everything gets merged in the same git tree it's way harder.

Yes, so don't merge, just rebase.

> Harder but doable with a rebase+force push workflow, which makes collaboration way harder.

No force pushes, just use new branch/tag names for new releases.

> Just impossible with a merge workflow.

Not impossible but dumb. Don't use merge workflows!

> As an upstream maintainer of several project, being able to tell at a glance and with a few clicks how one of my projects is patched in a distribution is immensely useful when bug reports are opened.

Git with a suitable web front-end gives you exactly that.

> In a past job it also literally saved a ton of money because we could show legal how various upstreams were customized by providing the content of a few .debian.tar.gz tarballs with a few small, detached patches that could be analyzed, instead of massive upstream trees that would take orders of magnitude more time to go through.

`git format-patch` and related can do the moral equivalent.

lta

2 months ago

It's worth mentioning the quilting approach likely predates the advent of git by at least a decade.. I think compatibility with git has been available for a while now and I assume there was always something more pressing than migrating the base stack to git

XorNot

2 months ago

dgit handles the whole affair with very little fuss I've found and is quite a pleasant workflow.

cryptonector

2 months ago

What is dgit?

RandallBrown

2 months ago

https://wiki.debian.org/DgitFAQ

cryptonector

2 months ago

Thanks! IMO Debian should just switch to only Git.

user

2 months ago

[deleted]

user

2 months ago

[deleted]

IshKebab

2 months ago

What is patch quilting, for the blissfully unaware?

eichin

2 months ago

https://wiki.debian.org/UsingQuilt but the short form is that you keep the original sources untouched, then as part of building the package, you apply everything in a `debian/patches` directory, do the build, and then revert them. Sort of an extreme version of "clearly labelled changes" - but tedious to work with since you need to apply, change and test, then stuff the changes back into diff form (the quilt tool uses a push/pop mechanism, so this isn't entirely mad.)

IshKebab

2 months ago

Ha yes that does sound mad. If only there was a version control system specifically designed to track changes to code...

db48x

2 months ago

Quilt predates Git. Back then source was distributed as a tarball, and Debian simply maintained a directory full of patches to apply to the tarball.

IshKebab

2 months ago

Sure but Git has been available (and super popular) for almost 20 years now.

db48x

2 months ago

Yea, so? Debian goes back 32 or more years, and quilt dates to approximately the same time. It’s probably just a year or two younger than Debian.

At Mozilla some developers used quilt for local development back when the Mozilla Suite source code was kept in a CVS repository. CVS had terrible support for branches. Creating a branch required writing to each individual ,v file on the server (and there was one for every file that had existed in the repository, plus more for the ones that had been deleted). It was so slow that it basically prevented anyone from committing anything for hours while it happened (because otherwise the branch wouldn’t necessarily get a consistent set of versions across the commit), so feature branches were effectively impossible. Instead, some developers used quilt to make stacks of patches that they shared amongst their group when they were working on larger features.

Personally I didn’t really see the benefit back then. I was only just starting my career, fresh out of university, and hadn’t actually worked on any features large enough to require months of work, multiple rounds of review, or even multiple smaller commits that you would rebase and apply fixups to. All I could see back then were the hoops that those guys were jumping through. The hoops were real, but so were the benefits.

IshKebab

2 months ago

> Yea, so?

So it's clearly a way better solution and it's disappointing that they still haven't switched to it after 20 years? I dunno what else to say...

tremon

2 months ago

So has git-buildpackage; the debian historical archives don't go further back than v0.4, but the oldest bug report referencing gbp is from december 2006: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=403987

blibble

2 months ago

it's quite difficult to maintain a quilt like workflow with plain git

I've tried it

cryptonector

2 months ago

Quilt is difficult to maintain, but a quilt-like workflow? Easy: it's just a branch with all patches as commits. You can re-apply those to new releases of the upstream by using `git rebase --onto $new_upstream_commit_tag_or_branch`.

mxey

2 months ago

How do you track changes to the patches themselves?

cryptonector

2 months ago

By having a naming convention for your tags and branches, then you can always identify the upstream "base" upon which the Debian "patches" are based, and then you can trivially use `git log` to list them.

Really, Git has a solution to this. If you insist that it doesn't without looking, you'll just keep re-inventing the wheel badly.

blibble

2 months ago

but then if I want to see the history for a specific patch, or bisect them?

mercurial has a patch queue extension that married it and quilt, which was very easy to use

cryptonector

a month ago

Do you ever really want this? I don't recall wanting this. But you can still get this: just list the ${base_ref}..${deb_ref} commit ranges, select the commit you want, and diff the `git show` of the selected commits. It helps here to keep the commit synopsis the same.

E.g.,

  c0=$(git log --oneline ${base_ref0}..${deb_ref0} |
         grep "^[^ ] The subject in question" |
         cut -d' ' -f1)
  c1=$(git log --oneline ${base_ref1}..${deb_ref1} |
         grep "^[^ ] The subject in question" |
         cut -d' ' -f1)
  if [[ -z $c0 || -z $c1 ]]; then
    echo "Error: commits not found"
  else
    diff -ubw <(git show $c0) <(git show c1)
  fi

See also the above commentary about Gerrit and commit IDs.

(Honestly I don't need commit IDs. What happens if I eventually split a commit in a patch series into two? Which one, if either, gets the old commit ID? So I just don't bother.)

mxey

a month ago

So there’s no way to have commit messages on changes to patches? There’s also https://dep-team.pages.debian.net/deps/dep3/

People keep saying “just use Git commits” without understanding the advantages of the Quilt approach. There are tools to keep patches as Git commits that solve this, but “just Git commits” do not.

cryptonector

a month ago

Having maintained private versions of Debian packages, I have zero need for "commit messages on changes to patches". I can diff them as needed as I showed, but I rarely ever need to -- I mostly only rebase onto new upstreams. Seeing differences in patches isn't helpful because there is not enough context there as to what changed in the upstreams.

I rather suspect that "commit messages on changes to patches" is what Debian ended up with and back-justifies it.

Of course, I am not a Debian maintainer, so it's entirely possible I'm just missing the experience of it that would make me want "commit messages on changes to patches".

mxey

a month ago

Quilt was AFAIK used before Git, so you’re not wrong. But now that it’s there, it has some advantages.

I’m not arguing against replacing Quilt, but it should be more than just Git. I haven’t done Debian packaging in a long time but apparently there are some Git-based tools now?

IshKebab

a month ago

I don't know that I've ever wanted to diff a diff, but you could do that still. And bisecting would still be possible, especially if you use merges instead of rebases.

cryptonector

a month ago

Bisect rebases... you mean that you have two release branches based on divergent upstream branches and you want to quickly test where a bug was introduced on the way from the one to the other? What I would do in a rebase workflow is find the merge base (`git merge-base`) of the two release branches, and bisect from that to the release branch I'm interested in.

IshKebab

2 months ago

You can keep the old branches around if you want. Or merge instead of rebasing.

coryrc

2 months ago

Those who don't understand git are doomed to reimplement half of it poorly?

(I know that's not quite the Greenspun quote)

cryptonector

2 months ago

I think that's right, sadly.

m463

2 months ago

> 4. No-one should have to learn about Debian Source Packages, which are bizarre, and have been obsoleted by modern version control.

mschuster91

2 months ago

Now if a consequence of that could be that one (as an author of a piece of not-yet-debianized software) can have the possibility to decently build Debian packages out of their own repository and, once the package is qualified to be included in Debian, trivially get the publish process working, that would be a godsend.

At the moment, it is nothing but pain if one is not already accustomed and used to building Debian packages to even get a local build of a package working.

kpcyrd

2 months ago

The problem is that "once the package is qualified to be included in Debian" is _mostly_ about "has the package metadata been filled in correctly" and the fact that all your build dependencies also need to be in Debian already.

If you want a "simple custom repository" you likely want to go in a different direction and explicitly do things that wouldn't be allowed in the official Debian repositories.

For example, dynamic linking is easy when you only support a single Debian release, or when the Debian build/pkg infrastructure handles this for you, but if you run a custom repository you either need a package for each Debian release you care about and have an understanding of things like `~deb13u1` to make sure your upgrade paths work correctly, or use static binaries (which is what I do for my custom repository).

kakwa_

a month ago

Just a few bits about that.

I would recommend looking into the chroot based build tools like pbuilder (.deb) and mock (.rpm).

It greatly simplifies the local setup, including targeting different distributions or even architectures (<3 binfmt).

But I tend to agree, these tools are not easy to remember, specially for the occasional use. And packaging a complex software can be a pain if you fall down the dependency rabbit hole while trying to honor distros' rules.

That's why I ended-up spending quite a bit of time tweaking this set of ugly Makefifes: https://kakwa.github.io/pakste/ and why I often relax things allowing network access during build and the bundling of dependencies, specially for Rust, Go or Node projects.

rjsw

2 months ago

They could take a look at how pkgsrc [1] works.

[1] https://www.pkgsrc.org/

eduction

2 months ago

pkgsrc is great, I use this on smartos (as just an end user) and it’s extremely straightforward

Valodim

2 months ago

Oh, yes. This seems like nothing short of necessary for the long term viability of the project. I really hope this effort succeeds, thank you to everyone pushing this!

jonhohle

2 months ago

You might think, but here we are at the end of 2025 and this is still a WIP.

I don’t think it’s a bad move, but it also seems like they were getting by with patches and tarballs.

agwa

2 months ago

I can't find it now but I recently saw a graph of new Debian Developers joining the project over time and it has sharply declined in recent years. I was on track to becoming a Debian Developer (attended a couple DebConfs, got some packages into the archive, became a Debian Maintainer) but I ultimately burned out in large part because of how painful Debian's tooling makes everything. Michael Stapelberg's post about leaving Debian really rings true: https://michael.stapelberg.ch/posts/2019-03-10-debian-windin...

Debian may still be "getting by" but if they don't make changes like this Git transition they will eventually stop getting by.

boldcode69

2 months ago

[dead]

MarsIronPI

2 months ago

What I've always found off-putting about the Debian packaging system is that the source lives with the packaging. I find that I prefer Ports-like systems where the packaging specifies where to fetch the source from. I find that when the source is included with the packaging, it feels more unwieldy. It also makes updating the package clumsier, because the packager has to replace the embedded source, rather than just changing which source tarball is fetched in the build recipe.

sillystuff

2 months ago

Debian requires that packages be able to be built entirely offline.

> Debian guarantees every binary package can be built from the available source packages for licensing and security reasons. For example, if your build system downloaded dependencies from an external site, the owner of the project could release a new version of that dependency with a different license. An attacker could even serve a malicious version of the dependency when the request comes from Debian's build servers. [1]

[1] https://wiki.debian.org/UpstreamGuide#:~:text=make%20V=1-,Su...

ForHackernews

2 months ago

This is such a wonderful guarantee to offer to users. In most cases, I trust the Debian maintainers more than a trust the upstream devs (especially once you take into account supply chain attacks).

It's sad how much Linux stuff is moving away from apt to systems like snap and flatpak that ship directly from upstream.

MarsIronPI

2 months ago

So do Gentoo and Nix, yet they have packaging separate from the source code. The source is fetched, but the build is sandboxed from the network during the configure, build and install phases. So it's technically possible.

aidenn0

2 months ago

Nix definitely does not allow most things to be built offline (at least in the way Debian means it).

With Nix, any fetcher will download the source. It does so in a way that guarantees the shasum of what is fetched is identical, and if you already have something in the nix store with that shasum, it won't have to fetch it.

However, with just a mirror of the debian source tree, you can build everything without hitting the internet. This is assuredly not true with just a mirror of nixpkgs.

MarsIronPI

2 months ago

> With Nix, any fetcher will download the source.

OK, I see how the Debian idea differs from the Portage/Nix/etc. idea. For Portage and Nix it is enough that the build proper be offline, but the source code is fetched at the beginning of the package build. Not only do I find this sufficient, I prefer it because IMO it makes the package easier to work with (since you're only wrangling the packaging code, not upstream's).

NewJazz

2 months ago

There are probably still ways to maintain a source archive with a ports system. Just analyze the sources used by builds, create a mirror, and redirect fetches to use the mirror. It's not that crazy. The packaging would still be a separate affair.

sestep

2 months ago

This is exactly what Nix does, if I understand correctly: the "mirror" is cache.nixos.org as mentioned in XorNot's reply to the same parent post.

aidenn0

2 months ago

I'm about 80% certain that cache.nixos.org caches the results of the builds, not the input sources.

MarsIronPI

2 months ago

I think it does both, at least for some sources. After all, sources are derivations too.

sestep

2 months ago

This is correct; in Nix lingo these are referred to as "fixed output derivations". For some other interesting context, see this Nix forum post from last year in which they discussed deleting some stuff from cache.nixos.org to save money, but were clear that they'd keep all fixed output derivations and only delete other things that aren't derivable from those: https://discourse.nixos.org/t/upcoming-garbage-collection-fo...

XorNot

2 months ago

Nix and specifically nixpkgs is IMO very bad at this. It's not a distro: it's a collection of random links that in many cases now only exists in cache.nixos.org. The tarball server frequently doesn't have content, can't represent some content at all (recursive hash types), links have rotted away completely (broadcom driver zips referencing a domain which is now advertising online gambling).

Nix isn't functional: it's a functional core that moved every bit of the imperative part to an even less parseable stage, labelled it "evaluation" and then ignored any sense of hygiene about it.

No: your dependency tree for packaging should absolutely not include an opaque binary from a cache server, or a link to a years old patch posted on someone else's bugzilla instance (frequently link rotted as well).

Nothing has made me appreciate the decisions of mainstream distributions more then dealing with an alternative like this.

em-bee

2 months ago

that's a pretty damning detail about nix which everyone else seems to speak so highly about. are there any articles that explain this in more detail?

MarsIronPI

2 months ago

Not saying you're wrong, but in the ~year I've been using NixOS I've never noticed anything like that.

XorNot

a month ago

The heavy asterisk here is that none of this actually makes using NixOS impossible because it obviously still works. But when you get into the problem I am - and one of the major purported benefits of it which is reproducibility and traceability - this is a pretty serious issue.

So long as the NAR files in cache.nixos.org exist, everything will work - that's not a problem. But if you actually choose to exercise that traceability - which is what I've been working on - suddenly you start finding all this stuff. The problem is nixpkgs doesn't expose or archive the code: it archives a reference to code that existed somewhere at some time, and worse it obfuscates what the code was - I can obviously still go get it from the NAR files, but I can't get any of the context surrounding it.

By contrast, things like the Fedora and Debian patching systems have - crucially - actual archives of what they're building, the patches they're building them with, and the commit messages or other notes on why those patches are being applied and the change record of them. With NixOS you get a bunch of hashes that terminates on "wefu123r23hjcowiejcwe.nar" and you don't know what that is until nixpkgs happens to evaluate it and calculate it, which means it's impossible to even know up-front what's going to be pulled in.

Then of course you get to practical matters: just because you can exactly specify dependencies doesn't mean you should - we all realized with containers that having a couple dozen versions of libraries kicking around is a bad idea (and lo and behold that's what traditional distro packaging tries to minimize) - and that's where all those calculated paths burn you anyway. Nix is a fairly freeform programming language, so it's nigh impossible to stop some snowflake package from pulling in a different version of a compiler or library even if I can see it happening (example I currently have: 5 different version of Rust, 5 different versions of Golang - and the invariant I want on that is "no, it's this version and you deal with it" - but there's a lot of ways nix will let you make this which are very resistant to static analysis or automated correction).

tremon

2 months ago

This doesn't say what you think it does. It says that every binary package should only depend on its declared source packages. It does not say that source packages must be constructed without an upstream connection.

What the OP was referring to, is that Debian's tooling stores the upstream code along with the debian build code. There is support tooling for downloading new upstream versions (uscan) and for incorporating the upstream changes into Debian's version control (uupdate) to manage this complexity, but it does mean that Debian effectively mirrors the upstream code twice: in its source management system (mostly salsa.debian.org nowadays), and in its archive, as Debian source archives.

0x457

2 months ago

All that is required for this to work (building offline) and be immune to all bad thing you wrote: package build part must contain checksum of source code archive and mirror that source code.

cbmuser

2 months ago

> What I've always found off-putting about the Debian packaging system is that the source lives with the packaging.

Many packages have stopped shipping the whole source and just keep the debian directory in Git.

Notable examples are

- gcc-*

- openjdk-*

- llvm-toolchain-*

and many more.

throwaway7356

2 months ago

But isn't that incompatible with the proposed transition to Git?

bandrami

2 months ago

It made a lot of sense before centralized source storage (Debian packaging predates Sourceforge, let alone github).

But it's still nice to have when an upstream source goes dark unexpectedly, as does occasionally still happen.

mschuster91

2 months ago

> I find that when the source is included with the packaging, it feels more unwieldy.

On the other hand, it makes for a far easier life when bumping compile or run time dependency versions. There's only one single source of truth providing both the application and the packaging.

It's just the same with Docker and Helm charts. So many projects insist on keeping sometimes all of them in different repositories, making change proposals an utter PITA.

djaouen

2 months ago

I remember when a startup I used to work for made the transition from svn to git. They transitioned, then threw the guy who suggested the transition under the bus; he quit, and then the company collapsed. Lol!

danudey

2 months ago

I was hired at a small startup (~15 employees total) and one of the first things I did was to migrate their SVN repository to Git. Not too difficult, brought over the history and then had to write a bunch of tooling to handle the fact that not all of the source code was in one giant heirarchy anymore (since everything was microservices and self-contained libraries it made sense to split them out).

After I left that company I ended up at a larger company (~14k employees) in part because I'd worked on SVN-to-Git migrations before. Definitely a different beast, since there were a huge amount of workflows that needed changing, importing 10 years of SVN history (some of which used to be CVS history), pruning out VM images and ISOs that had been inadvertently added, rewriting tons of code in their Jenkins instance, etc.

All this on top of installing, configuring, and managing a geographically distributed internal Gitlab instance with multiple repositories in the tens or hundreds of gigabytes.

It was a heck of a ride and took years, but it was a lot of fun at the same time. Thankfully 'the guy who suggested the transition' was the CEO (in the first company) or CTO (in the second) nothing went wrong, no one got thrown under buses, and both companies are still doing a-okay (as far as source control goes).

lionkor

2 months ago

git is a skill check on learning tools to get a job done

djaouen

2 months ago

git is actually a case-study in "Worse Is Better". hg beats it in every way, except pure speed. Of course, git is still way better than svn, tho.

jdlyga

2 months ago

The way Git took over wasn't Git vs Mercurial (although that was a small part of it), but much more Git vs SVN, CVS, and people that never used source control before. It's similar to how Chrome became the dominant browser over Firefox. It was much more converts from Internet Explorer and Safari than advanced users that were already on Firefox.

bandrami

2 months ago

That is an important point: in 2005 "all code must be in version control" was still a controversial idea, particularly for companies that made software but were not "tech" companies. A lot of git's expansion came from teams putting their software in a VCS for the first time.

superkuh

2 months ago

>Making changes can be done with just normal git commands, eg git commit. Many Debian insiders working with patches-unapplied are still using quilt(1), a footgun-rich contraption for working with patch files!

Huh. I just learned to use quilt this year as part of learning debian packaging. I've started using it in some of my own forks so I could eventually, maybe, contribute back.

I guess the old quilt/etc recommendation in the debian build docs is part of the docs updates project needed that the linked page talks about.

evolve2k

2 months ago

As a process of community transition the team is right to focus on the need for more communications and documentation around the shift to git across the ecosystem.

I see alot of value in how steam helped communicate which software was and wasn’t ready to run on their new gaming platform. Tools like verification ticks and defined statuses for packages are very useful to communicate progress and to motivate maintainers to upgrade. Consider designing a similar verifition approach that helps the community easily track progress and nudge slow players. If it’s all too technical the community can’t help move things along.

https://www.steamdeck.com/en/verified

evolve2k

2 months ago

Correct me if I’m wrong but as I’m understanding it, the processes is well underway towards moving the core systems and libraries (or whatever it’s all called) across to the new way. But that there’s a massive job of extended libraries maintained by lots of other parties and this ecosystem of libraries have been using all manner of approaches, each of which has its drawbacks and the big goal here is to get all these maintainers onboard to switch over to the new git-based workflow that this transition team (and others) have been working hard to make logical and easy enough to implement.

Is that a fair general read of the situation? (I have further comments to make but wanted to check my basic assumptions first).

shmerl

2 months ago

I wish Debian would also transition to a modern bug tracker. Current one is very archaic.

kstrauser

2 months ago

It surely won't win any beauty contests, but do you think it's missing any needed functionality?

Sincere question. I haven't interacted with it much in ages.

agwa

2 months ago

The simple task of following a bug requires you to:

1. Send an empty email to a special address for the bug.

2. Wait 15-30 minutes for Debian's graylisting mail server to accept your email and reply with a confirmation email.

3. Reply to the confirmation email.

The last time I tried to follow a bug, I never got the confirmation email.

In practically every other bug tracker, following a bug is just pressing a button.

Like most of Debian's developer tooling, the bug tracker gets the job done (most of the time) but it's many times more inconvenient than it needs to be.

kstrauser

2 months ago

Fair points. But without looking at it myself, and for the benefit of people reading along, do you have to do that if you already have an account on the tracker? For instance, it's easy to follow issues on GitHub, but that's after you've jumped through some similar hoops to create an account.

agwa

2 months ago

There is no way to create an account for the Debian bug tracker. You have to jump through these hoops every single time you want to follow a bug.

kstrauser

2 months ago

Oh, wow. Yeah. Well, I asked, and now I know!

gspr

2 months ago

I really wish we could have both. An interactive web frontend, and the classic email-centric bug tracker, both serving the same data. I think both have its strengths. I suppose that the job is massive given how enormous and fast-moving the first have become.

IshKebab

2 months ago

Yeah but virtually every developer in the world has already jumped through that hoop. They don't need to do it again for every project.

Also the hoop can be as simple as "click here to sign in with <other account you already have>".

shmerl

2 months ago

I use reportbug to simplify the process of initial reporting, but whole interaction is still far from convenient.

https://tracker.debian.org/pkg/reportbug

csnover

2 months ago

As someone who uses Debian and very occasionally interacts with the BTS, what I can say is this:

As far as I know, it is impossible to use the BTS without getting spammed, because the only way to interact with it is via email, and every interaction with the BTS is published without redaction on the web. So, if you ever hope to receive updates, or want to monitor a bug, you are also going to get spam.

Again, because of the email-only design, one must memorise commands or reference a text file to take actions on bugs. This may be decent for power users but it’s a horrible UX for most people. I can only assume that there is some analogue to the `bugreport` command I don’t know of for maintainers that actually offers some amount of UI assistance. As a user, I have no idea how to close my own bugs, or even to know which bugs I’ve created, so the burden falls entirely on the package maintainers to do all the work of keeping the bug tracker tidy (something that developers famously love to do…).

The search/bug view also does not work particularly well in my experience. The way that bugs are organised is totally unintuitive if you don’t already understand how it works. Part of this is a more general issue for all distributions of “which package is actually responsible for this bug?”, but Debian BTS is uniquely bad in my experience. It shows a combination of status and priority states and uses confusing symbols like “(frowning face which HN does not allow)” and “=” and “i” where you have to look at the tooltip just to know what the fuck that means.

Marsymars

2 months ago

> As far as I know, it is impossible to use the BTS without getting spammed, because the only way to interact with it is via email, and every interaction with the BTS is published without redaction on the web. So, if you ever hope to receive updates, or want to monitor a bug, you are also going to get spam.

Do the emails from the BTS come from a consistent source? If so, it's not a good solution, but you could sign up with a unique alias that blackholes anything that isn't from the BTS.

joeyh

2 months ago

The command is `bts` in devscripts. I wrote it in 2001.

noirscape

2 months ago

The spam issue is probably one of the stronger arguments against email centered design for bug trackers, code forges and the like. It's a bit crazy that in order to professionally participate in modern software development, you're inherently agreeing that every spammer with a bridge to sell you is going to be able to send you unsollicited spam.

There's a reason most code forges offer you a fake email that will also be considered as "your identity" for the forge these days.

shmerl

2 months ago

It's just annoyingly clunky to use any time I need to interact with it, versus modern bug trackers like GitLab's and etc.

Also, locally patching reportbug to support XDG base directory spec is a chore (since maintainers didn't accept the fix for it for years).

thesnide

2 months ago

to be fair, it fits my exact needs. and without common javacript bloat.

so kudos to its authors

fanf2

2 months ago

Ian Jackson (the author of this article) also wrote debbugs.

gorgoiler

2 months ago

Forge != repository is a good design pattern. If the two are separate you can even use multiple forges per repository.

Perhaps you might host hardware designs or art assets that benefit from one kind of forge, alongside code that benefits from another? Or more simply use one forge for CI and another for code review.

hu3

2 months ago

https://archive.ph/vp6rp

jancsika

2 months ago

> The canonical git format is “patches applied”.

How many Debian packages have patches applied to upstream?

rurban

2 months ago

Most, because Debian is the only distro which strictly enforces their manpages and filesystem standards. And most source packages don't care much, resp. have other ideas

dima55

2 months ago

Lots. Because many upstream projects don't have their build system set up to work within a distribution (to get dependencies form the system and to install to standard places). All distros must patch things to get them to work.

latchup

2 months ago

Well, there are big differences in how aggressively things are patched. Arch Linux makes a point to strictly minimize patches and avoid them entirely whenever possible. That's a good thing, because otherwise, nonsense like the Xscreensaver situation ensues, where the original developers aggressively reject distro packages for mutilating their work and/or forcing old and buggy versions on unsuspecting users.

dima55

2 months ago

Huh? I contribute to Debian; I don't aggressively patch anything. You can too.

lionkor

2 months ago

It's "let's patch as little as possible" vs "let's enforce our rules with the smallest patch possible"

latchup

2 months ago

Well good for you. Then I suppose you don't speak for the Debian maintainers responsible for trainwrecks like this:

https://research.swtch.com/openssl

There seems to be a serious issue with Debian (and by extension, the tens of distros based on it) having no respect whatsoever for the developers of the software their OS is based on, which ends up hurting users the most. Not sure why they cannot just be respectful, but I am afraid they are shoveling Debian's grave, as people are abandoning stale and broken Debian-based distros in droves.

lelanthran

2 months ago

> nonsense like the Xscreensaver situation ensues, where the original developers aggressively reject distro packages

I didn't know about this. Link?

latchup

2 months ago

https://www.jwz.org/blog/2016/04/i-would-like-debian-to-stop...

and

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=819703#158

Needless to say, Zawinski was more than a little frustrated with how the Debian maintainers do things.

But honestly, this took 30 seconds to Google and was highly publicized at the time. This whole "I never heard of this, link??" approach to defend a lost argument when the point made is easily verifiable serves to do nothing but detract from discussion. Which, you know, is what this place is for.

lelanthran

2 months ago

I wasn't defending anything; searching for xscreensaver debian debacle yielded links that might or might not have been what you were referring to, They did not, however, yield a link to the JWZ site.

I genuinely wanted to know what this was about.

dspillett

2 months ago

A fair few I expect, amongst actively developed apps/utils/libs. Away from sid (unstable) Debian packages are often a bit behind upstream but still supported, so security fixes are often back-ported if the upstream project isn't also maintaining older releases that happen to match the version(s) in testing/stable/oldstable.

shevy-java

2 months ago

Debian is kind of slow in adapting to the modern world.

I kind of appreciate that debian put FOSS at a core value very early on; in fact, it was the first distribution I used that forced me to learn the commandline. The xorg-server or rather X11 server back then was not working so I only had the commandline, and a lean debian handbook. I typed in the commands and learned from that. Before this I had SUSE and it had a much thicker book, with a fancypants GUI - and it was utterly useless. But that was in 2005 or so.

Now, in 2025, I have not used debian or any debian based distribution in a long time. I either compile from source loosely inspired by LFS/BLFS; or I may use Manjaro typically these days, simply because it is the closest to a modern slackware variant (despite systemd; slackware I used for a long time, but sadly it slowed down too much in the last 10 years, even with modern variants such as alienbob's slackware variant - manjaro moves forward like 100x faster and it also works at the same time, including when I want to compile from source; for some reason, many older distributions failed to adapt to the modern era. Systemd may be one barrier here, but the issue is much more fundamental than that. For instance, you have many more packages now, and many things take longer to compile, e. g. LLVM and what not, which in turn is needed for mesa, then we have cmake, meson/ninja and so forth. A lot more software to handle nowadays).

IshKebab

2 months ago

> Debian is kind of slow in adapting to the modern world.

Yeah definitely. I guess this is a result of their weird idea that they have to own the entire world. Every bit of open source Linux software ever made must be in Debian.

If you have to upgrade the entire world it's going to take a while...

rilindo

2 months ago

I always thought that Debian is already on git, so this confused me. How is source control currently (or was) done with the Debian project?

Sesse__

2 months ago

The short answer is that it's not.

The longer answer is that a lot of people already use Git for Debian version control, and the article expands on how this will be better-integrated in the future. But what goes into the archive (for building) is fundamentally just a source package with a version number. There's a changelog, but you're free to lie in it if you so wish.

trebligdivad

2 months ago

This is great; I hate fighting distro source tools when I want to debug something.

throwaway7356

2 months ago

This just adds a new tool though.

Obligatory XKCD reference: https://xkcd.com/927/

danudey

2 months ago

New tools during the transition, hopefully fewer tools in the long run. Also things making a lot more sense in the long run.

cylemons

2 months ago

How do derivatives like Ubuntu build their packages compared to Debian?