hackernews client

The perils of transition to 64-bit time_t

219 pointsposted 15 hours ago

(blogs.gentoo.org)

165 Comments

eschaton

10 hours ago

The way this was handled on Mac OS X for `off_t` and `ino_t` might provide some insight: The existing calls and structures using the types retained their behavior, new calls and types with `64` suffixes were added, and you could use a preprocessor macro to choose which calls and structs were actually referenced—but they were hardly ever used directly.

Instead, the OS and its SDK are versioned, and at build time you can also specify the earliest OS version your compiled binary needes to run on. So using this, the headers ensured the proper macros were selected automatically. (This is the same mechanism by which new/deprecated-in-some-version annotations would get set to enable weak linking for a symbol or to generate warnings for it respectively.)

And it was all handled initially via the preprocessor, though now the compilers have a much more sophisticated understanding of what Apple refers to as “API availability.” So it should be feasible to use the same mechanisms on any other platform too.

AshamedCaptain

8 hours ago

This does not solve the main issue as explained by TFA, which is that now applications that use a different "compiled OS version" cannot link with each other anymore. Your application X which declares to run on OS v.B cannot link anymore with application Y which declared to run on OS v.A , even when both are running under OS v.B .

In fact, what you describe is basically what every platform is doing.... as doing anything else would immediately break compatibility with all current binaries.

kazinator

6 hours ago

The problem is that secondary libraries, that are not glibc, will not have multiply defined functions for different size of off_t and will not have the switch in their header file to transparently select the correct set of functions based on what the client program wants.

Yet, somehow the article emphasizes that this is more of a problem for time_t than off_t.

This is believable and it's brobably it's because time_t is more pervasive. Whereas off_t is a POSIX thing involved in a relatively small number of interfaces, time_ t is ISO C and is all over the place.

On top of everything, lots of C code assumes that time_t is an integer type, equal in width to int. A similar assumption for off_t is less common.

_kst_

35 minutes ago

Code that assumes time_t is the same width as int is already broken, and won't work on typical 64-bit systems were int is 32 bits and time_t is 64 bits.

In any case, I'm not sure I've ever seen any such code.

eschaton

4 hours ago

Since the different sizes are ultimately a platform thing, they need to support multiple variants on those platforms, or limit their support to a new enough base set of APIs that they can rely on 64-bit types being there.

38

6 hours ago

your comment has the tone of this being an elegant solution, but it reads to me like an awful hack. typeless macros are a nightmare I never want to deal with again.

eschaton

5 hours ago

These are very straightforward and do in fact serve as an elegant solution.

smackeyacky

5 hours ago

Elegant maybe in the 1980s before compilers got properly complicated. Now they’re a weird anachronism.

I’m old enough to have worked on c++ when it was a precompiler called cfront and let me tell you trying to work out whether your macro was doing something weird or the cfront processor did something weird was very frustrating. I swore off macros after that.

eschaton

4 hours ago

This is a case where you’re seeing the word “macro” and reacting to that when it’s really not warranted. It’s not using macros for complicated code generation, just selecting from a couple alternatives based on one or two possible compiler arguments.

I’m also old enough to remember CFront and this isn’t that.

db48x

9 hours ago

Lol, that only works if you can force everyone on the platform to go along with it. It is a nice solution, but it requires you to control the c library. Gentoo doesn’t control what libc does; that’s either GNU libc or MUSL or some other thing that the user wants to use.

eschaton

9 hours ago

It’s entirely opt-in, Apple doesn’t force it. If you just do `cc mything.c -o mything` you get a binary whose minimum required OS is the version of the SDK you built it against, just as with any other UNIX-like OS. It’s just giving the developer the option to build something they know will run on an earlier version too.

And since it was all initially done with the preprocessor rather than adding knowledge to the compilers, there’s no reason individual libraries can’t handle API versioning in exactly this way, including things like differing `ino_t` and `off_t` sizes.

dmitrygr

9 hours ago

> requires you to control the c library

Which is why basically every other same operating system does that. BSDs, macOS, WinNT. Having the stable boundary be the kernel system call interface is fucking insane. And somehow the Linux userspace people keep failing to learn this lesson, no matter how many times they get clobbered in the face by the consequences of not learning it.

clhodapp

9 hours ago

Making the stable boundary be C headers is insane!

It means that there's not actually any sort of ABI, only a C source API.

kazinator

6 hours ago

Headers can do the magic to select the right ABI more or less transparently, based on preprocessor symbols which indicate the selection: is a certain type 32 or 64.

This is similar to what Microsoft did in Win32 with the W and A functions (wide char and ascii/ansi). You just call MessageBox and the header file will map that to MessageBoxA or MessageBoxW based on whether you are a UNICODE build.

The character element type in the strings accepted by MessageBox is TCHAR. That can either be CHAR or WCHAR.

Once the code is compiled, it depends on the appropriate ABI: MessageBoxA or MessageBoxW; MessageBox doesn't exist; it is a source-level fiction, so to speak.

westurner

6 hours ago

cibuildwheel builds manylinux packages for glibc>= and musl because of this ABI.

manylinux: https://github.com/pypa/manylinux :

> Python wheels that work on any linux (almost)

Static Go binaries that make direct syscalls and do not depend upon libc or musl run within very minimal containers.

Fuschia's syscall docs are nice too; Linux plus additional syscalls.

AshamedCaptain

8 hours ago

> no matter how many times they get clobbered in the face by the consequences of not learning it.

While I do think that the boundary should be set at the libc level just out of design cleanliness, I fail to see what the "consequence" of not doing so is. You're just changing the instruction binaries use for syscalls from a relative jump to a trap (or whatever), but you still have all the problems with data type sizes, function "versions" and the like which are what this discussion is about.

dmitrygr

8 hours ago

> I fail to see what the "consequence" of not doing so is.

TFA ?

AshamedCaptain

7 hours ago

No. How would it benefit TFA _at all_?

For all practical purposes Linux currently has _two_ stable boundaries, libc (glibc) and kernel. If you move it so that the stable boundary is only the kernel, you still have this problem. If you move it so that the stable boundary is only libc, you still have this problem.

In fact, TFA's problem comes from applications passing time_ts around, which is strictly a userspace problem, and the syscall interface is almost entirely ortogonal. Heck, the 32-bit glibc time_t functions probably use the 64-bit time_t syscalls these days...

asveikau

8 hours ago

> WinNT

This isn't true. There is an msvcrt in the OS, but it's mainly there for binaries that are part of Windows. The CRT is released as part of Visual Studio, out of band from the Windows release schedule.

Although CRT's place is in the layering a little different, because of so many things talking directly to Windows APIs.

dmitrygr

8 hours ago

IIRC, the CRT in windows makes NO system calls. Those are all in ntdll.dll, and in ntdll.dll ONLY

asveikau

8 hours ago

That's just a legacy of the strategy where NT native APIs were considered a private API, and you were meant to code against Win32 instead, to target both NT and 9x.

Syscalls are in ntdll, layered above that is kernel32 (today kernelbase), then most user mode code including CRT is above that. In 9x kernel32 were syscalls, in NT they were user mode shims above ntdll.

That's for most things. Things like gdi have kernel entry points too afaik.

Anyway my point is that the C library is developed out of band from that.

db48x

9 hours ago

It’s extra work, but I don’t know that it is necessarily insane. If libc was under the complete control of the kernel developers, then that gives other languages fewer options. Go famously (or infamously) uses certain syscalls without going through libc, for example. Sometimes the choices made for the C library just aren’t compatible with other languages. Frankly the C library, as it exists today, is insane. Maybe the solution is to split it in half: one for the syscalls and another for things like strings and environment variables and locales and all the other junk.

mort96

8 hours ago

Splitting libc into one "stable kernel API" part and one "library with things which are useful to C programmers" part would honestly make a ton of sense. OSes which declare their syscall interface to be unstable would be more convincing if they actually provided a stable API that's intended for other consumers than C.

AshamedCaptain

7 hours ago

Frankly I don't see the point of complaining that libc is not useful for non-C consumers. Sure, there are some ancillary functions in libc you'll likely never call. But what's the issue with making syscalls through libc? Again, the only difference is what the very last call instruction is. If your runtime can marshal arguments for the kernel, then it surely can marshal arguments for libc. They are almost always practically the same ABI.

And you can avoid the need for VDSO-like hacks which is basically the kernel exposing a mini-libc to userspace.

eschaton

9 hours ago

And yet it would still work out for Linux if musl, glibc, et al just adopted `API_VERSION_MIN` and `API_VERSION_MAX` macros themselves, it doesn’t actually have to be handled entirely at the `-isysroot` level.

codys

13 hours ago

There are a few options for Gentoo not discussed in the post, possibly because for Gentoo they would be a larger amount of work due to the design of their system:

1. Allow building against packages without installing them. The core issue here is that Gentoo package build and install happen as a single step: one can't "build a bunch of things that depend on one another" and then "atomically install all the build items into place". This means that Gentoo can easily be partially broken when one is doing updates when an ABI change occurs (modulo so versioning, see next option). This issue with 64-bit time_t is an example of an ABI change that folks are very aware of and is very widespread. It's also an example of something that causes an ABI change that isn't handled by the normal `.so` versioning scheme (see next option).

2. Extend the normal `.so` versioning to capture changes to the ABI of packages caused by packages they depend on. Normally, every `.so` (shared object/library) embeds a version number within it, and is also installed with that version number in the file name (`libfoo.so.1.0.0`, for example, would be the real `.so` file, and would have a symlink from `libfoo.so` to tell the linker which `.so` to use). This shared object version is normally managed by the package itself internally (iow: every package decides on their ABI version number to enable them to track their own internal ABI breakages). This allows Gentoo to upgrade without breaking everything while an update is going on as long as every package manages their `.so` version perfectly correctly (not a given, but does help in many cases). There is a process in Gentoo to remove old `.so.x.y.z` files that are no longer used after an install completes. What we'd need to do to support 64-bit time_t is add another component to this version that can be controlled by the inherited ABI of dependencies of each `.so`. This is very similar in result to the "use a different libdir" option from the post, but while it has the potential to set things up to enable the same kinds of ABI changes to be made in the future, it's likely that fixing this would be more invasive than using a different libdir.

akira2501

12 hours ago

> Allow building against packages without installing them.

A partial staged update system would work the best. I should be able to schedule the build of several new packages, have them built into a sandbox, and have new compilations use the sandbox first and then fallback to the system through a union, and then once everything is build finally package up all the pieces and then move them out of the sandbox and into the system at large.

You could make all gentoo updates transactional that way which would be a huge boon in many other ways.

aaronmdjones

12 hours ago

Gentoo already allows option 1 by specifying a directory to ultimately install the finished image to (normally it is set to /) [1]. One can do a complete rebuild of everything in @system and @world, install it to the specified subdirectory, and then sync it all over in one shot. Preferably you would do this from a live session, although in theory you could also bind-mount / onto a subdirectory of the place you reinstalled everything to, chroot into it, and then sync it (what is now /) into the bind-mounted real upper /.

[1] https://devmanual.gentoo.org/ebuild-writing/variables/#root

codys

11 hours ago

It's true that Gentoo has some pieces of what they need to build an implementation of option 1 (build packages that depend on other packages before completeing an install), but it's currently the case that that is not what happens when one runs `emerge` (the gentoo packaging build/installing tool), as you've noted one would need to write scripts that wrap emerge (or do the same work manually) to attempt to accomplish this today.

I suspect that using bind mounts and overlays (to allow the "building packages" chroot a view of the "installed" root) could be used to accomplish this, or alternately some filesystem snapshotting features if we're thinking about this from the "external to emerge" angle. (It's my understanding that for Gentoo to do this ABI change, though, they need some solution integrated into emerge).

To some extent, this kind of potential model also reminds me of systems that integrate their package updating with filesystem snapshotting to allow rollbacks and actually-atomic upgrades of many files. I think one or more of the solaris distributions did this?

aaronmdjones

8 hours ago

> but it's currently the case that that is not what happens when one runs `emerge` (the gentoo packaging build/installing tool)

This would be a simple matter of changing /etc/portage/make.profile to point to a profile that uses the new ABI and setting (and exporting) the ROOT variable in the shell you type "emerge" into (or add this ROOT=/whatever/ line to /etc/portage/make.conf), is what I was getting at. There's no need to write any wrappers.

codys

7 hours ago

Simply changing `ROOT` doesn't magically enable us to have `emerge` update our normal `ROOT=/`. There's extra steps here (the syncing, the bind mounting, etc) for the `emerge` with a custom `ROOT` to do an update to `/` itself.

And that's if we have a `ROOT` that allows us to source dependencies (libraries to link against, includes, etc). Currently it's unclear to me how exactly `ROOT` works, probably meaning there needs to be a bit more documentation there and potentially indicating that it might not be used much right now.

aaronmdjones

6 hours ago

> There's extra steps here (the syncing, the bind mounting, etc) for the `emerge` with a custom `ROOT` to do an update to `/` itself.

Those extra steps are handled by telling people what to do, in news items. This happened for example in the 17.1 -> 23.0 profile migration [1]; emerge wasn't updated to help people do this in any way whatsoever, users are expected to do everything themselves. This is the Gentoo way.

> And that's if we had a `ROOT` that allowed us to source dependencies (libraries to link against, includes, etc).

You don't need to source anything in ROOT because everything you're rebuilding is already in /.

If you intended to build a program that depended upon a library you did not already have (in /), you're right, this wouldn't work. That isn't the case here. The dynamic linker (ld-linux.so or whatever you fancy, the ELF interpreter, the thing that runs before main(), the thing that will look at the list of dependent libraries, try to load them, and balk at any mismatches) isn't executed until the program is, and setting ROOT merely tells portage where to put them; it doesn't execute them. This I alluded to in the other comment thread about building a userland that your CPU can't even execute.

[1] https://www.gentoo.org/support/news-items/2024-03-22-new-23-...

codys

6 hours ago

Huh. If it's OK to have instructions like those for users to run through, then I guess I don't understand what's holding up switching to 64-bit time_t for gentoo. Seems like they could make a new profile and tell folks to deal with it. Perhaps I don't understand what the requirements are here, which might indicate that the requirements aren't clear for others too.

rini17

9 hours ago

The third paragraph says following, which means it's not suitable solution to build whole system:

When building a package, ROOT should not be used to satisfy the required dependencies on libraries, headers files etc. Instead, the files on the build system should be specified using /.

aaronmdjones

8 hours ago

Yes, that means it will look in the host system (/) for those dependencies. But when you're also recompiling and installing those dependencies to /whatever/ as well anyway, and /whatever/ is eventually going to end up being /, this is fine. The original / doesn't get touched until everything is done, and /whatever/ is all built to the new ABI, which it can be precisely because nothing in /whatever/ is getting consulted or used during the rebuild process.

codys

7 hours ago

No, what rini17 pointed out is a problem here, and I think my previous comments that setting ROOT might be "part of a solution" are less accurate (as it seems `ROOT` is even less of a complete solution here).

Lets consider this hypothetical representation of the 64-bit time issue:

In `ROOT=/`, we have a `libA.so` with 32-bit time_t, and a `libB.so` with 32-bit time_t that depends on `libA` (ie: `libB.so` -> `libA.so`).

We rebuild package `A` with a special root (`ROOT=/tmp/new`) which has 64-bit time_t. We now have a `/tmp/new` with `libA.so` built for 64-bit time_t. We now want to build `B` for `ROOT=/tmp/new` with 64-bit time_t and have it link against the new `/tmp/new/lib/libA.so`.

But setting `ROOT=/tmp/new` doesn't cause us to use the `/tmp/new` to as our source to link against things. So (as it currently stands) we'll end up trying to link `libB.so` (64-bit time_t) against `/lib/libA.so` (32-bit time_t), and things just won't work.

If gentoo changes/extends ROOT (or adds something else like ROOT but used to locate dependencies for linking), then it could be a piece of a solution. But it sounds like it's pretty far from that in intention/practice right now.

aaronmdjones

7 hours ago

> But setting `ROOT=/tmp/new` doesn't cause us to use the `/tmp/new` to as our source to link against things

That's a good thing, because it won't be /tmp/new/ when we're finished.

> So (as it currently stands) we'll end up trying to link `libB.so` (64-bit time_t) against `/lib/libA.so` (32-bit time_t), and things just won't work.

It won't work until /tmp/new/ becomes /, which it doesn't have to until you're done.

Remember that this is just for building and installing the software. It doesn't get executed which is precisely what you want because it wouldn't work properly otherwise, as you point out.

You can even use this method to build a complete userland for an instruction set architecture that your CPU doesn't even support. All you need do is also set CHOST to the tuple identifying the target architecture in addition to setting ROOT for where it should be installed [1]. You just wouldn't be able to chroot into it obviously.

[1] https://wiki.gentoo.org/wiki/Embedded_Handbook/General/Intro...

codys

6 hours ago

Before getting into the details here about why we need to read `libA.so` at build time: Perhaps the documentation gentoo provides about `ROOT` that was noted by rini17 is incorrect/misleading and `ROOT` is used as the source of build time dependencies automatically and the documentation was intended for ebuild writers and not emerge users? (in which case, Gentoo should fix/clarify it).

"link" has 2 meanings:

1. As you've pointed at, at runtime we have linking occur that reads files.

2. Additionally, building software with `gcc`/etc, we tell it what libraries it needs to link against (`-lA`). When we do that, `gcc` as part of the build process searches for `libA.so` (in various locations dependent on its configuration and arguments), reads the `libA.so` file, and uses that info to determine what to emit in the object file that `gcc` is building. So we need to read a shared library. The same is true for headers used at build time. How much this actually ends up mattering varies depending on architectural details, and whether libraries have different symbols exposed (or generate different header files) depending on what the bit-width of time_t is. It may well be that for most packages we could have the build-time linking occur against a 32-bit time_t object as long as runtime linking is done to the 64-bit time_t object (assuming one is using identical package versions or at least versions with a compatible ABI).

aaronmdjones

6 hours ago

> and whether libraries have different symbols exposed (or generate different header files) depending on what the bit-width of time_t is

Hmm. You have a good point there; I hadn't thought about that...

Perhaps they can take an existing stage3 with 32-bit time_t, build a new stage1 through 3 with a 64-bit time_t, and compare the symbols and header files between them to identify any potential trouble spots. This would help them regardless of what migration course of action they decide upon.

Any software people have installed that isn't included in the stage3 (e.g. desktop window managers and everything else on top) wouldn't be included here and so is a potential pain point, but everything in @system would be, so at that point you can just rebuild @world again to fix everything.

seanhunter

an hour ago

In the original BSD man pages, the "Bugs" section for "tunefs" had the famous joke "You can tune a file system, but you can't tune a fish." but according to "Expert C Programming"[1], the source code for this manpage had a comment next to the joke saying

   > Take this out and a UNIX Demon will dog your steps from now
   > until the time_t's wrap around.

Obviously back in the 70s when that was written 2038 seemed unimaginably far in the future.

[1] https://progforperf.github.io/Expert_C_Programming.pdf

grantla

12 hours ago

> A number of other distributions such as Debian have taken the leap and switched. Unfortunately, source-based distributions such as Gentoo don’t have it that easy.

For Debian it was extremely painful. A few people probably burned out. Lots of people pointed to source-based distributions and said "they will have it very easy".

Denvercoder9

10 hours ago

> For Debian it was extremely painful.

Do you have any references that elaborate on that? From an outsider perspective, the time64 transition in Debian seemed to have been relatively uncontroversial and smooth. Way better than e.g. the /usr-merge.

jlarocco

9 hours ago

https://lwn.net/Articles/812767/

https://wiki.debian.org/ReleaseGoals/64bit-time

I'm continually impressed by Debian. My Debian systems are pretty boring, so maybe it's to be expected, but as an end-user neither the /usr merge or time64 transition broke anything for me, and I'm using Testing.

ajsnigrutin

9 hours ago

> For Debian it was extremely painful. A few people probably burned out. Lots of people pointed to source-based distributions and said "they will have it very easy".

'easy' in a way "just tell the user to rebuild everything in one go"? :)

kccqzy

14 hours ago

My biggest takeaway (and perhaps besides-the-point) is this:

> Musl has already switched to that, glibc supports it as an option. A number of other distributions such as Debian have taken the leap and switched. Unfortunately, source-based distributions such as Gentoo don’t have it that easy.

While I applaud their efforts I just think as a user I want to be over and done with this problem by switching to a non source-based distribution such as Debian.

ordu

13 hours ago

There is an easy way to deal with this in Gentoo. Boot from usb or something, run mkfs.ext4 (or .whatever fs you use) on your / and /usr partitions, mount them, unpack stage3 on them, chroot into them and run `emerge $all-my-packages-that-where-installed-before-mkfs`.

You can install new copy of Gentoo instead of upgrading it incrementally.

robin_reala

13 hours ago

“easy”

ordu

13 hours ago

It is easy. It is just a tl;dr version of "how to install gentoo". :)

lnxg33k1

13 hours ago

I hope I will get a day where I have the will to recompile everything in the next 14 years, I think I had the same install for the past 10 or so on the gentoo desktop. The only think I reinstalled recently has been the laptop to switch from arch to sid

wtallis

14 hours ago

It sounds like the difficulty for source-based distributions comes from trying to do an in-place upgrade that makes incompatible changes to the ABI. So changing to an entirely different distribution would be at least as disruptive (though possibly less time consuming) as doing a clean install of Gentoo using a new ABI.

akira2501

12 hours ago

Starting a few years ago I partition all my drives with two root partitions. Once is used and the other is blank. For precisely this reason. Sometimes it's easier to just roll out a brand new stage3 onto the unused partition, build an entirely new root, and then just move over to that once it's finished.

The bonus is you can build your new system from your old system using just chroot. Very convenient.

jasomill

12 hours ago

Several OSes I use on a daily basis (FreeBSD, Fedora CoreOS, and Fedora Kinoite) adopt related strategies as part of their regular (binary) update processes:

https://wiki.freebsd.org/BootEnvironments

https://coreos.github.io/rpm-ostree/

viraptor

10 hours ago

> by switching to a non source-based distribution such as Debian.

The distinction has more nuance. Source based distros like nixos don't have the same issue. The problem is more in how Gentoo builds/installs the packages than in building from source.

Also with third party closed source software, you're still going to have issues, even on binary systems. Actually you could even have issues with the first party packages if they're installed in separate independent steps.

mhandley

9 hours ago

"Let’s consider a trivial example:

  struct {
      int a;
      time_t b;
      int c;
  };

With 32-bit time_t, the offset of c is 8. With the 64-bit type, it’s 12."

Surely it's 16 not 12, as b needs to be 64-bit aligned, so padding is added between a and b? Which also kind of makes the point the author is trying to make stronger.

AshamedCaptain

7 hours ago

Most x86 ABIs do not mandate padding for 64-bit types, due to the lack of 64-bit loads (at the time).

nobluster

13 hours ago

For a large legacy 32 bit unix system dealing with forward dates I replaced all the signed 32 bit time_t libc functions with unsigned 32 bit time_t equivalents. This bought the system another 68 years beyond 2038 - long after I'll be gone. The downside is that it cannot represent dates before the unix epoch, 1970, but as it was a scheduling system it wasn't an issue.

If legacy dates were a concern one could shift the epoch by a couple of decades, or even reduce the time granularity from 1 second to 2 seconds. Each alternative has subtle problems of their own. It depends on the use case.

suprjami

9 hours ago

If you can change the whole system from signed to unsigned, why not change to 64-bit?

Dylan16807

2 hours ago

The main problem here is gradual upgrades. Signed and unsigned code act exactly the same on dates before 2038, so you can upgrade piece by piece at your leisure.

The suggestions for messing with epoch or the timescale wouldn't work nearly as well, for those just switch to 64 bit.

n_plus_1_acc

14 hours ago

I'm no expert on C, but I was under the impression that type aliases like off_t are introduced to have the possibility to change then later. This clearly doesn't work. Am I wrong?

wtallis

14 hours ago

Source vs binary compatibility. Using typedefs like off_t mean you usually don't have to re-write code, but you do have to re-compile everything that uses that type.

rblatz

12 hours ago

But isn’t the point of a source only distribution like Gentoo, to build everything yourself? Who is running gentoo but also lugging around old precompiled stuff they don’t have the source for?

mananaysiempre

11 hours ago

As the post describes, the problem is that on Gentoo you can’t really build everything then switch binaries for everything, or at least that’s not what happens when you update things the usual way.

Instead, dependencies are built and then installed, then dependents are built against the installed versions and then installed, etc. In the middle of this process, the system can technically be half-broken, because it could be attempting to run older dependents against newer dependencies. Usually this is not a real problem. Because of how pervasive time_t is, though, the half-broken state here is potentially very broken to the point that you may be unable to resume the update process if anything breaks for any reason.

ordu

14 hours ago

It kinda work, but not in a source based distro. If you can atomically rebuild @world with changed definition of off_t, then there will be no problem. But source based distro doesn't rebuild @world atomically. It rebuild one package at time, so there would be inconveniences like libc.so has 64-bit off_t, while gcc was build for 32-bit off_t, so gcc stops working. Or maybe bash, coreutils, make, binutils or any other package that is needed for rebuilding @world. At this point you are stuck.

So such upgrade needs care.

jasomill

13 hours ago

As someone whose nearest exposure to a "source based distro" is FreeBSD, this sounds like madness, as it means a broken build could not only impair attempts to repair the build, but render the system unbootable and/or unusable.

And as someone who regularly uses traditional Linux distros like Debian and Fedora, the idea that a package management system would allow a set of packages known to be incompatible with one another to be installed without force, or that core package maintainers would knowingly specify incorrect requirements, is terrifying.

While I'm not familiar with Gentoo, my reading of this article suggests that its maintainers are well aware of these sorts of problems and that Gentoo does not, in fact, suffer from them (intentionally; inevitably mistakes happen just as they occasionally do on the bleeding edge of FreeBSD and other Linux distros).

AshamedCaptain

8 hours ago

FreeBSD does not use package management for the core system (i.e. freebsd itself, not ports). Gentoo does.

Athas

13 hours ago

Why not essentially treat it as a cross compilation scenario? NixOS is also source based, but I don't think such a migration would be particularly difficult. You'd use the 32-bit off_t gcc to compile a glibc with 64-bit off_t, then compile a 64-bit off_t gcc linked against that new glibc, and so on. The host compiler shouldn't matter.

I always understood the challenge as binary compatibility, when you can't just switch the entire world at once.

codys

13 hours ago

Nixos has it easier here because they don't require packages to be "installed" before building code against them. For Gentoo, none of their build scripts (ebuilds) are written to support that. It's plausible that they might change the embuild machinery so that this kind of build (against non-installed packages) could work, but it would need investigation and might be a difficult lift to get it working for all packages.

"treat it as a cross compilation scenario" is essentially what the post discusses when they mention "use a different CHOST". A CHOST is a unique name identifying a system configuration, like "x86_64-unknown-linux-gnu" (etc). Gentoo treats building for different CHOSTs as cross compiling.

andrewaylett

12 hours ago

NixOS isn't the same kind of source-based. At some level, even Debian could be said to be source based: there's nothing stopping you from deciding to build every package from source before installing it, and obviously the packages are themselves built from source at some point.

NixOS sits between Debian and Gentoo, as it maintains an output that's capable of existing independently of the rest of the system (like Debian) but is designed to use the current host as a builder (like Gentoo). Gentoo doesn't have any way to keep individual builds separate from the system as a whole, as intimated in the article, so you need to work out how to keep the two worlds separate while you do the build.

I think what they're suggesting winds up being pretty similar to what you suggest, just with the right plumbing to make it work in a Gentoo system. NixOS would need different plumbing, I'm not sure whether they've done it yet or how but I can easily imagine it being more straightforward than what Gentoo is needing to do.

tadfisher

11 hours ago

There absolutely will be problems with different Nix profiles that aren't updated together; for example, if you update some packages installed in your user's profile but not the running system profile. But this is common enough with other glibc ABI breakage that folks tend to update home and system profiles together, or know that they need to reboot.

Where it will be hell is running Nix-built packages on a non-NixOS system with non-ABI-compatible glibc. That is something that desperately needs fixing on the glibc side, mostly from the design of nss and networking, that prevent linking against glibc statically.

oasisaimlessly

8 hours ago

> Where it will be hell is running Nix-built packages on a non-NixOS system with non-ABI-compatible glibc.

This isn't a thing. Nix-built binaries all each use the hardcoded glibc they were built with. You can have any number of glibc's simultaneously in use.

codys

14 hours ago

> there would be inconveniences like libc.so has 64-bit off_t

glibc specifically has support for the 32-bit and 64-bit time_t abi simultaneously.

From the post:

> What’s important here is that a single glibc build remains compatible with all three variants. However, libraries that use these types in their API are not.

ordu

13 hours ago

Yeah, glibc was a bad example. Probably libz.so or something would be better.

codys

13 hours ago

Yep, agreed. Though it does expose an option here which would be "have glibc provide a mechanism for other libraries (likely more core/widely used libs) to support both ABIs simultaneously.

Presumably if that had been done in glibc when 64-bit time_t support was added, we could have had multi-size-time ABI support in things like zlib by now. Seems like a mistake on glibc's part not to create that initially (years ago).

Though if other distros have already switched, I'd posit that perhaps Gentoo needs to rethink its design a bit so it doesn't run into this issue instead.

o11c

12 hours ago

> have glibc provide a mechanism for other libraries (likely more core/widely used libs) to support both ABIs simultaneously

The linked article is wrong to imply this isn't possible - and it really doesn't depend on "GLIBC provide a mechanism". All you have to do is:

* Compile the library itself with traditional time_t (32-bit or 64-bit depending on platform), but convert and call time64 APIs internally.

* Make a copy of every public structure that embeds a time_t (directly or indirectly), using time64_t (or whatever contains it) instead.

* Make a copy of every function that takes any time-using type, to use the time64 types.

* In the public headers, check if everything is compiled with 64-bit time_t, and if so make all the traditional types/functions aliases for the time64 versions.

* Disable most of this on platforms that already used 64-bit time_t. Instead (for convenience of external callers) make all the time64 names aliases for the traditional names.

It's just that this is a lot of work, and little benefit to any particular library. the GCC 5 std::string transition is probably a better story than LFS, in particular the compiler-supported `abi_tag` to help detect errors (but I think that's only for C++, ugh - language without room for automatic mangling suck).

(minor note: using typedefs rather than struct tags for your API makes this easier)

AshamedCaptain

6 hours ago

This is a nightmare to implement (unless you go library by library and do it by hand, or you have significant compiler help), as e.g. functions may not directly use a struct with time_t but indirectly assume something about the size of the struct.

Note that for example the std::string transition did not do this.

jasomill

12 hours ago

This sounds like something similar to NeXTSTEP/macOS fat binaries, only with the possibility of code sharing between "architectures".

I like it, though it sounds like something that'd be unlikely to see adoption in the Linux world without the endorsement of multiple major distros.

codys

7 hours ago

Fat binaries (packing together 2 entirely different objects and picking one of them) could potentially work here, though I suspect we'd run into issues of mixed 32-bit/64-bit time_t things.

Another option that's closer to how things work with LFS (large file support) (mostly in the past now) is to use create different interfaces to support 64-bit time_t, and pick defaults for time_t at compile time with a macro that picks the right impl.

Also possible could be having something like a version script (https://sourceware.org/binutils/docs/ld/VERSION.html) to tell the linker what symbols to use when 64-bit time_t was enabled. While this one might have some benefits, generally folks avoid using version scripts when possible, and would require changes in glibc, making it unlikely.

Both of those options (version script extention, and LFS-like pattern) could allow re-using the same binary (ie: smaller file size, no need to build code twice in general), and potentially enable mixing 32-bit time_t and 64-bit time_t code together in a single executable (not desirable, but does remove weird link issues).

ArsenArsen

13 hours ago

this is just as much of a problem on binary distros mind you. just because there might be a smaller time delta does not mean that the problem is solved

codys

13 hours ago

nixos is an example of a distro that is source based and can do the atomic rebuild here because it has a way to build packages against other packages that aren't "installed". In nixos, this is because there are very few things that get "installed" as the global, only thing to use. But one could imagine that Gentoo could build something that would allow them to at least build up one set of new packages without installing them and then install the files all at once.

jandrese

12 hours ago

It's only the first step of the puzzle. And arguably only a half step. As the article points out anytime that off_t is stuffed into a struct, or used in a function call, or integrated into a protocol, the abstraction is lost and the actual size matters. Mixing old and new code, either by loading a library or communicating over a protocol, means you get offsets wrong and things start crashing. Ultimately the changeover requires everybody to segregate their programs between "legacy" and "has been ported or at least looked over", which is incredibly painful.

smueller1234

14 hours ago

They make it easier, but just at a source code level. They're not a real (and certainly not full) abstraction. An example that'll be making it obvious: if you replace the underlying type with a floating point type, the semantics would change dramatically, fully visible to the user code.

With larger types that otherwise have similar semantics, you can still have breakage. A straightforward one would be padding in structs. Another one is that a lot of use cases convert pointers to integers and back, so if you change the underlying representation, that's guaranteed to break. Whether that's a good or not is another question, but it's certainly not uncommon.

(Edit: sibling comments make the same point much more succinctly: ABI compatibility!)

jonathrg

14 hours ago

It does work, the problem with ABI changes is that when you change it, you have to change it everywhere at the same time. By default there's nothing stopping you from linking one library built using 32-bit off_t with another library built using 64-bit off_t, and the resulting behaviour can be incredibly unpredictable.

raldi

13 hours ago

I think the main reason they’re introduced is to provide a hint to the programmer and maybe some typesafety against accidentally passing, say, a file descriptor.

devit

11 hours ago

The C library uses macros so that the referenced symbols are different depending on ABI (e.g. open becomes open64, etc.), but most 3rd party libraries don't bother with that, so they break if their API uses time_t/off_t.

progbits

14 hours ago

Edit: nevermind others raise better points.

cataphract

13 hours ago

That's not the problem. If time_t was a struct with a single int32_t, you'd be in the same situation when you changed it to int64_t (ABI incompatibility: you need more space to store the value now).

In order not to have this problem you'd need the API to use opaque struts, for which only pointers would be exchanged.

beached_whale

14 hours ago

at the source level it does, but when you have compiled libraries it breaks.

SkiFire13

14 hours ago

Yeah, the problem in general with type aliases is that they're just that, aliases, not proper types. This means that they leak all the details of the underlying type, and don't work for proper encapsulation, which is what's needed for being able to change what should be implementation details.

tsimionescu

11 hours ago

The problem here would be exactly the same regardless of what form this type took. The issue is fundamental to any low level language: the size of a type is a part of its public API (well, ABI), by definition. This is true in C++ with private fields as well, for example: if you add a private field to a class, or change the type of a private field with one of a different size, all code using that class has to be re-built. The only way to abstract this is to use types strictly through pointers, the way Java does.

darkhelmet

3 hours ago

Every time I see people struggling with this, I am so incredibly glad that I forced the issue for FreeBSD when I did the initial amd64 port. I got to set the fundamental types in the ABI and decided to look forward rather than backward.

The amd64 architecture did have some interesting features that made this much easier than it might have been for other cpu architectures. One of which was the automatic cast of 32 bit function arguments to 64 bit during the function call. In most cases, if you passed a 32 bit time integer to a function expecting a 64 bit time_t, it Just Worked(TM) during the platform bringup. This meant that a lot of the work on the loose ends could be deferred.

We did have some other 64 bit platforms at the time, but they did not have a 64 bit time_t. FreeBSD/amd64 was the first in its family, back in 2003/2004/2005. If I remember correctly, sparc64 migrated to 64 bit time_t.

The biggest problem that I faced was that (at the time) tzcode was not 64 bit safe. It used some algorithms in its 'struct tm' normalization that ended up in some rather degenerate conditions, eg: iteratively trying to calculate the day/month/year for time_t(2^62). IIRC, I cheated rather than change tzcode significantly, and made it simply fail for years before approx 1900, or after approx 10000. I am pretty sure that this has been fixed upstream in tzcode long ago.

We did have a few years of whackamole with occasional 32/64 time mixups where 3rd party code would be sloppy in its handling of int/long/time_t when handling data structures in files or on the network.

But for the most part, it was a non-issue for us. Being able to have 64 bit time_t on day 1 avoided most of the problem. Doing it from the start was easy. Linux missed a huge opportunity to do the same when it began its amd64/x86_64 port.

Aside: I did not finish 64 bit ino_t at the time. 32 bit inode numbers were heavily exposed in many, many places. Even on-disk file systems, directory structures in UFS, and many, many more. There was no practical way to handle it for FreeBSD/amd64 from the start while it was a low-tier platform without being massively disruptive to the other tier-1 architectures. I did the work - twice - but somebody else eventually finished it - and fixed a number of other unfortunately short constants as well (eg: mountpoint path lengths etc).

wpollock

12 hours ago

I'm probably naive, but I see another way forward. If all the t64 binaries are static linked, they have no danger of mixing abis. After 2038, all the t32 code is broken anyway so there's no additional risk for going back to dynamic linking then. I feel if this was a solution the author would have mentioned it but I'm willing to look foolish to hear what others will say.

layer8

12 hours ago

Static linking is probably impractical for applying security updates, as often you’d have to recompile/relink basically the whole system. In some cases it could also significantly increase memory usage.

GeorgeTirebiter

11 hours ago

with fast networks, huge disks, fast processors --- it seems wasteful to me to even consider shared libraries. Shared libraries is a technology that was useful when we were memory starved. We are no longer memory starved. So you replace the static binary? Big deal, size is not an issue (for 99% of the cases) given what we have today.

Recall, too, that the "link" step of those .o files is actually a "link / edit" step, where routines in the libraries not used are not linked.

layer8

11 hours ago

It’s much more straightforward to ensure consistency with shared libraries, and not having to rebuild stuff. Wasting disk space, RAM, network bandwidth and processing time is what seems wasteful to me.

gotoeleven

10 hours ago

Are you by chance a javascript programmer?

jiggawatts

9 hours ago

Static linking is the root cause for “modern” apps based on Electron taking minutes to start up and be useful. They’re statically linking almost an entire operating system (Chromium), an entire web server, server runtime framework, and an equivalent client framework for good measure.

On the fastest PC that money can buy this is somewhere between “slow” and “molasses”.

I miss the good old days when useful GUI programs were mere kilobytes in size and launched instantly.

troad

6 hours ago

Considering the most important remaining use case for 32-bit is embedded systems, wouldn't static linking be a non-starter due to space and performance constraints?

panzi

10 hours ago

The only place where this is relevant for me is running old Windows games via wine. Wonder how wine is handling this? Might as well re-map the date for 32bit wine to the late 90s/early 2000s, where my games are from. Heck, with faketime I can do that already, but don't need it yet.

loeg

13 hours ago

The C standard does not require time_t to be signed (nor does POSIX). Just changing the 32-bit type to unsigned would (in some senses) extend the lifetime of the type out to 2106. You could at least avoid some classes of ABI breakage in this way. (On the other hand, Glibc has explicitly documented that its time_t is always signed. So they do not have this option.)

pwg

13 hours ago

While that avoids 2038 as a "drop dead" date for 32-bit time_t, it also removes the ability to represent any date/time prior to 00:00:00 UTC on 1 January 1970 using 32-bit time_t because you lose the ability to represent negative values.

Having all existing stored date/times that are currently prior to the epoch suddenly become dates post 2038 is also not a good scenario.

loeg

12 hours ago

> it also removes the ability to represent any date/time prior to 00:00:00 UTC on 1 January 1970 using 32-bit time_t

Yes, of course. This is probably not the main use of negative values with signed time_t, though -- which is just representing the result of subtraction when the operand happened before the subtrahend.

> Having all existing stored date/times that are currently prior to the epoch suddenly become dates post 2038 is also not a good scenario.

In practice, there are ~zero of these on systems with 32-bit time_t and a challenging migration path as we approach 2038.

cryptonector

12 hours ago

> Yes, of course. This is probably not the main use of negative values with signed time_t, though -- which is just representing the result of subtraction when the operand happened before the subtrahend.

This is definitely a bigger concern, yes. One has to be very careful with subtraction of timestamps. But to be fair one already had to be very careful before because POSIX doesn't say what the size or signedness of `time_t` is to begin with.

Indeed, in POSIX `time_t` can even be `float` or `double`[0]!

  time_t and clock_t shall be integer or real-floating types.

Though on all Unix, BSD, Linux, and any Unix-like systems thankfully `time_t` is always integral. It's really only size and signedness that one has to be careful with.

Thus one should always subtract only the smaller value from the larger, and cast the result to a signed integer. And one has to be careful with overflow. Fortunately `difftime()` exists in POSIX. And there's a reason that `difftime()` returns a `double`: to avoid having the caller have to deal with overflows.

Basically working safely with `time_t` arithmetic is a real PITA.

  [0] https://pubs.opengroup.org/onlinepubs/009696799/basedefs/sys/types.h.html

loeg

12 hours ago

> Indeed, in POSIX `time_t` can even be `float` or `double`[0]!

Standard C, yes. Newer POSIX (your link is to the 2004 version) requires time_t be an integer type: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/sy...

> Though on all Unix, BSD, Linux, and any Unix-like systems thankfully `time_t` is always integral. It's really only size and signedness that one has to be careful with. Thus one should always subtract only the smaller value from the larger, and cast the result to a signed integer. And one has to be careful with overflow. Fortunately `difftime()` exists in POSIX. And there's a reason that `difftime()` returns a `double`: to avoid having the caller have to deal with overflows.

> Basically working safely with `time_t` arithmetic is a real PITA.

Yes.

cryptonector

11 hours ago

Yes, I know that was older POSIX. But we're talking about old code, the unstated context is portability over a long period of time, and I wanted to make a point :)

So use `difftime()`, don't assume signedness or size, but do assume that it's an integral type.

cryptonector

12 hours ago

Stored _where_ exactly?

umanwizard

13 hours ago

You would then lose the ability to represent times before Jan. 1st, 1970. Which is not just a theoretical concern; those times appear e.g. in databases with people's date of birth.

nobluster

13 hours ago

And a signed 32 bit time_t with an epoch of 1970 cannot represent dates before 1902. Using time_t to store legacy dates is not advisable - even if you ignore all the issues with time zones and changing local laws pertaining to offsets from UTC and daylight saving time.

pwg

13 hours ago

While true, that limitation has always existed, so everyone has already implemented whatever was necessary to represent dates earlier than that.

Changing 32-bit time_t to unsigned suddenly makes all dates from 1902 to Jan 1 1970 which were stored using time_t (even if it was non-advisable, it still will have occurred) appear to teleport into the future beyond 2038.

cryptonector

12 hours ago

That's alright, see, because I have no filesystems, no tarballs, no backups, no files on any kind of media with timestamps before 1970, and indeed, no one can except by manually setting those timestamps -- and why bother doing that?!

So any pre-1970 32-bit signed timestamps will be in... not spreadsheets, in what? In databases? No, not either. So in some sort of documents, so software consuming those will need fixing, but we're not going to see much teleporting of 1902-1970 timestamps to 2038-2106. I'm not concerned.

jasomill

11 hours ago

Many important software projects predate UNIX. Perhaps you want to create a Git repository for one of them, with historically accurate commit timestamps?

cryptonector

11 hours ago

Well, Git doesn't seem to set file timestamps when cloning. And as the sibling comment says, Git doesn't use 32-bit signed integers to store timestamps. So this is not a problem.

If, however, Git were to some day get an option to set file mtimes and atimes at clone time to the last-modified time of the files based on commit history, then you could always just use a 64-bit system where `time_t` is 64-bit.

cesarb

6 hours ago

More importantly than a git repository, you might have archives (a .tar or equivalent) of these software projects, and might want to unpack them while preserving the timestamps contained within these archives.

cryptonector

2 hours ago

Unix did not exist before 1970. Neither did tar. There are no tar files with files with timestamps before 1970 _unless_ you've used `touch -t 195001010000` or `utime(2)` or `futimes(3)` to set the files' time to before 1970. That seems pretty unlikely.

loeg

11 hours ago

Git represents timestamps in ASCII strings, like "12345." Not time_t. (I am not sure if it even allows negative integers in this format.)

loeg

12 hours ago

Databases do not and can not use system time_t. Consider how their on-disk state would be impacted by a change from 32-bit time_t to 64-bit! Instead they use specific or variable size integer types.

bloak

12 hours ago

Serious question: do people use time_t for representing a date of birth?

To me that wouldn't seem right: a date of birth isn't a timestamp and you typically receive it without a corresponding place or time zone so there's no reasonable way to convert it into a timestamp.

(The other problem is that a signed 32-bit time_t only goes back to 1901. You might not have to deal with a date of birth before 1901 today, unless you're doing genealogy, of course, but until fairly recently it's something you'd probably want to be able to handle.)

umanwizard

12 hours ago

> Serious question: do people use time_t for representing a date of birth?

I have seen people in real life use seconds since the UNIX epoch to represent DOB, yes.

nomel

12 hours ago

My birth certificate has location, day, year, hour, and minute of birth. Birth is an event in time, perfectly represented with a (UTC) timestamp.

Merad

12 hours ago

Your birth is an event that happened at an instant in time, but very few systems concern themselves with that detail. The vast majority need to store birth _date_ and have no interest in the time or location.

nomel

9 hours ago

> have no interest in the time

Setting the bottom couple of bytes to zero achieves this, while maintaining nearly universal consistency with all other events in time that might need to be related with a packing of bits. People do it because it's what's being used at nearly every other level of the stack.

cryptonector

12 hours ago

> Birth is an event in time, perfectly represented with a (UTC) timestamp.

That's not the same thing as `time_t` though. `time_t` is UTC. UTC is not `time_t`.

pezezin

9 hours ago

Does any database actually use time_t? PostgreSQL uses its own datatype, the number of microseconds since 4713 BC. I am sure that other databases do the same.

https://www.postgresql.org/docs/current/datatype-datetime.ht...

cryptonector

12 hours ago

Those databases might not even use `time_t`. It's their problem anyways, not the OS's.

petee

13 hours ago

Openbsd did just this for 32bit compat when they changed to 64bit time 12 years ago, and seems to have worked out fine.

brynet

11 hours ago

No, time_t is a signed 64-bit type on all architectures, 64-bit architectures have no "32bit compat".

https://www.openbsd.org/55.html

petee

9 hours ago

From the notes, this is what i was referring to, I think I just mistook the meaning -

Parts of the system that could not use 64-bit time_t were converted to use unsigned 32-bit instead, so they are good till the year 2106

scheme271

13 hours ago

If time_t is unsigned, how do times before the unix epoch get represented?

loeg

12 hours ago

They don't, much like they already do not. 32-bit time_t has always been finite, and 1970 was a long, long time ago. (See "in some senses" in my earlier comment.)

ddoolin

10 hours ago

Why was it 32 bits to begin with? Wasn't it known that 2038 would be the cutoff?

wongarsu

10 hours ago

It has been that way since Unix V4 which released in 1973. Back then moving to something that breaks after 65 years was a massive upgrade over the old format that wrapped after two and a half years. And at least from the standpoint of Unix engineers 32 bits was more than enough: Nobody is using Unix V4 in production in 2024, never mind 2038.

Why it made it into Posix and wasn't updated is a different question that's a bit more difficult to answer

int_19h

10 hours ago

It was not exactly a big concern when this was designed in early 70s, especially when you consider that Unix itself was kind of a hack at the time.

modeless

10 hours ago

In the 1970s people probably would have laughed at the idea that Unix would still be running in the year 2038.

ddoolin

7 hours ago

I get that; I am surprised to find my own software running 7 years later, let alone ~60 and given the context.

PHGamer

2 hours ago

well sometimes the old stuff is better than the new stuff. it really is hilarious though. things in the software world have regressed in some ways especially when it comes to privacy and locking down data.

NegativeLatency

10 hours ago

Ultimately probably hardware? I suspect it’s also been like this for a long time.

jdndhdhd

10 hours ago

Why is it 64 bit now? Isn't it know that 292277026596 would be the cutoff?

kazinator

6 hours ago

All those arguments apply to off_t also. If a small-file executable with 32 bit off_t uses a large-file library with 64 bit off_t, there is no protection. Only glibc has the multiply implemented functions, selected by header file macrology.

rini17

8 hours ago

Hope Gentoo alone won't get overburdened with the development and maintenance of such hybrid toolchain, to be able to compile and run time32 processes alongside time64. When everyone else has moved on. Better do a complete reinstall from stage3.

malkia

8 hours ago

Microsoft's famous (infamous?) "A" / "W" functions and macros controlling them -... but it works and one can link two different versions no problem. Wonder if that was possible here?

ok123456

10 hours ago

What about making time_t 64-bit across all profiles and incrementing all the profile versions by one? Gentoo users are used to breaking changes when upgrading profiles.

ndesaulniers

12 hours ago

I think we should start putting details of the ABI in ELF (not the compiler flags as a string). Wait.. Who owns the ELF spec??

Then the linker and loader could error if two incompatible objects with different ABIs were attempted to be linked together.

For instance, I suspect you could have fields to denote the size of certain types. I guess DWARF has that... But DWARF is optional and sucks to parse.

fred_is_fred

13 hours ago

Besides epoch time and the LFS support mentioned, are there any other 32-bit bombs waiting for Linux systems like this?

tredre3

12 hours ago

The ext file system uses many 32bit counters. Admittedly, version 4 fixed most of that (when formatted with the correct options).

kbolino

12 hours ago

In a similar vein, inodes can run out. On most conventional Linux file systems, inode numbers are 32 bits.

For many, this is not going to be a practical problem yet, as real volumes will run out of usable space before exhausting 2^32 inodes. However, it is theoretically possible with a volume as small as ~18 TiB (using 16 TiB for 2^32 4096-byte or smaller files, 1-2 TiB for 2^32 256- or 512-byte inodes, plus file system overheads).

Anticipating this problem, most newer file systems use 64-bit inode numbers, and some older ones have been retrofitted (e.g. inode64 option in XFS). I don't think ext4 is one of them, though.

lclarkmichalek

11 hours ago

It does happen in prod. Usually due to virtual FSes that rely on get_next_ino: https://lkml.org/lkml/2020/7/13/1078

Dylan16807

an hour ago

That method is wrapping and not checking for collisions? I would not call that a problem of running out then. It's a cheap but dumb generator that needs extra bits to not break itself.

fph

13 hours ago

IPv4, technically, is another 32-bit bomb, but that's not Linux-specific.

samatman

12 hours ago

I wouldn't call this a bomb at all. Bombs are events, not processes.

Resource contention for IPv4 has been around for a long time, with a number of workarounds and the ultimate out of supporting IPv6. There has been, to date, no moment of crisis, nor do I expect one in the future.

It will just get steadily more annoying/expensive to use IPv4 and IPv6 will relieve that pressure incrementally. We're at least two decades into that process already.

poincaredisk

12 hours ago

Two decades in, and IPv6 is still the more annoying option.

I wish they were less ambitious and just increased address sizes when designing ipv6.

icedchai

5 hours ago

If more people bothered to configure IPv6 instead of complaining about it, V4 would be a thing of the past already.

o11c

4 hours ago

A lot of people don't have a choice.

Just last(?) year I finally switched to an ISP whose equipment supports IPv6. But I still can't actually use it since my wifi router's support for IPv6 somehow fails to talk to the ISP equipment.

Hm, there's a firmware upgrade ... (installs it) well, re-trying all those options again, it looks like one of them lets it finally work now. The others (including the default) still fail with arcane errors though!

shmerl

5 hours ago

LFS should have become mandatory.

dark-star

13 hours ago

Are there other distros where the switch to 64-bit time_t has already happened? What's the easiest way to figure out whether $distro uses 32-bit or 64-bit time_t? Is there something easier/quicker than writing a program to print `sizeof(struct stat)` and check if that's 88, 96 or 1098 bytes (as hinted in the article)?

SAI_Peregrinus

12 hours ago

NixOS switched. And they're source-based, but with actual dependency tracking and all the (insanely complex) machinery needed to allow different programs to use different C library versions simultaneously.

`printf("sizeof (time_t) = %zu, %zu bits", sizeof (time_t), sizeof (time_t) * CHAR_BIT);` gives you the size in bytes and in bits. Needs time.h, stddef.h, and stdio.h.

jonathrg

12 hours ago

printf '#include <stdio.h>\n#include <time.h>\nint main() { printf("time_t is %%zu-bit\\n", sizeof(time_t)*8); }\n' | gcc -x c -o timesize - && ./timesize && rm ./timesize

If you're wondering about a specific distro that you're not using right now - just look it up.

teddyh

12 hours ago

Like the article says, Debian has already switched.

BoingBoomTschak

14 hours ago

A very thoughtful way of handling a problem much trickier than the earlier /usr merge. Had thought about 2 but not 1 and 3. I also had kinda forgotten that 2038 thing, time sure is flying!

I must say, mgorny's posts are always a treat for those who like to peer under the hood! (The fact that Gentoo has remained my happy place for years doesn't influence this position, though there's probably some correlation)

layer8

12 hours ago

> The second part is much harder. Obviously, as soon as we’re past the 2038 cutoff date, all 32-bit programs — using system libraries or not — will simply start failing in horrible ways. One possibility is to work with faketime to control the system clock. Another is to run a whole VM that’s moved back in time.

Yet another is to freeze the time for those programs at the last 32-bit POSIX second. They would just appear to execute incredibly fast :). Of course some will still break, and it’s obviously not suitable for many use cases (but neither is running in the past), but some might be just fine.

jeffbee

12 hours ago

Are we still keeping shared libraries? They are a complex solution to a problem that arguably stopped existing 20 years ago. Might be time to rethink the entire scheme.

filmor

12 hours ago

On Gentoo, definitely. I really don't want to rebuild my whole system whenever some foundational library fixes a bug. It already annoys me quite a bit that I need to set up sccache to get half-way reasonable compile times out of Rust projects (and I'm saying that as someone who enjoys gradually replacing significant portions of the userspace with Rust tools).

For one, none of us will be the ones dealing with it, but more seriously, by then 32-bit Unix/Linux ABIs will be gone.