A compact bitset implementation used in Ocarina of Time save files

54 pointsposted 5 days ago
by todsacerdoti

47 Comments

orlp

21 hours ago

> The most novel aspect of OoT bitsets is that the first 4 bits in the 16-bit coordinate IDs index which bit is set. For example, 0xA5 shows that the 5th bit is set in the 10th word of the array of 16-bit integers. This only works in the 16-bit representation! 32 bit words would need 5 bits to index the bit, which wouldn't map cleanly to a nibble for debugging.

There is nothing novel about this really. It's neat that it works with hexadecimal printing to directly read off the sub-limb index but honestly who cares about that.

Outside of that observation there's no advantage to 16-bit limbs and this is just a bog-standard bitset where the first k bits indicate the position within the 2^k bit limb, and the remaining bits give you the limb index.

jb55

21 hours ago

TimorousBestie

18 hours ago

For what it’s worth, I found it interesting. I like your verve.

lpribis

20 hours ago

Haha, good to take the classic HN condescension on the chin ;)

orlp

17 hours ago

My comment was worded a bit too harshly, sorry about that, I certainly wouldn't have worded that way in a personal message.

As I mentioned the hexadecimal printing coincidence is a neat fact, I was just excited when clicking the link to find a novel bitset idea. In my disappointment to find the standard bitset (albeit with 16-bit limbs) I reacted a bit too harshly. And as per https://xkcd.com/1053/, just because something isn't new to me doesn't mean it's not new to anyone.

Const-me

20 hours ago

In C++, std::vector<bool> does pretty much the same thing under the hood.

Sadly, C++ standard library API designers made it impossible to directly access the array of integers stored inside std::vector<bool>. This made serialization/de-serialization of these vectors very inefficient, sometimes prohibitively so.

abcd_f

a day ago

How's this notable? It’s literally the most straight-forward, trivial bitset implementation.

rixed

21 hours ago

After reading your comment, I was expecting some kind of RLE over a bit set. But no, not even that?

So I guess, it is notable because it shows how far coding has been industrialized; ie has been dumbed down so much that a mere bit field can be presented as some esoteric trick.

At this speed AGI is indeed around the corner :-D

pixelpoet

21 hours ago

Reminds me of the kerfuffle about https://en.wikipedia.org/wiki/Tai%27s_model, where someone managed to get published what is basically the Trapezoidal Rule for numerical integration, and of course had to name it after themselves.

bangonkeyboard

21 hours ago

> Tai disagreed that Tai's model is simply the trapezoidal rule, on the basis that her model uses the summed areas of rectangles and triangles rather than trapezoids.

Oh my god.

perching_aix

20 hours ago

The world is big and rediscoveries are not always going to be so graceful. If stuff is harder / more annoying to find than to figure it out yourself, you're downright reasonable for intentionally attempting so even.

If I remember correctly, in the "Tia's rule" case specifically, her coworkers even defended her for the whole thing - implying of course that she was being ridiculed for this, making neither side of this story look very pretty if you ask me.

Also why people don't "RTFM". Yes, I can read through the pages of prose you wrote how you implemented e.g. bog standard role based auth. Or I can just brute force try using all the related looking commands and see what happens...

dylan604

17 hours ago

> The world is big and rediscoveries are not always going to be so graceful.

It is telling how someone can think their idea is so novel, and rather than doing research go ahead and announce it like it is something new.

perching_aix

17 hours ago

But it really isn't. How many e.g. business ideas do you think are thought up every day, only for people to find that they already exist and have well established players? Most times people just catch themselves on time, or go through this with a smaller audience first that they know in person. Has this really never happened to you or anyone you know? Is this concept really that alien to you?

dylan604

16 hours ago

There's a difference between saying I'm going to do something that has already been done but I'm going to roll my own compared to look at this cool thing I've done that nobody else has done. There are plenty of bits of code that I've written that easily could have been handled by something along the lines of a 'npm install' but then I wouldn't have learned anything, but I never released a Show HN or written a blog to toot my own horn for reinventing the wheel and acting like it was the first time a wheel had been used.

But as far as being alien to me, any time someone pats themself on the back with public posts about anything is alien to me. I'm fully aware that if I could think of it, someone else out there has already done it. The number of times I've done an search for something I'm working on that hasn't had a post of someone else doing it approaches zero.

convolvatron

21 hours ago

omg, its so obvious! making computers smart is hard, but making people think less requires no effort at all. I can't believe I missed the plan so thoroughly.

TOGoS

20 hours ago

The neat thing here is that they did it without involving three package managers, two Docker containers, and an LLM.

dylan604

17 hours ago

How I converted a C++ bit mover package to Rust

armada651

21 hours ago

> The most novel aspect of OoT bitsets is that the first 4 bits in the 16-bit coordinate IDs index which bit is set.

Am I the only one who was trouble parsing this sentence correctly?

ethan_smith

21 hours ago

It means the first 4 bits of a 16-bit ID determine which bit position (0-15) to set in the bitset, essentially using part of the ID itself as an index into the bitset.

coldpie

21 hours ago

No, haha, it took me a while, too. "Index" is operating as a verb here.

curiousgal

21 hours ago

The coordinates IDS are 16-bit long. The first 4 bits index (verb) which bit is set.

jb55

20 hours ago

prs welcome

jb55

20 hours ago

i updated the wording, hope its more clear

firesteelrain

19 hours ago

I used to work at a Company where I did parking software and it was based on Borland C++ Builder 6 codebase. Up to 2012, when I left, we were using Bit flags to indicate configuration settings. We would store these settings in a config file as binary and recall it back when the application started up. We stored a ton of configuration in each flag and it was easy to add more.

But in retrospect, it wasn't super great if the disc or installation somehow got corrupted. You would lose your entire configuration and parking operators do not have the best IT.

This way of storing configuration was designed back in the 90s at this Company and it survived over 20 years. This was prior to this Zelda game and I never played it.

dylan604

17 hours ago

wow, then DVD programming must owe Company some due thanks because that's how GPRM/SPRM values were stored as 16bits where we'd XOR the halibut out of it get the specific values we wanted to know about. it's a good thing Company existed as nobody else on the planet would have ever figured out how to bit shift or otherwise use masks to get data out of 16bit int in ways other than just storing an integer.

not really sure why you chose to come down the way you did, but wow, just wow.

jamesu

20 hours ago

Reminds me of the extensive use of bitsets in Starfox Adventures. Not only was object spawning tied to flags, many dynamic effects were too so if you prodded the flags in memory often a lot of funky things would happen like explosions getting triggered and cutscenes.

tacker2000

21 hours ago

How is this in any way novel? People have been doing this since the dawn of time.

jb55

21 hours ago

I will remove the novel wording just for the hacker news geniuses. I have been programming for 26 years and have never encountered this pattern, nor could I find it in any libraries, which is why I decided to wrap it up in a library.

If a simple bitset like this exists in a library somewhere I would love to see where! Most implementations I've seen over-complicate it for simple use cases like this.

sumtechguy

20 hours ago

You may enjoy this page then. https://graphics.stanford.edu/~seander/bithacks.html

Bunches of bit twiddling things people like to do in different places for either speed/space.

jb55

20 hours ago

I have had the experience of explaining to coworkers how bitwise operators even work more than once. I think sometimes people overestimate the average programmers knowledge when it comes to bit operations. modern programming is so detached from having to use that for day to day work.

I am aware of the bithacks page, but I just found encoding the bit coordinate in the ID itself so clever.

Sesse__

18 hours ago

Can I ask, how would you do it otherwise? Is there any other way? Or is the supposed novelty here that the two values are packed into the same byte instead of having some zero bits between?

jfyi

17 hours ago

Yeah, you weren't going to win this. Most developers I've met would have approached this from one of the two extremes:

"Lol, this is the obvious implementation."

or

"Lol, just use a boolean."

That is to say, those that might be interested would already know. I'd expect very little in between.

Fwiw, I do think noticing the id readability is notable from a reverse engineering standpoint.

dooglius

18 hours ago

C++ std::vector<bool>

a_e_k

21 hours ago

Darn. By "compact", I thought there'd be some clever sparsity or compression to get below 1 bit/flag in common cases.

bangonkeyboard

21 hours ago

Apparently "compact" here refers to the five lines of actual code.

Just use <bitstring.h>.

warvair

19 hours ago

This seems quite nice. Simple bit-shifting of the 16-bit ID gives both the index into the array of 16-bit words and the bit in the word and allows up to 65536 bitflags. This makes for a very efficient and easy to implement bitflag set/check system given these constraints.

estebarb

18 hours ago

It is hard to understand how it works from the explanation, but seems to be only useful for really sparse bitmaps. But in a general case, I don't see how it is better than roaring bitmaps.

gurkwart

20 hours ago

I love this one! Thanks for sharing.

It's not about the bitset itself. It's about how to organize and think about your flags.

The small visualization grid is fantastic for debugging, and the `word:bit` structure lends itself directly to organize your data in categories and display them as such.

Tiberium

16 hours ago

OpenAI o3's writing style is quite distinct :)

brcmthrowaway

a day ago

So there is no compression?

dazzawazza

a day ago

No. This might seem useless to non game coders but it's a very common pattern in videogames. The state of the world is frequently 1000s of yes/no data points.

It's not rocket science but this seems like a decent enough implementation of it. The number of data points rarely goes beyond thousands so compression doesn't have much value.

sandinmyjoints

20 hours ago

(From a non-game coder) How do teams keep track of which events need to be tracked in state, what their names are, etc.? (Not how as in how is this possible, but rather, what kinds of processes have evolved for this.) Like are state events typically planned out ahead of time, or do devs realize which ones they need as they go? What if I want to add a new event to state, how can I check if someone else has added it -- do I just scan all the names in the code and hope that it was named something that I will recognize?

hbn

19 hours ago

If you look at the code linked to for the OoT decomplication, it's centralized in a single file everyone would be committing to

https://github.com/zeldaret/oot/blob/4d2bc68bacdff06261f7a89...

AshamedCaptain

19 hours ago

That is not at all representative of how the original code would have looked like. At all.

I am tired of people assigning pseudo-magical properties to "decompilations" while at the same doublethinking them as a way to launder copyrighted works.

sandinmyjoints

18 hours ago

Yeah looking at that is what prompted this question. Are the events in order? If so: the index for an event in the bitmap is constant, so if you realize you need to save an event that wasn't previously being saved, how do you insert it to keep order? Do you break all previously saved data by changing all the following indexes? Or are the events not in order, which could make it harder to find if an event is already in it (and just generally be confusing)? Managing a long list of events like this given that it is used for persistent state seems like an interesting process challenge.

kvdveer

21 hours ago

No compression. In the olden days, storage was scarce. While you could compres the bitset, the worst case compression of that bitset would be larger than the uncompressed size. You likely have no way to guarantee the worst case is impossible, so you'd have to reserve extra space. In the end you'd store more data, in a slower to use format.

These days storage bandwidth is the main limiter, so now compression makes sense. When storage size is the main limiter, compression is ironically not very helpful.