LegionMammal978
6 hours ago
IME, there's one big thing that often keeps my programs from being unaffected by byte order: wanting to quickly splat data structures into and out of files, pipes, and sockets, without having to encode or decode each element one-by-one. The only real way to make this endian-independent is to have byte-swapping accessors for everything when it's ultimately produced or consumed, but adding all the code for that is very tedious in most languages. One can argue that handling endianness is the responsible thing to do, but it just doesn't seem worthwhile when I practially know that no one will ever run my code on a big-endian processor.
rocqua
4 hours ago
Byte swapping is equivalent to needing to do encoding and decoding. Is it not?
LegionMammal978
4 hours ago
The benefit is that you'd only have to do it for the parts of the data that are actively manipulated, which might be far less than the entirety of the data structure. Also, you can easily forward a copy elsewhere in the original format.
But if you know you're not going to have endianness problems, you can just skip that step entirely.
GMoromisato
6 hours ago
I think the article's author would say that loading data "without having to encode or decode each element" is premature optimization and more likely to have bugs. I tend to agree.
dwattttt
5 hours ago
The optimisation the parent is referring to is development time/effort; if the alternative to dumping a structure to a file is to hand roll your serialiser/deserialiser, that's a slower & probably more error prone approach (depending on the context).
skybrian
5 hours ago
The article makes an argument that the hand-rolled solution is less buggy, if you approach it the right way.
For complicated data structures, it's probably best to use a library that serializes to a common standard. (For example, protocol buffers or JSON.)
But I think the article assumes you don't get to choose the protocol, so it probably has to be hand-written by someone.
LegionMammal978
4 hours ago
Not once you start getting into the range of hundreds of megabytes or more, which accounts for most situations where I'd use a binary format in the first place.
amluto
13 minutes ago
By the time I’m putting hundreds of MB somewhere, I want a defined format, not whatever the compiler happens to generate for this particular build of my software. There are plenty of nice ways to do this.
sobellian
5 hours ago
Maybe if you hand-roll the struct layout, but if you use something like flatbuffers I doubt you would see many more bugs - and flatbuffers will take care of endian swaps as necessary without you needing to think about it.
sgarland
3 hours ago
Depends what you’re doing. I have a side project that generates CSVs in the GB range. It keeps everything in bytes because encode/decode is a lot of overhead in loops when you’re hitting them millions of times.