cartoffal
4 days ago
The idea of "type safety over the network" is a fiction.
When it comes down to it, what is being sent over the network is 1s and 0s. At some point, some layer (probably multiple layers) are going to have to interpret that sequence of 1s and 0s into a particular shape. There are two main ways of doing that:
- Know what format the data is in, but not the content (self-describing formats) - in which case the data you end up with might be of an arbitrary shape, and casting/interpreting the data is left to the application
- Know what format the data is in, AND know the content (non-self-describing formats) - in which case the runtime will parse out the data for you from the raw bits and you get to use it in its structured form.
The fact that this conversion happens doesn't depend on the language; JS is no more unsafe than any other language in this regard, and JSON is no better or worse of a data serialisation format for it. The boundary already exists, and someone has to handle what happens when the data is the wrong shape. Where that boundary ends up influences the shape of the application but what is preferable will depend on the application and developer.
rappatic
4 days ago
> The idea of "type safety over the network" is a fiction
> When it comes down to it, what is being sent over the network is 1s and 0s
?
When it comes down to it, all of computing is 1s and 0s. This is not some feature that's particular to the wire.
lblume
4 days ago
The difference being that a client can be malicious, while e.g. a local file is assumed to behave with the same intent as another. Programs that run on one computer can always be statically verified, while the task is harder for server-client applications — the client could always be an untrusted impersonator!
sophacles
4 days ago
A local file can be "newly local" and have recently been saved from the network or via a usb drive, etc.
And assuming a file is going to behave with good intent, or even the same intent as another file of the same format, is bad. It's how we get jpeg/png/etc parsing errors. Its how we end up with PDFs that are also valid executables, and 1000 more issues.
munchler
4 days ago
This happens with local files also, and was originally called “DLL hell”. The mismatch isn’t malicious, but the effect is the same.
JadeNB
4 days ago
> The difference being that a client can be malicious, while e.g. a local file is assumed to behave with the same intent as another.
I'm not sure what it means to assume something about the behavior of a file, presumably thought of as a static piece of data, but I'd certainly disagree that a modern computing system is entitled to assume that all local apps behave with the same intent as one another (except to the extent that it assumes that all local apps behave maliciously).
denkmoon
4 days ago
What does that have to do with type safety though? If anything, type safety improves whichever piece of the puzzle you do have control over by reducing the likelihood of you accepting malformed data.
tbrownaw
4 days ago
> The idea of "type safety over the network" is a fiction.
Not really, it just has to be enforced at run time rather than compile or link time.
cartoffal
4 days ago
In many statically-typed languages, types do not exist at runtime - they are erased once the program is known to be internally consistent. What is left is not type safety, it is parsing and validation of unstructured binary blobs (or arbitrary strings, depending on the protocol) into structured data. Structure and types are not the same thing, and in many languages they barely even overlap.
setr
3 days ago
Any input data has the same problem. Type safety exists after validation, and its guarantees hold only if your original validation was upheld.
Files, database, user input, network protocols, etc. I don’t know why the network would be any way special. You parse/validate unstructured binary blobs into structured data, and what’s left is type safety. It’s not in the runtime only because, if the compiler has done its job correctly, it is typesafe by construction.
In other words, how many times are you going to check your data structure is correct, before you start assuming it’s correct? Once — at parsing and validation — after that, you’re working with structured data, and your types are just recording the fact
cartoffal
14 hours ago
> I don’t know why the network would be any way special.
The network isn't special. This applies locally too. But the article we are commenting on (at least, _I_ was commenting on) is about the network, and it uses the phrase "type safety over the network" in particular.
> You parse/validate unstructured binary blobs into structured data, and what’s left _is type safety_.
That is in fact exactly what I said. The point is that at some point you started with unstructured binary blobs. As soon as data leaves the application, it is no longer "safe", and it is unsafe until it is ingested and (re)validated. So my point can be freely generalised to "type safety beyond the application boundary is a fiction". And the application boundary will always exist, whether you are working with a strongly or weakly, statically or dynamically typed language.
p0w3n3d
2 days ago
I agree with you. I think that except for memory cost, having everything in strings just moves the logic of validation of unknown or malformed strings from deserializer to validator method. But it must happen. And it will break nevertheless when new possible value occur.
You've just implemented POST and GET enum? Here's the PATCH. You have all http codes in your enum? And what about teapot? You had STARTING, PENDING, and FINISHED in your ``state`` allowed values? Our business analyst wants FAILED. Etc. Etc.
bluGill
4 days ago
While it is all 1 and 0, what those bits mean can easially be encoded. When you say what those bits mean in detail (which we need to anyway - what code page is that string), we can then assign what is valid, and in turn we can reject messages that while they are only 1/0 are still wrong. Also by assigning meaning we can get closer to what we want. the string "12345" and the number 1234 can both mean the same thing, but we can put one into 2 bytes if we want, while the string is at least 5 bytes. Not to mention a number is easier to parse, turning a string to a number is not always trivial (depending of course on which code page is active)
teeray
4 days ago
> At some point, some layer (probably multiple layers) are going to have to interpret that sequence of 1s and 0s into a particular shape
You do it once, at the application border. Doing it multiple times, in multiple layers, is a path to madness.
cartoffal
4 days ago
My point was more that the layers below the application will also have to parse the data into a particular format - in the case of networked applications, into TCP/IP packets, then anything particular to the message protocol, before hitting the application. And then the application will, at runtime (regardless of whether you are using a type safe language or not) have to parse and validate the shape of the data before it can be used.
01HNNWZ0MV43FF
4 days ago
Not to agree, but this is the same for files on disk or data in a database, where the malicious or misbehaving peer might just be an older version of the same program