Animats
a year ago
There's no such thing as finite-time arbitration for multiprocessor requests to memory, but it works anyway. This is similar. You can't guarantee time-bounded exactly once delivery, but after enough interchanges the probability of being wrong approaches zero.
This shows up explicitly in blockchain systems, where confidence improves as the number of confirmation cycles increases.
YZF
a year ago
This is the first time I'm hearing about the topic of arbitrating multiprocessor access to memory. Isn't there a trivial counterexample where the bandwidth is divided by the number of processors and each processor gets a timeslot?
Processors, despite being a distributed system (well really every ASIC is), don't typically suffer from your classical distributed systems style issues because they have no failures, or rather they only have certain catastrophic failure modes that are low probability and well controlled for. Your typical distributed system to contrast has constant failures.
rcxdude
a year ago
It's more to do with the underlying electronics: digital is an abstraction, everything is analog underneath, and logic gates and registers are actually just very non-linear analog circuits. What this means is that it's always in principle possible for an asynchronous edge (i.e. one that is not aligned to the other edges in the system) to be arbitrarily non-determinate: never actually being decided as a 1 or a 0 as it propagates through the digital circuit. This isn't an entirely theoretical issue, either: it can and will cause very strange behaviour in a digital system because a marginal or metastable value can be interpreted differently by different parts of the circuit that should be seeing the same value. It's possible in practice to push the chance of this happening extremely low, but it comes at the cost of latency: you need to spend more time to allow the circuit to actually settle into one state or the other.
Within a fully synchronous system, it's possible to avoid this, though. But most multi-processor systems are not synchronous.
londons_explore
a year ago
> But most multi-processor systems are not synchronous.
Is this actually true? A modern AMD or Intel CPU doesn't have one quartz crystal per core, so therefore all per-core clocks have to be derived from one clock source, meaning the relationship between all the clock frequencies and phases is deterministic, even if not specced in the datasheet.
YZF
a year ago
There would be a single clock typically but OP is right in the sense that as the signal propagates through the system you can get nondeterministic effects: https://en.wikipedia.org/wiki/Metastability_(electronics)
In practice the probability of that is just driven low enough that it's not an issue.
rcxdude
a year ago
Multi-core isn't the same as multi-processor, and in practice while they are derived from the same reference, dynamic per-core clock scaling makes it pretty asynchronous, to say nothing of the I/O paths between the many independently clocked parts of a modern computer.
Animats
a year ago
> This is the first time I'm hearing about the topic of arbitrating multiprocessor access to memory.
See [1]. This was a very real problem in early multiprocessor computer design.
fanf2
a year ago
Are there many (or even any) multi-drop buses in modern computers? I get the impression that these days almost everything is point-to-point links. But I know less about what’s happening on silicon. I have seen talk of crossbar interconnects on microcontrollers, which (aiui) are more active, unlike passive buses?
Animats
a year ago
Memory controllers with more than one port have an internal arbiter. The ports are connected to various CPUs, GPUs, and I/O devices. Something has to sequence requests to the memory interface. Memory controllers used to be separate chips, but since about 2008 they're usually on the CPU die.
Memory arbiters are where atomic operations on memory are actually implemented.
avianlyric
a year ago
I2C buses still tend to turn up in modern computers for running various low level components. Stuff like allowing the CPU to talk to power management systems, controlling fan speeds, RGB lights etc
I don’t think there are any high bandwidth multi-drop busses anymore. Electrically multi-drop busses don’t make it easy to ensure the needed signal integrity for high speed busses.
imtringued
a year ago
Modern processors have "packet" (nowhere near as complex as ethernet/IP/TCP) based Network On Chip interconnects based with packet routing.
Processors suffer from the classical distributed system style issues just the same, the difference is that as you said they tend to be more reliable and the configuration doesn't change at runtime, but another key aspect is that a lot of latency tradeoffs aren't as lethal due to the short distances within a chip.
Accessing data that is kept in the cache of another processor only costs you 100ns at most and a lot of that overhead comes from time spent on the cache coherency protocol. Modern directory based cache coherency protocols are complicated beasts. Each processor core needs to be aware which processors have a copy of the cache line it is about to access, so that it can request exclusive write access to that cache line and mark every single one of those copies dirty upon modification.
The way the distributed system problems manifest themselves on the higher levels is that cache coherency by itself doesn't get you correct software automatically. You still have to use synchronization primitives or atomics on the software level, because your registers are essentially stale read replicas. Compare and swap on the processor level is no different than compare and swap on mongodb.
dheera
a year ago
Easy exactly-once delivery: Send a transaction ID with the transaction using an at-least-once system. The receiver ignores duplicate ID transactions. The sender can guarantee never sending the same ID twice.
Maxatar
a year ago
That's what the article calls exactly once processing.
An analogy would be pizza delivery. You wouldn't call someone who delivered 100 identical pizzas to you even though you only ate one of them "exactly once delivery".
kelnos
a year ago
Agreed, if I had to be the one who dealt with the extra pizzas. But if there was someone standing outside in my yard in order to discard the 99 duplicate pizzas before they got to my door (a travesty in and of itself, but that's another discussion), I would think of the system as a whole as "exactly once delivery".
And, analogously, that does seem to be what Sequin's system provides.
Maxatar
a year ago
But that wouldn't work which is the problem.
If someone named Bob was standing outside your yard discarding the 99 pizzas, then Bob would not be able to guarantee that you actually get exactly one pizza. It's possible that the one pizza Bob tries to give you gets rained on before Bob can actually hand it to you.
Only you yourself can actually implement a means of discarding extraneous pizzas, any middlemen introduced into the chain just adds another layer of indirection that is susceptible to failure.
avmich
a year ago
No, Bob standing outside of your yard is principally closer to you - as in, holding your hand - and can't fail to release pizza only after you got hold of it.
Of course the rain may flip bits in any system, but that would amount to destruction of you. You won't worry about pizza then, but your clone will spring up and get the pizza from Bob. And Bob himself is similarly protected from single-point-of-failure problems.
Of course the whole Bob subsystem may be destroyed, but that's already beyond the threshold of the problem we choose. Otherwise we'd have the external system to check if we'd actually got the pizza, and resend it again.
EDIT: that's probably the definition of "only you" - the Bob being that close. Agree then.
Maxatar
a year ago
If you have a system where failure is impossible then none of this applies and yes, you can implement a deliver exactly once protocol.
It's only in systems where failure is a possibility that exactly once delivery becomes impossible.
avmich
a year ago
Correct, and we need Bob to actually be part of us in order to tolerate his failures.
magicalhippo
a year ago
But then it's not the transport system that provides exactly-once delivery, but rather the users of the system.
dheera
a year ago
Wrap that up and abstract it as a transport system. The users don't need to know the inner workings, they just need a transport API that guarantees exactly-once.
magicalhippo
a year ago
The point is you can't wrap that up, because the faults can happen at the edges, ie where the systems interact.
The users still need to interact with this wrapped system, and it's those interactions that can fail, thus preventing the overall exactly-once delivery.
To flip it around, if this wasn't an issue then what you're suggesting already exists in the form of ACID databases.
orf
a year ago
> they just need a transport API that guarantees exactly-once
Which is the same problem, but at a different layer.