charcircuit
14 hours ago
>In banking, telecom, and payments, reliability is not a nice to have. It is table stakes.
This reliability isn't done by being perfect 100% of the time. Things like being able to handle states where transactions don't line up allowing for payments to eventually be settled. Or for telecom allowing for single parts of the system to not take down the whole thing or adding redundancy. Essentially these types of businesses require fault tolerance to be supported. The real world is messy, there is always going to be faults, so investing heavily into correctness may not be worth it compared to investing into fault tollerance.
cloudhead
2 hours ago
False dichotomy.. if reliability matters, you have to invest in both. Fault tolerance is not a replacement for correctness.
rastrian
12 hours ago
Agree with the framing: in payments/telecom, reliability is often achieved via fault tolerance + reconciliation more than “perfect correctness.”
My point is narrower: those mechanisms still benefit from making illegal transitions unrepresentable (e.g. explicit state machines) so your retries/idempotency don’t create new failure modes. It’s not correctness vs tolerance, it’s correctness inside tolerant designs.
discarded1023
14 hours ago
You'd like to know your fault tolerance is reliable and possibly even correct.
charcircuit
12 hours ago
Not if proving so is more expensive to do than not. Reliability is only a means. Not the end. Also the human parts of the business would need to be simplified in order to model them. If deviate from the model that could invalidate it.
rastrian
12 hours ago
Agree on the economics. I’m not arguing for full formal proofs; I’m arguing for low-cost enforcement of invariants (ADTs/state machines/exhaustiveness) that makes refactors safer and prevents silent invalid states. Human processes will always drift, so you enforce what you can at the system boundary and rely on reconciliation/observability for the rest.
nickpsecurity
21 minutes ago
You can also argue that debugging time can be expensive but static checks reduce debugging. This is much more true when it's concurrency errors.