ekidd
a day ago
"Sum types" (aka "Rust enums") are something that I've really come to love for representing the idea "There are 3 possible cases here, here's the data we need in each case, and please remember that these cases are strictly mutually exclusive."
Making this super easy and short to express makes a lot of designs clearer.
There's also an interesting tradeoff here between sum types and abstract interfaces:
- Abstract interfaces: "We have an unknown number of cases/subclasses, but they each support this fixed set of methods." The set of implementations is open, but the set of operations is harder to update.
- Sum types (aka Rust enums): "We know all the cases, but there may be many different methods." The set of cases is harder to extend, but the set of methods manipulating them is completely open.
A good example of the latter pattern is actually LLVM! LLVMs' intermediate representation is a essentially a small assembly language with a fixed set of operations (and an escape hatch). But because the set of basic instructions is small and known, it's possible to write dozens and dozens of optimization passes that each manipulate that small set of instructions.
And a surprising number of programs work well with an LLVM-like structure.
So in a language like Rust, which supports both abstract interfaces (via traits) and sum types (via enums), the tradeoff is whether you think you have many different types of values that you'll manipulate in a limited number of ways, or a small number of types of values that you'll manipulate in many different ways.
(Oh, and the metaphor of "algebraic types" goes deeper than you might think; there are some super interesting data structures based on taking the "derivatives" of simple algebraic types.)
ulrikrasmussen
a day ago
The lack of sum types should really be considered a weird omission rather than a strange fancy language feature. I suspect that their omission from popular languages is mostly due to the historical belief that object-orientation was the way forward and that "everything is an object", leading many programmers to represent things that are decidedly not objects using abstractions designed for objects. An object is a stateful thing which exists over a period of time (sometimes unbounded) and whose identity is characterized by its observable behavior. Abstractions for objects consist of interfaces which expose only a slice of the observable behavior to consumers. On the other hand, a value is really just that; a value. It does not have state, it does not start or stop existing, and its identity is defined only by its attributes. You can use objects to model values (poorly), but you often end up doing a lot of extra work to stop your "fake value" objects from behaving like objects, e.g. by being careful to make all fields immutable and implementing proper deep equality. Abstractions for values are not naturally expressed using interfaces because interfaces force the implementation to be tied to the instance of an object, but since values do not have a lifetime this restriction is a huge disadvantage and often leads to clunky and inflexible abstractions. For example, consider the Comparable interface in Java which tells you that an object is modeling a value which can be compared to other values of the same type. It would be awfully nice if List<A> could implement this interface, but it cannot because doing so will mean that you can only create lists of things (of some type A) which have a total order defined on them.
However, if you consider programming to not only be about expressing what you can do to stateful objects but also expressing values and their operations, then algebraic data types and traits/modules/type classes become the natural basic vocabulary as opposed to classes and interfaces. When dealing with first-order algebraic data types, products (i.e. records) and coproducts (i.e. sum types) are unavoidable. A list is one of the simplest algebraic data types which use both products and coproducts:
data List a = Nil | Cons a (List a)
One trait of lists is that lists of totally ordered things are themselves totally ordered using lexicographic ordering, and this is in Haskell expressed as a type class instance
instance Ord a => Ord (List a) where ...
Crucially, the type class is removed from the definition of what a list is, which enables us to still talk about lists of e.g. functions which do not have an order on them. These lists do not have the Ord trait, but they are still lists.
Another important distinction between traits and interfaces is that some behavior of value types consist of simply identifying special values. For example, the Monoid type class in Haskell is
typeclass Monoid a where
mempty :: a
mappend :: a -> a -> a
It is impossible to express Monoid as an interface because one of the features of a monoid is that you have a special element `mempty`, but consumers cannot get this value without first having another value of type `a` on their hands.Many languages now have proper support for both objects and values in that they have better primitives for defining both products and coproducts, but I still think that most mainstream languages are missing proper primitives for defining traits.
epolanski
a day ago
> It is impossible to express Monoid as an interface because one of the features of a monoid is that you have a special element `mempty`, but consumers cannot get this value without first having another value of type `a` on their hands.
Can't you trivially do so in TypeScript?
- define a Magma<A> interface `concat<A>(a:A, b: A) => A`,
- define a Semigroup<A>, same implementation of `Magma<A>` could even just point to it,
- define Monoid<A> as Semigroup<A> and `empty` property (albeit I prefer the term `unit`, empty is very misleading).
Now you can implement as many monoids you want for strings, functions, lists and use those instances with APIs that require a Monoid. E.g. you could have a `Foldable` interface that requires a `Monoid` one, so if you have a Foldable for some A you can define foldMap.
Not sure what the practical differences would be.
After all Haskell's monoids, same as in TypeScript, are contracts, not laws. There are no guarantees that monoid properties (left and right identity) hold for every element of type `a`.
ulrikrasmussen
a day ago
Yes, you can do that, and you can do something similar in Java or Kotlin by `object : Monoid<Int> { override val mempty = 0; override fun mappend(x: Int, y: Int): Int = x + y }`. This is an encoding of traits as objects though, and is not as nice to work with as e.g. typeclasses and ML modules.
pdpi
a day ago
You can express Monoid as an interface if Monoid<T> is implemented by a standalone object instead of by T itself — it’s how Scala does it.
ahartmetz
a day ago
I wonder why sum types are named like that. They are not the sum of their possible states in a way that I would recognize, more like an exclusive-or or one of out many. At some point I came up with an explanation, but I forgot it again. If you need to explain why something is simple, it's not simple.
duckfan77
20 hours ago
The number of valid options is the sum of the valid option for each variant. This is also why product types are called product types.
If you have a struct (product type) containing an 8 bit integer and a boolean, the struct has 512 possible states. 256 for the integer, times 2 for each value of the boolean.
If you have a sum type where one variant is an 8 bit integer, and the other is a boolean, you have 258 possible states. 256 for when it's an integer, and 2 for when it's a boolean.
Sum types add the options of each variant, product types multiply the options for each item in it.
phkahler
a day ago
Sum type sounds like the old Variant from VB.
We use something similar in Solvespace. Where one might have used polymorphism, there is one class with a type enum and a bunch of members that can be used for different things depending on the type. Definitely some disadvantages (what does valA hold in this type?) but also some advantages from the consistency.
user
a day ago
GhosT078
a day ago
This tradeoff sounds similar to the choice to use "tagged types" versus "variant records" in Ada. Ada has provided variant records since 1983 and tagged types since 1995 (and both with a very nice syntax).
wk_end
a day ago
According to this [0] tutorial, variant records in Ada have the downside that you only get an error at runtime if you use them incorrectly, i.e. access data from one branch of the sum type when the actual value is on another. That's a pretty huge drawback.
https://learn.adacore.com/courses/intro-to-ada/chapters/more...
csb6
a day ago
Ada doesn’t have pattern matching, so there is no way for it to enforce that only the valid fields are accessed except at runtime (beyond warnings if the discriminant is known at compile time). Most people would just use a case statement that checks the variant discriminant before accessing any fields.
I believe the SPARK subset can (through static analysis) ensure that only valid fields are accessed for the current discriminant.
GhosT078
a day ago
I'm not sure I understand your point. Is it about such an error being reported at run time versus compile time? The GNAT compiler will often provide compile time warnings that "Constraint_Error will be raised at run time" in many such scenarios. Other Ada type checks would make this scenario improbable (although it could be contrived).