hackernews client

Algebraic Types are not Scary

90 pointsposted 3 days ago

(blog.aiono.dev)

69 Comments

ekidd

a day ago

"Sum types" (aka "Rust enums") are something that I've really come to love for representing the idea "There are 3 possible cases here, here's the data we need in each case, and please remember that these cases are strictly mutually exclusive."

Making this super easy and short to express makes a lot of designs clearer.

There's also an interesting tradeoff here between sum types and abstract interfaces:

- Abstract interfaces: "We have an unknown number of cases/subclasses, but they each support this fixed set of methods." The set of implementations is open, but the set of operations is harder to update.

- Sum types (aka Rust enums): "We know all the cases, but there may be many different methods." The set of cases is harder to extend, but the set of methods manipulating them is completely open.

A good example of the latter pattern is actually LLVM! LLVMs' intermediate representation is a essentially a small assembly language with a fixed set of operations (and an escape hatch). But because the set of basic instructions is small and known, it's possible to write dozens and dozens of optimization passes that each manipulate that small set of instructions.

And a surprising number of programs work well with an LLVM-like structure.

So in a language like Rust, which supports both abstract interfaces (via traits) and sum types (via enums), the tradeoff is whether you think you have many different types of values that you'll manipulate in a limited number of ways, or a small number of types of values that you'll manipulate in many different ways.

(Oh, and the metaphor of "algebraic types" goes deeper than you might think; there are some super interesting data structures based on taking the "derivatives" of simple algebraic types.)

ulrikrasmussen

a day ago

The lack of sum types should really be considered a weird omission rather than a strange fancy language feature. I suspect that their omission from popular languages is mostly due to the historical belief that object-orientation was the way forward and that "everything is an object", leading many programmers to represent things that are decidedly not objects using abstractions designed for objects. An object is a stateful thing which exists over a period of time (sometimes unbounded) and whose identity is characterized by its observable behavior. Abstractions for objects consist of interfaces which expose only a slice of the observable behavior to consumers. On the other hand, a value is really just that; a value. It does not have state, it does not start or stop existing, and its identity is defined only by its attributes. You can use objects to model values (poorly), but you often end up doing a lot of extra work to stop your "fake value" objects from behaving like objects, e.g. by being careful to make all fields immutable and implementing proper deep equality. Abstractions for values are not naturally expressed using interfaces because interfaces force the implementation to be tied to the instance of an object, but since values do not have a lifetime this restriction is a huge disadvantage and often leads to clunky and inflexible abstractions. For example, consider the Comparable interface in Java which tells you that an object is modeling a value which can be compared to other values of the same type. It would be awfully nice if List<A> could implement this interface, but it cannot because doing so will mean that you can only create lists of things (of some type A) which have a total order defined on them.

However, if you consider programming to not only be about expressing what you can do to stateful objects but also expressing values and their operations, then algebraic data types and traits/modules/type classes become the natural basic vocabulary as opposed to classes and interfaces. When dealing with first-order algebraic data types, products (i.e. records) and coproducts (i.e. sum types) are unavoidable. A list is one of the simplest algebraic data types which use both products and coproducts:

data List a = Nil | Cons a (List a)

One trait of lists is that lists of totally ordered things are themselves totally ordered using lexicographic ordering, and this is in Haskell expressed as a type class instance

instance Ord a => Ord (List a) where ...

Crucially, the type class is removed from the definition of what a list is, which enables us to still talk about lists of e.g. functions which do not have an order on them. These lists do not have the Ord trait, but they are still lists.

Another important distinction between traits and interfaces is that some behavior of value types consist of simply identifying special values. For example, the Monoid type class in Haskell is

    typeclass Monoid a where
      mempty :: a
      mappend :: a -> a -> a

It is impossible to express Monoid as an interface because one of the features of a monoid is that you have a special element `mempty`, but consumers cannot get this value without first having another value of type `a` on their hands.

Many languages now have proper support for both objects and values in that they have better primitives for defining both products and coproducts, but I still think that most mainstream languages are missing proper primitives for defining traits.

epolanski

a day ago

> It is impossible to express Monoid as an interface because one of the features of a monoid is that you have a special element `mempty`, but consumers cannot get this value without first having another value of type `a` on their hands.

Can't you trivially do so in TypeScript?

- define a Magma<A> interface `concat<A>(a:A, b: A) => A`,

- define a Semigroup<A>, same implementation of `Magma<A>` could even just point to it,

- define Monoid<A> as Semigroup<A> and `empty` property (albeit I prefer the term `unit`, empty is very misleading).

Now you can implement as many monoids you want for strings, functions, lists and use those instances with APIs that require a Monoid. E.g. you could have a `Foldable` interface that requires a `Monoid` one, so if you have a Foldable for some A you can define foldMap.

Not sure what the practical differences would be.

After all Haskell's monoids, same as in TypeScript, are contracts, not laws. There are no guarantees that monoid properties (left and right identity) hold for every element of type `a`.

ulrikrasmussen

a day ago

Yes, you can do that, and you can do something similar in Java or Kotlin by `object : Monoid<Int> { override val mempty = 0; override fun mappend(x: Int, y: Int): Int = x + y }`. This is an encoding of traits as objects though, and is not as nice to work with as e.g. typeclasses and ML modules.

pdpi

a day ago

You can express Monoid as an interface if Monoid<T> is implemented by a standalone object instead of by T itself — it’s how Scala does it.

ahartmetz

a day ago

I wonder why sum types are named like that. They are not the sum of their possible states in a way that I would recognize, more like an exclusive-or or one of out many. At some point I came up with an explanation, but I forgot it again. If you need to explain why something is simple, it's not simple.

duckfan77

20 hours ago

The number of valid options is the sum of the valid option for each variant. This is also why product types are called product types.

If you have a struct (product type) containing an 8 bit integer and a boolean, the struct has 512 possible states. 256 for the integer, times 2 for each value of the boolean.

If you have a sum type where one variant is an 8 bit integer, and the other is a boolean, you have 258 possible states. 256 for when it's an integer, and 2 for when it's a boolean.

Sum types add the options of each variant, product types multiply the options for each item in it.

phkahler

a day ago

Sum type sounds like the old Variant from VB.

We use something similar in Solvespace. Where one might have used polymorphism, there is one class with a type enum and a bunch of members that can be used for different things depending on the type. Definitely some disadvantages (what does valA hold in this type?) but also some advantages from the consistency.

user

a day ago

[deleted]

GhosT078

a day ago

This tradeoff sounds similar to the choice to use "tagged types" versus "variant records" in Ada. Ada has provided variant records since 1983 and tagged types since 1995 (and both with a very nice syntax).

wk_end

a day ago

According to this [0] tutorial, variant records in Ada have the downside that you only get an error at runtime if you use them incorrectly, i.e. access data from one branch of the sum type when the actual value is on another. That's a pretty huge drawback.

https://learn.adacore.com/courses/intro-to-ada/chapters/more...

csb6

a day ago

Ada doesn’t have pattern matching, so there is no way for it to enforce that only the valid fields are accessed except at runtime (beyond warnings if the discriminant is known at compile time). Most people would just use a case statement that checks the variant discriminant before accessing any fields.

I believe the SPARK subset can (through static analysis) ensure that only valid fields are accessed for the current discriminant.

GhosT078

a day ago

I'm not sure I understand your point. Is it about such an error being reported at run time versus compile time? The GNAT compiler will often provide compile time warnings that "Constraint_Error will be raised at run time" in many such scenarios. Other Ada type checks would make this scenario improbable (although it could be contrived).

simpaticoder

a day ago

Algebraic types aren't scary, but they offer enough degrees-of-freedom to become scary [1]. Software is interesting because you can move around the complexity, and putting complexity into the type system has its advocates. They like to move complexity to compile time, but there are serious trade-offs that must be acknowledged! Turing-complete type systems can get programmers in trouble.

Most programs are better off putting the complexity into the body of the function being called and away from the type system itself. Use simple, fixed, runtime types[2][3], and deal with the occasional complex case in user code that you control rather than compiler code you don't.

1 - Demonstration of the Turing-completeness of Typescript's types: https://github.com/microsoft/TypeScript/issues/14833

2 - Zod https://zod.dev/

3 - Friendly Functions https://simpatico.io/test/friendly

lock1

a day ago

No?

I think you're confusing ADT with a Typescript-like type system. Adding language support for ADT is not really a warrant for an "enough DOF to become scary" sticker.

I find modern Java surprisingly pretty good at capturing the essence of ADT without introducing way too much fancy stuff. Modern Java supports ADT via `sealed` classes (sum types) and `record` (product types). For working with ADT classes, modern Java provides exhaustive `switch` expression and `record` destructuring.

That's enough for minimal "language ADT support". You still can't do Typescript's wacky operations like conditional / mapped types or type parameter recursion in Java. You can't do fancy Haskell's typeclasses here either. And it is certainly not turning Java's type system into a Turing-complete type system at all.

wk_end

a day ago

Algebraic types, by themselves, don't do anything to make things "scary" or Turing-complete. Even C has (sort of) a weak, limited form of algebraic types through structs and unions. Zod, which you linked to as an example of the "simple, fixed, runtime types" you're advocating for supports discriminated unions too.

jerf

a day ago

"Algebraic types, by themselves, don't do anything to make things "scary" or Turing-complete."

That's what "offer enough degrees-of-freedom to become scary" is getting at.

The types themselves are no more complicated than non-algebraic types in the end. However it so happens that for historical reasons they have appealed to certain personality types who proceed to build glorious castles on top of them, and indeed per the "degrees of freedom" point, multiple different overlapping-yet-often-conflicting castles. It is easy to mistake those castles as being "algebraic types", which is not helped by the name, rather than being simple things which can be explained to children and could easily be used in an introduction to programming curriculum rather than conventional types.

Conventional types are also not necessarily that complicated, or don't have to be. For instance, the complexity of C++'s towers of subclasses and templates should be accounted to C++ itself, rather than as a fundamental aspect of conventional type systems. There's things like Go that use a much simpler type system that is not generally amenable to such towers of complexity, which is a rephrasing of the most common complaints against it.

torginus

a day ago

It's like a Go-kart that can do 100 mph - while not inherently dangerous as the excess capability is not necessarily advertised, and it requires and user to exploit them, and it's up to user consensus that how fast a go-kart should be allowed to go, as capability increases so does the potential for abuse. It's up to debate that how much is too much, and while I'd think most people would agree that 100mph is above that limit, some would still argue it's artificially limiting.

Powerful type systems are like that. You can program almost everything with what, for example, Go gave you before the generics update, but some would argue that more stuff is needed, like sum types etc.

Now as capability increases, and some comipler people boast that their type system is Turing complete, I think you are in fact vindicating people who decide to write things like a type that can only contain prime numbers.

But adding a type system this powerful will probably affect the real-world usability in your language, first you add a second language to your code that's awkward to use, slow to execute, and prone to crashing the compiler.

I remember having endless debates with C++ people arguing that their doubly-recursive template system, is not in fact, clever, but makes every compile take forever, is prone to crashing the compiler or making it misbehave, and cricitising it is not in fact, just an admission of one's intellectual inferiority.

simpaticoder

a day ago

Thanks, I didn't know that about Zod (https://zod.dev/api#intersections). But I suppose I could generalize my point to be "less is more when it comes to algebraic types". A little ketchup is good on fries, but no-one wants fries swimming in ketchup.

whateveracct

a day ago

ADTs don't give your type system Turing completeness.

Also acting like Turing complete type systems are scary is more FUD than reality. If it becomes relevant it's because you are choosing to do recursion in some sort of constraint solver the type system provides.

taeric

a day ago

The name for them feels a bit unfortunate, to me. The vast majority of programmers know of algebra from grade school, at best. And there you think of algebra as being able to do symbolic operations on equations. Not so much as "algebra over a field."

I think it is fair to say that that is the more unfortunate naming, maybe? I still contend that that is what people think of when they hear the term, though.

dfabulich

a day ago

I strongly agree. Algebraic types aren't "scary," but "algebraic types" is a bad term for what they are. In all popular languages that support "sum types" we just call them "unions."

Your favorite programming language probably already supports unions, including Python, TypeScript, Kotlin, Swift, PHP, Rust, C, and C++. C# is getting unions next year.

The article never mentions the word "union," which is overwhelmingly more likely to be the term the reader is acquainted with. It mentions "sets" only at the start, preferring the more obscure terms "sum types" and "product types."

A union of type sets kinda like a sum, but calling it a "sum" obscures what you're doing with it, rather than revealing. The "sum" is the sum of possible values in the union, but all of the most important types (numbers, arrays, strings) already have an absurdly large number of possible values, so computing the actual number of possible values is a pointless exercise, and a distraction from the union set.

Stop trying to make "algebraic" happen. It's not going to happen.

chrisnight

a day ago

> In all popular languages that support "sum types" we just call them "unions."

When I was doing research on type theory in PL, there was an important distinction made between sum types and unions, so it’s important not to conflate them. Union types have the property that Union(A, A) = A, but the same doesn’t hold for sum types. Sum types differentiate between each member, even if they encapsulate the same type inside of it. A more appropriate comparison is tagged unions.

adastra22

a day ago

What you are calling Union type is not what GP is talking about.

wasabi991011

a day ago

So, disjoint union in set theory terms?

If I understand correctly .

dkarl

a day ago

> Stop trying to make "algebraic" happen. It's not going to happen

It's been used for decades, there's no competitor, and ultimately it expresses a truth that is helpful to understand.

I agree that the random mixture of terminology is unhelpful for beginners, and it would be better to teach the concepts as set theory, sticking to set theoretic terminology. In the end, though, they'll have to be comfortable understanding the operations as algebra as well.

dfabulich

a day ago

No, seriously, you literally never need to understand the operations as algebra. You just need to know how to use your language's type system.

None of the popular languages call them sum types, product types, or quotient types in their documentation. In TypeScript, it's called a "union type," and it uses a | operator. In Python, `from typing import Union`. In Kotlin, use sealed interfaces. In Rust, Swift, and PHP, they're enums.

There are subtle differences between each language. If you switch between languages frequently, you'll need to get used to the idiosyncrasies of each one. All of them can implement the pattern described in the OP article.

Knowing the language of algebraic types doesn't make it easier to switch between one popular language to another; in fact, it makes it harder, because instead of translating between Python and TypeScript or Python and Rust, you'll be translating from Python to the language of ADTs (sum types, tagged unions, untagged unions) and then from the language of ADTs to TypeScript.

People screw up translating from TypeScript to ADTs constantly, and then PL nerds argue about whether the translation was accurate or not. ("Um, actually, that's technically a tagged union, not a sum type.")

The "competitor" is to simply not use the language of algebraic data types when discussing working code. The competitor has won conclusively in the marketplace of ideas. ADT lingo has had decades to get popular. It never will.

ADT lingo expresses no deeper truth. It's just another language, one with no working compiler, no automated tests, and no running code. Use it if you enjoy it for its own sake, but you've gotta accept that this is a weird hobby (just like designing programming languages) and it's never going to get popular.

dkarl

a day ago

If you're so confident that algebraic data types will never become popular, I don't understand why you feel it's so important to convince a few nerds not to talk about them.

I don't think anyone here is advocating making the topic compulsory. There will always be some people who are interested in theory and others who aren't.

bananaflag

a day ago

I always thought "algebraic" sounds unscary because in high school / undergraduate, algebra feels easier than analysis. Now, "analytic data types", that would be something.

taeric

6 hours ago

Apologies, I missed this yesterday.

It isn't that it is scary. It is that it connotes something about equations and such. It is literally the name of the class people learn how to symbolically solve quadratic equations and such.

I was saying in another thread that I think this would be a lot more obvious if people actually did algebra on the types. But we don't. At least, I can't recall ever seeing it done. Closest I've seen is in places where people have to do so to justify the naming here.

And I want to be clear, I specifically mean using + and * with types. And not that it can be done. Is it done?

thechao

a day ago

Right. So, there's a unit type: the type that the sum of returns the other type, i.e.,

    a + () = a

So that means there's an identity type for the product type operator:

    a * 1 = a

I.e., the type that you can "multiply" by that doesn't give you a two-field tuple, but just returns the single-field tuple. If "1" is a type, then we can build a sum-type with it:

    a + 1 = (a, 1)

Ok. Cool. But! If we can "add 1", then we should be able to remove "1", right? That kinda implies that "-1" exists:

    (a, 1) -1 = (a,) # don't shoot the messenger, I'm just riffing

But, that also implies that we can allow a product type whose result is the "-1" type?

    (i, i) = -1

OH NOES!

adastra22

a day ago

Why not just “struct” or “record” and “enum” or “variant”?

dkarl

a day ago

I think the term "algebra" is nice and neutral, common ground from grade school through college mathematics and higher, and I think it's a good clue for someone who doesn't have much math background. "You know how you learned to do math operations on numbers, and then in algebra class you learned to do them on other things like variables and equations? You can do operations on types as well."

It's inevitable that someone starting from zero like this needs some time to get used to the ideas. Nobody learns this stuff instantly. Even if they can read an explanation in ten minutes and understand every word of it, it still takes time for the ideas to sink in and become natural. That's why an undergraduate math course might only cover a couple hundred pages in a semester. For someone not used to the process of absorbing ideas like these, the time it takes can be confusing and disappointing, but it's normal.

I think one point of confusion that could be avoided is that areas of math that are initially taught as separate topics often get freely mixed in introductory explanations of type theory. For example, someone who is exposed to a little bit of set theory in one class, and a little bit of logic in another, probably thinks of these things as separate from each other and both separate from algebra. But then an explanation like this confuses them by freely interchanging "union" and "sum" and the operator "|" which programmers learn as logical "or." Ideally, I think an introduction should stick to a single set of terminology from one area (I would pick set theory) at first, and after the student is comfortable with the concepts and operations, only then show how the other terminology maps to the ideas the student has learned.

taeric

a day ago

Oddly, that is precisely my problem with it as a name. I can't do any "basic algebra" operations on these things.

I say this as someone that is having to help my grade schoolers learn algebra this week, actually. :D I can squint to see where things are going and how they intersect. But it takes a lot of squinting.

dkarl

a day ago

You can add and multiply. You have to learn what add and multiply mean with types.

There was a time when you only knew how to add and multiply with numbers, and you had to learn how to deal with things like "x". It's confusing for most kids at first, and they spend weeks getting used to it. I'd say that virtually anybody who has learned grade school algebra will learn to add and multiply types faster than they learned to add and multiply polynomials.

"Squaring" and "cubing" is an easy way to build intuition with type multiplication, because the name helps you visualize the result. If you have a type C with three values {red, green, blue} then C^2 = C x C is a "square" of values:

     (red, red)      (red, green)    (red, blue)
    (green, red)    (green, green)  (green, blue)
     (blue, red)     (blue, green)   (blue, blue)

and C^3 = C x C x C can be visualized as a "cube" of values.

You can imagine arranging other products of finite types in the same way:

    class Foo(color: C, enabled: Boolean)
    
    // the values of Foo can be arranged in a 3x2 array
    
     Foo(red, true)    Foo(red, false)
    Foo(green, true)  Foo(green, false)
     Foo(blue, true)   Foo(blue, false)

This is analogous to how kids are taught to multiply 3 x 2.

taeric

a day ago

Like I said, I can squint to see what you mean. But people tend to treat types more as what grade schools call units. And those act more as constraints on what you can do with the values. With you doing add/multiply on the values.

Note that I'm not necessarily arguing that things should change. At this point, I'd consider the name fairly well established. But, it should be noted much more heavily that you are doing algebra on the types and that this does not imply anything that you can do with the values they may represent at any time.

dkarl

a day ago

> But people tend to treat types more as what grade schools call units. And those act more as constraints on what you can do with the values. With you doing add/multiply on the values.

Ah, I see. Yes, it doesn't make sense unless you see types as sets of values. I haven't been super deep into type theory, so I don't know how far that definition takes you, but it's the intuitive/naive starting place for understanding the idea of algebraic data types. The addition and multiplication operations are on sets of values and don't have anything to do with adding or multiplying the values themselves.

taeric

a day ago

Thinking a little more about it, I would wager it would be a lot easier if people used the + and * when combining types. As things are, we say that that is what defines them being algebraic, but then never use + and *. Would be a lot easier if Maybe[String] was defined as [String + None] and similar.

(I... don't actually know if that was classically done?)

dkarl

a day ago

I think they both have advantages. "Union" directly references the set operation, which helps keep you concretely grounded when learning the concept. + references a different level of abstraction, which is fine after you've internalized the operations, but probably a bit confusing to start with.

taeric

a day ago

Certainly. Names are important, as much as it annoys us.

My question was more on if you could move the algebra into more focus in the types defined. Specifically, I don't think I've seen people use +/* in describing the type that often. Or ever, really.

As an example, I could see defining an Either<A, B> as (A + B) as a fairly easy to understand example that would make it very easy to see the algebra being described. Though, at that point, I confess I don't know of any common product types by name. Tuple, but that is not really satisfying to me, for some reason. (Fully ack that is a me problem.)

Could easily explain this using cards. You start with the types of Suite and Rank. With a Card being (Suite * Rank). This nicely could show that asking about a Suite is effectively asking about only part of the type for any Card.

I'll probably be slightly thinking on this longer than makes sense. :D

dkarl

19 hours ago

When you start thinking about real programming languages, most of them don't assign any meaning to A + B or A * B, so it's up to you to construct types that act like those types.

Product types are easiest. Tuples or case classes / record types act like product types. In the case of the Foo type I defined above, it's easy to see that for each pair (c, b) in C x Boolean, there is one value of Foo, Foo(c, b), and vice-versa.

Sum types are weird, because in practice you often don't have a guarantee that two types are disjoint. Because of this, you typically don't see sum types in use without wrapper types that distinguishes the two cases. For example, the instances of Either<A, B> are typically Left(a) for some a in A or Right(b) for some b in B.

To see why this is important, consider a sum type combining Option<A> + Option<B>. The value None belongs to both types, and you can't tell if None arose as a value of Option<A> or Option<B>. In practice, this distinction matters more often than not.

For a more extreme value of non-disjoint union, consider a library function

    fetch<A>(extract: Response -> A): Either<A, Int>

that calls an HTTP API, extracts a value of type A from the response, and returns either the extracted value or, if extract fails, the HTTP status code of the response. If you extract a value of type Int from the response, the return value is Either<Int, Int>. A simple sum type Int+Int is useless here, because you won't be able to tell if 404 was the value extracted from the response or the HTTP response code.

For these reasons, Either<A, B> isn't a sum of A and B, but rather a sum of Left<Int> + Right<Int>.

One place sum types are useful is combining enumerations, because different enumerations have distinct values. For example:

    enum ClientErrors { BadRequest, NotFound }
    
    enum ServerErrors { InternalServerError, BadGateway }

    APIErrors = ClientErrors + ServerErrors

taeric

19 hours ago

Right. Apologies, but I know the basics.

My argument is that "algebraic types" would be less confusing if you actually did algebra using the types. That is it. That is my full argument.

As it is, you have to go through hoops to show what the algebra of the types is. And how that maps to whatever subtyping rules that the popular language you are using is.

Now, my argument is not that there is not some usefulness to understanding the algebra you can do with types. Quite the contrary. I think that is very useful and people should try to understand it.

But my argument remains that without using the common operations that define algebras, namely + and *, that calling them algebraic types does not provide any help in understanding the types for people. It remains an unfortunate naming based on how people are commonly introduced to the term "algebra."

mrkeen

a day ago

Units also receive the add/multiply treatment.

taeric

20 hours ago

Sort of? If you multiply two values, then the units also multiply. But if you add, they do not. They just have to be equal. No?

Even there, there is a very different thing from "adding units" and "adding values that have units."

adastra22

a day ago

Where are grade schoolers learning algebra?

taeric

a day ago

Just google "algebra 1." Largely, it is where they learn to do symbolic math. Lots of time spent on quadratics.

adastra22

a day ago

Yes, where I live (Silicon Valley) the earliest you can take algebra 1 is in middle school, and that is the accelerated pathway for the smart kids.

taeric

a day ago

Ah, I see that "grade school" is sometimes defined as just to 6th grade or so. I have always used it to include all schooling up to college.

So, apologies on that. I am referring to the same stuff you are mentioning. I don't think that changes my point, at all?

adastra22

a day ago

No it doesn’t, I was just curious. And yeah in the US grade school is just through 5th grade, and 6th grade in most of the rest of the world. Synonymous with elementary school AFAIK.

I think the US is far too slow in introducing algebra, with it standardly being taught in 9th grade. So that made me curious.

taeric

a day ago

I don't remember my school days, oddly. For my kids, they don't have an "algebra" class until later. But they do basic symbolic math far sooner. I know they have done what we would call linear systems of equations as early as 6th grade. Different number systems even earlier.

My 9th grader is in an algebra class. I don't remember if it is 1 or 2.

I'm still not entirely clear I see the obvious path from the things they are doing to algebraic types.

jghn

a day ago

> in the US grade school is just through 5th grade, and 6th grade in most of the rest of the world

Be careful with this. Where I grew up in the US, "Grade school" was usually through 6th grade with a grade 7-8 middle school or just 1-8.

spicyusername

a day ago

I always miss them so much in languages that don't have them.

It is just such a common situation to be in where you have something that can only be X, Y, or Z, where X, Y, or Z are separate types.

And then being able to pattern match over it to guarantee all cases are handled. Just perfect.

raxxorraxor

3 days ago

> Algebraic Types are Just Elementary School Algebra

My math prof did say the exact same while torturing students with question about proofs about their arcane set of arbitrary numbers and if they can be considered a field or ring or a group or everything at the same time.

Sure, just some + and *...

And sure, for a programmer it is mostly about which operations are defined on the type. But with just a few tweaks here and there you can transform a tool into a torture device...

Jokes aside, I think this is a good explanation about the concepts and parallels.

HelloNurse

2 days ago

> their arcane set of arbitrary numbers and if they can be considered a field or ring or a group or everything at the same time.

A toolbag of abstract theory and tools that can be straightforwardly applied to any "arcane set" is the value that upgrading from arithmetic to algebra provides.

tromp

a day ago

> In OCaml for instance, unit corresponds to void.

It makes more sense that Void should be void of values, i.e correspond to the empty set, as it does in Haskell [1]. Does OCaml have no empty type?

[1] https://hackage.haskell.org/package/void-0.6.1/docs/Data-Voi...

aiono

a day ago

Void in Haskell is not the same thing as `void` in C, C++, Java or similar languages. If you return `Void` in Haskell, you will get a runtime error but `void` returning functions in C-like languages don't fail. It's not a type that doesn't have any value, rather it's a type that has a single value (normal return). Therefore it's more similar to `unit`.

chuckadams

a day ago

Void having no values means it's impossible to return it or call a function that accepts it, so that would make it a compile-time error as opposed to runtime, no?

aiono

a day ago

In practice yes, so when you try to execute the function that accepts `Void`, that will error out. But there is no reason to not compile it because it will never actually execute. This may sound unpractical but quite the contrary it's very useful. For instance, it's used to model functions that panics in Rust (https://doc.rust-lang.org/reference/types/never.html). That way you can have a function called `todo` which fails when executed, but keeps the type checker happy until you actually implement it.

chuckadams

an hour ago

Right, it's very useful for lots of reasons, one being that any construct that actually tries to use a Void return value or construct a parameter value will be rejected at compile time.

What I am really curious about is whether there's any uses of `absurd :: Void -> a` in the wild. I guess it's a top type for arbitrary functions, though I'm not sure how useful that is in itself...

gherkinnn

a day ago

They clicked for me nigh a decade ago while learning Elm and since then I can't think in code without them.

zk108

a day ago

...and a monad is just a monoid in the category of endofunctors of some fixed category

Just kidding but algebraic types are a great abstraction paradigm,

zokier

a day ago

ADTs are just structs and tagged unions. Tbh I don't understand why there is such fetishization of them considering how mudane they are.

aiono

a day ago

That's the point I tried to make. All my bachelor education there was no single mention of sum types even though we learned about the visitor pattern. Instead of having to model sum types with object hierarchies, the language should provide a straightforward way to represent them since it's a very basic concept. I think that while things have improved a lot compared to past, sum types concept is still not known as much as objects for instance. Today all mainstream languages added direct support for sum types but awareness of it lacks.

user

a day ago

[deleted]

Munksgaard

a day ago

They are not "just structs and tagged unions". The language support that enforces safe usage is not optional.

lock1

19 hours ago

And OOP is just subtyping and visibility modifiers too! /j

The main problem with ADT is language support for them. Primarily sum types, as they're missing in a lot of major PLs, while product types are "just" structs or classes.

Without language support, you can simulate them via brittle workarounds like `if (d.isTypeA()) return d.methodA()`.

Actually, I agree, it is mundane. That's exactly why it's kind of becoming an itch for anyone used to ADT, you'll feel like you're missing a basic building block.

Just imagine if all languages removed `struct`-like and `class`-like construct. Something like Brainfuck esolang or any plain asm. It certainly possible to create arbitrary complex program in those languages. But would you pick `struct`-less language over language that supports it natively?

stuartjohnson12

a day ago

Algebraic types aren't scary at all. See:

  type CollatzConjectureIsTrue<
    N extends any[] = [unknown],
    M = "∀",
    A extends any[] = []
  > =
  M extends "odd"   ? (N extends [any, any, ...infer R] ? CollatzConjectureIsTrue<R,"odd"> : N extends [any] ? true : false)
  : M extends "half"  ? (N extends [any, any, ...infer R] ? CollatzConjectureIsTrue<R,"half",[unknown,...A]> : A)
  : M extends "3n1"   ? [unknown, ...N, ...N, ...N]
  : M extends "next"  ? (CollatzConjectureIsTrue<N,"odd"> extends true ? CollatzConjectureIsTrue<N,"3n1"> : CollatzConjectureIsTrue<N,"half">)
  : M extends "term"  ? (N extends [unknown] ? true : CollatzConjectureIsTrue<CollatzConjectureIsTrue<N,"next">,"term">)
  : (CollatzConjectureIsTrue<N,"term"> extends true ? CollatzConjectureIsTrue<[unknown,...N]> : false);

mrkeen

a day ago

any, extends, unknown and infer.

What algebraic operations do they correspond to?

wk_end

a day ago

Where are the sum and product types in this, exactly?