lacker
13 hours ago
Dependent types are very useful for some things. For example, I wish Python had the ability to express "a 10 x 5 matrix of float32s" as a type, and typecheck that.
The Curry-Howard correspondence, using dependent type system to have "proofs" be equivalent to "types", is powerful, but it can be really confusing. From a human point of view, there is a huge difference between "Your proof is wrong" and "You wrote a statement that fails typechecking".
Intuitively, when you make an error with types, it should be something fairly trivial that you just read the error and fix it up. When you make an error in a proof, it's understandable if it's very complicated and requires thought to fix. The natural UI is different.
So I agree with the author that the greatest benefit of Lean is not its typesystem per se, but its community. Specifically the fact that Lean's library of mathematics, mathlib, is organized like an open source community with pull requests. Whereas Isabelle's library of mathematics, the AFP, is organized like a scientific journal with referees.
I'm working on a dependent type system at the moment for a new theorem prover - Acorn, at https://acornprover.org - and my hope is to combine the good points of both Lean and Isabelle. It's nice that Lean has the power to cleanly express the simple dependent types that mathematicians often use, like vector spaces or quotients. But if you use dependent types too much then it does get complicated to debug what's happening.
Sharlin
12 hours ago
> For example, I wish Python had the ability to express "a 10 x 5 matrix of float32s" as a type
To clarify, as long as 5 and 10 are constants, this is entirely possible in C++ and Rust^1, neither of which are dependently typed (or at most are dependently typed in a very weak sense). In general, neither can ensure at compile time that an index only known at runtime is in bounds, even if the bounds themselves are statically known. A proper dependently-typed language can prevent runtime out-of-bound errors even if neither the indices nor the bounds are known at type check time.
^1 And weakly in many other languages whose builtin array types have compile-time bounds. But C++ and Rust let user-defined generic types abstract over constant values.
thomasahle
12 hours ago
You can do that in python using https://github.com/patrick-kidger/torchtyping
looks like this:
def batch_outer_product(x: TensorType["batch", "x_channels"],
y: TensorType["batch", "y_channels"]
) -> TensorType["batch", "x_channels", "y_channels"]:
return x.unsqueeze(-1) * y.unsqueeze(-2)
There's also https://github.com/thomasahle/tensorgrad which uses sympy for "axis" dimension variables: b, x, y = sp.symbols("b x y")
X = tg.Variable("X", b, x)
Y = tg.Variable("Y", b, y)
W = tg.Variable("W", x, y)
XWmY = X @ W - Ypatrickkidger
7 hours ago
Quick heads-up that these days I recommend https://github.com/patrick-kidger/jaxtyping over the older repository you've linked there.
I learnt a lot the first time around, so the newer one is much better :)
thomasahle
7 hours ago
Ah, I would have never thought jaxtyping supports torch :)
saghm
10 hours ago
> In general, neither can ensure at compile time that an index only known at runtime is in bounds, even if the bounds themselves are statically known.
Yeah, this seems like matrixes might not be a great first example for explaining the value of dependent types. It's fully possible to define a matrix that uses a generic type as the index for each dimension that doesn't allow expressing values outside the expected range; it would just be fairly cumbersome, and the usual issues would creep back in if you needed to go from "normal" integers back to indexes (although not if you only needed to convert the indexes to normal integers).
I find that the potential utility of dependent types is more clear when thinking about types where the "dimensions" are mutable, which isn't usually how I'd expect most people to use matrixes. Even a simple example like "the current length of a list can be part of the type, so you define a method to get the first element only on non-empty lists rather than needing them to return an optional value". While you could sort of implement this in a similar as described above with a custom integer-like type, the limitations of this kind of approach for a theoretically unbounded length are a lot more apparent than a matrix with constant-sized dimensions.
jyounker
8 hours ago
The advantage of dependent typing systems is that you can say that the arguments to a matrix multiplication function must have dimensions (m, i) and (i, n) and that result will have dimensions (m, n).
akst
an hour ago
As others have said, this isn't a unique value proposition of dependent types, by all means dependent types provide a solution to this, but so do many other approaches.
You can do this in C++ & Rust, and I've done something similar in Typescript. The point of the comment you're replying to is kind of making the point that pitches for dependent types probably should give examples of something not already solved by other systems (not that you can't do this in DT). See:
> Yeah, this seems like matrixes might not be a great first example for explaining the value of dependent types.
In order to adopt DT, you need to pay a combined cost of productivity loss in terms of established tooling and choose to forgo any human capital which is no longer applicable once you start using a language with Dependent types, and a loss of time spent learning to use a language that has dependent types, which almost certainly has a smaller library ecosystem than most mainstream languages without DT.
For that reason it's worth considering what examples illustrate a benefit that outweighs all the above costs. I don't think it's enough to simply assert some moral basis on safety when clearly salaries for work done in unsafe languages are already considerably high.
saghm
an hour ago
Yeah, right now there isn't really any example of the value tradeoff being high enough that its overcome the barriers to adoption. Examples making the benefits more clear are helpful in being able to talk about dependent types in my opinion because most programmers (including myself beyond a fairly surface-level understanding, honestly) probably aren't familiar with them; once that's been established, I think it's a lot easier to talk about why the approach hasn't yet become mainstream.
For example, with the potential utility in having dependent types for list APIs, the barrier is that in practice, it's not always straightforward to be able to define your code in a way where you actually know the size of the list. If I read input from a user, I might be able to do an initial check that a list of items is non-empty as part of the constructor for the type, but what happens if I want to start popping off elements from it? In practice, this seems like it would either force you to go back to manually checking or only ever process things in a purely functional way to avoid defining behavior that vary based on state that isn't known. If you're willing to take this approach, having the compiler enforce it is great! You have to be willing to use this approach often enough for it to be worth picking a language with dependent types though, and in practice it doesn't seem likely that people would choose to even if they were more aware of the option.
akst
14 minutes ago
> Examples making the benefits more clear are helpful in being able to talk about dependent types in my opinion because most programmers (including myself beyond a fairly surface-level understanding, honestly) probably aren't familiar with them
There's probably some truth in that (I'm probably in this camp as well), though I feel sometimes adoption of things like this can often be a communication problem, and the people who understand these technologies the most may struggle to identify what aspect of these technologies are valued more by the people they are trying to pitch them too.
Like they might even find those aspects boring and uninteresting because they are so deep into the technology, that they don't even think to emphasis them. Which feels like a similar story where a technology early to the party didn't take off until someone else came back to them later (like fat structs and data oritented design gaining some interest despite early versions of these ideas date back to Sketchpad (1963)).
> For example, with the potential utility in having dependent types for list APIs, the barrier is that in practice, it's not always straightforward to be able to define your code in a way where you actually know the size of the list
Yeah this example crossed my mind as well, it's not immediately clear how you deal with IO or data the result of inputs from the outside world. Although I have feeling this has to be a solved problem.
I guess more generally with API design, sometimes you don't really know the system you want to model yet so there's diminishing returns in being overly precise in how you model it if those things are subject to change. Not sure how substantive a concern this is when weighing up using dependent types, but it is something that crosses my mind as an outsider to all this dependent type stuff.
eru
3 hours ago
You don't need dependent types for that, either.
codebje
2 hours ago
Not for matrix multiplication, because there's no term-level chicanery to worry about; you just need phantom types for the two dimensions on your matrix type.
You can express a lot just carrying phantom types around and using basic pattern matching: in Rust I can express multiplication, concatenation of vectors to matrices, and inverses of square matrices, using phantom types.
But I can't say this, or anything remotely like it:
struct Matrix<M: u32, N: u32, T> {
dim: (M, N),
// imaginary dependently typed vectors
cells: Vec<M, Vec<N, T>>,
}
impl<M, N, T> Matrix<M, N, T> {
fn mul<O: u32>(&self, &other: Matrix<N, O, T>) -> Matrix<M, O, T> {
let cells: Vec<self.m, Vec<other.o, T>> = do_the_thing();
Matrix {
// any other values in dim, or a different type in cells, would be a type error
dim: (self.m, other.o),
cells
}
}
}
In other words, there's no dependent pair construction without dependent types. (Or at least dependent kinds and some tricks.)zozbot234
12 hours ago
Yes, the point of dependent types is that they give you the ability to do some sort of almost arbitrary (though not strictly Turing-complete) "computation" as part of type checking, which essentially dispenses with the phase separation between "compiling" and "running" code - or at least makes compile-time computation unusually powerful. So if you want to replicate the properties of dependent typing in these existing languages you'll need to leverage their existing facilities for compile-time metaprogramming.
> A proper dependently-typed language can prevent runtime out-of-bound errors even if neither the indices nor the bounds are known at type check time.
Yes, but the way this is done is by threading a proof "this index is within bounds" throughout the code. At runtime (e.g. within 'extracted' code, if you're using common dependently-typed systems), this simply amounts to relying on a kind of capability or ghost 'token' that attests to the validity of that code. You "manufacture" the capability as part of an explicit runtime check when needed (e.g. if the "index" or "bounds" come from user input) and simply rely on it as part of the code.
gpderetta
8 hours ago
You can do arbitrary computations as part of type checking in C++, yet I don't think it should be considered dependently typed.
It seems to me that dependent typing strictly requires going from runtime values to types.
(You can parametrize types by runtime values in c++ in a trivial sense, by enumerating a finite set a compile time and then picking the correct type at runtime according to a runtime value, still I don't think it counts as the set of valid types would be finite).
zozbot234
8 hours ago
The "runtime" values of a dependent type system are not quite runtime values either, because the computation is happening at compile time. When you "extract" a program from a dependently-typed to an ordinary programming language that can compile down to a binary, the dependent types are simply erased and replaced with ordinary, non-dependent types; though there may be "magic" casts in the resulting program that can only be proven correct via dependent types.
wk_end
4 hours ago
It's not that simple, right? Because the computation can actually rely on the types, so the types can't always just be erased. The complexity around knowing when the types are going to be erased or not is part of what motivated Idris 2's adoption of quantities.
https://idris2.readthedocs.io/en/latest/updates/updates.html...
codebje
an hour ago
QTT is about clarity on erasing values rather than types, eg, if you define 'Vect n a' do you need to set aside space in the memory structure for a vector for 'n' or not. The types themselves are not represented. If, eg, you wanted to implement something that printed out its own type, like:
vname : Vect n a -> String
vname = "Vect " ++ show n ++ " " ++ nameOf a
Then (1) you'd be requiring "n" to be represented at run-time (so a QTT value of "unrestricted"), and (2) you'd need a type class to find "nameOf" for "a" when executing, because the actual type of "a" is gone. At runtime a type class means a table of methods passed to the function so it can call the right "nameOf" implementation. class Named a where
nameOf a : String
vname : Named a => Vect n a -> String
≡ vname : String -> Vect n a -> Stringlacker
12 hours ago
That's a good point, for example in Eigen you can do
Eigen::Matrix<float, 10, 5>
I just really want it in Python because that's where I do most of my matrix manipulation nowadays. I guess you also would really like it to handle non-constants. It would be nice if these complicated library functions like
torch.nn.MultiheadAttention(embed_dim, num_heads, dropout=0.0, bias=True, add_bias_kv=False, add_zero_attn=False, kdim=None, vdim=None, batch_first=False, device=None, dtype=None)
would actually typecheck that kdim and vdim are correct, and ensure that I correctly pass a K x V matrix and not a V x K one.
zozbot234
12 hours ago
Python is a dynamic language where everything happens at run time, including checks of variable types. So you can already do this.
nephanth
10 hours ago
Python has static typechecking which, while not perfect, works pretty well as long as you're not actually trying to circumvent it (or manipulating things "too dynamically")
eru
3 hours ago
I think Python actually has multiple, different static typechecking systems, depending on which checker you use.
Python itself only gives you type annotations, but doesn't specify what they are supposed to mean.
(I think. Please correct me if I am wrong.)
bluGill
10 hours ago
The problem with runtime is when you make a mistake it can be a long time before you find out. Compile time means a large class of problems is prevented without perfect test coverage. On a small project it isn't a big deal but when you have hundgeds of developers over decades you will miss something
etbebl
12 hours ago
The latter is what would be most useful imo. Even Matlab can type check matrix sizes with constants these days, but I often wish I could use variables to express relationships between the sizes of different dimensions of inputs to a function.
tmtvl
10 hours ago
In Common Lisp:
(defparameter *test-array*
(make-array '(10 5)
:element-type 'Float
:initial-element 0.0))
(typep *test-array* '(Array Float (10 5)))
And the type check will return true.ndr
8 hours ago
Compile-time checks is what's implied in virtually all these conversations.
Does that check run at compile time?
tmtvl
7 hours ago
If I define a function foo which takes an (Array Float (10 5)) and then define a function bar which uses the result of calling foo with an (Array String (3 3)) then SBCL will print a warning. If I don't redefine foo and call bar with such an array then a condition will be raised. It's not quite the same as, say Rust's 'thing just won't compile', but that's due to the interactive nature of Common Lisp (I believe someone called it the difference between a living language and a dead language (though the conventional meaning of living language and dead language in programming is a bit different)).
chongli
5 hours ago
Can you replace 10 5 and 3 3 with variable names and still get the warning without first calling the function?
thesz
8 hours ago
"In general, neither can ensure at compile time that an index only known at runtime is in bounds, even if the bounds themselves are statically known."
I remember being able to use telescopes [1] in Haskell long time ago, around 2012 or so.
[1] https://www.pls-lab.org/en/telescope
Haskell was not and is not properly dependently typed.
uecker
12 hours ago
C has such types and can guarantee that there is no out-of-bounds access at run-time in the scenarios you describe: https://godbolt.org/z/f7Tz7EvfE This is one reason why I think that C - despite all the naysayers - is actually perfectly positioned to address bounds-safe programming.
Often in dependently-types languages one also tries to prove at compile-time that the dynamic index is inside the dynamic bound at run-time, but this depends.
Sharlin
10 hours ago
-fsanitize-bounds uses a runtime address sanitizer, surely? The program compiles fine. In a (strongly) dependently typed language, something like the following would refuse to typecheck:
int foo(int i) {
int bar[4] = { 1, 2, 3, 4 };
return bar[i]
}
The type checker would demand a proof that i is in bounds, for example int foo(int i) {
int bar[4] = { 1, 2, 3, 4 };
if i < 4
return bar[i]
else
return 0
}
In languages with an Option type this could of course be written without dependent types in a way that's still correct by construction, for example Rust: fn foo(i: 32) -> i32 {
let bar = [1, 2, 3, 4];
bar.get(i) // returns Option<i32>, not a raw i32
.unwrap_or(0) // provide a default, now we always have an i32
}
But ultimately, memory safety here is only guaranteed by the library, not by the type system.uecker
9 hours ago
You seem to be repeating what I said except you mixup strong and static typing. In a statically dependent-typed language you might expect it this not to compile, but this also depends. Run-time checking is certainly something you can combine with strong dependent typing.
Implementing bounds checking that returns option types (which can also implement in C) is not exactly the same thing. But dependent typing can be more elegant - as it is here.
uecker
8 hours ago
Update: For the interested, here are three ways to write this in C, the first using dependent type, the second using a span type, and the third using an option type. All three versions prevent out-of-bounds accesses: https://godbolt.org/z/nKTfhenoY
turndown
12 hours ago
I thought that this was a GCC extension(you need to use #define n 10 instead of int n = 10). Is this not the case anymore?
uecker
10 hours ago
This is in ISO C99. In C11 it was made an optional feature but in C23 we made the type part mandatory again.
akst
8 hours ago
One of the things I love about typescript is if your audacious enough you can kind of make some types like this (still a far cry from dependent types):
https://github.com/AKST/analysis-notebook/blob/main/lib/base... (Line 38)
It type checks, but I haven’t gotten a lot of mileage out of this (so far at least). I did something similar for vectors, which has been useful for when I destructure the elements, and it preserves that through vector operations.
https://github.com/AKST/analysis-notebook/blob/777acf427c65c...
The same is actually applies true for matrices, I wrote the multiplication to carry the new size into the output type
https://github.com/AKST/analysis-notebook/blob/777acf427c65c...
That said I mostly use this for square matrixes in web GL so you’re mostly working with square matrixes.
As an aside, this source is from a toy project largely more for leisure and fun, I don’t know if something like this would be suitable for a larger more data oriented project (at least in trend of how I represent data here).
aatd86
12 hours ago
> using dependent type system to have "proofs" be equivalent to "types"
Do you mean proposition as types? And proof has a program?
Or do you see it at somewhat a higher order, since values are proof for a type, and types can perhaps be proof at a higher level if they are first class citizen in the language?
(Which makes me think that dependent types is really a stairway between two type systems?)
Just wondering, since I had never personally seen it described in that fashion.
Other question: What is the function and runtime treatment of dependent types? Can they be instantiated on the spot by runtime values? A bit like defining/declaring functions that return arrays of runtime-known length? Does it involve two aspects of type checking? compile and runtime? Implementation-wise, doesn't it require too many indirections?
lacker
6 hours ago
Yeah, I just meant proposition-as-types. Sloppy terminology by me. I meant to complain about the general conflation of the "proving system" and the "typechecking system".
Generally in dependent type systems you are limited in some way in what sort of values a type can refer to. The fancy type checking happens at compile time, and then at run time some or most of the type information is erased. The implementation is tricky but the more difficult problem IMO is the developer experience; it involves a lot of extra work to express dependent types and help the compiler verify them.
olivia-banks
8 hours ago
I did a small small experiment in dependent typing for Python (and catching those errors before run time). It's a decently good fit, and with HM Type inference some neat things just fall out of the design.
But the real challenge is deciding how much of Python's semantics to preserve. Once you start enforcing constraints that depend on values, a lot of idiomatic dynamic behavior stops making sense. Curious what ways there are to work around this, but I think that's for a person better at Python than I :)