After 8 years, I rewrote my open-source PyTorch curvature library

1 pointsposted 8 hours ago
by noahgolmant

1 Comments

noahgolmant

8 hours ago

Back in 2018 I published pytorch-hessian-eigenthings, a niche open source package for GPU-accelerated curvature analysis of PyTorch models. Loss landscape curvature metrics like the eigenvalues of the Hessian have been implicated in many generalization properties of neural networks (like flat-minima hypotheses, low-rank Hessian claims, etc.). But the full Hessian costs memory quadratic in the parameter count, which is usually infeasible. This library uses Hessian-vector products + iterative methods (Lanczos, power iteration) to get the eigendecomposition in linear memory instead. I stepped away from the project for years, but it ended up being used by other researchers doing curvature analysis work. I noticed the original implementation had aged so I thought I'd revisit it. I also have more professional engineering experience under my belt to inform the design.

I just shipped a v1.0 rewrite. The new version adds new curvature operators (Generalized Gauss-Newton, empirical Fisher), and new algorithms (Hutchinson + Hutch++ trace estimation, spectral density via Stochastic Lanczos Quadrature). It also has a fused Triton/torch.compile cross-entropy Hessian-vector kernel for foundation-model-scale vocabularies (where standard implementations blow up). More importantly it adds a lot of numerical analysis validating the operators: closed-form correctness on linear/logistic regression where the Hessian is known analytically, and cross-library tests against curvlinops to catch any regressions.

https://github.com/noahgolmant/pytorch-hessian-eigenthings

I'm hoping to use it for some follow-up analysis. For example right now I'm looking at inter-agreement between various optimizer updates (Muon, K-FAC, Natural Gradient Descent) on Pythia checkpoints.

Very open to suggestions or requests from anyone who's been working in this space. I've been out of the field for a while, so pointers to recent work I should be aware of are very welcome.