saagarjha
2 hours ago
Honest question, I feel like kernels are usually short enough that you can fully understand their performance in the development cycle before you even deploy them. If you get different results in production this seems to me that you didn’t spend enough time understanding what’s going on earlier. Are there things you genuinely can’t get from this workflow?
SyzygyRhythm
6 minutes ago
Sometimes you have to optimize other people's code. Also, sometimes code behaves unexpectedly depending on the data, say over a certain size threshold. And sometimes it behaves differently on different hardware. You don't always find these things out until production.