Compiler optimizations for 5.8ms GPT-OSS-120B inference (not on GPUs)

9 pointsposted 4 months ago
by olibaw

No comments yet