Hackernews
new
show
ask
jobs
Show HN: A Zero-Copy 1.58-bit LLM Engine hitting 117 Tokens/s on single CPU core
4 points
posted 8 hours ago
by dhilipsiva
(github.com)
No comments yet