Hackernews
new
show
ask
jobs
MiniMax teased M3 Sparse Attention: 9.7x prefilling, 15.6x decoding at 1M
8 points
posted 10 hours ago
by rebekkamikkoa
(twitter.com)
No comments yet