MiniMax teased M3 Sparse Attention: 9.7x prefilling, 15.6x decoding at 1M

8 pointsposted 10 hours ago
by rebekkamikkoa

No comments yet