Hackernews
new
show
ask
jobs
CAD: Disaggregating Core Attention for Efficient Long-Context LLM Training
6 points
posted 10 hours ago
by ginda307
(hao-ai-lab.github.io)
1 Comments
user
10 hours ago
[deleted]