Hackernews
new
show
ask
jobs
Field Notes on Scaling Moe Expert Parallelism with DeepEP
1 points
posted 10 hours ago
by todsacerdoti
(nousresearch.com)
No comments yet