Hackernews
new
show
ask
jobs
Pipeline-parallel LLM inference across GPUs on separate machines
4 points
posted 10 hours ago
by ngaut
(github.com)
No comments yet