Pipeline-parallel LLM inference across GPUs on separate machines

4 pointsposted 10 hours ago
by ngaut

No comments yet