Adaptive speculative decoding: picking draft lengths at runtime

2 pointsposted 9 hours ago
by hasheddan

No comments yet