Token-Count-Based Batching: Faster, Cheaper Embedding Inference for Queries

1 pointsposted 13 hours ago
by fzliu

No comments yet