hackernews client

1.8-3.3x faster Embedding finetuning now in Unsloth

3 pointsposted 16 days ago

3 Comments

storystarling

15 days ago

Do the memory savings carry over to inference or is this strictly optimizing the backward pass? I'm running embedding pipelines via Celery and being able to squeeze this into lower VRAM would help the margins quite a bit.

danielhanchen

16 days ago

Excited to have collabed on this! Thanks electroglyph for the contrib!

electroglyph

16 days ago