hackernews client

Show HN: Fine-tuned Llama 3.2 3B to match 70B models for local transcripts

31 pointsposted 5 months ago

4 Comments

syntaxing

5 months ago

This is amazing, I’ve been having the problem with live STT (mainly for voice assistants). I’m curious if your model + whisper tiny would outperform Whisper small or even medium. I’ve been having issues where even Fast Whisper small takes too long.

Also bummed how Qwen3-1.7B purely nonthinking hasn’t been released. Otherwise, I’m curious on “how low can you go”

ruben81ad

5 months ago

Thanks for sharing. It is impressive to see how fine tuning makes such a huge difference. It is a matter of cost to decide whether you use a large llm or fine tune a small one.

As a noob in fine tuning, question: how did you decided the values of the hyper parameters ?

user

5 months ago

[deleted]

stuaxo

5 months ago

I can't trust Llama3, because I have no idea what they did to the model to make it "less woke".