Show HN: On-device transcriber that's 97% accurate at identifying speakers

32 pointsposted 19 days ago
by marshalla

10 Comments

mattlohkamp

18 days ago

Hey Marshall! Cool to see this coming together, kudos for buildimg the tool you wish you had, thats the right reason to do things!

it seems like these “realtime meeting assistant / transcriber” services have taken a huge leap closer to being what I too have have often found myself wishing for. (Recently I gave Hedy AI a shot, very much in the same neighborhood functionally feels like)

out of curiosity, for Mimic’s Local Mode, whatre the tech specs required for a reasonable level of performance?

marshalla

18 days ago

Matt, Thanks!

I just tried Hedy, same concept, also a great tool. It's todos are nice.

MimicScribe works well with any Apple silicon Mac so it'll feel snappy on an M1 with 8GB of RAM even. It uses Apple's on-device ML accelerator, the ANE. https://mimicscribe.app/docs/performance

joey9prints

16 days ago

I think the privacy story is super important, I like the use of local models as the fix here and understand that the output wouldn't be as good as frontier models. Perhaps look into venice for more private inference.

marshalla

12 days ago

venice.ai? Ok nice I'll check it out. thanks. ya I added the openai compat endpoint, no brainer.

pavelpilyak

18 days ago

Looks great! Feature suggestion: would be great to plug in ollama for the AI parts. Not as great as a BYOK, but worth it to those who want to keep everything local

marshalla

18 days ago

Hey thanks! Good call. I had limited success getting Qwen 3.5 9B working for some of the longer prompts that require lots of json output. I feel like completely on-device is so close to being usable for this stuff though. I should revisit this, actually.

bissellator

18 days ago

running locally is great, but I would wonder about the system requirements (I love my mac air but ollama on 8GB with an Intel i5 probably would be a little too much for the little thing). But having a toggle option would be great

marshalla

17 days ago

Right? I feel like local model support is something that has strong ideological appeal and might influence someones feeling towards the app, but when it comes down to it most people will probably just use a cloud model for larger tasks unless they have beastly hardware. It's like how I have an Android in part because I might one day flash the ROM.

Ollama just exposes models via an OpenAI compatible endpoint though (I'm pretty sure), so adding that standard is probably a good idea. The prompts are a bit tuned for Gemini. I'd have to test how much that matters.

apex_sloth

18 days ago

Somehow I expected from the headline it identifies loudspeakers by their sound signature and got really curious :)

marshalla

18 days ago

lol that'd be a trick. I'd have it purposely misidentify to cheaper brands to mess with people.