hackernews client

mc7alazoun

8 hours ago

Feasible but too expensive! I get that privacy is a priority for you but unfortunately if you want quality models you'd still have to maybe use frontier closed models..

mazinz

8 hours ago

No open source model that’s any good?

the Gemma you tried is tiny, there are 31B and 26B (A4B) variants. there's also Qwen 3.6 with 27B and 35B (A3B) variants, reportedly pretty good. try them on open router or something. these require 30-40 Gb of memory to run between RAM and VRAM, less if quantized beyond near-lossless 8 bit.

there are near-SOTA open models, but they are 1T+ parameters, i.e. they require over a terabyte of memory to run.

user

7 hours ago

[deleted]

Ask HN: Is it feasible to run a model on device for complete privacy?

7 Comments

mc7alazoun

mazinz

vitalyan1234

user

benoau

mazinz

benoau