hackernews client

Google's TurboQuant AI-compression algorithm can reduce LLM memory usage by 6x

8 pointsposted 10 hours ago

by geoffbp

(arstechnica.com)

6 Comments

amai

2 hours ago

The PolarQuant paper:

https://arxiv.org/abs/2502.02617

The TurboQuant paper:

https://arxiv.org/abs/2504.19874

These elaborate ideas are not adequately represented in the article.

Is it correct to rephrase PolarQuant as “embeddings vectors but as complex numbers” since polar vectors were represented with i complex numbers in other fields of applications? Would this open a new field of transformations using complex number arithmetics?

Edit: I believe I have my math concepts all wrong. Vectors are currently represented with xyz coordinates, so that can be our complex number equivalent. This paper saves memory size by ‘simply’ transforming this “complex number” into its polar form (radius/size and angle). Since you use less bytes on angle + radius you then always refer to that polar form instead of the regular vector/complex number part. If I have my concepts correct this idea is simple and genius.

Edit 2: I may be missing the forest for the trees here. I’ll try to learn more from the actual sources if I can.

jqpabc123

10 hours ago

This cheat sheet is necessary because, as we say all the time, LLMs don’t actually know anything; they can do a good impression of knowing things through the use of vectors

In their words, it's not real intelligence --- it's a facsimile thereof that can be convincing to some.

This can be useful in some limited cases --- but also risky and dangerous if relied on for decision making.