hackernews client

mikewarot

3 hours ago

Training a model requires repetition, in the case of large language models, it's feeding it a trillion tokens while using gradient descent to improve it's predictive power, then repeating that loop a trillion times.

Those tokens save a few orders of magnitude in training costs compared to doing it with raw streams of text. (But also result in LLMs that suck at basic math, spelling, or rhyming)

Doing the same thing with raw inputs from the world would likely add 6 more more orders of magnitude to any given training task, as you would have to scale up that initial input fed into the AI to match the wider bandwidths you're talking about.

You also have to have some form of goal to have a loss against. It's unclear what that would be. I'd suggest using "surprise minimization" as the goal. Something that can just predict raw surprise might turn out to be useful.

To get the compute requirements down into the feasible range, I'd suggest starting with an autoencoder. Like we do with LLMs, you could take that raw input and just try to compress it to a much lower dimensionality. You could then try to predict that value in the future.

mikewarot

36 minutes ago

Ugh... missed the 2 hour window.

Initially I was focused on the training and memory requirements, but as I thought about it while doing other things, it occurred to me that the same things that work for LLMs should work with your idea.

Use an autoencoder to try to reduce the dimensionality of the data, while preserving as much information as possible. This gains you orders of magnitude data compression while remaining useful, and reducing compute requirements for the next steps by that amount squared.

Once the autoencoder is sufficiently effective, then you can try to predict the next state at some point in the future. If you have any tagging data, then you can do the whole gradient descent, repeat for a trillion iterations thing.

The thing is, trillions of cycles aren't really a barrier these days. Start with deliberately small systems, and work up.

Good luck!

PaulHoule

15 hours ago

Electromagnetic waves are linear and can only do so much. General intelligence and communication require nonlineariy. You could have beams of light connecting some kind of optical neurons through free space or reflecting through a hologram but you still need the neuron.

https://www.nature.com/articles/s41377-024-01590-3

sunscream89

3 hours ago

Yes, all of the rules of conservation and expenditure of potential distribution over manifold surface areas may be explored, possibly in parts applied.

PH says electromagnetic waves are linear though I believe he has mistaken his sensory dimensions for the extent of universal expansion.

It is exactly where there are vectors that dimensionality changes, it adds a new scalar coordinate system and allows more information (discernible disposition), etc.

Electromagnetic waves aren’t just intensity, they like gravity extrapolate and create features in the time space of existential reality. We calculate these behaviors linearly, yet reality doesn’t calculate, it distributes potentials (such as EM) over surface areas (such as space time, or intensity).

Where drawn further from fundamental forces, a priori aspects of both reality (existential aspect of universal potential distributing), information (reduction of uncertainty, that is the resolve of potential/uncertainty of distribution), and intelligence (mitigation of uncertainty, the forward determiners for determinant resolve) might be seen in new ways.

Is a new AI paradigm based on raw electromagnetic waves feasible?

4 Comments

mikewarot

mikewarot

PaulHoule

sunscream89