js8
3 months ago
Not pixels, but percels. Pixels are points in the image, while a "percel" is unit of perceptual information. It might be a pixel with an associated sound, in a given moment of time. In case of humans, percels include other senses as well, and they can also be annotated with your own thoughts (i.e. percels can also include tokens or embeddings).
Of course, NNs like LLM never process a percel in isolation, but always as a group of neighboring percels (aka context), with an initial focus on one of the percels.
almoehi
3 months ago
I’ve had written up a proposal for a research grant to basically work exactly on this idea.
It got reviewed by 2 ML scientists and one neuroscientist.
Got totally slammed (and thus rejected) by the ML scientists due to „lack of practical application“ and highly endorsed by the neuroscientist.
There’s so much unused potential in interdisciplinary research but nobody wants to fund it because it doesn’t „fit“ into one of the boxes.
behnamoh
3 months ago
Make sure the ML scientists don't take credit for your work. Sometimes they reject a paper so they can work on it on their own.
almoehi
3 months ago
Grant reviews are blind reviews - so you don’t know. Also - and even worse - there is no rebuttal process. It gets rejected without you having a chance to clarify / convince reviewers.
Instead you’d need to resubmit and start the entire process from scratch. What a waste of resources …
It’s the final nail what made me quit pursuing a scientific career path despite having good pubs & PhD /w honours.
Unfortunately it’s what I enjoy the most.
Enginerrrd
3 months ago
That's unfortunate. My personal sense is that while agentic LLM's are not going to get us close to AGI, a few relatively modest architectural changes to the underlying models might actually do that, and I do think mimicry of our own self-referential attention is a very important component of that.
While the current AI boom is a bubble, I actually think that AGI nut could get cracked quietly by a company with even modest resources if they get lucky on the right fundamental architectural changes.
almoehi
3 months ago
I agree - and I think having interdisciplinary approach here is going to increase the odds here. There is a ton of useful knowledge in related disciplines - often just named differently - but turns out investigating the same problem from a different angle.
shepardrtc
3 months ago
Sounds like those ML "scientists" were actually just engineers.
verdverm
3 months ago
A lot of progress is made through engineering challenges
This is also "science"
falcor84
3 months ago
I love this idea, but can't find anything about it. Is this a neologism you just coined? If so, is there any particular paper or work that led you to think about in those terms?
js8
3 months ago
Yes, I just coined the neologism. It was supposed to be partly sarcastic (why stay at pixels, why not just go fully multimodal and treat the missing channels as missing information?), I am kind of surprised why it got so upvoted.
(IME, often my comments which I think are deep get ignored but silly things, where I was thinking "this is too much trolling or obvious", get upvoted; but don't take it the wrong way, I am flattered you like it.)
causal
3 months ago
Pretending channels can be effectively merged into a single percel vector, that would open up interesting channels beyond human perception even, e.g. lidar. Or it would be interesting to train a model that feels at home in 4D space.
jaredhansen
3 months ago
I think there's a decent chance you may have just created the ideal name for what will become one of the most important concepts ever. Bravo!
SJMG
3 months ago
Deep things often, not always, take more attention to appreciate than the superficial. It's a precious resource people are seldom disposed to allocate a lot of when headline-surfing HN.
throwaway-aws9
3 months ago
Should future attributions in white papers go to js8 from HN?
Workaccount2
3 months ago
Isn't this effectively what the latent space is? A bunch of related vectors that all bundle together?
js8
3 months ago
No, latent space doesn't have to be made of percels, just like not every 2D array of 3-element vectors is an image made of pixels. Percels are tied to your sensors, components of what you perceive, in totality.
Of course there is an interesting paradox - each layer of the NN doesn't know whether it's connected to the sensors directly, or what kind of abstractions it works with in the latent space. So the boundary between the mind and the sensor is blurred and to some extent a subjective choice.
taneq
3 months ago
“Percel” is still a way cooler and arguably more descriptive term than “token” though.
causal
3 months ago
This is an interesting thought. Trying to imagine how you represent that as a vector.
You still need to map percels to a latent space. But perhaps with some number of dimensions devoted to modes of perception? E.g. audio, visual, etc
milanove
3 months ago
I'm not an ML expert or practitioner, so someone might need to correct me.
However, I believe the parcel's components together as a whole would capture the state of the audio+visual+time. However, I don't think the state of one particular mode (e.g. audio or visual or time) is encoded with a specific subset of the percel's components. Rather, each component of the percel itself would represent a mixture (or a portion of a mixture) of the audio+video+time. So, you couldn't isolate out just the audio or visual or time state specifically by looking at some specific subset of the percel's components, because each component is itself a mixture of the audio+visual+time state.
I think the classic analogy is that if river 1 and river 2 combine to form river 3, you cannot take a cup of water from river 3 and separate out the portions from river 1 and river 2; they're irreversibly mixed.
BrokenCogs
3 months ago
I was going to say toxel
causal
3 months ago
Like a tokenized 3D voxel?
BrokenCogs
3 months ago
Tokenized pixel. I understand now that's not what js8 was talking about, so my original comment doesn't really make sense
szundi
3 months ago
[dead]
user
3 months ago