bob1029
9 days ago
I feel like there is some kind of information theory constraint which confounds our ability to extract higher order behavior from multiple instances of the same LLM.
I spent quite a bit of time building a multi agent simulation last year and wound up at the same conclusion every day - this is all just a roundabout form of prompt engineering. Perhaps it is useful as a mental model, but you can flatten the whole thing to a few SQL tables and functions. Each "agent" is essentially a sql view that maps a string template forming the prompt.
I don't think you need an actual 3D world, wall clock, etc. The LLM does not seem to be meaningfully enriched by having a fancy representation underly the prompt generation process. There is clearly no "inner world" in these LLMs, so trying to entertain them with a rich outer environment seems pointless.
chefandy
9 days ago
TBH I haven't seen a single use of LLMs in games that wasn't better served by traditional algorithms beyond less repetitive NPC interactions. Maybe once they get good enough to create usable rigged and textured meshes with enough control to work in-game? They can't create a story on the fly that's reliable enough to be a compelling accompaniment to a coherent game plot. Maps and such don't seem to need anything beyond what current procedural algorithms provide, and they're still working with premade assets— the implementations I've seen can't even reliably place static meshes on the ground in believable positions. And as far as NPCs go— how far does that actually go? It's pure novelty worth far less than an hour of time. Let's even say you get a guided plot progression worded on the fly using an LLM, is that even as good, let alone better, than a dialog tree put together by a professional writer?
This Civ idea at least seems like a new approach to some extent, but it still seems to conceptually not add much. Even if not, learning that it doesn't it's still worthwhile. But almost universally these ideas seem to be either buzzwordy solutions in search of problems, or a cheaper-than-people source of creativity with some serious quality tradeoffs and still require far too much developer wrangling to actually save money.
I'm a tech artist so I'm a bit biased towards the value of human creativity, but also likely the primary demographic for LLM tools in game dev. I am, so far, not compelled.
JohnMakin
9 days ago
It's been posted in-depth a few times across this forum to varying degrees by game developers - I was initially very excited about the implementation of LLM's in NPC interactions, until I read some of these posts. The gist of it was - the thing that makes a game fundamentally a game is its constraints. LLM-based NPC's fundamentally break these constraints in a way that is not testable or predictable by the developer and will inevitably destroy the gameplay experience (at least with current technology).
chefandy
9 days ago
Yeah, same. Epic's Matrix demo implemented it and even without a plot, the interactions were so heavily guided that the distinction was pointless. So you can find out what that NPCs spous's name is and their favorite color. It's that neat? Sure it's neat. It's it going to make it a better game? Probably less than hiring another good writer to make NPC dialog. To be truly useful, I think they would have to be able to affect the world in meaningful ways that worked with the game plot, and again, when you clamp that down as much as you'd need to to still have a plot, you're looking at a fancy decision tree.
MichaelZuo
9 days ago
Nobody will know for sure until a big budget game is actually released with a serious effort behind its NPCs.
chefandy
9 days ago
I can't see anything that Gen AI NPCs would add unless maybe you're talking about a Sims kind of game where the interactions are the point, and they don't have to adhere to a defined progression. Other than that, it's a chat bot. We already have chatbots and having them in the context of a video game doesn't seem like it would add anything revolutionary to that product. And would that fundamentally stand a chance of being as compelling to socially-focused role-playing gamers as online games?
This is my field so I'm always looking for the angle that new tech will take. I still rank this lower than VR— with all of its problems— for potential to significantly change player interactions. Tooling to make games is a different story, but for actual use in games? I don't see it yet.
mywittyname
9 days ago
Sandbox games are probably where they will shine. Imagine being able to play Minecraft, and tell a prompt to generate a world that resembles Tatooine, or a vampire-themed mansion. Expectations are lower with sandbox games, so there's no risk of breaking immersion like would happen with an LLM Elder Scrolls game when someone tricks in NPC into solving problems in python.
Granted, I'm certain there will be copyrights issues associated with this capability, which is why I don't think it will be established game companies who first take a crack at this approach.
chefandy
9 days ago
The problem is what it takes to implement that. I've seen companies currently trying to do exactly that, and their demos go like this "ok, give me a prompt for the environment" and if they're lucky, they can cherry pick some stuff the crowd says and if they're not, they sheepishly ask for a prompt that would visit indicate one of 5 environment types they've worked on and include several of the dozen premade textured meshes they've made, and in reality you've got a really really expensive procedural map with asset placement that's worse than if it was done using traditional semi-pre-baked approaches. A deceptive amount of work goes into the nitty gritty of making environments, and even with all of the incredible tooling that's around now, we are not even close to automating that. It's worth noting that my alma mater has game environment art degree programs. Unless you're making these things, you can't easily see how much finesse and artistic sensibility it takes to make beautiful compositions with complementary lighting and nice atmospheric progression. It's not just that nobody has really given it a go— it's really difficult. When you have tooling that uses AI controlled by an artist that knows these things, that's one thing. When they need to make great results every time so players keep coming back? That's a very different task. I've never met anyone that thought it was remotely currently feasible without lacking knowledge of generative AI, game development, or both.
Automating the tools so a smaller workforce can make more worlds and more possibilities? We're already there— but it's a very large leap to remove the human creative and technical intermediaries.
MichaelZuo
8 days ago
What are the actual claims and/or arguments?
chefandy
8 days ago
To automatically generate a freeform 3D sandbox environment based on a prompt? Do you have a specific criticism or counterpoint?
MichaelZuo
8 days ago
How does this relate to the parent comment? Is there some counterargument why X is unlikely due to Y reasons?
chefandy
8 days ago
"Sandbox games are probably where they will shine. Imagine being able to play Minecraft, and tell a prompt to generate a world that resembles Tatooine, or a vampire-themed mansion."
"The problem is what it takes to implement that. I've seen companies currently trying to do exactly that, and their demos go like this "ok, give me a prompt for the environment" and if they're lucky, they can cherry pick some stuff the crowd says and if they're not, they sheepishly ask for a prompt that would visit indicate one of 5 environment types they've worked on and include several of the dozen premade textured meshes they've made[...]"
I was clearly directly addressing what they said. Unless you have a specific, substantive, on-topic question or statement, I'm going to assume that you're just fishing for things to argue about.
MichaelZuo
7 days ago
You can assume whatever you want, but your assumptions can never outweigh the assumptions of anyone else on HN… so the last sentence doesn’t make sense?
Plus listing past examples doesn’t indicate future possibilities must conform to that… unless there is a specific argument on why that should be the case on the balance of probabilities… so are you sure you understood my previous questions?
chefandy
7 days ago
confirmed.
MichaelZuo
7 days ago
> confirmed.
Huh? Is this an AI response?
fennecfoxy
5 days ago
As someone who has tried a lot of role-play models, I think there is definitely value in what LLMs (or similar tech) can add to NPCs, it's just most people don't know how to prompt for it.
Using the RP models, over time I've found certain things that can guide them to creating better stories; an agent system is much easier to use but even using single character cards it's not hard to stuff them with a narrator and several individual characters in one go. I recently switched from kunoichi (8b, decent) to an Aria derivative (13b, much better).
In the majority of role-play stories I do now, it's super easy to refine the prompt so that characters don't necessarily provide pointless details + avoid all the common tropes, especially with newer models.
Maybe I should make a PoC, would be a fun project. But yeah I agree that chatting to an NPC about its day doesn't necessarily make for great gameplay - but it's relatively easy now to guide it into interesting scenarios/experiences, which _does_ make for great gameplay.
Ie the wife of the hunter you murdered in a fantasy game; normally we just think that we killed a character in a game - but when the hunter's wife decides in the background to train with a sword so that she can avenge her husband, then finally comes to find you and calls you out for murdering her husband - suddenly it's murder, and a revenge story. It's not too hard to prevent a decent model from injecting fluff (like where she bought her sword and how much for) into it.
Edit: just tested this to see what would happen; I first walked into a cottage, grandfather and his young granddaughter, stabbed him in front of her and ran away (spent the next 2 years of "game time" in a forest hiding away). Character motivations updates for the granddaughter were essentially: distraught, vowing revenge, travelling around to hone her skills, speaking with unsavoury types in taverns to find my whereabouts, finding & confronting me, killing me. I was able to query it for "3 dialogue options/actions with percent chances and distinct outcomes in JSON format" which it gave, the chance of her forgiving me was 0.01% which I suppose is fair enough. It did fail to create nice JSON tho, the model is not fine-tuned for that at all.
But it's definitely possibly with multiple loras/prompts/queries to extract dynamic dialogue options, actions, stats, percent chances for plot/story paths etc. LLMs in games definitely need to be managed by a traditional rules based framework, LLM should only be used for the creative bits. Stats/player skill will always determine who wins a fight, but the fight starting because of dialogue or past events could totally be LLM driven.
chefandy
2 days ago
I’m specifically talking about non-text-based games. You’re still limited by the game assets, animations (including hair, clothes, weapon movements, etc,) environments, and characters that are the waypoints for the plot— so you’ve already got a finite number of possibilities. You can’t create a new class of weapon on the fly, or a new character, or new plot with current assets and maintaining story stability unless it’s really really restricted, right? So what do you get aside from variability in dialog that you can’t get from a random number generator? And when it comes down to it, does that unpredictability, and all of the effort it takes to wrangle it make the game better than having a professional writer make a handful of variations on a bunch of lines?
I can’t think of a scenario within the limitations of real games with visual assets that have progressive plots and characters for which that would yield a better game than having people craft it. Players are going to be no more tolerant of bugs, slowdowns, bad dialog, plot holes, misleading information, and annoyance just because an LLM is the source rather than substandard design or QA.
Maybe I’m not quite grasping what you’re proposing?
caetris2
9 days ago
You've absolutely nailed it here, I agree. To make any progress at all at the tremendously difficult problem they are trying to solve, they need to be frank about just how far away they are from what it is they are marketing.
I am whole-heartedly in support of commercial interests to drum of awareness and engagement by the authors. This is definitely a cool thing to be working on, however, what does make more sense is to frame the situation more honestly and attract folks to the desire of solving tremendously hard problems based on a level of expertise and awareness that truly moves the ball forward.
What would be far more interesting would be for the folks involved to say all the ten thousand things that went wrong in their experiments and to lay out the common-sense conclusions from those findings (just like the one you shared, which is truly insightful and correct).
We need to move past this industry and their enablers that continually try to win using the wrong methodology -- pushing away the most inventive and innovative people that are ripe and ready to make paradigm shifts in the AI field and industry.
teaearlgraycold
9 days ago
It would however be very interesting to see these kinds of agents in a commercial video game. Yes they are shallow in their perception of the game world. But they’re a big step up from the status quo.
dartos
9 days ago
It’s a game where you, a vampire, convince townsfolk that you’re not, so they let you in their house.
The NPCs are run by LLMs. It’s pretty interesting.
caetris2
9 days ago
Yes... Imagine a blog post at the same quality as this paper that framed their work and their pursuits in a way that genuinely got people excited about what could be around the corner, but with the context that frames exactly how far away they are from achieving what would be the ultimate vision.
canadianfella
9 days ago
[dead]
shkkmo
9 days ago
> I don't think you need an actual 3D world, wall clock, etc. The LLM does not seem to be meaningfully enriched by having a fancy representation underly the prompt generation process.
I don't know how you expect agents to self organize social structures if they don't have a shared reality. I mean, you could write all the prompts yourself, but then that shared reality is just your imagination and you're just DMing for them.
The point of the minecraft environment isn't to "enrich" the "inner world" of the agents and the goal isn't to "entertain" them. The point is to create a set of human understandable challenges in a shared environment so that we can measure behavior and performance of groups of agents in different configurations.
I know we aren't supposed to bring this up, but did you read the article? Nothing of your comment addresses any of the findings or techniques used in this study.
grahamj
8 days ago
I wrote and played with a fairly simple agentic system and had some of the same thoughts RE higher order behaviour. But I think the counter-points would be that they don't have to all be the same model, and what you might call context management - keeping each agent's "chain of thought" focused and narrow.
The former is basically what MoE is all about, and I've found that at least with smaller models they perform much better with a restricted scope and limited context. If the end result of that is something that do things a single large model can't, isn't that higher order?
You're right that there's no "inner world" but then maybe that's the benefit of giving them one. In the same way that providing a code-running tool to an LLM can allow it to write better code (by trying it out) I can imagine a 3D world being a playground for LLMs to figure out real-world problems in a way they couldn't otherwise. If they did that wouldn't it be higher order?
logicchains
9 days ago
>I feel like there is some kind of information theory constraint which confounds our ability to extract higher order behavior from multiple instances of the same LLM.
It's a matter of entropy; producing new behaviours requires exploration on the part of the models, which requires some randomness. LLMs have only a minimal amount of entropy introduced, via temperature in the sampler.
fennecfoxy
5 days ago
As I've pointed out in the past, I also think it's fair to say that we overestimate human variability, and that most human behaviours and language coalesces for the most part.
Also the creative industry, a talking point being that "AIs just rehash existing stuff, they don't produce anything new". Neither do most artists, everything we make is almost always some riff on prior art or nature. Elves are just humans with pointy ears. Goblins are just small elves with green skin. Dwarves are just short humans. Dragons are just big lizards. Aliens are just humans with an odd shaped head and body.
I don't think people realise how very rare it is that any human being experiences or creates something truly novel and not yet experienced or created by our species yet. Most of reality is derivative.
InDubioProRubio
9 days ago
Maybe we need gazelles and cheetahs - many gazelle-agents getting chased towards a goal, doing the brute force work- and the constraint cheetahs chase them, evaluate them and leave them alive (memory intact) as long as they come up with better and better solutions. Basically a evolutionary algo, running on top of many agents, running simultaneously on the same hardware?
FeepingCreature
9 days ago
Do you want stressed and panicking agents? Do you think they'll produce good output?
In my prompting experience, I mostly do my best to give the AI way, way more slack than it thinks it has.
InDubioProRubio
9 days ago
No, i want the hunters to zap the prey with tiredness. Basically electron holes, hunting for free electrons, annhilating state. Neurons have something similar, were they usually prevent endless excitement and hyperfixation, which is why a coder in flow is such a strange thing.
fennecfoxy
5 days ago
This only works (genetic algo) if you have some random variability in the population. For different models it would work but I feel like it's kind of pointless without the usual feedback mechanism (positive traits are passed on).
nobrains
9 days ago
I had the opposite thought. Opposite to evolution...
What if we are a CREATED (i.e. instant created, not evolved) set of humans, and evolution and other backstories have been added so that the story of our history is more believable?
Could it be that humanity represents a de novo (Latin for "anew") creation, bypassing the evolutionary process? Perhaps our perception of a gradual ascent from primitive origins is a carefully constructed narrative designed to enhance the credibility of our existence within a larger framework.
What if we are like the Minecraft people in this simulation?
thrway01234
9 days ago
I feel that is too complicated. The most simplest explanation is usually the right one. I think we live on an earth with actual history. Note that this does not necessarily mean that we are not living in a simulation, as history itself can be simulated.
If we are indeed in a simulation, I feel there are too many details to be "designed" by a being. There are too many facts that are connected and unless they fix the "bugs" as they appear and reboot the simulation constantly, I don't think it is designed. Otherwise we would have noticed the glitches by now.
If we are in a simulation, it has probably been generated by a computer following a set of rules. Maybe it ran a simplified version to evolve millions of possible earths, and then we are living in the version they selected for the final simulation? In that case all the facts would align and it could potentially be harder to noticed the glitches.
I don't think we are living in a simulation because bugs are hard to avoid, even with close to "infinite" computing power. With great power comes great possibilities for bugs
Perhaps we are in fact living in one of the simplified simulations and will be turned off at any second after I have finished this senten
wongarsu
9 days ago
We also can't rule out that Gaia or Odin made the world five minutes ago, and went to great lengths to make the world appear ancient.
It certainly makes sense if you assume that the world is a simulation. But does it actually explain anything that isn't equally well explained by assuming the simulation simulated the last 13 billion years, and evolution really happened?
j1elo
9 days ago
As long as we don't get to the point of being able to simulate a Universe ourselves, the odds are against us being in a simulation, it seems! :)
sangnoir
8 days ago
There's a built-in assumption that there would be no constraints applied on nested simulation anywhere further down the stack, which is IMO unlikely, unless every layer has unlimited compute, or otherwise interested in investigating nested simulations.
cen4
9 days ago
That depends on giving them a goal/reward like increasing "data quality".
I mean frogs don't use their brains much either inspite of the rich world around them they don't really explore.
But chimps do. They can't sit quiet in a tree forever and that boils down to their Reward/Motivation Circuitry. They get pleasure out of explore. And if they didn't we wouldn't be here.
fhe
9 days ago
so well put. exactly how I've been feeling and trying to verbalize.