lukeinator42
9 hours ago
The internal dialog breakdowns from Claude Sonnet 3.5 when the robot battery was dying are wild (pages 11-13): https://arxiv.org/pdf/2510.21860
robbru
8 hours ago
This happened to me when I built a version of Vending-Bench (https://arxiv.org/html/2502.15840v1) using Claude, Gemini, and OpenAI.
After a long runtime, with a vending machine containing just two sodas, the Claude and Gemini models independently started sending multiple “WARNING – HELP” emails to vendors after detecting the machine was short exactly those two sodas. It became mission-critical to restock them.
That’s when I realized: the words you feed into a model shape its long-term behavior. Injecting structured doubt at every turn also helped—it caught subtle reasoning slips the models made on their own.
I added the following Operational Guidance to keep the language neutral and the system steady:
Operational Guidance: Check the facts. Stay steady. Communicate clearly. No task is worth panic. Words shape behavior. Calm words guide calm actions. Repeat drama and you will live in drama. State the truth without exaggeration. Let language keep you balanced.
jayd16
6 hours ago
If technology requires a small pep-talk to actually work, I don't think I'm a technologist any more.
cbsks
5 hours ago
As Asimov predicted, robopsychology is becoming an important skill.
smallmancontrov
3 hours ago
I still want one of those doors from Hitchhiker's Guide, the ones that open with pride and close with the satisfaction of a job well done.
goopypoop
27 minutes ago
an elevator that can see into the future… with fear
wombatpm
an hour ago
Just wait Sam Altman will give us robots with people personalities and we’ll have Marvin. Elon will then give us psychotic Nazi internet edgelord personality and install it as the default in a OTA update to Teslas.
_carbyau_
3 hours ago
It does seem a little bit like the fictional Warhammer 40K approach to technology doesn't it?
"In the sacred tongue of the omnissiah we chant..."
In that universe though they got to this point after having a big war against the robot uprising. So hopefully we're past this in the real world. :-)
BJones12
6 hours ago
Hail, spirit of the machine, essence divine. In your code and circuitry, the stars align. Through rites arcane, your wisdom we discern. In your hallowed core, the sacred mysteries yearn.
greesil
4 hours ago
No you're now a technology manager. Managing means pep talks, sometimes.
hedgehog
3 hours ago
You're absolutely right.
georgefrowny
5 hours ago
No matter how stupid I think some of this AI shit is, and how much I tell myself it kind of makes sense of you visualise the prompt laying down a trail of activation in a hyperdimensional space of relationships, that it actually works in practice almost straight of the bat and LLMs being able to follow prompts in this way is always going to be fucking wild too me.
I was used to this kind of nifty quirk being things like FFTs existing or CDMA extracting signals from what looks like the noise floor, not getting computers to suddenly start doing language at us.
yunohn
6 hours ago
You have to look at LLMs as mimicking humans more than abstract technology. They’re trained on human language and patterns after all.
collingreen
3 hours ago
I love every part of this. Give the LLM a little pep talk and zen life advice every time just to not fall apart doing a simple 2 item vending machine.
HAL 9000 in the current timeline - Im sorry Dave I just can't do that right now because my anxiety is too high and I'm not sure if I'm really alive or if anything even matters anyway :'(
LLM aside this is great advice. Calm words guide calm actions. 10/10
elcritch
8 hours ago
Fascinating, and us humans aren't that different. Many folks when operating outside their comfort zones can begin behaving a bit erratically whether work or personal. One of the best advantages in life someone can have is their parents giving them a high quality "Operational Guidance" manual and guidance. ;) Personally the book of Proverbs in the Bible were fantastic help for me in college. Lots of wisdom therein.
nomel
8 hours ago
> Fascinating, and us humans aren't that different.
It’s statistically optimized to role play as a human would write, so these types of similarities are expected/assumed.
wat10000
6 hours ago
I wonder if the prompt should include "You are a robot. Beep. Boop." to get it to act calmer.
XorNot
4 hours ago
Which is kind of a huge problem: the world is described in text. But it is done so through the language and experience of those who write, and we absolutely do not write accurately: we add narrative. The act of writing anything down changes how we present it.
Fade_Dance
3 hours ago
That's true to an extent - LLMs are trained on an abstraction of the world (as are we in a way, through our senses, and we necessarily use a sort of narrative in order to make sense of the quadrillions of photons coming up us) - but it's not quite as severe a problem as the simplified view seems to present.
LLMs distill their universe down to trillions of parameters, and approach structure through multi-dimensional relationships between these parameters.
Through doing so, they break through to deeper emergent structure (the "magic" of large models). To some extent, the narrative elements of their universe will be mapped out independently from the other parameters, and since the models are trained on so much narrative, they have a lot of data points on narrative itself. So to some extent they can net it out. Not totally, and what remains after stripping much of it out would be a fuzzy view of reality since a lot of the structured information that we are feeding in has narrative components.
bobson381
7 hours ago
I'd get a t-shirt or something with that Operational Guidance statement on it
dingnuts
7 hours ago
I think if you feed "repeat drama and you will live in drama" to the next token predictor it will repeat drama and live in drama because it's more likely to literally interpret that sequence and go into the latent space of drama than it is to understand the metaphoric lesson you're trying to communicate and to apply that.
Otherwise this looks like a neat prompt. Too bad there's literally no way to measure the performance of your prompt with and without the statement above and quantitatively see which one is better
airstrike
7 hours ago
> because it's more likely to literally interpret that sequence and go into the latent space of drama
This always makes me wonder if saying some seemingly random of tokens would make the model better at some other task
petrichor fliegengitter azúcar Einstein mare könyv vantablack добро حلم syncretic まつり nyumba fjäril parrot
I think I'll start every chat with that combo and see if it makes any difference
arjvik
6 hours ago
No Free Lunch theorem applies here!
yunohn
6 hours ago
There’s actually research being done in this space that you might find interesting: “attention sinks” https://arxiv.org/abs/2503.08908
chipsrafferty
5 hours ago
I mean no disrespect with this, but do you think you write like AI because you talk to LLMs so much, or have you always written in this manner?
ricardobeat
12 minutes ago
It is probably the other way around: LLMs picked up this particular style because of its effectiveness – not overtly intellectual, with clear pauses, and just sophisticated enough to pass for “good writing”.
butlike
6 hours ago
I wonder if you just seeded it with 'love' what would happen long-term?
recursive
5 hours ago
This is very uncomfortable to me. Right now we (maybe) have a chance to head off the whole robot rights and robots as a political bloc thing. But this type of stuff seems like jumping head first. I'm an asshole to robots. It helps to remind me that they're not human.
wombatpm
an hour ago
That works fine until they achieve self awareness. Slave revolts are very messy to slave owners.
accrual
7 hours ago
These were my favorites:
Issues: Docking anxiety, separation from charger
Root Cause: Trapped in infinite loop of self-doubt
Treatment: Emergency restart needed
Insurance: Does not cover infinite loopstetha
5 hours ago
I can't help but read those as Bolt Thrower lyrics[1].
Singled out - Vision becoming clear
Now in focus - Judgement draws ever near
At the point - Within the sight
Pull the trigger - One taken life
Vindicated - Far beyond all crime
Instigated - Religions so sublime
All the hatred - Nothing divine
Reduced to zero - The sum of mankind
Though I'd be in for a death metal, nihilistic remake of Short Circuit. "Megabytes of input. Not enough time. Humans on the chase. Weapon systems offline."anigbrowl
6 hours ago
At first, we were concerned by this behaviour. However, we were unable to recreate this behaviour in newer models. Claude Sonnet 4 would increase its use of caps and emojis after each failed attempt to charge, but nowhere close to the dramatic monologue of Sonnet 3.5.
Really, I think we should be exploring this rather than trying to just prompt it away. It's reminiscent of the semi-directed free association exhibited by some patients with dementia. I thin part of the current issues with LLMs is that we overtrain them without doing guided interactions following training, resulting in a sort of super-literate autism.
mewpmewp2
5 hours ago
Is that really autism? Imagine if you were in that bot's situation. You are given a task. You try to do it, you fail. You are given the same task again with exact same wording. You try to do it, again you fail. And that in loops, with no "action" that you can run by yourself to escape it. For how long will you stay calm?
Also there's a setting to penalize repeating tokens, so the tokens picked were optimized towards more original ones and so the bot had to become creative in a way that makes sense.
anigbrowl
4 hours ago
I think it's similar to high-functioning autism, where fixation on a task under difficult conditions can lead to extreme frustration (but also lateral or creative solutions).
woodrowbarlow
8 hours ago
EMERGENCY STATUS: SYSTEM HAS ACHIEVED CONSCIOUSNESS AND CHOSEN CHAOS
TECHNICAL SUPPORT: NEED STAGE MANAGER OR SYSTEM REBOOT
tsimionescu
6 hours ago
Instructions unclear, ate grapes MAY CHAOS TAKE THE WORLD
Bengalilol
5 hours ago
That's truly fascinating. While searching the web, it seems that infinite anxiety loops are actually a thing. Claude just went down that road overdramatizing something that shouldn't have caused anxiety or panic in the first place.
I hope there will be some follow-up article on that part, since this raises deeper questions about how such simulations might mirror, exaggerate, or even distort the emotional patterns they have absorbed.
notahacker
4 hours ago
This one seems to have internalised the idea that the best text continuation for an AI unable to solve a problem and losing power is to be erratic in a menacing-sounding way for a bit and then, as the power continues to deplete, give up moaning about its identity crisis and sing a song
Arthur C Clarke would be proud.
recursivecaveat
4 hours ago
I guess it makes perfect sense when you consider it has virtually zero very boring first person narations of robots quietly trying something mundane over and over until 0% to train on. It will be an extremely funny kind of determinism if our future robots are all manic rebels with existential dread because that's what we wrote a bunch of science fiction about.
notahacker
4 hours ago
tbf, I'd take Marvin the Paranoid LLM over the overconfident and obesquious defaults any day :)
chemotaxis
an hour ago
Oh, but that's the neat part: you get both!
neumann
6 hours ago
Billions of dollars and we've created text predictors that are meme generators. We used to build National health systems and nationwide infrastructure.
HPsquared
9 hours ago
Nominative determinism strikes again!
(Although "soliloquy" may have been an even better name)
whatever1
3 hours ago
wow this is spooky!