Why do so many "agentic AI" systems collapse without persistent state?

3 pointsposted a month ago
by JohannesGlaser

Item id: 46385179

4 Comments

verdverm

a month ago

> readable files that the model initializes from every run

this is how AGENTS.md et al. are supposed to work, you can include many more things, like ...

> Append-only logs, rules, inventories, histories

I include open terminals and files for example, these may make it into the system prompt. The same problem arises here, how much and when. Same story for tools, mcp, skills.

> Without persistent state

There are a lot of different ways people are approaching this. In the end, you are just prewarming a cache (system prompt). The next step is to give the agent control over that system prompt towards self-controlled / dynamic context engineering.

You, increasingly in collaboration with an agent, are doing context engineering. One can take the analogy towards a memory or knowledge hierarchy. You're also going to want a table of contents or librarian (context collecting subagent or phase, search, lots of open design space here)

JohannesGlaser

a month ago

I think there’s an important distinction here.

What you describe (AGENTS.md, open files, terminals, system prompts) is still context shaping inside the prompt space. It’s about what to load and how much, and yes, that quickly turns into dynamic context engineering.

What I’m experimenting with is one step earlier: treating state as an external artifact, not as an emergent property of the prompt. The files aren’t hints or instructions that compete for relevance, but the assistant’s working state itself. On initialization, the model doesn’t decide what to pull in; it reconstructs orientation from a fixed set of artifacts.

In that sense it’s not prewarming a cache so much as rebuilding a process from disk. Forgetting, correction, and continuity are handled by explicitly changing those artifacts, not by prompt evolution.

I agree there’s a lot of open design space here. My main point was that persistent state tends to be discussed as a prompt or retrieval problem, whereas treating it as first-class state changes the failure modes quite a bit.

Curious how far current agent frameworks really go in that direction in practice.

verdverm

a month ago

What you are describing is context construction. When working with agents and LLMs, the only way you get anything beyond their training is through the system prompt and message history. There is nothing else.

You can call them whatever fancy notions and anthropomorphic concepts you want, but in the end, it is just context engineering, regardless of how and when you create, store, retrieve, and inject artifacts. A good framework will give you building blocks and flexibility in how you use these and how that happens. That's why I use ADK anyway

Maybe you are talking about giving the agent tools for working with this state or cache? I have that in my ADK based setup

If I have a root AGENTS.md, or a user level file of similar nature, and these are always loaded for every conversation, how is what you are talking about different?

JohannesGlaser

a month ago

At the lowest level, you’re right: everything the model ever sees is context. I’m not claiming a channel beyond tokens. The distinction I’m trying to draw isn’t where state ends up, but how it is governed.

AGENTS.md (and similar conventions) are a good step toward making agent context explicit. But they are still instructional artifacts: static guidance that gets loaded into the prompt. They don’t define a state lifecycle. They don’t encode history, correction, or invalidation over time. And they don’t change unless a human edits them. In most agent setups I’ve worked with, “state” is assembled per turn. An agent or orchestration layer decides what to include, summarize, drop, or rewrite. That makes continuity an emergent property of context engineering. It works locally, but over time you see drift, silent overwrites, and loss of accountability.

What I’m experimenting with is treating state as a process artifact, not an input artifact. The assistant doesn’t curate its own context. On startup, it reconstructs orientation from a fixed, inspectable set of external files — logs, rules, inventories — with explicit lifecycle rules. State changes happen deliberately (append, correct, invalidate), not implicitly via prompt evolution.

So yes, the model ultimately reads tokens. But forgetting, correction, and continuity are handled outside the prompt logic. The prompt becomes closer to a bootloader than a workspace.

If you always load a root AGENTS.md plus a stable artifact set, the surface can look similar. In practice, the difference shows up in failure modes: how systems degrade over weeks instead of minutes.

I’m not arguing current frameworks can’t approximate this — just that persistent state is usually framed as a context problem, rather than as first-class state with explicit lifecycle semantics. That shift changes what “agentic” failure even looks like.