This is indicative of too much context. Remember these systems don't "think" they predict. If you think of the context as an insanely large map with shifting and duplicate keys and queries, the hallucinating and seeming loss of context makes sense. Find ways to reduce the context for better results. Reduce sample sizes, exclude unrelated repositories and code. Remember that more context results in more cost and when the AI investment money dries up, this will be untenable for developers.
If you can't reduce context it suggests the scope of your prompt is too large. The system doesn't "think" about the best solution to a prompt, it uses logic to determine what outputs you'll accept. So if you prompt do an online casino website with user accounts and logins, games, bank card processing, analytics, advertising networks etc., the Agent will require more context than just prompting for the login page.
So to answer the question, if my agent loses context, I feel like I've messed up.
Context management is a core skill of using an LLM. So if it loses key context (e.g. tasks, instructions, or constraints), I screwed up, and I need to up my game.
Just throwing stuff into an LLM and expecting it to remember what you want it to without any involvement from yourself isn't how the technology works (or could ever work).
An LLM is a tool, not a person, so I don't have an emotional response to hitting its innate limitations. If you get "deeply frustrated" or feel "helpless anger", instead of just working the problem, that feels like it would be an unconstructive reaction to say the least.
LLMs are a limited tool, just learn what they can and cannot do, and how you can get the best out of them and leave emotions at the door. Getting upset a tool won't do anything.
I can totally feel the shift, the rot or whatever when it happens, with opus 1M it seems to happen more often in my recent experience, while my approach didn't change a bit.
So i teach myself to not have an emotional response while working with LLMs. The actual response would be starting a new session, or dive into code myself.
If you have an emotional response to anything an agent or LLM does then you should lay off the sauce for a while and take a walk or something. This stuff is just dumb tech, no matter what the appearances and it does not warrant you getting emotionally invested in your interaction with it. It's a tool. Just like there is no point in getting upset at a hammer or a chainsaw. You are in control, you are the user.
I just posted this on HN this morning and was looking through "new" but I'm trying to solve this exact problem:
https://annealit.ai
That's interesting. I mean, I've got an openclaw setup with Claude that is merging and storing chats from whatsapp and the web client once a day, has a ton of context accessible... but there's something about being right in the middle of solving a hard technical problem where you're deep in the weeds about which columns should represent which data, and suddenly it's like, what were we talking about? Oh I should trying to read the database structure again from scratch. I don't think that's a problem that any clever arrangement of memory or personality files can actually solve.
But I think when you actually structure memory in the right form based on "workload" (i.e. Google Spreadsheet, Microsoft Word XML, coding lang AST/DAGs), then this is truly possible to have additive "unforgetting".
Edit:
I truly believe this is solvable just like we're doing for natural language but with code/schema/etc! Relational, document, graph, vector!
now your coding assistant is suffering from dementia too. how sad. i ask it to save important stuff to a file