hackernews client

journal

a month ago

realistically, you can get a project started within low enough tokens, to have a long enough conversation to generate something that looks like 1.0. eventually you will reach a point where every request becomes more expensive and caching doesn't help. you'll have to truncate/prune/hoist the context however you do that, summarize is what i do and i get creative. i have absolutely no idea how anyone using agents is producing anything maintainable over a long iterative period.

this is llm bitcoin moment, you will find them raise the price so high that running these agents like you are used to now will leave you with no pants on. you need to aim for minimum context, not stuff it with everything irrelevant.

al_borland

a month ago

I work mostly in Ansible and Copilot is completely incompetent when trying to deal with it. I’ve tried several models that are available (Claude, Gemini, various GPTs, Codex), and they’ve all been pretty bad.

For example, I asked just this week if a when condition on a block was evaluated once for the block or applied to each task. I thought it was each task, but wanted to double check. It told me it was done once for the block, which was not what I was expecting. I setup a test and ran it; it was wrong. It evaluates the condition on every task in the block. This seems like a basic thing and it was completely wrong. This happens every time I try to use it. I have no idea how people are trusting it to write 80% of their code.

We recently got access to agent mode, which is the default now. Every time it has tried to do anything it destroys my code. When asking it to insert a single task, it doesn’t understand how to format yaml/ansible, and I always have to fix it after it’s done.

I can’t relate to anything people are saying about AI when it comes to my job. If the AI was a co-worker, I wouldn’t trust them with anything, and would pray they were cut in the next round of layoffs. It’s constantly wrong, but really confident about it. It apologizes, but never actually corrects its behavior to improve. It’s like working with a psychopath.

In terms of training AI on our code base, that seems unlikely. We’re not even allowed to give our entire team (of less than 10 people) access to our code. We also can’t use whatever AI tool is out there. We can only use Copilot and the models it has, and only through our work account with SSO so it applies various privacy rules (from my limited understanding). We don’t yet have access to a general purpose AI at work, but they are in pilot I think.

I have no idea where it’s heading, as I have trouble squaring the reality of my experience with the anecdotes I read online, to the point where I question if any of it is real, or these are all investors trying to keep the stock prices going up. Maybe if I was working in a more traditional language or a greenfield environment that started with AI… maybe it would be better. Right now, I’m not impressed at all.

raw_anon_1111

a month ago

I don’t use Ansible. But both Codex (and just using ChatGPT) and Claude Code are excellent with CloudFormation, Terraform and the CDK. Sometimes with ChatGPT I have to tell it to “verify its code using the latest documentation” for newish features

jf22

a month ago

Yes the workflow has shifted.

I've handled the sloppiest slop with llms and turned the worst code into error free modern and tested code in a fraction of the time it used to take me.

People aren't worried about cost because $1k in credits to get 6 months of work done is a no brainer.

A year from not semi-autonomous llms will produce entire applications while we sleep. We'll all be running multi-agents and basically write specs and md files all day.

Ask HN: Where is legacy codebase maintenance headed?

5 Comments

user

journal

al_borland

raw_anon_1111

jf22