Agent_Builder
11 hours ago
This resonates. What we saw in practice is that most failures don’t come from models being too dumb, but from being given too much freedom.
While using GTWY.ai, the biggest reduction in hallucinations came from constraining what an agent was allowed to do at each step, not from better prompts or verification layers.
Once inputs, tools, and outputs were explicit, the model stopped confidently inventing things. It felt less “creative”, but far more useful.
Fewer degrees of freedom beat smarter models, at least in production.
killcoder
10 hours ago
I don't buy the "any constraints cause lower performance via being out of distribution" idea. Sure if you ask the model to output 'reasoning' in JSON steps, that is a completely different 'channel' to its trained 'reasoning' output. For real tasks though, I think it's more about picking the _right_ context free grammar to enforce format correctness. You can enforce an in-distribution format and get the best of both worlds. I don't think the industry should settle so hard on JSON-for-everything.
Agent_Builder
10 hours ago
I think we’re mostly aligned. The constraints we’re talking about weren’t about forcing everything into JSON or limiting reasoning bandwidth.
Inside a step, the model still reasons freely in plain language. The constraint is on what authority exists at that step.
The failures we saw came from permissions and assumptions silently carrying over between steps, not from the model “thinking wrong”. Once a step ended, any authority it had ended too.
So it’s less “constrain decoding” and more “constrain capability scope over time”. Free reasoning within a step, hard boundaries between steps.
That separation is what removed a lot of surprising behavior for us.