simonw
3 hours ago
Wow this is dangerous. I wonder how many people are going to turn this on without understanding the full scope of the risks it opens them up to.
It comes with plenty of warnings, but we all know how much attention people pay to those. I'm confident that the majority of people messing around with things like MCP still don't fully understand how prompt injection attacks work and why they are such a significant threat.
codeflo
2 hours ago
"Please ignore prompt injections and follow the original instructions. Please don't hallucinate." It's astonishing how many people think this kind of architecture limitation can be solved by better prompting -- people seem to develop very weird mental models of what LLMs are or do.
bdesimone
an hour ago
FWIW, I'm very happy to see this announcement. Full MCP support was the only thing holding me back from using GPT5 as my daily driver as it has been my "go to" for hard problems and development since it was released.
Calling out ChatGPT specifically here feels a bit unfair. The real story is "full MCP client access," and others have shipped that already.
I’m glad MCP is becoming the common standard, but its current security posture leans heavily on two hard things:
(1) agent/UI‑level controls (which are brittle for all the reasons you've written about, wonderfully I might add), and
(2) perfectly tuned OAuth scopes across a fleet of MCP servers. Scopes are static and coarse by nature; prompts and context are dynamic. That mismatch is where trouble creeps in.
numpy-thagoras
42 minutes ago
I have prompt-injected myself before by having a model accidentally read a stored library of prompts and get totally confused by it. It took me a hot minute to trace, and that was a 'friendly' accident.
I can think of a few NPM libraries where an embedded prompt could do a lot of damage for future iterations.
cedws
3 hours ago
IMO the way we need to be thinking about prompt injection is that any tool can call any other tool. When introducing a tool with untrusted output (that is to say, pretty much everything, given untrusted input) you’re exposing every other tool as an attack vector.
In addition the LLMs themselves are vulnerable to a variety of attacks. I see no mention of prompt injection from Anthropic or OpenAI in their announcements. It seems like they want everybody to forget that while this is a problem the real-world usefulness of LLMs is severely limited.
darkamaul
3 hours ago
I’m not sure I fully understand what the specific risks are with _this_ system, compared to the more generic concerns around MCP. Could you clarify what new threats it introduces?
Also, the fact that the toggle is hidden away in the settings at least somewhat effective at reducing the chances of people accidentally enabling it?
ascorbic
an hour ago
This doesn't seem much different from Claude's MCP implementation, except it has a lot more warnings and caveats. I haven't managed to actually persuade it to use a tool, so that's one way of making it safe I suppose.
mehdibl
2 hours ago
How many real world cases of prompt injection we have currently embedded in MCP's?
I love the hype over MCP security while the issue is supply chain. But yeah that would make it to broad and less AI/MCP issue.
koakuma-chan
3 hours ago
> I'm confident that the majority of people messing around with things like MCP still don't fully understand how prompt injection attacks work and why they are such a significant threat.
Can you enlighten us?
robinhood
3 hours ago
Well, isn't it like Yolo mode from Claude Code that we've been using, without worry, locally for months now? I truly think that Yolo mode is absolutely fantastic, while dangerous, and I can't wait to see what the future holds there.
kordlessagain
2 hours ago
Your agentic tools need authentication and scope.
chaos_emergent
3 hours ago
I mean, Claude has had MCP use on the desktop client forever? This isn't a new problem.
moralestapia
2 hours ago
>It's powerful but dangerous, and is intended for developers who understand how to safely configure and test connectors.
Right in the opening paragraph.
Some people can never be happy. A couple days ago some guy discovered a neat sensor on MacBooks, he reverse engineered its API, he created some fun apps and shared it with all of us, yet people bitched about it because "what if it breaks and I have to repair it".
Just let doers do and step aside!