Show HN: A minimal TS library that generates prompt injection attacks

31 pointsposted 19 hours ago
by yaoke259

14 Comments

sippeangelo

18 hours ago

Was the whole lib and website vibe coded? I can't find any instructions on how to use it, the repo is for the website itself and the readme is AI blurb that doesn't make me any wiser.

  // Test your AI system
  const results = await injector.runTests(yourAISystem);
???

Even the "prompt-injector" NPM package is something completely different. Does this project even exist?

mpalmer

10 hours ago

The website copy is obviously generated, and has not been reviewed for correctness.

The website trumpets "25+ curated prompt injection patterns from leading security research". The README of the linked Github promises: "100+ curated injection patterns from JailbreakBench".

None of the research sources are actually linked for us to review.

The README lists "integrations" with various security-oriented entities, but no such integration is apparent in the code.

The project doesn't earn the credibility it claims for itself. Because the author trusts bad LLM output enough to publish it as their own work, we have to assume that they don't have the knowledge or experience to recognize it as bad output.

Sorry for the bluntness, but there are few classes of HN submission that rankle as much as these polished bits of fluff. My advice: do not use AI to publicly imply abilities or knowledge you don't have; it will never serve you well.

yaoke259

6 hours ago

Yes, to be completely honest this is a vibe coded project and I'm by no means a security expert. This was more of a fun, side project/experiment based on a shower thought. I admit it's not good/disingenuous to imply security knowledge, but for what it's worth, I just prompted Claude to research the latest papers on prompt injection and it made the claims on its own. Again this should not be an excuse for not reviewing the AI's output more carefully, so in the future I'll be more careful with LLM output and also present it as a vibe-coded project. Apologies, I'm just a noob in prompt injection security who doesn't know what he's doing :(

mpalmer

5 hours ago

There's absolutely no problem with not knowing what you're doing! Just, you know, own it.

Part of what I find exhausting about projects like this is I can't see any evidence of the person who ostensibly created it. No human touch whatsoever - it's a real drag to read this stuff.

By all means, vibe code things, but put your personal stamp on it if you want people to take notice.

yaoke259

5 hours ago

yes absolutely, updating the page now as we speak!

yaoke259

6 hours ago

Your feedback is valuable and correct, I'll extract the library into /core in the repo and also manually verify all the citations. I'll read into the prompt injection literature more deeply and turn this from a shower thought project into something more mature

mosselman

15 hours ago

What are some good prevention mechanisms for this? A sort of firewall for prompts? I've seen people recommend LLMs, but that seems like it wouldn't work well. What is the industry standard? Or what looks promising at least?

hoppp

14 hours ago

Nothing yet. Probably a new kind of model needs to be trained that can find injected prompts, sort if like an immune system for LLMs. Then the sanitized data can be passed to the LLM after.

No real solution for it yet. I would be interested to try to train a model for this but no budget atm.

HKayn

17 hours ago

Why did you use something as heavy as SvelteKit for a website with a single page? This doesn't inspire confidence.

yaoke259

6 hours ago

Sveltekit is not heavy, it is compiled into lightweight bundles