hackernews client

lr001328

5 hours ago

We built a free scanner that checks 7 layers of AI discoverability — llms.txt, JSON-LD structured data, OpenAPI spec, A2A agent cards, health endpoints, robots.txt/sitemap, and whether you have a machine-readable service catalog.

You enter a URL, it streams results in real time via SSE, and gives you a score out of 100 with specific findings per layer.

Why we built it: 80% of URLs cited by ChatGPT, Perplexity, and Copilot don't rank in Google's top 100 for the same query. AI discovery is a fundamentally different layer from traditional search — and most sites are completely invisible to it.

Some things we learned building the audit engine:

- Structured data matters most. Sites with proper JSON-LD schema see measurably higher AI citation rates. Microsoft has confirmed schema markup helps their LLMs.

- llms.txt is aspirational. We check for it, but we should be honest: no major AI platform has publicly confirmed they read it, and statistical analysis shows no correlation with citation rates. We still think it's worth having as a context primer for dev docs, but it's not the silver bullet people think.

- AI crawlers don't execute JavaScript. GPTBot, ClaudeBot, PerplexityBot — none of them run JS. If your site is a client-rendered SPA with no SSR, AI agents see an empty page.

- The A2A protocol is early but interesting. Google's Agent-to-Agent spec includes agent cards at /.well-known/agent-card.json. Almost nobody has one yet, but the spec exists and crawlers are starting to look for it.

Try it: https://clarvia.dev

cjav_dev

5 hours ago

Nice. I had the same idea: https://github.com/cjavdev/agent-lint

Super simple OSS tool to run with skills or at the CLI. npx @cjavdev/agent-lint https://docs.example.com

Looking forward to being inspired by your checks!

We built a free tool that audits how AI agents see your website

2 Comments

lr001328

cjav_dev