dsagostini
13 hours ago
Hello HN, I'm the builder of ViaMetric.
We noticed that despite robots.txt allowing User-agent: *, many sites return 403s to LLM crawlers because of legacy WAF rules effectively blocking 'non-browser' user agents.
This free tool runs a real HTTP request masquerading as GPTBot and PerplexityBot to verify if your firewall is letting them through. It also calculates a 'Content Density' score (Text-to-HTML ratio), which we found correlates with better retrieval in RAG systems.
Tech Stack: Next.js, Cheerio for parsing, Supabase backend. Feedback welcome!