Show HN: Yikes imagine trusting Google's documentation

11 pointsposted 11 hours ago
by aa_y_ush

14 Comments

wibbily

3 hours ago

Huh? The first "conflict" you list isn't a conflict.

> The snippet from "search docs crawling indexing pause online business" states that adding a Disallow: / rule for Googlebot in robots.txt will keep Googlebot away permanently as long as the rule remains. "search help office hours 2023 june", however, advises against disallowing all crawling via robots.txt, warning that such a file "may remove the website's content, and potentially its URLs, from Google Search." This directly contradicts the claim that a full-disallow rule safely blocks Googlebot without negative consequences, creating a true conflict about the effect and advisability of using a disallow rule to block Googlebot.

If you want to block Googlebot "permanently", why would you expect to stay listed in Search? The first page actually agrees with the second - if you only want to temporarily block crawling, it recommends not blocking Googlebot.

Actually, your last "conflict" is bad too. A 503 fetching robots.txt does stop crawling the site, for at least twelve hours and possibly forever (if other pages return errors). The only crawling Google will continue to do is to keep trying to fetch robots.txt.

I appreciate what you're trying to set up here but 2/4 is a pretty bad record for a demo.

yash_5339

8 hours ago

This definitely solves a good problem. Company don’t keep generally good confluence docs and documentation. Somehow if there is common source of truth it helps to entire org in a company. But I was wandering if it will be helpful or not for external world because I feel companies usually double check any information before releasing it publicly ..especially related to code base.(Just a thought).

Portoaj

7 hours ago

Totally agree - we've found a handful of conflicts on every large company whose public docs we've looked at but a lot of the value is definitely in internal docs where there's no technical writer double checking everything that comes out.

rathinshah

10 hours ago

Thisis really good. I wonder if these "truths" propagate anywhere or not.

aa_y_ush

10 hours ago

thanks! yes, we enable auto updating documentation, automatic conflict resolution, and accurate search indexing.

SchmitzAndrew

7 hours ago

would be cool to extend this to enable auto-creating a pr to update a docs repo!

aa_y_ush

7 hours ago

Hey Andrew, that is a 100% in play.

ayushman_gupta_

8 hours ago

Love the concept. Just curious if and how you guys determine which source is correct in case of a conflict

aa_y_ush

8 hours ago

We do! An org can define precedence rules, but the engine looks at things like recency, authority, majority voting etc. We also flag criticality to raise manual reviews when needed.

mtyagi

10 hours ago

Makes sense now why integrations sometimes break unexpectedly. Conflicting info in official docs is a real problem

aa_y_ush

10 hours ago

have you seen a similar situation before?

mtyagi

10 hours ago

Seen it a lot in enterprise environments. Teams maintain parallel Confluence spaces and internal API docs. They drift constantly. The newer page is correct, but search still surfaces the old one first