Show HN: An experimental AntiBot, AntiCrawl reverse proxy for the web

23 pointsposted 13 hours ago
by pulkitsh1234

31 Comments

drdaeman

12 hours ago

Doing the ad industry's work? :-)

I strongly suspect someone from ad industry will start offering an option to serve a heavily obfuscated WASM-based rendering engine to render the website, with obligatory promises that it "protects the integrity of your content", "stops the AI crawl theft", and of course it also "lowers development costs by ensuring consistent rendering across all platforms".

/s, obviously

ronsor

12 hours ago

Well, this is going to be extremely resource intensive (read: your site will be a DDOS magnet), and it's also an accessibility nightmare.

ranger_danger

12 hours ago

Curious if you have any better suggestions

ronsor

12 hours ago

Don't serve your website as PNGs wrapped in a single page application.

Also, if the main reason for blocking bots is to reduce server load, this solution is going to require running multiple browser instances on a server, which will require a lot of resources just to serve normal traffic.

Edit: I should also mention this is going to chew through bandwidth.

1231232131231

an hour ago

Literally anything other than converting websites to images. For example: captchas, proof-of-work captchas, bot detection using techniques such as tls fingerprinting, blacklisting known vps/vpn IPs, etc.

throwaway888abc

11 hours ago

An option can be render each div/element as separate image wrapped in aria markup

ranger_danger

9 hours ago

Wouldn't the aria markup contain the very text they don't want bots to have?

rmbyrro

12 hours ago

Curious that the author works for BrowserStack, a browser-automation service

AnnaMere

11 hours ago

> Render Webpage as PNG for static content

Why not just take a screenshot yourself and use an image tag to render the image without using puppeteers?

You can add a small CSS media query for showing different sizes of images for different types of devices.

Why do you even need puppeteer for above basic things?

myflash13

12 hours ago

I’ve actually seriously been thinking of using WebAuthn to “authenticate” every single page load with a passkey unlocked by a biometric device only, so that I can be sure that every single page load had a meat finger on TouchID or a meat face in front of FaceID before showing the page to them.

In the future I imagine that there will be biometrically secure browsers that will be required for top security applications, that can guarantee that a single physical person is actually physically present while using it.

threatofrain

12 hours ago

Webauthn doesn't reveal if someone's a human and you can't see whether they used FaceID or whatever, and once someone proves they're human you should just give then a session token.

myflash13

11 hours ago

But I believe there is a spoof-proof way to accept hardware keys only? Limit yourself to known good hardware manufacturers and you should be good.

jeroenhd

11 hours ago

That'll just exclude most of your visitors. The visitors you're not refusing are using TPMs or secure elements you can't trust anyway because of a long history of side channel attacks for "known-good" hardware manufacturers.

hamandcheese

12 hours ago

Couldn't someone just write a software key that lies and says it's biometric?

AFAIK end-to-end manufacturer attestation isn't part of the spec (yet...).

KomoD

12 hours ago

There's even an emulator built in to Chrome

Dev Tools -> 3 dots -> More tools -> WebAuthn -> Enable virtual authenticator environment

myflash13

11 hours ago

Attestation statements can be used to restrict WebAuthn use to certain hardware keys only.

hamandcheese

11 hours ago

But is there a chain of trust going all the way back to the manufacturer? Or is that simply metadata which could be spoofed?

gruez

2 hours ago

Yes. That's what attestation is for.

jeroenhd

11 hours ago

WebAuthn doesn't work the way you think it does. Most computers don't have fingerprint readers or even webcams, let alone webcams capable of verifying a user. Instead, you'll end up making people type in their Windows password or scanning QR codes.

Remote attestation concepts already exists to solve this problem: https://httptoolkit.com/blog/apple-private-access-tokens-att...

Cloudflare ran an experiment with it, and I think Apple and Cloudflare are working to make it into a proper RFC. It's only a matter of time before other browsers will support it too with the way things are going right now.

dyml

12 hours ago

Please don’t use WebAuthn on every page load.

Two reasons: the protocol is not designed to do this - and the UI/UX is not designed to support this. There are better ways.

2) it will likely not work. There are virtual/software authenticatators (available in dev tools) that could generate a valid response without a human.

jeroenhd

11 hours ago

FWIW using WebAuthn to start a session, set up a cookie, and validating that cookie to get access seems like a pretty usable pattern. Not much more invasive than the "checking your connection" screen Cloudflare likes to throw.

Aachen

12 hours ago

That sounds awful. From there it's a miniscule step to actually needing to authenticate with every website, like a cookie you can't erase if you want to continue using the internet

rmbyrro

12 hours ago

C'mon folks, PNG? Webauthn on every page?

If we're talking lunars, why don't you print posters and distribute physically in your neighborhood?

100% bot safe & "meat finger" guarantee.

rmbyrro

12 hours ago

I presume you're not interested in traffic coming from search engines, right?

jeroenhd

11 hours ago

Search engines have publicly listed IP ranges + user agents, setting up a HTML whitelist for search engines you do like shouldn't be impossible. Especially now that Google no longer presents its cache to visitors.

freen

12 hours ago

Well, I’m not going to use this for it’s intended purpose, however I AM going to try to see if I can use it to pipe webpages to a simple touchscreen e-ink display and retain interactivity.

tonyponydong

12 hours ago

Please don’t make the web more inaccessible to assistive technology users

ranger_danger

9 hours ago

You're not wrong, but tell that to the bots :) Do you have any other suggestions?

gruez

2 hours ago

>You're not wrong, but tell that to the bots :)

Tell that to the judge when you're slapped with an ADA lawsuit because you failed to provide "reasonable accommodation" to disabled users.

ranger_danger

2 hours ago

I have to say, I've been making websites professionally for 25 years, and I've never heard of this being a thing even once.