My advice to little sites: Basic Auth stops bots dead in their tracks.
The credentials have to be discoverable of course, which makes it useless for most sites, but it's not that inconvenient. Once entered, browsers offer to use the same creds on subsequent visits.
I have several apps that use 3D assets that are large enough for it to become worrisome when hundreds of bots are requesting them day and night. Not any more, lol.
That's a good advice, thank you.
In our approach we do our best to not to affect user experience. E.g. consider an example of a company website with a blog. The company does it's best to engage more audience to their blog, products whatever. I guess quite a part of the audience will be lost due to requirement of authentication on website, which they see first time.
However, for returning, and especially regular, clients I think that is a really simple and good solution.
Where can I learn more about JA5? John Althouse (What JA stands for) had not published anything about JA5 yet.
Thanks for sharing!
The heuristics you use are interesting, but this will likely only be a hindrance to lazy bot creators. TLS fingerprints can be spoofed relatively easily, and most bots rotate their IPs and signals to avoid detection. With ML tools becoming more accessible, it's only a matter of time until bots are able to mimic human traffic well enough, both on the protocol and application level. They probably exist already, even if the cost is prohibitively high for most attackers, but that will go down.
Theoretically, deploying ML-based defenses is the only viable path forward, but even that will become infeasible. As the amount of internet traffic generated by bots surpasses the current ~50%, you can't realistically block half the internet.
So, ultimately, I think allow lists are the only option if we want to have a usable internet for humans. We need a secure and user-friendly way to identify trusted clients, which, unfortunately, is ripe to be exploited by companies and governments. All proposed device attestation and identity services I've seen make me uneasy. This needs to be a standard built into the internet, based on modern open cryptography, and not controlled by a single company or government.
I suppose it already exists with TLS client authentication, but that is highly impractical to deploy. Is there an ACME protocol for clients? ... Huh, Let's Encrypt did support issuing client certs, but they dropped it[1].
[1]: https://news.ycombinator.com/item?id=44018400
This is not realistically useful