Self-Hosted JA4 to combat AI bots

3 pointsposted 11 hours ago
by ArcHound

9 Comments

mmarian

9 hours ago

As a learning exercise - great. As an actual mitigation technique - even JA4s can be rotated pretty easily these days by motivated actors. Rotation patterns might still work (for now :D)

ArcHound

9 hours ago

This is the sad conclusion of the next part. JA4 is a great supplement, it can squeeze some additional info, but for a motivated attacker it can be avoided.

Now the question of how motivated are noisy AI scrapers is still open. Even a solution that cuts down 50 percent of the dumbest scraping attempts will still provide much needed relief to a struggling site.

mmarian

9 hours ago

I'm curious, which site struggles are you envisaging? In my exp, JA4 is used as a hammer for which the nail must be found; simpler solutions oftentimes work better.

ArcHound

8 hours ago

I think we agree that JA4 is situational. It really saved me when investigating a credential stuffing attack - random logins with random chance of success spread into many ASNs, all had the same fingerprint.

From my experience, there are all kinds of levels of bots. Add them all together and they can produce a ridiculous load on a site (especially a fragile one that you have to secure anyway). So I look at the volume, trying to block anything stupid I can get away with.

It is a game of whack-a-mole. It also can cut down the overall traffic to a fraction of the original, which has tangible infra costs benefits.

And yes, captcha works better in a lot of cases. Fortunately I'm not selling JA4, I'm just curious.

And yes, IP rate limits and ASN checks work really well in plenty cases. Side note: I got a high-throughput free offline asn-checker too! https://blog.miloslavhomer.cz/asn-check/

mmarian

8 hours ago

I agree JA4 is situational; but the # of use cases is smaller than most people think. Like you said, Captcha works better; would've stopped the credential stuffing. Managed DDoS services (Cloudflare et al) + rate limits are better at DDoS.

Cool ASN project, but doesn't IPInfo already offer this for free: https://ipinfo.io/lite ?

Bender

10 hours ago

This is a good write-up. Is your blog running JA4 right now?

ArcHound

10 hours ago

Hello again! Yes it is. If you have an exotic client, I'm here for it :D

Bender

10 hours ago

Nice. I was more curious of the clients using HTTP/2.0 HTTP Protocol, what percentage of them is JA4 detecting as bots that spoof all the other headers a browser sends? That is the missing piece in my blog write-up as I don't do SSL fingerprinting. I am trying to see what percentage are getting through my very crude methods.

ArcHound

10 hours ago

I'll get back to you on this, I'll need to parse some logs. I should have at least ALPNs