ryanator777
3 hours ago
Just published a blog post on this topic. We've been fortunate at SerpApi to avoid too many negative effects of this, but I think other web scraping companies were hit harder.
Item id: 42719865
3 hours ago
Just published a blog post on this topic. We've been fortunate at SerpApi to avoid too many negative effects of this, but I think other web scraping companies were hit harder.
3 hours ago
Just wrote a blog post covering this topic:
2 days ago
Generally works for me with https://www.google.com/search?gbv=1&q=test with JS blocked at domain-level by uBlock Origin.
With the caveat that this used to 100% work, but since a couple months, it indeed occasionally redirects to the “Turn on JavaScript to keep searching” page you mention, https://www.google.com/httpservice/retry/enablejs . I'd say the refusal happens 1 / 20 searches. Said differently, I’d prefix your “refusing” with a “sometimes”.
I haven’t investigated the reason for this sometimes-ness. Would love to find an answer here, or ideas/leads (aside switching to another search engine, yes I do know about them, but sometimes Google remains better). Or maybe the sometimes-ness was just A/B testing, and the full switch is happening and this is now a thing of the past.
EDIT you must have posted precisely at the moment of the end of the A/B test: I did several non-JS searches today at $job, and to confirm what I was writing here I did a test one, successfully. But 30min later, I confirm your observation: 100% blocked.
a day ago
If I copy "https://www.google.com/search?gbv=1&q=test" into the URL bar in Firefox 128.6.0esr on Linux and hit enter, I get a page that says "Turn on JavaScript to keep searching."
a day ago
Yes, see EDIT at the bottom of my post.
2 days ago
Google is now forcing my to go elsewhere and I pay for gsuite. I may as well move it all over to Proton.
I use noscript and refuse to turn on JavaScript for anything but actual Web Applications. I do not turn it on for general browsing of content to be read or search because the the UX abuse that JavaScript enables.
The majority of exploits in the wild are delivered via drive by JavaScript.
That said I'm all for honest Advertising if the UX is not shit.
In fact I think there should be an HTML5 <ad></ad> tag implmented in the browsers sandbox that supports IAB VAST specs so none the that "VAST MACRO" garbage would need to done via huge JavaScript payloads.
2 days ago
Maybe Proton should get into the search business. Except Proton requires I turn on JavaScript.
2 days ago
I think that lamenting the end of an era because Google doesn’t offer hyperlinked docs is like lamenting the end of fine dining because Olive Garden doesn’t offer cloth serviettes.
We’re looking in the wrong place if we want an ad company to be the champion of anything but revenue optimization.
8 hours ago
I got it to work again with a user agent from Links: `Links (2.29; Linux 6.11.0-13-generic x86_64; GNU C 13.2; text)`.
a day ago
It occurs to me this might be of interest to readers in this
thread.
A couple years ago I put together a list of sites that render
well using Lynx and EWW. Since both browsers don't support
JavaScript out of the box, maybe this is interesting to people
here?
https://ohmeadhbh.github.io/bobcat/
I noticed the Greycoder site has several sites I should probably
add. But if you have a link you think should be included,
please submit a PR on GitHub. While I'm not horribly sensitive
to Github's weirdness after being purchased by Microsoft, I am
sensitive to people who are sensitive to it. So if you don't
want to go onto GitHub to submit a PR, you can find my email
address at https://github.com/ohmeadhbh, just send me an email.
2 days ago
I've filed this issue in their Google Search Community: https://support.google.com/websearch/thread/318978583
You can also click the wrench icon in the Google SERP and choose Send feedback.
In the meantime, use another search engine, such as Brave Search, DDG or Searx.
2 days ago
Good time to get off google then!
a day ago
Highly recommend Kagi Search as an alternative. The results are generally better than Google's anyways, and don't require JavaScript. It is a paid service, but at this point having reliable/privacy-respecting search is worth it.
Not affiliated with Kagi btw.
a day ago
I see it compares between ads/sponsored vs kagi showcasing just webpages. If I use ublock and block elements, does Kagi has advantage over it? In other words, are the search results superior barring QOL improvement could Kagi possibly bring.
a day ago
I personally find the search results from Kagi to be superior to Google/DDG/etc, beyond just not having ads or sponsored content. Before switching to Kagi, I had started to feel like a lot of the front page results from Google were just sites that had managed to maximize their SEO but not actually have much valuable content. That hasn’t seemed to be the case with Kagi. I generally find that the results from them are a lot more informative.
Of course that’s highly subjective, so I think it’s worth trying out their free tier to see if the potential improvement in search result quality is something you notice and find valuable enough to spend money on.
a day ago
Or use free SearXNG: https://searx.be
a day ago
a day ago
Somewhat apropos... this isn't the first time people have been curious about which websites work with JS turned off:
From five years ago: https://dev.to/ziizium/famous-websites-with-javascript-disab...
And eight years ago: https://www.jakobstoeck.de/2017/websites-which-work-great-wi...
It does seem like JavaScript is required on more sites as time moves forward.
a day ago
"It does seem like Javascript is required on more sites as time moves forward."
If the statement was "Javascript is used on more sites" then I would agree and it is easy to test for use of Javascript.
But a statement like "Javascript is required on more sites" is difficult to agree with as I have a very different experience..
For example, I am now retrieving Google results from the command line without Javascript using a specific UA string. Arguably, that means no Javascript is "required" to retrieve search results. A specific UA string is now required though. Use the wrong UA string and then Javascript is "required" to retrieve the results.
Rather than focusing on Javascript, a more interesting question might be whether more sites are requiring specific UA strings.
By default I do not use Javascript (I do not use a graphical web browser) nor do I send a User-Agent header. The overwhelming majority of websites "work" for me with no problems. To me, it does not seem that more sites are requring specific UA strings as time moves forward.
Google www search is just one website. The www is vast.
2 days ago
I guess google is fed up with freeloader piggy backing. Requiring JS is going to break a bunch of LLM crawlers immediately
2 days ago
It'll also break for a lot of of people with impaired vision and screen readers. Screen readers can't keep up with the insane development pace of JS and CSS and so people with impaired vision are going to be left behind. It's an accessibility nightmare.
2 days ago
Google or most search engines work fine with screen readers with javascript enabled. I think your understanding of how web accessibility works is likely severely outdated. There's just too many websites that use JavaScript that it would be a disservice if web didn't support accessible interface for pages with javascript.
https://en.m.wikipedia.org/wiki/WAI-ARIA
That said, as ARIA rule #1 says, it's better to not use javascript, as it's always less error prone. That doesn't mean websites shouldn't use javascript when they have reasons to do so, as long as they correctly follow ARIA.
2 days ago
And which reasons do you think Google absolutely has in order to disable completely the usage of the search engine without Javascript?
2 days ago
This is a common myth.
Screen readers are not a type of web browser. They are software which interacts with other software running on the computer, including web browsers. There is nothing which inherently makes JS or CSS incompatible with screen readers.
2 days ago
Yep. There's a bazillion of accessible JS libraries. Just manage tabindex/aria attributes. Accessibility is about actual DOM not the html string returned from server.
JS gives the same improvements for screen readers as for everyone else especially with complex apps.
Bad JS of course ruins things as usual, same bad HTML with table layout or whatever. But that's not JS on google.com;)
2 days ago
That doesn't make it a myth. There are plenty of screen readers that break directly because of shitty use of javascript.
2 days ago
Sure. But it certainly isn't as simple as "screen readers don't support Javascript or CSS" (or whatever). You can make an inaccessible web site with or without modern web technologies - a site based around HTML image maps or table layouts will baffle a screen reader just as badly.
2 days ago
Two words: shadow dom. Now tell me how a reader is supposed to know what's what?
2 days ago
My understanding is that people with impaired vision use the regular browser and a layer on top of it, such as VoiceOver. They don't need a special version of website. And screen readers don't need to keep up with JS.
2 days ago
I think google did this to force AI on us. Does not matter to me, I left google a year or 2 ago.
Makes me wonder about Google and AI, with my tin-foil hat on, I cannot help but think Google/AI searches your cache and cookies looking for info.
2 days ago
Google has questionable behavior in its browser[0] and tracking technologies[1] that sound similar to what you describe, but I believe the search itself is behaving normally. It runs slowly because all LLM chatbots use tons of processing power to pore through servers full of data that may or may not be accurate.
I agree Google is trying to force AI on us, but for a different reason: to demonstrate its value to shareholders.
[0]: https://www.eff.org/deeplinks/2023/09/how-turn-googles-priva...
[1]: https://www.schneier.com/blog/archives/2025/01/google-is-all...
2 days ago
The way web applications work, there is domain separation of data (be it cache or cookies), so googles "AI" isn't going to be able to read data that it already didn't have access to before.
2 days ago
this wouldn't matter if the page itself was calling the AI thingamajig
2 days ago
This isn't making sense to me. Due to the same-origin policy, anything on google.com can only access the cookies and other application data stored there by google.com. Doesn't matter if it's JavaScript or an "AI", Google can't break the same-origin policy and read cookies from other domains. AI changes nothing about this.
(Google services' widespread use by 3rd party sites does give them more data, but they have that whether you load google.com with JS or not. And again, unclear how AI changes anything about what data is available to them.)
2 days ago
Google can eventually allow anything on their domains to access whatever they want in the future and the end user can do nothing about this. https://news.ycombinator.com/item?id=40918052
2 days ago
If they do go this route it'd of course be highly controversial (not that they care), but also limited to Chrome. Even if they dare put it in Chromium itself, Brave, Vivaldi, etc would rip it out. Microsoft would just redirect the collection in Edge to their servers.
2 days ago
> Makes me wonder about Google and AI, with my tin-foil hat on, I cannot help but think Google/AI searches your cache and cookies looking for info.
This is nonsense. Any cached data or cookies that Google’s scripts have access to was saved by those same scripts. If any site’s “AI” (not sure what you mean by that) could search through objects cached by other sites, you’d have bigger problems.
2 days ago
What's FF esr 128?
EDIT: Figured it out - https://www.reddit.com/r/firefox/comments/1ca4ii3/what_is_fi... - it's "Firefox Extended Support Release" - https://support.mozilla.org/en-US/kb/firefox-esr-release-cyc...
2 days ago
Yep, sorry. The default browser for Debian Linux. I've also found it's blocking firefox forks like Palemoon 33. But really old browsers from the 2015 era don't get blocked (yet). User-agent spoofing does nothing.
2 days ago
How does it detect despite the user-agent?
2 days ago
I was wrong. There are a few user-agents they still allow. Like "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:133.0) Gecko/20100101 Firefox/22.0"
2 days ago
Firefox Extended Support Release version 128
2 days ago
2 days ago
Firefox extended service release
a day ago
Is anyone collecting links to interesting sites that work without javascript? This could slowly be turned into a simple search engine that returns only results compatible with Dillo and other small browsers.
a day ago
Are there any alternatives? On the browser that I use, with javascript disabled,
Ask.com: Does not work; does nothing at all.
Ecosia: Blocks me indirectly by cloudflare.
Startpage: Blocks me explicitly, saying "Your connection has been suspended"
a day ago
FWIW... I was able to get startpage.com to work with javascript turned off. But it doesn't really look that nice in a text browser.
Maybe they're doing some weird geofencing or dislike your ISP?
2 days ago
Pretty proud of the fact that Kagi not only works without Javascript, but it looks and behaves almost the same. Javascript is used to enhance the UX not create it. It is like 'lite' mode is always on.
2 days ago
This is just outrageous, I'm off to DuckDuckGo then.
2 days ago
Same on Dillo, reloading makes the JS wall go away for now, but probably won't last.
Edit: Reloading doesn't work anymore for me. Unless the sca_esv=xxxxxxxxx param is present it will redirect to the JS wall.
2 days ago
Maybe try changing the user agent? I can use Google on my Kindle's web browser and that can barely handle Javascript (though the Kindle does do limited execution)
2 days ago
I have tried spoofing the user agent. No effect. It seems to be if the browser is new enough then if JS is turned off it blocks you. But if you use a really old browser (~2015 Firefox) that doesn't support modern stuff it still allows non-JS search. I think they must have the server looking at HTTP header or fingerprinting or something. I don't think they could do the redirect based on CSS or HTML5 support without JS being run.
2 days ago
I believe they're doing a meta-tag redirect (possibly inside a noscript tag?) in at least some cases. Source: I'm developing a web engine that doesn't have JS support.
2 days ago
That's exactly what they are doing:
<!DOCTYPE html>
<html>
<head><title>Google Search</title>...</head>
<body>
<noscript>
<meta content="0;url=/httpservice/retry/enablejs?sei=..." http-equiv="refresh">
<div style="display:block">Please click <a href="/httpservice/retry/enablejs?sei=...">here</a> if you are not redirected within a few seconds.</div>
</noscript>...
2 days ago
You're correct.
<noscript><meta content="0;url=/httpservice/retry/enablejs?sei=a3qIZ42cGcvcp84P5p_mwQI" http-equiv="refresh"><style>table,div,span,p{display:none}</style><div style="display:block">Please click <a href="/httpservice/retry/enablejs?sei=a3qIZ42cGcvcp84P5p_mwQI">here</a> if you are not redirected within a few seconds.</div></noscript></header>
2 days ago
Here's a skeleton overlay.js for an old style firefox extension to mitigate the meta-redirect part of the blocking.
var remover = {
init: function() {
var appcontent = document.getElementById("appcontent");
if(appcontent) {
appcontent.addEventListener("DOMContentLoaded",
function(e) {
var doc = e.originalTarget;
if(doc instanceof HTMLDocument) {
var noscripts = doc.getElementsByTagName('noscript');
for(var i = 0; i < noscripts.length; i++) {
//if(noscripts[i].innerHTML.indexOf('meta content="0;url=/httpservice/retry/enablejs?sei=') != -1) {
noscripts[i].parentNode.removeChild(noscripts[i]);
//}
}
}
}, true);
}
}
};window.addEventListener("load", function() { remover.init(); }, false);
2 days ago
Have you tested this? I don't think this will help much because you'll just get a page that requires JavaScript to function.
2 days ago
This won't help indeed, the noJS page now is only this redirect and contains no search results.
It’s a blank page with one line of text: “Please click here if you are not redirected within a few seconds”.
2 days ago
That depends. It works on setups where google still gives you the old html results but not one setups where it sends you to the js application. So on 3 out of 5 of my desktop setups it works. On the ones where, for some reason, google only sends me to the JS app, it doesn't work like you suggest, or, it does, but I'm just left looking at a blank page.
2 days ago
Other browsers, just an ad to use one of five other browsers, and incidentally use Javascript. Always use noscript to reduce the attack surface.
For the dyed-in-the-wool, lynx https://www.google.com, tab and type in test, tab and enter, Now how can I get lynx to remove the ad?
Startpage search on "Google requires Javascript" replies "Allow JavaScript in your browser - Google AdSense Help" - now isn't that special?
18 hours ago
SERP systems are no longer working because of this. Lynx seems to be a solution here.
2 days ago
It is not just Firefox esr 128. I can reproduce this with other UA strings. It has been intermittent for me so far. Sometimes I get /hhtpservice/retry/enablejs? and sometimes I get results as usual. I do all searching from the command line. No Javascript.
Some HN commenters suggest DDG or paid search. Funnily enough, DDG is returning a CAPTCHA at the moment.
Fortunately I have many other free www search engine options that are working fine.
In addition there are countless website search engines that all continue to work. No Javascript required.
2 days ago
Disappointing. DuckDuckGo seems to still work w/o JS enabled. But Bing also fails to do anything w/o JS. HN comments still seem to work, thankfully.
2 days ago
DuckDuckGo specifically offers JS-free frontends and Bing still works -- it's just their tracking redirect pages that require JS. Thankfully userscripts exist to deobfuscate the tracking URLs to plain ones on search results (as they're slightly more complex than Googles)
a day ago
How do you get Bing to work? I go there, type in a query, hit return and then nothing. The eyeglass icon seems like it could be a search button, but nothing happens when it's clicked. I should mention I'm trying it on Firefox 128.6.0esr.
2 days ago
Can userscripts run in a page which has JS disabled?
2 days ago
Stop using Google. Google has literally nothing but SEO spam and malware. Use DuckDuckGo or Kagi.
2 days ago
Google is currently the only search engine allowed to crawl Reddit, which sometimes yields good original user content of actual { non-blogspam, non-SEOed-to-hell, non-AI } value.
2 days ago
All search engines include Reddit results by default and you can usually refine by adding some param like site:reddit.com which works the same in Google as in other search engines
e.g. https://duckduckgo.com/?t=ffab&q=filter+coffee+site%3Areddit...
Edit: maybe you are thinking about the AI deal which is exclusive to Google. That's not the same thing as search engine indexing https://www.cbsnews.com/news/google-reddit-60-million-deal-a...
2 days ago
While I couldn’t find a better document: https://www.tomsguide.com/computing/search-engines/google-is... describes how non-Google search engines cannot get new results from Reddit.
2 days ago
Yes, I was wrong. I found this one as well https://arstechnica.com/gadgets/2024/07/non-google-search-en...
2 days ago
> “Edit: maybe you are thinking about the AI deal which is exclusive to Google. That's not the same thing as search engine indexing.”
@mastazi that’s what I’m talking about, and I think your AI vs. indexing nuance is incorrect. I wasn’t sure, so I just did a quick N=1 verification: searching for the name of a random 1week-old popular Reddit post with a precise unique title,
- Insta-found it on Goog as top result
- Didn't find it on DDG, with or without site:reddit.com
Looks like sibling comment from @cpressland (thanks!) is correct: as of today and until other search engines sign licensing agreements with Reddit, “non-Google search engines cannot get new results from Reddit”. See https://www.reddit.com/robots.txt , which links to https://support.reddithelp.com/hc/en-us/articles/26410290525... , section “Reddit may license public content for commercial or non-commercial use”
2 days ago
you are right, and I was wrong;
I found this article by Ars Technica that highlights the consequences of the deal for search engines:
https://arstechnica.com/gadgets/2024/07/non-google-search-en...
so Reddit, the website co created by Aaron Swartz, is now a walled garden.
Wow.
2 days ago
I thought that DDG and Kagi get a large amount of their results from Google? Do these have reddit results stripped out?
> a random 1week-old popular Reddit post with a precise unique title
While a week is a long time, is it possible that the other search engines just hadn't got around to indexing it yet?
2 days ago
DDG gets a large amount of their results from Bing: https://duckduckgo.com/duckduckgo-help-pages/results/sources...
Kagi I don’t know much about, but quickly reading about it, seems to indeed pay Goog.
> “While a week is a long time, is it possible that the other search engines just hadn't got around to indexing it yet?”
I doubt it, search engines these days are much faster, and this is double-confirmed by 1. articles linked in sibling comments, 2. reading https://www.reddit.com/robots.txt
2 days ago
2 days ago
ChatGPT search isn't allowed to index reddit
2 days ago
> Google is currently the only search engine allowed to crawl Reddit
Stop using Reddit. Reddit is already following the path toward nothing but SEO spam and malware. Use Hacker News or Fediverse.
2 days ago
This partnership has created a problem, though:
My Google is set to German. Apparently Reddit has autotranslated all their content into German.
If I do a Google search for "ssd zfs pool" the 4th result is "SSD-Pool eine schlechte Idee? : r/Proxmox" and this links to `https://www.reddit.com/r/Proxmox/comments/12a5abh/ssd_pool_a...`.
The Reddit page is in German, but as you may have noticed, the URL has `?tl=de` appended, while it contains `ssd_pool_a_bad_idea` in the path. If I remove the `?tl=de`, I get the original version, in English.
This means that what Google crawled, what it has in its index, was already in German. So Reddit translated the original page into German, then made it accessible for Google to index it.
For me this causes the problem that I am now getting a lot of AI-translated Reddit content, even though I'd really like to have the English version to begin with, because I assume that it won't contain translation errors.
I mean, the translation is very good, you probably wouldn't notice that it is one, but still...
https://www.reddit.com/r/Proxmox/comments/12a5abh/ssd_pool_a...
https://www.reddit.com/r/Proxmox/comments/12a5abh/ssd_pool_a...
a day ago
Add this to your uBlock Origin filters:
reddit.com$removeparam=tl
2 days ago
2 days ago
Funny, that is indeed true, didn't know that.