The Age of PageRank Is Over (2022)

69 pointsposted 12 hours ago
by dotcoma

59 Comments

Fordec

10 hours ago

Can the script on this then be flipped? Build a search engine, clearly smaller in scope and commercial utility, that if a site links to a payment or ad network, de-rank it heavily. Then the end result should be in theory, filled with what one would consider the "old" internet, primarily blogs and sites not trying to sell you things or abuse your data.

None of the large companies would do it, but that would be the point.

JumpCrisscross

10 hours ago

Introducing: Kagi.

sph

9 hours ago

That's not what Kagi does. Nor does its "Small Web" search mode, as it only searches blogs that have been manually added to a specific GitHub repo (so for most part is a collection of US tech blogs - not very diverse at all)

No VC-backed or commercial search engine would do what OP is talking about. But I can see a use for a niche search engine that ranks websites inversely proportional to the number of trackers and ad networks they depend on. Heck, I would pay for that, but I'm a nerd.

Maybe marginalia.nu would like this idea.

JumpCrisscross

9 hours ago

> not what Kagi does

Kagi maintains a "non-commercial index (Teclis) and non-commercial news index (TinyGem)" [1]. They also "prioritize non-commercial sources," implicitly downranking monetized sites. (Also, you can manually downrank problem domains [2].)

My devices have randomly switched back to Google or DDG from time to time. The first thing I did was check my ad blocker was working--I was simply stunned by the amount of blogspam puke.

> No VC-backed or commercial search engine would do what OP is talking about

No ad based, i.e. free, engine can, not sustainably. A paid one obviously can and does.

[1] https://help.kagi.com/kagi/search-details/search-quality.htm...

[2] https://help.kagi.com/kagi/features/website-info-personalize...

freediver

5 hours ago

> That's not what Kagi does.

We do actually. We penalize the sites with a lots of ads/trackers on them in our results and boost non-monetized pages. It is one of the main reasons Kagi reults have a specific 'flavor' to them. (Kagi CEO here)

baxtr

9 hours ago

What’s the reason that search engine would ask for a login? (Like Kagi does)

TeMPOraL

9 hours ago

Because it's a paid service? That's the entire point.

And that also enables tons of user-centric features they talked about, starting with the earliest ones and my favorites: being able to uprank and downrank domains (like, "ban pinterest", "pin Wikipedia", "downrank w3schools", "uprank cppreference", etc.), and adding rewrite rules to results (like "reddit.com" -> "old.reddit.com"). Both of these are personalizations tied to your account, so they're active on any device as long as you're logged in.

They've added more cool stuff since, but these two alone were what has kept me a paying customer for the past years.

baxtr

5 hours ago

Ok the paid part makes of course sense.

However I still find it a bit creepy to know that they know all about my searches and even up- and down votes.

Google at least can be used in incognito mode.

evoke4908

an hour ago

Google knows what you search, even in "incognito" mode, and even when you're logged out. They correlate your IP address with your search profile and use everything you search for in your ad profile.

Kagi does not track search history, period. There's no history attached to your account even if you wanted it. The login is purely for authentication.

user

9 hours ago

[deleted]

Ferret7446

8 hours ago

Those sites don't exist any more. Not literally of course, but effectively. There's probably only a couple tens of thousand if I had to guess (and many of them abandoned blogs with a couple dozen pages of no note, hosted by some server that has not yet been unplugged because it's physically lost). Also, good luck trying to find them (either electronically or physically).

(And I say that as someone who owns such a site.)

Case in point, I wanted to link to http://bash.org/?5273, but bash.org no longer exists.

farseer

7 hours ago

You can use Wikipedia's built in search for that I guess.

trod123

8 hours ago

There are some subjects that are personal that no one will be comfortable searching for while its tied to a billing address (i.e. any login).

Abortion in some states for example. It doesn't matter what any company says right now, the position is clear that the tune can and often does change the moment its inconvenient for them.

Paying for the privilege of being targeted is crazy talk.

JumpCrisscross

8 hours ago

> while its tied to a billing address (i.e. any login)

Total false equivalence. Kagi accepts Bitcoin [1]. Logins are not identifiable in the way billing addresses are.

> doesn't matter what any company says right now, the position is clear that the tune can and often does change the moment its inconvenient for them

You think the free engines don't know who you are?

[1] https://blog.kagi.com/accepting-paypal-bitcoin

trod123

7 hours ago

bitcoin is traceable. not a false equivalence at all. Search history is kept tied to a profile/semaphore/data lake and then also tied financially to the way you pay, as well as the devices you use.

When data is collected, it will be trolled through for anything and everything. Absence of data is just as unique. The only way is to blend into the masses generically.

Free engines have a very hard time differentiating me from the mass of other web traffic.

I compile my own browser to make it that way though, even my phone doesn't register an overnight location, and runs GrapheneOS.

Importantly this should be a fundamental right of every person not just those that have the money and expertise to enforce it. The data collection shouldn't be allowed, its the equivalent of quartering a digital soldier in every home (something forbidden by the constitution).

Any business funded and dependent on arbitrary preferential loans (from the printer, regardless of how indirect) is state-run and nationalized industry, and should be bound by the same constitutional requirements of government.

I take my privacy seriously because of my job, and the indisputable fact that sensitive roles get targeted.

JumpCrisscross

3 hours ago

> this should be a fundamental right of every person not just those that have the money and expertise to enforce it

Sure. But it isn’t, and as long as the choice is free, ad-supported and paid, privacy contractually (but not technically) guaranteed, there is an obvious answer as to where the smart money goes. And that’s before we get to search quality.

A unicorn third option would be nice. But if even people who can pay for this aren't willing to, it condemns the idea of a public option in the cradle--it screams people don't value privacy in search enough to a quantifiable degree.

amenhotep

5 hours ago

People post video evidence of them literally committing felonies to their own personal public social media accounts. I know "no-one" is hyperbole but risk perception and tolerance is much more variable than it would need to be for this to be persuasive.

ben30

9 hours ago

When colleagues ask questions in Slack, I sometimes paste Kagi's search summary. Quick and usually spot-on.

Funny thing is, I've told the team about Kagi, but not everyone's willing to pay/see the benefits yet. Meanwhile, I'm wondering why they're asking if a good search engine could answer it so easily.

geenat

10 hours ago

I'm not convinced Google has stopped using backlinks and other classic pagerank attributes for search ranking.

28304283409234

10 hours ago

Nobody said they stopped. The article argues that the practise has become infinitely less effective, and Google is optimizing for their customers as opposed to their users.

captainmuon

8 hours ago

For many queries, it seems the top result is just Stackoverflow, Wikipedia or something simliar, then a couple of organic links, and then just junk. And on the second or third page, the results just stop. I have a feeling that at this point, it is a mixture of white- and blacklisting, and single-page classifiers based on heuristics or "AI". PageRank is surely in there, but probably not dominant. (Sidenote: I wonder if they secretly do ignore rel="nofollow" because so many links use it.)

Funnily, I've used PageRank for some smaller search projects, and it still works very well if you use it in a vertical, say "educational resources" or "programming" or simliar. I just did a broad crawl, 3-4 steps from some known "good" seed URLs away, calculate PageRank and mix it in with Solr's classifiers.

stanislavb

10 hours ago

I think you are right. They might be using a plethora of other factor as an addition to back links; however, genuine links from authoritative sources still seem to be the best "voting" mechanism to bubble up good content/results.

I can imagine that Reddit could join the search war - first, they have so much user gen content that people are deliberately looking for; second, they could use the voting mechanisms that are already in place to give preference to the "better" content. Of course, I might be wrong :).

cryptoz

10 hours ago

The part where you might be wrong is the part where Reddit does it. IIRC they have been “working really hard on a better search” for Reddit for…15 years? Every couple years spez will pop up and claim they almost have it!

I wouldn’t hold my breath.

cryptos

7 hours ago

I like the idea to pay for quality content - or search results. If you don't pay for it, someone else will and the "sponsor" will probably have different objectives than you. So, you won't get the best information and it would be likely to waste money because of that (e.g. buying an advertised low-quality product).

ricardo81

10 hours ago

I used to get sent a monthly cheque for $1K around the turn of the millennium simply to have a link on my home page, such was the power of Pagerank back then. This was handy as it allowed me to invest time into learning web development, plus the site was educational and paid for its existence.

Producing content was harder back then and the web was a lot smaller. Google most definitely still use links to rank but are much more likely to discount/devalue links than they did historically.

l5870uoo9y

9 hours ago

Buying backlinks is still around and costly.

ricardo81

8 hours ago

No doubt. Search still struggles to measure intent from the content creator PoV.

28304283409234

10 hours ago

I will do you one better though, given my conversation with teenagers this weekend: Pagerank is dead, because webpages are dead.

The only moment these kids use a search engine is when they do homework. In any and every other moment they just search "locally" in TikTok, or insta.

It scares the shit out of me.

(Edit: paying kagi customer here! Keep it up, kagi! I still love and need you!)

nicbou

10 hours ago

I run a website and that doesn’t seem to be the case. Some things don’t fit into a TikTok video. Some things can’t be answered by ChatGPT. People still search for things and find me.

EZ-E

9 hours ago

I've come to realize the same thing. I was found the "boomer" thing to look for restaurants on Google Maps, then I realized younger people use Tiktok for this now. And here I thought Tiktok was for just for memes and funny videos.

fxtentacle

9 hours ago

In China, there's WeChat, which is basically the everything app, from chat, to food delivery, to ecommerce, payment, and even navigation. Normally, that would violate App Store guidelines, but they are dominant enough in the Chinese market that Apple caved in and granted them special permissions.

throwccp

8 hours ago

If they didnt, Apple would have been banned. Thats the power of CCP.

trod123

8 hours ago

Don't forget indoctrination and demoralization for thought reform as well. Beams marxist reeducation right into every teen's life in subtle ways.

There's good reason its being considered a national security threat when all user data is being shipped through China, and their businesses are a government partnership.

Political Warfare and Subversion are real and difficult issues to deal with given our 'open' society.

trod123

8 hours ago

> It scares the shit out of me.

It should because anytime austerity, or other diverse circumstances happen where life is on the line, these are the first people that end up dying and worst the responsibility for the travesty is entirely on the previous generation in terms of political power. These are summer childs, and winter is coming.

People aren't naturally stupid. Its a long process of drugs and torture that makes them like that. Only the drugs are flouride and dopamine, and the torture is arbitrary struggle sessions built into every process to eliminate rational reason and thought where they have to interact with it in society during critical identity formation periods (permanently damaging them).

No doubt there will likely be a great dying in the near future. You can only kick the can so many different times in cycles before those cycles all line up at the same time. Mother nature is a bitch, and the unprepared die when safety nets fail.

The first most important and crucial tool for survival is having your brain and knowing how to reason following rational first principles and thought.

Dalewyn

10 hours ago

>because webpages are dead.

I think it's more accurate to say that the internet and perhaps personal computing in general is dead.

Practically everything about what makes a computer tick has been abstracted away, because rightfully or otherwise people today just don't care. When was the last time you saw someone actually using the address bar in a web browser instead of Googling? Or indeed Tictok'ing or Insta'ing. Nobody knows what a file or folder is either.

roelschroeven

7 hours ago

> When was the last time you saw someone actually using the address bar in a web browser instead of Googling?

I used to laugh when people Googled a websites name instead of entering it manually. But these days, I find myself often either Googling (or DuckDuckGoing usually, but as a verb that just doesn't have the same ring to it)for the name, or relying on autocomplete from my bookmarks or history.

I feel website names have become less predictable mainly because of the explosion of possible top level domains: even if I know the exact name of a website, I can't reliably remember which TLD to use. Plus I'm more and more worried of accidentally using the wrong URL (through a misspelling or a wrong TLD or whatever) for fear of ending up on some scam site instead of the real thing.

Dalewyn

6 hours ago

>Plus I'm more and more worried of accidentally using the wrong URL (through a misspelling or a wrong TLD or whatever) for fear of ending up on some scam site instead of the real thing.

I also Google for websites I really should know by heart more often than I want to admit.

Why? Because I've accidentally typed googl.com or google.colm and then immediately slammed ALT+F4 enough bloody times that I actually no longer trust myself to type straight.

At least if I typo "googl" into Google I'll either get an autocorrected result or utter gibberish instead of a drive-by trojan to my face.

JumpCrisscross

10 hours ago

> everything about what makes a computer tick has been abstracted away

It’s always been abstractions. Even when we didn’t know it. (We mastered the technological use of transistors before we understood why they work.)

openrisk

9 hours ago

> people today just don't care.

When did people 'care'? Complex technologies have always been pushed on people. Regulation, consumer protection etc. is supposed to be the informed, delegated 'caring' filter.

The interesting long-term dynamic is that the problem is self-correcting. If you use amazing technologies to breed masses of addicted, exploited idiots instead of informed empowered citizens, eventually your walled garden will collapse onto itself.

Dalewyn

9 hours ago

>When did people 'care'?

Back when using a computer required having some idea of what was going on. Even if all you really cared about was playing Doom, you still needed to make sense of all those levers with nonsensical labels on them and boy did we figure them out.

Mind you, I don't see where we are today as necessarily a bad thing. It's a very good thing that most people can just use a computer as just another tool like a screwdriver or a car.

But on the other hand, we've also lost the joys and miseries of getting our hands dirty.

openrisk

8 hours ago

Yes, I also think the phenomenal adoption of computing is fundamentally a good thing as it unlocks new levels for society, at least in principle.

In the short term it triggered a race to a low-information, consumerist bottom in many respects (privacy invasion, addictive dark patterns, locked-down platforms, general enshittification etc.).

But this state of affairs, while unfortunate and lamentably gross waste, does not feel terminal. It is very unfullfiling beyond superficial sugar rushes and essentially hostile to the user-product being exploited. Especially so for the talented people that itch to get their hands dirty.

Informed and able minorities are what moves the needle, not dazed and confused masses. The history of (online) computing does not end here. It is still very early days.

wiseowise

10 hours ago

> When was the last time you saw someone actually using the address bar in a web browser instead of Googling?

I see the guy in the mirror every once in a while, why?

audiodude

9 hours ago

To be fair, I usually type the first few characters of the website I want to go to, and then pick the thing I want from the autocomplete results.

trod123

8 hours ago

I think its a fair point to say that most of humanity will probably be dead within 20-30 years. You have far more destructive people in positions that only make survival less likely, compared to the intelligent ones.

The people who made things work will die of old age or withdraw their support (on strike), LLM's will prevent new workers from developing the same expertise (since entry level positions will be removed).

Systems that have stood strong as oaks for centuries will suddenly fail, and with that collapse so too goes the food production.

Non-market socialist systems don't work, but you get those same systems during currency collapse (where ponzi outflows exceed inflows, or debt growth exceeds gdp).

No one knows a thing because socialism has done its dirty work, having captured academia over the past 50+ years, and indoctrinated the masses.

The benefits were front-loaded (as all ponzi's are), and it happened slow enough that no one noticed over multiple generations, and the generation that got the most benefit won't cede political power (they took power in the 1990s, and remain the majority today). They'll give it up only once its pried from the dead hands, which will come from natural aging.

Menticide from the Totalitarian state has a deleterious effect, making it harder for people to see the problems to take any action. Joost wrote extensively about this with regards to the Nazi's and Mao.

What we are seeing today is hubris and a natural consequence of ignoring lesson's learned.

hyperG

5 hours ago

"I think its a fair point to say that most of humanity will probably be dead within 20-30 years. "

Why do I even read this stupid fucking website

globular-toast

9 hours ago

The majority of people never did personal computing. Remember the Eternal September where internet users were saddened by hordes of normies messing up the net? I feel like there was probably a spike around the early 2000s in number of people doing personal computing. Since then it's gone back to normal. Most people aren't interested and have no need for it. Tiktok etc is just the new TV: something to mindlessly rot away in front of. We also need to remember that even those people who did use computers never used them how we use them. They used Windows and Microsoft stuff. Hardly in control of their own computing. Part of me is saddened by it, but then I wonder if that isn't the case for everything: cyclists are sad about all the cars, cooks are sad about all the ready meals etc.

imiric

8 hours ago

> We also need to remember that even those people who did use computers never used them how we use them. They used Windows and Microsoft stuff. Hardly in control of their own computing.

That's just elitism.

MS and Windows gave access to computers to millions of people who otherwise wouldn't have been interested. MS products allowed them to be productive, and enabled thousands of businesses to function. MS was instrumental in the explosion of the internet and WWW in the 90s. No niche hacker-oriented or consumer OS had as large of an impact as Windows.

We can argue whether MS has lost its way since then, but claiming that people weren't in control of their computing in the 90s because they used MS products is silly. The aggressive tracking, SaaS business models and everything else MS is criticized for today came much later.

the_third_wave

9 hours ago

> paying kagi customer here! Keep it up, kagi! I still love and need you!

What is it with the Kagi sycophancy on this site? Right now there's yet another discussion on the front page full of glowing Kagi adoration interspersed with some reality checks about the actual quality of the search results. I understand that Kagi is a Y-Combinator company but does it have to be laid on so thickly?

As to what I use for search: a self-hosted SearxNG (a meta-search engine, i.e. it proxies search results from other search engines) instance (started with Searx but followed when development moved there) combined with Recoll for local search, recoll-webui and the recoll 'engine' to integrate results into SearxNG. I also experimented with YaCy (a fully self-hosted search engine with its own web crawler) but have not gotten useable results yet, the system seems to get bogged down once the index grows behind a certain size.

fxtentacle

9 hours ago

Just try it, I guess. I was also sceptical and ran my own search engine and at one time I was even pitching people a paid indexing service that would allow people to self-host their search engine and by pooling money they gain world-class crawling.

But in the end, I noticed that for me, improving search results is mostly about suppressing garbage. Kagi lets me filter out Pinterest and some of the worst SEO spam farms. And with them gone, the results already feel much better.

I'd guess Kagi is popular here because they sell what people crave.

input_sh

8 hours ago

It really reminds me of Roam Research, which was also an overpriced, niche product that was unavoidable for quite some time in these circles. Now if you google it (or kagi it?) you just see a bunch of reddit posts asking if Roam is dead.

You'd think Kagi has millions of paying users, and not ~33k, seemingly half of which are on HN.

Hopefully this time around they don't actually start referring to their community as a cult like Roam did.

28304283409234

7 hours ago

Didn't even know they were a ycombinator company and now that I do I like them a lot less.

I am not a sycophant. Just a happy customer. And happy to have the chance to be a customer, not just a data point. I love kagi trying to serve me, with actual usefulness, instead of serving some dark marketplace of advertisers that have no interest in my wellbeing.

paradox460

8 hours ago

Kagi is not a yc company. They're privately funded and their last round was for less than a million dollars and was from private investors

immibis

8 hours ago

Maybe it's a good product or maybe they paid dang off. Two possibilities. Which one is more likely?

user

10 hours ago

[deleted]

syndicatedjelly

11 hours ago

Amazingly prescient. The original version of that article was written in 2019