Internet Archive: Security breach alert

1091 pointsposted 2 months ago
by ewenjo

156 Comments

Springtime

2 months ago

Just in terms of privacy, it's worth noting that anyone who has uploaded something on IA already has their email address publicly viewable.

This isn't something that commonly known (even judging by comments here) but in the publicly viewable metadata of every upload it contains the uploader's IA account email address. So from a security perspective it's bad but from a privacy perspective a lot of users probably weren't aware of this detail if they've uploaded anything.

hunter2_

2 months ago

This raises an interesting question: should email addresses be private? Addresses of buildings aren't private, and they're somewhat analogous as with many computing concepts. (Aside: Before spam filters were quite good, it was typical to avoid scraping of addresses by mild obfuscation, but I think those days are gone, and this is distinct from privacy anyway.)

If someone wants to upload and never be found out, then they need to use a throwaway address in any case, lest they be providing their "private" address to the administrators of the service without explicitly forbidding further disclosure. If I say something to Alice without demanding that Alice keep it from Bob, then I implicitly don't mind if Alice tells Bob what I said.

keybpo

2 months ago

It's not just uploads but any item that uses the email address as a unique user identifier (I'm not technical enough to explain this clearer but [1]).

An email address will be part of the xml in his uploads but also in his profile, which anyone can access by simply changing the url from https://archive.org/details/@foobar to https://archive.org/download/foobar. So, in essence, one just needs to have a registered account, independeltly any uploads made.

[1] https://help.archive.org/help/accounts-a-basic-guide-2/

steffanA

2 months ago

This is bad enough. This alone is a privacy bug/data leak.

Theoretically, someone could scrape the pages and compile a list of exposed email addresses.

rrwo

2 months ago

One solution is to use a unique email address for every website, and change the address if the site gets compromised (with the old address getting added to a spam filter).

999900000999

2 months ago

A pulled an old friends website down from Internet Archive.

He's moved on the next stage, but I was glad I was able to put his site back up.

It'll be a shame if IA goes down permanently, but we need a decentralized solution anyway.

Having a single mega organization in charge of our collective heritage isn't a good idea.

gabeio

2 months ago

I have always thought about this. It would be interesting to have users actually store small amounts of redundant info on a device connected to the internet. Very similarly to what a torrent does but with more peers (more data shards than full copies) and less seeds. And try and keep a huge database for everyone. Obviously open source and it would end up something like tor where they just assist the network with security patches but they don’t actually have any real “control” (admin dashboard control) over the network at large. We already do something smaller but like that with website static file caching, but at much smaller scale. Obviously security implications of this would be very hard but maybe not impossible to overcome. ipfs comes close but it again does more seeds then peers.

if anyone knows something like what I'm suggesting, I'd love to hear about it!

max-throat

2 months ago

This is why BitTorrent and other P2P solutions were invented, but alas: A. The RIAA, MPAA, and ESA have given these technologies a terrible reputation. B. Nobody likes to seed. Some kind of seeding-based crypto would have been a great incentive if cryptocurrency wasn't also demonized by now.

aucisson_masque

2 months ago

It's called torrent protocol and it doesn't work, no one wants to spend money and bandwidth hosting a god forsaken movie or book that only a handful of people care about.

EamonnMR

2 months ago

I keep wanting to do this for old sites, make like a personal mini IA. Besides just using wget or curl, any tips for pulling down useable complete websites from IA?

account42

2 months ago

Agreed, especially an organziation that has already shown to not always be impartial.

Simran-B

2 months ago

A decentralized solution, doesn't that scream internet archive on blockchain? What could go wrong.

steffanA

2 months ago

More details here about the data breach. Stolen database contains 31 million records.

https://www.bleepingcomputer.com/news/security/internet-arch...

ano-ther

2 months ago

> the Have I Been Pwned data breach notification service created by Troy Hunt, with whom threat actors commonly share stolen data to be added to the service

Do they? Why?

mkl

2 months ago

> The data will soon be added to HIBP

My unique-to-archive.org email address is not there yet.

maltris

2 months ago

My question is: How did Scott Helme end up with a password hash that features his own name?

Funes-

2 months ago

Friendly reminder to generate a unique password for every account you create so database leaks like this one don't bother you (besides on the site they're used).

ewenjo

2 months ago

Just noticed the site now alerts this:

> Have you ever felt like the Internet Archive runs on sticks and is constantly on the verge of suffering a catastrophic security breach? It just happened. See 31 million of you on HIBP!

mewpmewp2

2 months ago

Jokes on them... I'm already on HIBP countless of times...

mendym

2 months ago

I assume that if this is a bad actor, then account email/name will be leaked?

uticus

2 months ago

Is it a genuine alert, or hacking artifact?

Sometimes with friendly / attempt-at-humorous error messages it’s difficult to tell

EKSolutions

2 months ago

It looks like someone has compromised one of their subdomains for Polyfill

Update: Subdomain seems to be returning normal responses again now.

Aachen

2 months ago

You mean the IA included some JS polyfill from a subdomain and that's what's compromised / where the alert is coming from?

jrochkind1

2 months ago

That would perhaps explain how they managed to inject the JS alert popup, right?

EasyMark

2 months ago

One of those instances when you really wish curses worked on whoever was pulling this stunt “may you and your descendants suffer the bites of 10000 fleas for 10000 nights as punishment for your misdeeds”

PenguinRevolver

2 months ago

Probably not the best time to say this, but it's surprisingly easy to go through a collection with items and grab every email along with the usernames.

https://archive.org/metadata/naturally_a_girl/metadata

One way or another, there was going to be someone who would take loads of emails with a username attached to it. A bit intrigued by how the hacker compromised the database and got the passwords.

fewgrehrehre

2 months ago

Damn, I had no idea about this. Definitely would've changed some things had I known that emails were public.

This honestly seems like a bit of a design flaw.

Nathans220

2 months ago

Why go for the Internet Archive go for something else not the fucking archive!

mewpmewp2

2 months ago

We all need our easily accessible decentralized archive of some sort...

pityJuke

2 months ago

This thread is looking like it'll be one of the first places this incident will be documented (seems to be on the top of Google).

Already there are two new users just for this.

mendym

2 months ago

i see more than 2

ewenjo

2 months ago

Yeah, I was looking around, but saw no mention of it anywhere until I realized it just happened.

quart

2 months ago

[flagged]

iamtedd

2 months ago

I have had an IA account for a number of years, with a gmail address. Nine months ago, I changed the email address to a masked address using my own domain. Now I find that my gmail address was still stored, and was involved in the breach. Why? I get that they might store change history, but why?

BTW, for the current account details, I changed the password to another random string generated by my password manager, and also deleted the masked email address and generated another one, so going forward this sort of thing isn't that much of an issue for me.

keybpo

2 months ago

I have a similar situation, where I signed up with my main account and later changed IA's email to a more private address. It was the first email I checked on HaveIBeenPwned and it doesn't show up in this leak. The other couple IA accounts I have, whose emails and passwords are exclusive to them, they all show in this leak alright. I have no explanation to your situation but this was also my immediate though and I also wanted to give the opposite perspective.

account42

2 months ago

It's also possible that the breach was earlier or going on for longer than reported.

marviel

2 months ago

https://www.reddit.com/r/DataHoarder/comments/h02jl4/lets_sa...

I found this reddit thread from /r/DataHoarder about backing up the internet archive particularly interesting, given the circumstances

nikisweeting

2 months ago

It's been tried several times, but it's hard because it's such a massive quantity of data. The IPFS backup never really got off the ground.

They have their own backups which I think is good enough for now unless someone plans on donating a few hundred million.

creer

2 months ago

Backup / duplication is not an easy project for sure. But meanwhile for now IA is a single organization operating under one legal system. And one technical setup, would be relevant today. That's a major weakness.

EamonnMR

2 months ago

Suppose we each backed up sites we cared about rather than trying to mirror the whole thing...

Aachen

2 months ago

A few minutes ago (22:48 UTC), I got three emails from HIBP about accounts of mine breached on the Internet Archive. Troy is quick! And I'm surprised the author of that alert() actually had the data as well as followed through

Bit of a shame the emails contain an ad for a password manager, saying there's two easy steps to become more secure: Step 1: use our password manager (fair enough), "Step 2: Enable 2 factor authentication and store the codes inside your [password manager]" ehh now it's back to 1 factor or am I missing something?

Edit: according to https://www.bleepingcomputer.com/news/security/internet-arch... (via https://news.ycombinator.com/item?id=41793669), Troy Hunt / HIBP already received and verified this "three days ago" as of yesterday 6pm AoE

almyk

2 months ago

I think it is safer to have 2FA in your password manager than not using 2FA at all. Because even if they got your password, if they don't have access to your password manager they can't login.

If you protect your password manager with a yubikey or any other hardware key, then your 2FA inside your password manager is quite secure and convenient. But this is very individual, what your threat model is and how secure you want/need to be.

nixosbestos

2 months ago

I was going to disagree with you (and I sort of do about password managers and storing 2FA in them, but I also unlock my password manager with a yubikey).

But, doesn't a DB compromise mean that the attacker would have the TOTP seed as well? It can only increase your account security elsewhere, but also not re-using password prevents the IA leak from hurting you elsewhere as well?

EasyMark

2 months ago

They use bcrypt and I always use a really long password so I’m not gonna freak out over this one for once.

bjourne

2 months ago

Are bcrypt password hashes difficult to crack? I signed up for IA over 10 years ago with a much weaker password than those I use today.

tkgally

2 months ago

As of 01:09 GMT on October 10, the Internet Archive is back up.

In fact, the Wayback Machine and the book archives are responding more quickly than they did for me a week ago, when I showed the Archive to the students in an online class I teach. I gave the students a homework assignment that involves accessing some old books at the Archive. That assignment is due in about 12 hours, and I was just getting ready to e-mail the students about the outage when I saw that the site is working again.

divbzero

2 months ago

As of 08:34 GMT on October 10, the Internet Archive is down again.

lordfrito

2 months ago

Confused about this breach... I received a notification from HIBP about this hack, but I don't recall ever creating an account on archive.org (was creating an account there even a thing?).

What info does archive.org have on people? Is this info scraped from other websites and stored in the archive.org database? Or is this info related to personal archive.org accounts (as I said I don't recall making an account)?

floam

2 months ago

They are actual archive.org accounts. Maybe you made an account to upload something, or to check out a digitized book from their library?

AdmiralAsshat

2 months ago

Well this should be fun.

Now I'll have to dig through my IA account and remember if I donated to them directly via credit card (and if they stored it), or if it was through PayPal.

paxys

2 months ago

Even if you paid by credit card, there's zero chance they processed the payment themselves.

zelse

2 months ago

HaveIbeenpwnd says it was just passwords/usernames/emails, so seemingly not. (My company just got an email from them about the breach and I confirmed I'm in there with a quick search on their website.)

gaudystead

2 months ago

Good point and thank you for the reminder. Time to go check my email archives...

account42

2 months ago

If they stored your email from your donation the IA would have already used it to spam you themselves, no attackers needed.

pentagrama

2 months ago

The reported alert on the site states:

> Have you ever felt like the Internet Archive runs on sticks and is constantly on the verge of suffering a catastrophic security breach? It just happened. See 31 million of you on HIBP!

But is this an official message from the company? It sounds odd and unprofessional, especially the "See 31 million of you on HIBP!" part, which jokingly refers to a huge privacy issue for users. Could it also be that the site was hacked, with hackers posting that message in addition to the data breach and DDoS attack?

andrelaszlo

2 months ago

Troy Hunt's tweet mentions the IA getting breached, defaced AND DDoSed. Here it is, in case you don't want to use that site:

>>>

Let me share more on the chronology of this:

30 Sep: Someone sends me the breach, but I'm travelling and didn't realise the significance

5 Oct: I get a chance to look at it - whoa!

6 Oct: I get in contact with someone at IA and send the data, advising it's our goal to load within 72 hours

7 Oct: They confirm and I ask for a disclosure notice

8 Oct: I follow up on the disclosure notice and advise we'll load tomorrow

9 Oct: They get defaced and DDoS'd, right as the data is loading into HIBP

The timing on the last point seems to be entirely coincidental. It may also be multiple parties involved and when we're talking breach + defacement + DDoS, it's clearly not just one attack.

<<<

gtirloni

2 months ago

It's a thankless job to be always begging for donations to keep something working when the Internet at large doesn't value it as much as it should. And now getting targeted like that? I wouldn't judge them if this is an official communication coming from exhausted and frustrated staff.

internetter

2 months ago

The alert is gone now. It appears the attacker compromised their front end deployment

Uptrenda

2 months ago

The funny thing is the internet archive is more connected to hacker culture than cracking a website will ever be. I hate posers more than anything. Hopefully the internet archive comes back stronger than ever.

TZubiri

2 months ago

Yeah, this is hacker news, not hacking news

driver8_

2 months ago

That sucks, I was reading my email in the morn and saw the news from haveibeenpwned.com, and I'm indeed effected by it.

Consolation is that I used a randomly generated unique password, tried to reset my credentials and see of any 2FA options but the site is overloaded throwing 504s.

left-struck

2 months ago

I’ve been mentioning this a lot lately but it’s also a good idea to use email forwarding services like Firefox relay, icloud/apple “hide my email”, duckduckgo has a free one, simplelogin you can host yourself… In an email breach you can confirm who was breached if you used a unique email, and it also means your actual email remains at least as secure as those services I mentioned

Aachen

2 months ago

Should we be linking to the site that is very likely to be breached? Could start to host any type of malware until the access can be definitively revoked

btown

2 months ago

This - dang/mods is there a policy for this?

RGamma

2 months ago

Let's hope it was someone dumb enough to be extraditable.

popcalc

2 months ago

No one gets extradited when the attack aligns with US interests abroad.

odo1242

2 months ago

Fun fact: this is the first time using a password manager (Bitwarden) protyected me from a security breach! Now I only have to update my archive.org password instead of all of them lol

adfm

2 months ago

They're hiring, if you're looking for a job.

https://www.indeed.com/viewjob?jk=3bb8222ccd9a88ea

Aachen

2 months ago

> Software Engineer, Archiving & Data Services (Remote) [...] Preliminary duties of the role will primarily focus on developing Archive-It

That is. Paying over 100k at the lower end of the range for 3y experience as software engineer

bawolff

2 months ago

Reporting on security issues is always so terrible. Is it a data breach or is it a DDoS? (Or both). Those are opposite things. One is trying to release secret information one is trying to make the site inaccessible.

odo1242

2 months ago

It is both. They got attacked by a DDOS after the security breach.

Aachen

2 months ago

That's like complaining the reporting on the weather forecast channel is so often wrong. This news broke about an hour ago and the IA is down, what witchcraft do you expect news media to practice! Nobody yet has the answers you're looking for, give it some time and log files will be audited and the reporting becomes useful :)

meindnoch

2 months ago

How much of the archive is affected? Could be a targeted effort to tamper with historical records.

EamonnMR

2 months ago

If they wanted to do that they'd probably not try to draw this much attention.

jl6

2 months ago

Does the IA publish hashes of its data to a 3rd party, so we could (in principle) verify that nothing has been tampered with?

markus_zhang

2 months ago

Wouldn't be surprised if the service was purchased by some publishing empires. This kind of things usually costs some $$$.

xyst

2 months ago

One of the many benefits of owning my own email server:

- I have a catch all setup to forward all emails to specific user on mail server

- able to setup adhoc email addresses for each online service (ie, iarch@example.com)

- able to claim example.com in haveibeenpwned

Now I get breach emails from hibp for the whole domain. Unfortunately, I was exposed in this IA breach

lolinder

2 months ago

In case anyone would like these benefits but doesn't want to actually run an email server: All you actually need to accomplish this is a domain name and a decent provider. Fastmail is what I use and it's been great for me.

lunatuna

2 months ago

I used to do this, now I use icloud and the 'hide my email' tool and it works without any hassle. Even asks me when signing up for something if I want to hide my email. It is easier than adding it to my old setup. Even easier than when I was using my free Google for Business setup.

The rest of apple's email landscape sucks. It is pretty poor at managing spam, the client is terrible, it doesn't sync rules between the desktop app, icloud email, and iphone.

I hate email in general. It is getting to be 1 in a 100 type scenario of anything of value and likely worse if I knew all the emails that were deleted before I saw them.

nostromo

2 months ago

The only drawback being that all of your outgoing email is sent directly to the receiver’s spam folder..?

CobaltFire

2 months ago

I do the same thing. Absolutely worth the small hassle.

core-utility

2 months ago

You don't need to deal with the hassle of your own email server for this. Just buy a domain and use Fastmail, Protonmail, or any other service you trust.

alwayslikethis

2 months ago

Simplelogin can do the first two. The third matters little anyways if you don't reuse passwords.

wackget

2 months ago

Great until you need to give someone an email address in real life and awkwardness ensues.

  Cashier: "What's your email?"
  Me:      "walmart@somedomain.com"
  Cashier: "No I meant YOUR email address."
  Me:      "Yeah walmart@somedomain.com"
  Cashier: "Oh do you work for Walmart???"
  Me:      "No see I set up my email so... oh nevermind, 420BLAZEIT@GMAIL.COM"

appendix-rock

2 months ago

All things that aren’t remotely unique to running your own mail server.

account42

2 months ago

Good. Maybe this will get them to reconsider their website changes that make the IA unusable without javascript.

honeybadger1

2 months ago

Lets attack one of the bastions of information freedom...in the name of Palestine, sigh. Ass-hat hackers.

tomrod

2 months ago

That's a shame.

We need not one but many internet archives. Just one and we will repeat the outcome of the Library of Alexandria.

kiba

2 months ago

The Library of Alexandria wasn't that significant and likely wasn't destroyed in one cataclysmic event, but rather centuries of neglect.

19h00

2 months ago

They reported a DDOS attack yesterday, wonder if this is their alert as they manage the fallout?

n3uman

2 months ago

https://blog.archive.org/2021/02/04/thank-you-ubuntu-and-lin... "The Internet Archive is wholly dependent on Ubuntu and the Linux communities that create a reliable, free (as in beer), free (as in speech), rapidly evolving operating system. It is hard to overestimate how important that is to creating services such as the Internet Archive." Maybe CUPS?

Wowfunhappy

2 months ago

Archive.org is now down. Could anyone explain what it used to show?

1024core

2 months ago

Why should an Archive need accounts anyways? This is like a public library: you don't need to authenticate yourself to enter a public library, do you?

msephton

2 months ago

I just got a Discord "breaking news" notification about this from a server I am, said it may not show on Have I Been Pwned as it is so new.

crispair

2 months ago

I wonder how they got access the their database? I read in this thread that they likely used a supply chain attack by replacing some polyfill scripts. So they could've injected malicious code (XSS) that logged email and password to a remote server which they could have gone through. With a bit of luck they couldve gotten access to an admin account or whatever…

Nathans220

2 months ago

Strange I just received this message when going to the archive.org website I thought I might have misspelled the url

alkonaut

2 months ago

Does IA have much information on users? I’ve been in dozens of these HIBP leaks (including this one) but still none have concerned me, since they were mostly just email/password and nothing else.

Does IA store anything sensitive for any users?p physical addresses, credit cards, etc?

pastureofplenty

2 months ago

Maybe this will make Google reconsider relying on them for cached versions of webpages.

1970-01-01

2 months ago

Archive.org is completely down

pmontra

2 months ago

Does anybody know the details of the attack via the JS library? Was that the exploit of a bug that could affect every site or a chain of supply attack targeted at the Internet Archive?

user

2 months ago

[deleted]

meow_catrix

2 months ago

Bet it’s just a stored XSS alert from a poisoned cache.

bn-l

2 months ago

The recent news on IA has made me worried about it. It seems to be a fragile thing and if it goes it'll be something we'll all regret.

Nathans220

2 months ago

After this error 504 Gateway Time-out Now 503 Service Unavailable No server is available to handle this request. Not looking good

silexia

2 months ago

Why does this link to the verge (garbage clickbait site) and not to the original source of the internet archive?

Apocryphon

2 months ago

Hachette Book Group or Hack-it Boot Group?

godshatter

2 months ago

The conspiracy theorist in me wonders what was accidentally copied into the archive that powerful interests want removed and if this is all smoke and mirrors while they make that happen.

carloslfu

2 months ago

"You are all cooked" vibes from that message hahaha

Levitating

2 months ago

I just received my haveibeenpwned.com email...

max_

2 months ago

Is Internet Archive teh same as Archive.is?

el_jay

2 months ago

And only weeks before a US election.

excalibur

2 months ago

The overall state of cybersecurity in 2024 depends to an astonishing degree on Troy Hunt's schedule.

anigbrowl

2 months ago

They have a Telegram channel and there's some blurb about it being pushback on US support of Israel, but it reads as bullshit. Probably a script kiddie.

themingus

2 months ago

I was disappointed to discover that https://haveibeenpwned.com does not report an email as pwned if it is subaddressed/plus addressed. myemail@gmail.com is reported as still safe, but myemail+archive@gmail.com is pwned. I wonder if my email has been leaked by any other websites without me knowing.

firen777

2 months ago

Considering the hacker's motive: https://x.com/Sn_darkmeta/status/1844358501952618976

Is it safe to assume the hacker want to erase the evidence?

Forcing the service offline also means they want to prevent people from archiving evidence in the next how-ever-long hours. Combining with the spoken language they used in that video, are they planning some online disinformation campaign?

----

Edit: some more info about this group: https://old.reddit.com/r/technology/comments/1g0kupb/hacktiv...

----

This group claims to be pro palestinian and it's entirely based on Russia.

[https://therecord.media/middle-east-financial-institution-6-...

>SN\_BLACKMETA has operated its Telegram channel since November 2023, boasting of DDoS incidents and cyberattacks on infrastructure in Israel, the Palestinian Territories and elsewhere. While all of the group’s messages focus on the Palestinian Territories and perceived opponents to Palestine, many of its posts are written in Russian.

>The group’s account on X also shows that it was created by someone in Staraya, a town in Novgorod Oblast, Russia. The account’s initial language was also set to Russian.

>The researchers added that analysis of timestamps and activity patterns showed possible evidence that the actors within the group are operating in a timezone “close to Moscow Standard Time (MSK, UTC+3) or other Middle Eastern or Eastern European time zones (UTC+2 to UTC+4).”

~~Attacks include pro palestine sites and groups, so~~ take that "pro palestine" with a grain of salt.

EDIT: edited for clarity on what is actually in the article and not in outside anonymous sources. If you want to read more, [there's a clearer report on one of their attacks and their usual targets.](https://www.radware.com/security/threat-advisories-and-attac...)

anon115

2 months ago

I wouldn't be surprised if it has something to do Israel

Krasnol

2 months ago

This is why humanity can't have nice things.

worstspotgain

2 months ago

In unrelated news, apparently most world leaders in the Internet era, from Thatcher to GHWB to Mitterand to Rabin, expressed great admiration for Vladimir Putin.

Ekaros

2 months ago

So now the data also has off-site third-party archive. Isn't this along the goals of organization. It is less likely now to be destroyed in many eventualities.

lloydatkinson

2 months ago

Deeply disappointing. The only reason I have a IA account is to upload correct book covers to obviously wrong or poor quality books on the Library.

joshchernoff

2 months ago

What an asshole, honestly this is a good public service they offer.

user

2 months ago

[deleted]

haha112

2 months ago

Damn I get the notice too

EchoReflection

2 months ago

shouldn't info about this breach be ON the IA landing page??

haha112

2 months ago

Where to see dump data?

dt3ft

2 months ago

Imagine if we could get rid of passwords. Entirely. Forever.

indus

2 months ago

I mistakenly read HIBP as Half Price Books..wait what?

mendym

2 months ago

Now it shows a 'Temporarily Offline' message

phplovesong

2 months ago

WHY would you attack IA? Whats the point?

testfrequency

2 months ago

I’m feeling extremely conflicted on all of this with IA right now.

On one hand, I love IA

On the other hand…I’m in a long thread with their support right now on removing old snapshots of a social media account I have. Creeps are actively using the old snapshots to dox me and send me death threats using my PII.

It’s incredibly frustrating and IA keeps insisting they cannot do anything about it.

A small part of me hoped IA didn’t recover from today because I knew my info would be finally deleted :/

kleiba

2 months ago

What kind of asshole attacks the Internet Archive of all places on the web??

wasabinator

2 months ago

Some people on this planet add such negative value. What does this clown hope to gain, apart from costing us all an incredibly useful shared resource?

ErikAugust

2 months ago

“According to their twitter, they’re doing it just to do it. Just because they can. No statement, no idea, no demands.”

A special place in Hell…

mynameyeff

2 months ago

huh i thought everyone already knew this

muppetman

2 months ago

Great. Bunch of pricks. Refuse to remove any of my data they scraped.

msephton

2 months ago

They seem to roll out the we're being DDOS'd every time there's some other thing happening.