Just in terms of privacy, it's worth noting that anyone who has uploaded something on IA already has their email address publicly viewable.
This isn't something that commonly known (even judging by comments here) but in the publicly viewable metadata of every upload it contains the uploader's IA account email address. So from a security perspective it's bad but from a privacy perspective a lot of users probably weren't aware of this detail if they've uploaded anything.
This raises an interesting question: should email addresses be private? Addresses of buildings aren't private, and they're somewhat analogous as with many computing concepts. (Aside: Before spam filters were quite good, it was typical to avoid scraping of addresses by mild obfuscation, but I think those days are gone, and this is distinct from privacy anyway.)
If someone wants to upload and never be found out, then they need to use a throwaway address in any case, lest they be providing their "private" address to the administrators of the service without explicitly forbidding further disclosure. If I say something to Alice without demanding that Alice keep it from Bob, then I implicitly don't mind if Alice tells Bob what I said.
This is bad enough. This alone is a privacy bug/data leak.
Theoretically, someone could scrape the pages and compile a list of exposed email addresses.
One solution is to use a unique email address for every website, and change the address if the site gets compromised (with the old address getting added to a spam filter).
A pulled an old friends website down from Internet Archive.
He's moved on the next stage, but I was glad I was able to put his site back up.
It'll be a shame if IA goes down permanently, but we need a decentralized solution anyway.
Having a single mega organization in charge of our collective heritage isn't a good idea.
I have always thought about this. It would be interesting to have users actually store small amounts of redundant info on a device connected to the internet. Very similarly to what a torrent does but with more peers (more data shards than full copies) and less seeds. And try and keep a huge database for everyone. Obviously open source and it would end up something like tor where they just assist the network with security patches but they don’t actually have any real “control” (admin dashboard control) over the network at large. We already do something smaller but like that with website static file caching, but at much smaller scale. Obviously security implications of this would be very hard but maybe not impossible to overcome. ipfs comes close but it again does more seeds then peers.
if anyone knows something like what I'm suggesting, I'd love to hear about it!
This is why BitTorrent and other P2P solutions were invented, but alas:
A. The RIAA, MPAA, and ESA have given these technologies a terrible reputation.
B. Nobody likes to seed. Some kind of seeding-based crypto would have been a great incentive if cryptocurrency wasn't also demonized by now.
It's called torrent protocol and it doesn't work, no one wants to spend money and bandwidth hosting a god forsaken movie or book that only a handful of people care about.
I keep wanting to do this for old sites, make like a personal mini IA. Besides just using wget or curl, any tips for pulling down useable complete websites from IA?
Agreed, especially an organziation that has already shown to not always be impartial.
A decentralized solution, doesn't that scream internet archive on blockchain? What could go wrong.
> the Have I Been Pwned data breach notification service created by Troy Hunt, with whom threat actors commonly share stolen data to be added to the service
Do they? Why?
> The data will soon be added to HIBP
My unique-to-archive.org email address is not there yet.
My question is: How did Scott Helme end up with a password hash that features his own name?
Friendly reminder to generate a unique password for every account you create so database leaks like this one don't bother you (besides on the site they're used).
Just noticed the site now alerts this:
> Have you ever felt like the Internet Archive runs on sticks and is constantly on the verge of suffering a catastrophic security breach? It just happened. See 31 million of you on HIBP!
Jokes on them... I'm already on HIBP countless of times...
I assume that if this is a bad actor, then account email/name will be leaked?
Is it a genuine alert, or hacking artifact?
Sometimes with friendly / attempt-at-humorous error messages it’s difficult to tell
It looks like someone has compromised one of their subdomains for Polyfill
Update: Subdomain seems to be returning normal responses again now.
You mean the IA included some JS polyfill from a subdomain and that's what's compromised / where the alert is coming from?
That would perhaps explain how they managed to inject the JS alert popup, right?
One of those instances when you really wish curses worked on whoever was pulling this stunt “may you and your descendants suffer the bites of 10000 fleas for 10000 nights as punishment for your misdeeds”
Probably not the best time to say this, but it's surprisingly easy to go through a collection with items and grab every email along with the usernames.
https://archive.org/metadata/naturally_a_girl/metadata
One way or another, there was going to be someone who would take loads of emails with a username attached to it. A bit intrigued by how the hacker compromised the database and got the passwords.
Damn, I had no idea about this. Definitely would've changed some things had I known that emails were public.
This honestly seems like a bit of a design flaw.
Why go for the Internet Archive go for something else not the fucking archive!
We all need our easily accessible decentralized archive of some sort...
This thread is looking like it'll be one of the first places this incident will be documented (seems to be on the top of Google).
Already there are two new users just for this.
Yeah, I was looking around, but saw no mention of it anywhere until I realized it just happened.
I have had an IA account for a number of years, with a gmail address. Nine months ago, I changed the email address to a masked address using my own domain. Now I find that my gmail address was still stored, and was involved in the breach. Why? I get that they might store change history, but why?
BTW, for the current account details, I changed the password to another random string generated by my password manager, and also deleted the masked email address and generated another one, so going forward this sort of thing isn't that much of an issue for me.
I have a similar situation, where I signed up with my main account and later changed IA's email to a more private address. It was the first email I checked on HaveIBeenPwned and it doesn't show up in this leak. The other couple IA accounts I have, whose emails and passwords are exclusive to them, they all show in this leak alright.
I have no explanation to your situation but this was also my immediate though and I also wanted to give the opposite perspective.
It's also possible that the breach was earlier or going on for longer than reported.
It's been tried several times, but it's hard because it's such a massive quantity of data. The IPFS backup never really got off the ground.
They have their own backups which I think is good enough for now unless someone plans on donating a few hundred million.
Backup / duplication is not an easy project for sure. But meanwhile for now IA is a single organization operating under one legal system. And one technical setup, would be relevant today. That's a major weakness.
Suppose we each backed up sites we cared about rather than trying to mirror the whole thing...
A few minutes ago (22:48 UTC), I got three emails from HIBP about accounts of mine breached on the Internet Archive. Troy is quick! And I'm surprised the author of that alert() actually had the data as well as followed through
Bit of a shame the emails contain an ad for a password manager, saying there's two easy steps to become more secure: Step 1: use our password manager (fair enough), "Step 2: Enable 2 factor authentication and store the codes inside your [password manager]" ehh now it's back to 1 factor or am I missing something?
Edit: according to https://www.bleepingcomputer.com/news/security/internet-arch... (via https://news.ycombinator.com/item?id=41793669), Troy Hunt / HIBP already received and verified this "three days ago" as of yesterday 6pm AoE
I think it is safer to have 2FA in your password manager than not using 2FA at all. Because even if they got your password, if they don't have access to your password manager they can't login.
If you protect your password manager with a yubikey or any other hardware key, then your 2FA inside your password manager is quite secure and convenient. But this is very individual, what your threat model is and how secure you want/need to be.
I was going to disagree with you (and I sort of do about password managers and storing 2FA in them, but I also unlock my password manager with a yubikey).
But, doesn't a DB compromise mean that the attacker would have the TOTP seed as well? It can only increase your account security elsewhere, but also not re-using password prevents the IA leak from hurting you elsewhere as well?
They use bcrypt and I always use a really long password so I’m not gonna freak out over this one for once.
Are bcrypt password hashes difficult to crack? I signed up for IA over 10 years ago with a much weaker password than those I use today.
As of 01:09 GMT on October 10, the Internet Archive is back up.
In fact, the Wayback Machine and the book archives are responding more quickly than they did for me a week ago, when I showed the Archive to the students in an online class I teach. I gave the students a homework assignment that involves accessing some old books at the Archive. That assignment is due in about 12 hours, and I was just getting ready to e-mail the students about the outage when I saw that the site is working again.
As of 08:34 GMT on October 10, the Internet Archive is down again.
Confused about this breach... I received a notification from HIBP about this hack, but I don't recall ever creating an account on archive.org (was creating an account there even a thing?).
What info does archive.org have on people? Is this info scraped from other websites and stored in the archive.org database? Or is this info related to personal archive.org accounts (as I said I don't recall making an account)?
They are actual archive.org accounts. Maybe you made an account to upload something, or to check out a digitized book from their library?
Well this should be fun.
Now I'll have to dig through my IA account and remember if I donated to them directly via credit card (and if they stored it), or if it was through PayPal.
Even if you paid by credit card, there's zero chance they processed the payment themselves.
HaveIbeenpwnd says it was just passwords/usernames/emails, so seemingly not. (My company just got an email from them about the breach and I confirmed I'm in there with a quick search on their website.)
Good point and thank you for the reminder. Time to go check my email archives...
If they stored your email from your donation the IA would have already used it to spam you themselves, no attackers needed.
The reported alert on the site states:
> Have you ever felt like the Internet Archive runs on sticks and is constantly on the verge of suffering a catastrophic security breach? It just happened. See 31 million of you on HIBP!
But is this an official message from the company? It sounds odd and unprofessional, especially the "See 31 million of you on HIBP!" part, which jokingly refers to a huge privacy issue for users. Could it also be that the site was hacked, with hackers posting that message in addition to the data breach and DDoS attack?
Troy Hunt's tweet mentions the IA getting breached, defaced AND DDoSed. Here it is, in case you don't want to use that site:
>>>
Let me share more on the chronology of this:
30 Sep: Someone sends me the breach, but I'm travelling and didn't realise the significance
5 Oct: I get a chance to look at it - whoa!
6 Oct: I get in contact with someone at IA and send the data, advising it's our goal to load within 72 hours
7 Oct: They confirm and I ask for a disclosure notice
8 Oct: I follow up on the disclosure notice and advise we'll load tomorrow
9 Oct: They get defaced and DDoS'd, right as the data is loading into HIBP
The timing on the last point seems to be entirely coincidental. It may also be multiple parties involved and when we're talking breach + defacement + DDoS, it's clearly not just one attack.
<<<
It's a thankless job to be always begging for donations to keep something working when the Internet at large doesn't value it as much as it should. And now getting targeted like that? I wouldn't judge them if this is an official communication coming from exhausted and frustrated staff.
The alert is gone now. It appears the attacker compromised their front end deployment
The funny thing is the internet archive is more connected to hacker culture than cracking a website will ever be. I hate posers more than anything. Hopefully the internet archive comes back stronger than ever.
Yeah, this is hacker news, not hacking news
What are they looking for here? Negative karma?
That sucks, I was reading my email in the morn and saw the news from haveibeenpwned.com, and I'm indeed effected by it.
Consolation is that I used a randomly generated unique password, tried to reset my credentials and see of any 2FA options but the site is overloaded throwing 504s.
I’ve been mentioning this a lot lately but it’s also a good idea to use email forwarding services like Firefox relay, icloud/apple “hide my email”, duckduckgo has a free one, simplelogin you can host yourself…
In an email breach you can confirm who was breached if you used a unique email, and it also means your actual email remains at least as secure as those services I mentioned
Should we be linking to the site that is very likely to be breached? Could start to host any type of malware until the access can be definitively revoked
This - dang/mods is there a policy for this?
Let's hope it was someone dumb enough to be extraditable.
No one gets extradited when the attack aligns with US interests abroad.
Fun fact: this is the first time using a password manager (Bitwarden) protyected me from a security breach! Now I only have to update my archive.org password instead of all of them lol
> Software Engineer, Archiving & Data Services (Remote) [...] Preliminary duties of the role will primarily focus on developing Archive-It
That is. Paying over 100k at the lower end of the range for 3y experience as software engineer
Reporting on security issues is always so terrible. Is it a data breach or is it a DDoS? (Or both). Those are opposite things. One is trying to release secret information one is trying to make the site inaccessible.
It is both. They got attacked by a DDOS after the security breach.
That's like complaining the reporting on the weather forecast channel is so often wrong. This news broke about an hour ago and the IA is down, what witchcraft do you expect news media to practice! Nobody yet has the answers you're looking for, give it some time and log files will be audited and the reporting becomes useful :)
How much of the archive is affected? Could be a targeted effort to tamper with historical records.
If they wanted to do that they'd probably not try to draw this much attention.
Does the IA publish hashes of its data to a 3rd party, so we could (in principle) verify that nothing has been tampered with?
Wouldn't be surprised if the service was purchased by some publishing empires. This kind of things usually costs some $$$.
One of the many benefits of owning my own email server:
- I have a catch all setup to forward all emails to specific user on mail server
- able to setup adhoc email addresses for each online service (ie, iarch@example.com)
- able to claim example.com in haveibeenpwned
Now I get breach emails from hibp for the whole domain. Unfortunately, I was exposed in this IA breach
In case anyone would like these benefits but doesn't want to actually run an email server: All you actually need to accomplish this is a domain name and a decent provider. Fastmail is what I use and it's been great for me.
I used to do this, now I use icloud and the 'hide my email' tool and it works without any hassle. Even asks me when signing up for something if I want to hide my email. It is easier than adding it to my old setup. Even easier than when I was using my free Google for Business setup.
The rest of apple's email landscape sucks. It is pretty poor at managing spam, the client is terrible, it doesn't sync rules between the desktop app, icloud email, and iphone.
I hate email in general. It is getting to be 1 in a 100 type scenario of anything of value and likely worse if I knew all the emails that were deleted before I saw them.
The only drawback being that all of your outgoing email is sent directly to the receiver’s spam folder..?
I do the same thing. Absolutely worth the small hassle.
You don't need to deal with the hassle of your own email server for this. Just buy a domain and use Fastmail, Protonmail, or any other service you trust.
Simplelogin can do the first two. The third matters little anyways if you don't reuse passwords.
Great until you need to give someone an email address in real life and awkwardness ensues.
Cashier: "What's your email?"
Me: "walmart@somedomain.com"
Cashier: "No I meant YOUR email address."
Me: "Yeah walmart@somedomain.com"
Cashier: "Oh do you work for Walmart???"
Me: "No see I set up my email so... oh nevermind, 420BLAZEIT@GMAIL.COM"
All things that aren’t remotely unique to running your own mail server.
Good. Maybe this will get them to reconsider their website changes that make the IA unusable without javascript.
Lets attack one of the bastions of information freedom...in the name of Palestine, sigh. Ass-hat hackers.
That's a shame.
We need not one but many internet archives. Just one and we will repeat the outcome of the Library of Alexandria.
The Library of Alexandria wasn't that significant and likely wasn't destroyed in one cataclysmic event, but rather centuries of neglect.
They reported a DDOS attack yesterday, wonder if this is their alert as they manage the fallout?
https://blog.archive.org/2021/02/04/thank-you-ubuntu-and-lin...
"The Internet Archive is wholly dependent on Ubuntu and the Linux communities that create a reliable, free (as in beer), free (as in speech), rapidly evolving operating system. It is hard to overestimate how important that is to creating services such as the Internet Archive." Maybe CUPS?
Archive.org is now down. Could anyone explain what it used to show?
Why should an Archive need accounts anyways? This is like a public library: you don't need to authenticate yourself to enter a public library, do you?
I just got a Discord "breaking news" notification about this from a server I am, said it may not show on Have I Been Pwned as it is so new.
I wonder how they got access the their database? I read in this thread that they likely used a supply chain attack by replacing some polyfill scripts. So they could've injected malicious code (XSS) that logged email and password to a remote server which they could have gone through. With a bit of luck they couldve gotten access to an admin account or whatever…
Strange I just received this message when going to the archive.org website I thought I might have misspelled the url
Does IA have much information on users? I’ve been in dozens of these HIBP leaks (including this one) but still none have concerned me, since they were mostly just email/password and nothing else.
Does IA store anything sensitive for any users?p physical addresses, credit cards, etc?
Maybe this will make Google reconsider relying on them for cached versions of webpages.
Archive.org is completely down
Does anybody know the details of the attack via the JS library? Was that the exploit of a bug that could affect every site or a chain of supply attack targeted at the Internet Archive?
Bet it’s just a stored XSS alert from a poisoned cache.
The recent news on IA has made me worried about it. It seems to be a fragile thing and if it goes it'll be something we'll all regret.
After this error 504 Gateway Time-out
Now 503 Service Unavailable
No server is available to handle this request.
Not looking good
Why does this link to the verge (garbage clickbait site) and not to the original source of the internet archive?
Hachette Book Group or Hack-it Boot Group?
I hope it will be back again soon
The conspiracy theorist in me wonders what was accidentally copied into the archive that powerful interests want removed and if this is all smoke and mirrors while they make that happen.
"You are all cooked" vibes from that message hahaha
I just received my haveibeenpwned.com email...
Is Internet Archive teh same as Archive.is?
And only weeks before a US election.
Any information on SN_Blackmeta?
The overall state of cybersecurity in 2024 depends to an astonishing degree on Troy Hunt's schedule.
They have a Telegram channel and there's some blurb about it being pushback on US support of Israel, but it reads as bullshit. Probably a script kiddie.
I was disappointed to discover that https://haveibeenpwned.com does not report an email as pwned if it is subaddressed/plus addressed. myemail@gmail.com is reported as still safe, but myemail+archive@gmail.com is pwned. I wonder if my email has been leaked by any other websites without me knowing.
Considering the hacker's motive: https://x.com/Sn_darkmeta/status/1844358501952618976
Is it safe to assume the hacker want to erase the evidence?
Forcing the service offline also means they want to prevent people from archiving evidence in the next how-ever-long hours. Combining with the spoken language they used in that video, are they planning some online disinformation campaign?
----
Edit: some more info about this group: https://old.reddit.com/r/technology/comments/1g0kupb/hacktiv...
----
This group claims to be pro palestinian and it's entirely based on Russia.
[https://therecord.media/middle-east-financial-institution-6-...
>SN\_BLACKMETA has operated its Telegram channel since November 2023, boasting of DDoS incidents and cyberattacks on infrastructure in Israel, the Palestinian Territories and elsewhere. While all of the group’s messages focus on the Palestinian Territories and perceived opponents to Palestine, many of its posts are written in Russian.
>The group’s account on X also shows that it was created by someone in Staraya, a town in Novgorod Oblast, Russia. The account’s initial language was also set to Russian.
>The researchers added that analysis of timestamps and activity patterns showed possible evidence that the actors within the group are operating in a timezone “close to Moscow Standard Time (MSK, UTC+3) or other Middle Eastern or Eastern European time zones (UTC+2 to UTC+4).”
~~Attacks include pro palestine sites and groups, so~~ take that "pro palestine" with a grain of salt.
EDIT: edited for clarity on what is actually in the article and not in outside anonymous sources. If you want to read more, [there's a clearer report on one of their attacks and their usual targets.](https://www.radware.com/security/threat-advisories-and-attac...)
I wouldn't be surprised if it has something to do Israel
This is why humanity can't have nice things.
In unrelated news, apparently most world leaders in the Internet era, from Thatcher to GHWB to Mitterand to Rabin, expressed great admiration for Vladimir Putin.
So now the data also has off-site third-party archive. Isn't this along the goals of organization. It is less likely now to be destroyed in many eventualities.
Deeply disappointing. The only reason I have a IA account is to upload correct book covers to obviously wrong or poor quality books on the Library.
What an asshole, honestly this is a good public service they offer.
Damn I get the notice too
shouldn't info about this breach be ON the IA landing page??
Imagine if we could get rid of passwords. Entirely. Forever.
I mistakenly read HIBP as Half Price Books..wait what?
Now it shows a 'Temporarily Offline' message
WHY would you attack IA? Whats the point?
I’m feeling extremely conflicted on all of this with IA right now.
On one hand, I love IA
On the other hand…I’m in a long thread with their support right now on removing old snapshots of a social media account I have. Creeps are actively using the old snapshots to dox me and send me death threats using my PII.
It’s incredibly frustrating and IA keeps insisting they cannot do anything about it.
A small part of me hoped IA didn’t recover from today because I knew my info would be finally deleted :/
What kind of asshole attacks the Internet Archive of all places on the web??
Some people on this planet add such negative value. What does this clown hope to gain, apart from costing us all an incredibly useful shared resource?
“According to their twitter, they’re doing it just to do it. Just because they can. No statement, no idea, no demands.”
A special place in Hell…
huh i thought everyone already knew this
Great. Bunch of pricks. Refuse to remove any of my data they scraped.
They seem to roll out the we're being DDOS'd every time there's some other thing happening.