dperfect
22 days ago
Nerdsnipe confirmed :)
Claude Opus came up with this script:
It produces a somewhat-readable PDF (first page at least) with this text output:
(I used the cleaned output at https://pastebin.com/UXRAJdKJ mentioned in a comment by Joe on the blog page)
pests
22 days ago
So it was a public event attended by 450 people:
https://www.mountsinai.org/about/newsroom/2012/dubin-breast-...
https://www.businessinsider.com/dubin-breast-center-benefit-...
Even names match up, but oddly the date is different.
elmomle
22 days ago
Your links are for the inaugural (first) ball in December 2011; OP's text referred to a second annual ball in December 2012.
pests
22 days ago
You are right my first is incorrect but the second does seem to be from 2012.
sorbus-25
22 days ago
DUBIN BREAST CENTER SECOND ANNUAL BENEFIT MONDAY, DECEMBER 10, 2012 HONORING ELISA PORT, MD, FACS AND THE RUTTENBERG FAMILY HOST CYNTHIA MCFADDEN SPECIAL MUSICAL PERFORMANCES CAROLINE JONES, K'NAAN, HALEY REINHART, THALIA, EMILY WARREN MANDARIN ORIENTAL 7:00PM COCKTAILS LOBBY LOUNGE 8:00PM DINNER AND ENTERTAINMENT MANDARIN BALLROOM FESTIVE ATTIRE
Groxx
20 days ago
Since it looks like this got flagged (probably because out of context at a glance it looks like insane babble that somewhat frequently occurs here), some context: this is appears to be text recovered from the pdf, in the links up-thread. Though there's more text than that link shows, and I'm not entirely sure why it's posted in this specific thread, though it's relevant-ish at least.
sorbus-25
20 days ago
It's from a contemporaneous reference to the very same event listed in the PDF. I found it online and archived it: https://web.archive.org/web/20260206040716/https://what2wear...
It includes screenshots of what looks like an expanded document for the event.
Why relevant? I found it by searching the archive for "DBC". There were references to "Dubin", then I found the rest online easily. All that extra text could have helped with decoding the base64 text
turtlesdown11
21 days ago
interesting, Eva Dubin was highlighted today for offering Epstein her 15 year old daughter and her friends.
She's a medical doctor, who became amnesic when on the stand for Maxwell's case
>Pressed about gaps in her memory, Dubin told the court: "It's very hard for me to remember anything far back and sometimes I can't remember things from last month. My family notices it. I notice it."
nialv7
22 days ago
looks like we have it. in the end it's pretty mundane...
JKCalhoun
21 days ago
There are plenty of other PDF's with Base64 encoded attachments.
klustregrif
22 days ago
Which begs the question why was it censored?
nickthegreek
21 days ago
They censored Dont in one location. The current Thought process is they were redacting mentions of Don T.
BBB22
20 days ago
[dead]
pbhjpbhj
21 days ago
Mighty be they censored all pages mentioning keywords, this one says "Breast" ... perhaps they censored all sexual content?
klustregrif
21 days ago
At the risk of repeating myself. Which begs the question why?
mmastrac
21 days ago
To protect the people in power, as always.
redeeman
21 days ago
what is insane is that everyone just accepts it, knows that this happens, and dont go lynch the ones in charge immediately.
There was a time when the guy making the cannon had to sit on top of it for the first shot. Perhaps this kind of policy could be adapted to other situations aswell.
Take the job to guard epstein? take the consequences when things go wrong.
Protect criminals? take the very real consequences if found out
ben_w
21 days ago
> what is insane is that everyone just accepts it, knows that this happens, and dont go lynch the ones in charge immediately.
For a while, my pet conspiracy theory was that this was Epstein's real cause of death: a lynching by a prison guard made to look like suicide.
I never took it too seriously, because no actual evidence; now I'm more inclined to think it was a coconspirator hoping it would mean no more evidence getting out.
quickthrowman
21 days ago
Epstein being murdered is the one conspiracy that I personally still think may be possible/probable.
All it takes is a single actor paying off some guards to ‘fall asleep’, a camera to be disabled, and a 15 minute window of opportunity. It’s much more probable than something like the US Government planning 9/11 and somehow keeping thousands of co-conspirators silent.
I don’t really spend a whole lot of time thinking about it since as you said, we’ll never know for sure. It just seems at least probable if he actually did have kompromat on powerful people.
QuercusMax
21 days ago
Did you see this? https://www.cbsnews.com/news/epstein-files-jail-cell-death-v...
The noose they found in his cell was not the thing that strangled him. If he wasn't murdered then they faked his death.
computerthings
21 days ago
[dead]
mikeyouse
21 days ago
Likely because the named list is a bunch of Trump appointees and mega donors and they're illegally trying to spare them the embarrassment.
lanyard-textile
21 days ago
Distraction.
myduck_hacker
21 days ago
[dead]
notpushkin
22 days ago
> It produces a somewhat-readable PDF (first page at least) with this text output
Any chance you could share a screenshot / re-export it as a (normalized) PDF? I’m curious about what’s in there, but all of my readers refuse to open it.
dperfect
22 days ago
Screenshot: https://imgur.com/eWCfYYd
dperfect
21 days ago
Letting Claude work a little longer produced this behemoth of a script (which is supposed to be somewhat universal in correcting similar OCR'd PDFs - not yet tested on any others though): https://pastebin.com/PsaFhSP1
which uses this Rust zlib stream fixer: https://pastebin.com/iy69HWXC
and gives the best output I've seen it produce: https://imgur.com/itYWblh
This is using the same OCR'd text posted by commenter Joe.
daveguy
21 days ago
> which is supposed to be somewhat universal in correcting similar OCR'd PDFs
Xerox would like a word.
https://news.ycombinator.com/item?id=29223815
Point being, "correcting" to "correct looking" may be worse than just accepting errors. Errors are often clearly identified by humans as a nonsense word. "Correcting" OCR can result in plausible, but wrong results that are more difficult for the human in the loop to identify.
dperfect
21 days ago
That's true if we're correcting OCR of actual output text. In this case, it's operating on the base 64 text, trying to produce chunks that form valid zlib streams and PDF syntax so the file can be intact enough to be opened. "Just accepting errors" would mean not seeing any content in the file because it cannot be read.
So yes, the "fixed" output has errors, but it’s not hallucinating details like an LLM, nor is it trying to produce output that conforms to any linguistic or stylistic heuristics.
The phrase "correcting similar OCR'd PDFs" should have been "correcting similar OCR'd base 64 representations of PDFs".
the_real_cher
21 days ago
This is cool!