anigbrowl
3 days ago
I found this part interesting:
There are also other documents that appear to simulate a scanned document but completely lack the “real-world noise” expected with physical paper-based workflows. The much crisper images appear almost perfect without random artifacts or background noise, and with the exact same amount of image skew across multiple pages. Thanks to the borders around each page of text, page skew can easily be measured, such as with VOL00007\IMAGES\0001\EFTA00009229.pdf. It is highly likely these PDFs were created by rendering original content (from a digital document) to an image (e.g., via print to image or save to image functionality) and then applying image processing such as skew, downscaling, and color reduction.
tombrossman
3 days ago
GNOME Desktop users can put this in a Bash script in ~/.local/share/nautilus/ for more convincing looking fake PDF scans, accessible from your right-click menu. I do not recall where I copied it from originally to give credit so thanks, random internet person (probably on Stack Exchange). It works perfectly.
ROTATION=$(shuf -n 1 -e '-' '')$(shuf -n 1 -e $(seq 0.05 .5))
for pdf in "$@";
do magick -density 150 $pdf \
-linear-stretch '1.5%x2%' \
-rotate 0.4 \
-attenuate '0.01' \
+noise Multiplicative \
-colorspace 'gray' \
"${pdf%.*}-fakescan.${pdf##*.}"
donebarrkel
2 days ago
That seq is probably supposed to be $(seq 0.05 0.05 0.5). Right now it's always 0.05.
Note that you can get random numbers straight from bash with $RANDOM. It's 15 bit (0 to 32767) but good enough here; this would get between 0.05 and 0.5: $(printf "0.%.4d\n" $((500 + RANDOM % 4501)))
streetfighter64
3 days ago
Shouldn't $ROTATION be set inside the loop and actually used in the magick command?
tombrossman
2 days ago
You know, now that you point it out that seems obvious. I think maybe I was experimenting with rotation and left that in, unused. I did this years ago. The loop works OK though. Thanks for the feedback (and now I have to finish editing that script ...)
lordgrenville
2 days ago
Nothing about this is specific to GNOME, right? Imagemagick is cross-platform
turboponyy
2 days ago
I guess the Gnome-specific part is that Gnome comes with the Nautilus file browser, and the instructions add a script for Nautilus.
But yea, this will work as long as you have imagemagick and Nautilus installed.
lordgrenville
2 days ago
Oh I missed that part, was just looking at the script
landdate
2 days ago
or just run script and input pdf as argument...
mimischi
2 days ago
I like https://lookscanned.io/
landdate
2 days ago
[flagged]
taskforcegemini
2 days ago
you sound as grumpy as my cat looks. there's no need for this language
landdate
2 days ago
[flagged]
nullbio
2 days ago
The real question is: Which of the documents are the ones that are "simulating" scanned documents, and what political narrative do they reinforce?
The only reason I can think of for why someone would want to do this is to pass off fraudulent or AI generated images as real.
boromisp
2 days ago
A simpler explanation could be wanting to skip the print->sign->scan ceremony required by some institutions.
lucideer
17 hours ago
Another explanation is that it's simply one form of lazy ineffective obfuscation performed by inexperienced relative luddites in an attempt to walk the fine line between complying with the supreme court directive & not releasing anything useful.
Other investigations into the files have found oddities like redaction of the word "don't" indicating a haphazard find-&-replace approach to redaction, possibly LLM-aided.
The DOJ/Akamai online hosted search feature is also incomplete - potentially due to some of these "digitally scanned" files not being subject to OCR.
lucideer
17 hours ago
> to pass off fraudulent or AI generated images as real.
Possibly but I don't find it compelling, if only because a significant portion of the media reportage on the files has made claims that are entirely baseless - if there were a narrative to be sold one would expect such reportage to be actively leveraging such fraudulent images.
reactordev
2 days ago
This. Slip in a few thousand “fakes” with the trove of goods to be able to fabricate a narrative.
streetfighter64
3 days ago
Very interesting. That document in particular seems to be an interview of A. Acosta by the DoJ from 2019. But what reason would the FBI have for pretending it's a scanned document, if it is genuine? Perhaps there's some aspect of Epstein's deal with Acosta that they'd rather not reveal to the public?
https://www.justice.gov/epstein/files/DataSet%207/EFTA000092...
juujian
3 days ago
Not that I can speak from personal experience or anything... But somebody on an email chain may have requested a scanned version of the document to ensure there is no metadata and the employee might have found it easier to just flatten the pdf and apply a graphical filter to make the document appear like a scanned document. There might even be a webtool available somewhere to do so, I wouldn't know...
agopo
3 days ago
[dead]
ThePowerOfFuet
2 days ago
Straight to the signup page? A bit blatant, no?
mikkupikku
3 days ago
> the employee might have found it easier to just flatten the pdf and apply a graphical filter to make the document appear like a scanned document
Is that remotely plausible? I can't imaging faking a scan being easier than just walking down the hall to the copier room.
dahcryn
2 days ago
If I look at my personal work situation, working from home would mean I can't do it immediately, but would have to remember to do it the next day. Or just do it digitally right now in a few minutes and have it off my to-do list
Don't attribute to malice what can be attributed to laziness, these are government workers
streetfighter64
2 days ago
I think maybe the old "don't attribute to malice" adage goes out the window when we're talking about a coverup of a giant child sex trafficking ring run by high-up people in the government.
sporkland
2 days ago
While I don't disagree with your point about Epstein case being a massive cya for a ton of people in power, the fact is that if they deeply wanted to cover up something the right way to do it would to be to actually print it and scan it, this does look like someone shortcutted some broad order to print and scan all digital media.
jojobas
2 days ago
It's thousands of pages, surely investing some time in a script is faster. They were in a rush as well.
If they were faking the documents rather than the delivery method they definitely could have invested some time in flawless looks.
smcnally
2 days ago
Or more-realistic flawed looks as the case is here.
meinersbur
2 days ago
The time advantage of faking a scan becomes better the more pages you have to scan.
usr1106
2 days ago
Nice. But 5 years seems unrealistic. Who stays on the same job using same processes 5 years these days? Even if the task might remain the same, input formats might change, requiring extra maintenance to the tool. Should recalculate that for 3 years before using it in my automation decisions.
luplex
2 days ago
you do not work in the public sector, where processes change rarely, slowly, and partially
salynchnew
2 days ago
If it's already scanned, then you don't have to leave your desk.
normie3000
2 days ago
Working from home and no scanner in the house?
ongy
2 days ago
No printer.
juujian
a day ago
Look, what I'm saying is that I don't have a scanner at home or at work and I've find this.
Spooky23
2 days ago
You’re talking about 1,000 FBI agents locked in a building. There’s no printer.
user
2 days ago
ffsm8
3 days ago
Depending on their technical capability, yes.
I mean even in this thread you got what are essentially one-liners to do it.
Definitely less hassle then doing it irl
streetfighter64
2 days ago
Hoe big a percentage of FBI / DoJ employees are running linux (with imagemagick) as their work computer? I'd be surprised to see a similar oneliner for a stock windows installation.
Yeah they might have used some web converter, but that on the other hand would have been extremely incompetent handling of the secret data.
1718627440
2 days ago
Installing MSYS2 is a matter of a few minutes. There is also WSL and macOS features a POSIX shell, imagemagick is likely already installed as a dependency somewhere, like ffmpeg often also is.
mikkupikku
2 days ago
I know I'm not the brightest bulb by any measure, but do some people really take less than at least a few minutes to come up with one-liners for problems as novel as graphical transformations to PDFs? Maybe if the presumed techie hacker / federal worker took it as an amusing challenge I could see this being done, but genuinely out of pure laziness? That's incredible if true.
naniwaduni
2 days ago
It's not a novel problem. But yes, I don't think people quite appreciate how quick and easy it is for people who are in the habit of brewing up one-liners to solve simple problems to do that. I've done it here on HN for jq toy problems before, and I don't really doubt there are people similarly familiar with imagemagick.
vlovich123
2 days ago
It’s a mix of “they’ve done it many times before” and these days AI. But remember the “they’ve done it many times before” just means that in a technical and popular forum you’re likely to find the handful of people who have done so regularly enough to remember the one liner. Also this is probably easily searchable as well so even prior to AI not super hard.
jeltz
2 days ago
There is nothing novel about it. I saw at least one person say that they have done exactly the same thing out of laziness.
breppp
2 days ago
I am only guessing that they had to remove the document from a classified network in a way where data won't possibly leak
draw_down
3 days ago
[dead]
zoky
2 days ago
Such a weird way to do it when it would be a vastly easier to just blow the document out to paper and re-scan it.
brazzy
2 days ago
Vastly easier when you do it to one or a handful of documents.
But if you want to do it to 2000 documents...
fc417fc802
2 days ago
But at that point why bother with the fakery? Why does it matter if it's obviously of digital origin? As long as it's rendered down to an image problem solved.
Was the motivation for this benign (an employee skirting regulations) or malicious?
pbhjpbhj
2 days ago
4 reems (4×500) is hardly a lot for commercial equipment to handle - paper trays will take a reem at a time. Document analysis would still show some shenanigans were in play, but you'd get a bit of variation at least.
userinexperienc
2 days ago
[dead]
hiccuphippo
2 days ago
I mean, I do that all the time when they ask me to print something, sign it, and then scan it.
Sign a blank paper, scan it, paste the original doc on it. Then keep the scan for future docs.
foxglacier
2 days ago
An easier trick I've used is just sign directly on the computer screen over the displayed document with a whiteboard marker and take a photo with my phone.