Paperless-ngx: scan, index and archive all your physical documents

41 pointsposted 13 hours ago
by saikatsg

22 Comments

candiddevmike

9 hours ago

I wish there was some way to combine paperless ngx with Google docs-like things somehow. Being able to combine living documents and scanned versions would be very helpful. I currently just scan things and upload them to Google Drive as a way to centralize everything.

I suppose I could convert "finished" Google docs to PDF and save them in paperless, but it just seems like these systems will always be disconnected in some way.

orastor

3 hours ago

I would pay for a foss paperless ngx fork with support for running in a readonly filesystem of arbitrary file structure, and giving me full text search with ocr for images, pdfs, and ideally descriptions of video files

pratio

9 hours ago

paperless-ngx is successor of paperless and paperless-ng. Around that time I moved to https://teedy.io which is also opensource https://github.com/sismics/docs and also support ldap.

I've been itching to give paperless-ngx a shot because I just love it but ldap hasn't yet ended up in the docs but the pull request was merged https://github.com/paperless-ngx/paperless-ngx/pull/5190.

Regardless, I just love how this project just keeps coming back to life

candiddevmike

9 hours ago

As someone who is adding SSO to B2C apps, are you an LDAP or nothing kind of person or would you consider things with OIDC/OAuth integration too?

LDAP is such a pain in the ass to integrate with, and it seems like most things are going OIDC these days.

pratio

9 hours ago

Absolutely, would love OIDC/OAuth. I use https://goauthentik.io/. Teedy supports only LDAP so that's what I'm using right now.

candiddevmike

9 hours ago

Nice, thank you. Ive been busy adding OIDC client support to a household management app (https://homechart.app) and I'm now adding support for making it an OIDC provider too. In theory, you'd already have accounts for all of your household members (ideally with TOTP or WebAuthn), so it should be a good identity provider.

I've been avoiding LDAP like the plague. I think MS is moving away from self-hosted AD, and LDAP really loses its luster for most folks when the self hosted options are something like OpenLDAP.

vetinari

9 hours ago

OIDC is not really a replacement for LDAP. SAML2 could be, but OIDC in itself has no concept like group membership.

Kerberos, yes, but LDAP no.

What are your pain points integrating with LDAP? It is pretty simple.

candiddevmike

9 hours ago

OIDC _can_ have group memberships if the provider/client support it via claims.

LDAP is a pain because you have to expose/support a lot of knobs for integration (bind vs anonymous, secure vs unsecure, group format, root DNs, etc.). OIDC is (in theory) a lot simpler for the most part as the bare minimum is discovery URL, client ID, and client secret.

CodeWriter23

8 hours ago

From their docs site:

> Documents are saved as PDF/A format which is designed for long term storage…[snip]

Can someone please tell me what attributes make a given file format more suitable for long term storage over another?

vibbix

8 hours ago

Everything for the document (fonts, images, etc) are all stored within the document file. It's entirely self-contained.

MarioMan

8 hours ago

Among other things, it usually means that the file type has wide interoperability (which makes it more likely you can open it in the far future) and comes in a format resistant to damage, so if bits are changed or removed, you can still recover the rest of the document (usually this means avoiding compressed formats). As to how well-suited PDF/A is for these aspects, I'm not experienced enough to say.

einpoklum

9 hours ago

After scanning a document, how is it different than any other document I have as a file (other than it being not-very-editable)? i.e. is this a general-purpose document management system, or - what?

> The easiest way to deploy paperless is docker compose

Ok, that's a first red flag.

RockRobotRock

9 hours ago

Go ahead man, manually install and configure redis, mariadb, gotenberg, and tika to see if you like the software. It's a free country.

viraptor

9 hours ago

Not a general purpose one really, but it is a document management system. It's aimed at incoming mail. You get automatic OCR and learned classification / tagging / date finding.

And "docker compose up" is the easiest way to deploy things these days in general. That's got nothing to do with this software specifically.

RiverCrochet

9 hours ago

> After scanning a document, how is it different than any other document I have as a file (other than it being not-very-editable)?

You don't want to use paperless-ngx for editable stuff really. You want to use it for stuff like bills, invoices, and business records.

Once it's in paperless, it's searchable and you don't have to worry about where it is. As long as the scan is good it will grab the OCR and then you can search for things like account number. My uncle basically scans everything bill related into his instance and then shreds the paper.

You can also tag documents and search by tag. Also since it's a web app if you can do the self-hosted thing it works well on the phone.

noncoml

9 hours ago

I have my printer set to scan and save the files to a NFS. Paperless-NGX picks it from there, does OCR and saves it. I guess I could just leave it on the NFS, but I do like the UI of P-NGX.

maxace

5 hours ago

I have dedicated scanners at my 2 business locations with shortcuts to SFTP scans onto the server. paperless-ngx monitors the folder and automatically ingests the document. literally just two button presses and any document is digitized, tagged, OCRd, and archived within about a minute. I have the scanners set the file name based on their location so I can tell at a glance where something came from in the paperless inbox view.

zeagle

2 hours ago

Any suggestion for a scanner for this purpose?

tga

7 minutes ago

Take a look at full-duplex multifunctional printers, many times they are cheaper than standalone scanners. Just as an example, a black and white laser like the Brother MFC-L2820DW should last you a long time.

user

8 hours ago

[deleted]