Show HN: AI tool to scan internal docs for GDPR violations before audits

2 pointsposted 11 hours ago
by kinottohw

Item id: 45761471

13 Comments

kingnothing

11 hours ago

You need to have compliance certifications or no one will use this. Think along the lines of SOC2, HIPAA, willingness to sign BAAs, etc. The hardest part of this company is going to be sales. You're not selling to small businesses who will pop in a credit card number -- this is an offering for enterprises with annual agreements and longer sales cycles.

Also, consider supporting CCPA for California businesses.

kinottohw

11 hours ago

Actually, we’re mostly targeting small companies (10–50 people) that need guidance to avoid big fines but can’t afford the bigger, full-featured compliance tools. Do you think there’s really no room for something like this in the market without having all the compliance certifications first?

pavel_lishin

11 hours ago

Wouldn't the act of allowing this service to scan your docs potentially violate compliance, if the data there does contain things that shouldn't leak?

hobofan

11 hours ago

Yup. Maybe the business model could be to automatically forward the offense to the sactioning agency and take a cut of the penalty?

kinottohw

10 hours ago

we’re aiming more at helping teams spot issues early so they can fix them before any fines happen

kinottohw

11 hours ago

You're right, now we’re only testing with fake/synthetic data, so no real info is ever scanned. We’re already using local processing, encryption, and access controls to make sure everything stays compliant.

pavel_lishin

11 hours ago

But when I logged in, I got the option to integrate my Dropbox account.

kinottohw

11 hours ago

Yes you can test with real docs. they get processed locally, nothing gets saved on our servers, just the scan results which are encrypted. We’ve been testing ourselves by connecting our own Dropbox/Google accounts using fake docs that simulate GDPR issues

hobofan

11 hours ago

The do you mean? Your demo video clearly shows the document contents in the dashboard. The document contents from all I could see would be processed by a cloud LLM.

Everything I see reads like you have a strange understanding of "local" and shouldn't be trusted with building such software.

kinottohw

11 hours ago

Yes the document content is visible in the dashboard when you’re logged in, but it’s fetched at runtime from whichever integration you’re using (Dropbox, Google, etc.) and never stored on our servers. The cloud LLM just processes the document on the fly to spot potential issues. And the data you see in the demo is all fake.

kinottohw

3 hours ago

Yeah, you’re both right, it’s not “local” in the strict sense like running everything including the LLM in your browser. What I meant is that the docs are fetched at runtime and never stored on our servers. I’m totally open to ideas on how to make the setup better, even if it means tweaking the business model a bit.

pavel_lishin

10 hours ago

> The cloud LLM just processes the document on the fly

That... doesn't sound local, dude. "Locally" would mean that the LLM is actively running in my browser, and in my browser only, which is not what you're describing.

I understand that you're claiming that the documents aren't being stored permanently, but they're still being transferred to your servers, and their full contents are being read there by something.

hobofan

10 hours ago

So the data isn't processed locally.