Show HN: A local-first, reversible PII scrubber for AI workflows

23 pointsposted 11 hours ago
by tjruesch

5 Comments

minixalpha

13 minutes ago

I'd like to know if there's a tool that can automatically replace sensitive information before I paste content into ChatGPT, and then automatically restore the sensitive information when I copy the results from ChatGPT. The logic for both "replacement" and "restoration" should be handled locally on my computer.

welcome_dragon

2 hours ago

Reversible as in you can re-identify? That sounds not secure

bigiain

44 minutes ago

The post discusses that:

Security First

Because the “PII Map” (the link between ID:1 and John Smith) effectively is the PII, we treat it as sensitive material.

The library includes a crypto module that forces AES-256-GCM encryption for the mapping table. The raw PII never leaves the local memory space, and the state object that persists between the masking and rehydration steps is encrypted at rest.

I've bookmarked this for inspiration for a medium/long term project I am considering building. I'd like to be able to take dumps of our production database and automatically (one way) anonymize it. Replacing all names with meaningless but semantically representative placeholders (gender matching where obvious - Alice, Bob, Mallory, Eve, Trent perhaps, and gender neutral like Jamie or Alex when suitable). Use similar techniques to rewrite email addresses (alice@example.org, bob@example.com, mallory@example.net) and addresses/placenames/whatever else can be pulled out with Named Entity Recognition. I suspect I'll in general be able to do a higher accuracy version of this, since I'll have an understanding of the database structure and we're already in the process of adding metadata about table and column data sensitivity. I will definitely be checking out the regexes and NER models used here.

fluidcruft

44 minutes ago

My hope is it means it assigns coded identifiers and the key remains local. When the document returns, the identifiers can be restored. So the PII itself never leaves the premises.

handfuloflight

2 hours ago

This is an awesome share and development. Kudos!