ontouchstart
2 days ago
How do you prevent AI agent to scrape your data the same way you scrape HF?
For example, I can “cache” your page as a shared link in this comment
https://www.openpaperdigest.com/paper/paperdebugger-a-plugin...
Or in a gist somewhere:
https://gist.github.com/ontouchstart/38d80cab66794014d17e193...
Then I can have a bot to scrape these pages with context as training data.
This can be out of hands for you in inference cost. Then you need VC money to sustain your website. Wish you the best luck to get there.
ontouchstart
2 days ago
The reason I said that is that I already have a POC to use LLM to go to a gist and do something with the date in it.
https://gist.github.com/ontouchstart/03f4c7ee853061772b479d9...