Anthropic requires 30 day data retention for Fable and Mythos

201 pointsposted a day ago
by lebovic

84 Comments

pseudosavant

4 hours ago

It is actually worse than that. It is at least 30 days. There is an "almost" that is doing a ton of heavy lifting here "deletion after 30 days in almost all cases". My read of that is they can hang onto data for as long as they want, even if they usually won't. And "all traffic" with an agentic harness is basically your entire codebase you work on.

> We will require 30-day retention for all traffic on Mythos-class models, on both first- and third-party surfaces. We won’t use this data to train new Claude models, or for any non-safety-related purpose, and we’ve instituted new privacy protections including logging all human access to the data and ensuring its deletion after 30 days in almost all cases (see this post for further details). The data will help us defend against complex and novel attacks (including new jailbreaks and attacks that operate across many requests) as well as help us identify and reduce false positives.

bagels

3 hours ago

How were they not already auditing access to customer data?

codebje

3 hours ago

They were not keeping it beyond the timeframe necessary for the model to process it, so there wasn't access there to audit.

eth0up

43 minutes ago

I cannot help wondering if the 'we won't train on your data' applies across the fence over there in pentagon land, where the classified contracts be. Yeah, of course they are not connected. Or..

Present user-llm activity is a goldmine of intel the agencies literally spent lives and billions on getting hardly close to, yet they elect to just let this one slip by..

Maybe. Really, I don't dispute it.

But why? It's what, or precisely what, they always dreamed of.

tcp_handshaker

3 hours ago

Half of my customers will drop them right away, and the other half, after I explain to them what this means.

usef-

an hour ago

It's only for this model, not the one you're already using. And they're not training on the data. It's supposedly to detect abuse etc (such as someone retrying repeatedly with different variations to get around their protections)

gmerc

19 minutes ago

Yet

usef-

12 minutes ago

Maybe, but to do so they'd need to offer new terms of service and we'd have to accept. I believe they'd lose a lot of their core business market if they did so.

vntok

2 hours ago

You must have very unrepresentative customers. What will they use?

bethekidyouwant

3 hours ago

Even worse when you git push something Microsoft gets all your code!

dannyw

2 hours ago

Yes, that is your intended purpose of “git push”, it’s to save. And only if you use GitHub.

A better analogy here is probably “every time you use VS Code, the files you edit get sent to Microsoft”.

Some legitimate concerns:

• You have trade secrets. Previously; you can use services like Bedrock, etc, with signed contracts and significant reputations. Your contract is between AWS and you, and stays within your AWS security boundary.

• Security breaches. Remember when Anthropic accidentally published the source tree of Claude code? Or Meta’s recent AI recovery bot that didn’t check if the supplied recovery email was actually the email of the Instagram account? The best way to reduce your exposure is to minimise storage.

• Weaponised T&S. For example what if Anthropic decided to build a classifier for “usage in unsupported regions” that’s super overbearing (as we see with Fable) and vacuums up all context/input/output if there’s Mandarin? Contractually they could now retain it forever, not just 30 days, for ‘trust and safety purposes’ and perhaps have AI scan for any new or interesting ML techniques at scale, for Anthropic’s own use? They say just can’t train Claude models on the data.

layer8

3 hours ago

Only if you push it to GitHub.

tcp_handshaker

3 hours ago

That is why, for the last five years I have been checking in with them, code with some of the most atrocious quality. So far...its working....

vntok

2 hours ago

Thank you for your service.

IFC_LLC

7 minutes ago

Anthropic is desperate for the IPO and will release a half-baked product that they are so afraid to release, you can literally feel the shiver through the text of their press-release.

Now they want to have any way of either fixing it, or in case someone will actually make a big boo-boo with their model, to be able to blame the guy in the end.

connorboyle

4 hours ago

A startup that uses agentic coding tools such as Claude Code or Codex is packaging up their entire codebase and sending it directly to their LM provider. Depending on their product, they might be sending it directly to a potential competitor.

Odd times we are living in!

ai-x

4 hours ago

people over-rate how much software/IP is useful in running a successful business. There are genuinely very few IP in this world that needs to be protected. Everyone else is running stupid CRUD apps

They also over index fear of LargeCo stealing IP from SmallCo. In fact, LargeCo is typically more scared about even the possibility of any product team looking at competitor internals due to lawsuits.

hnlmorg

3 hours ago

I’d be more scared of a data leak due to LargeCo being hacked than I would about LargeCo prying into the data.

What I don’t trust LargeCo with is personal information. I’ve heard too many horror stories about Govs and LargeCos swapping customer nudes or stalking ex’s to be comfortable with anything personal on those systems. But that’s a whole different topic.

switchbak

3 hours ago

LargeCo is probably struggling under the weight of technical debt and organizational challenges/politics.

I bet if you gave them the Codebase of the Gods, it’d be a heap of hacks inside a couple months.

tsunamifury

2 hours ago

You could not be more wrong in the aggregate.

Literally how LLMs will continue to learn to code and easily replace whatever you build with them.

Incredible that you could so blithely misunderstand this

sly010

3 hours ago

> people over-rate how much software/IP is useful in running a successful business

Indeed, by a couple trillions...

noncoml

2 hours ago

How can you make such bold and generic claims without some data backing it?

ai-x

an hour ago

actuaries look for data. visionaries take leaps in faith. There was no data proving LLMs will work at scale. Google waited for the Data. OpenAI and then Anthropic took the leap of faith. The result is there for all to see. The core attribute of a successful AI Researcher was were they AGI-pilled and not were they waiting for data for unknown unknowns?

bob1029

2 hours ago

Trust and liability are the actual currency in a software business.

Your email domain is significantly more important than whatever is in your corporate GitHub repositories.

drchaim

4 hours ago

and all their keys, because sooner or later, the harness is gonna read them

ai-x

4 hours ago

One company's irrational fear is a competitive advantage for someone else.

skybrian

3 hours ago

Yes, it certainly is an odd situation when some people believe you cannot use Mythos-class models because security while others believe you must do code reviews with Mythos-class models because security.

Ifkaluva

3 hours ago

Not just “a startup”! Also, famously, Meta, with their famous AI usage dashboards

stainablesteel

19 minutes ago

they would kill their own product if they did this

it would be like if tsmc started designing their own chips to compete with the people they sell their services to, they have more to gain by limiting their participation to a specific corner

kingcauchy

9 minutes ago

« Trust us, we’re doing this for the good of humanity » (fills pockets with stock value and externalities from data center polloution) « No seriously trust us , at least we’re not Sam Altman »

Update: « Oh and we’re the only ones who will stop AI from turning into SkyNet and eating your babies, you just have to pay us to make sure we invent SkyNet first »

throwaway85825

12 minutes ago

Given the model intelligence plateau and public data exhaustion the only way to improve in customer use cases is by training the model on customer data.

borissk

3 minutes ago

If this is true, than Anthropic, Google and maybe OpenAI models will keep getting better and better and everyone else will be left in the dust - as they won't have access to so much customer data.

samuelknight

2 hours ago

And by Fable they really mean Opus 4.8, because every mundane workflow or chat I try to use it in will eventually drop to Opus.

Sol-

3 hours ago

Fortunately I can't use Fable anyway, since their hyperactive content flaggers do not let you work on anything remotely biological or medical related (i.e. parse a CSV with some medical content, nope, you're probably a bioterrorist) and you get downgraded to Opus immediately.

nmfisher

2 hours ago

I'm not even working on anything biological/medical, almost all PyTorch work is getting flagged (not even a safety notice and a downgrade, just an outright refusal with "this is against our ToS").

torginus

2 hours ago

My 2 cents is that doctors people with lots of money and very specific needs who generally don't really go for tech jobs, so they're probably planning to create a separate monetization tier.

That, or alternatively, Mythos is so good at medical stuff, that it cam replace a lot of physician work 90% of the time, pissing off doctors, while the remaining 10% would result in very expensive lawsuits.

peyton

3 minutes ago

More likely whomever they’re consulting is protecting their own bags.

DrewADesign

2 hours ago

> That, or alternatively, Mythos is so good at medical stuff, that it cam replace a lot of physician work 90% of the time, pissing off doctors

Well they definitely don’t give a teaspoon of shit about putting people out of work by hawking munged-up versions of those people’s data, which was involuntarily ‘ingested’ for the benefit of society (in a way that happened to fuel a centabillion dollar industry.) So it’s prolly not that one.

pbgcp2026

3 hours ago

Yes! I have hit the same brick wall. What sort of idiots are doing this? Honestly, I have no idea. And just before their IPO. SO far Anthropic marketing has been perfect and spotless. This is serious slipup.

sigmar

an hour ago

It's temporary. From the fable blogpost:

>To release the model both safely and quickly, we’ve tuned these safeguards conservatively—they’ll sometimes catch harmless requests, though they trigger, on average, in less than 5% of sessions. With more capable models arriving in the coming months, we’re working to improve our safeguards and reduce false positives as quickly as we can.

solenoid0937

3 hours ago

It's good they're being overcautious here. The alternative is far worse.

siva7

3 hours ago

The alternative of... saving lives?

anigbrowl

an hour ago

They don't want the real risk of someone using it to make biological or genetically targeted weapons, and they don't want the social risk of someone asking it a bunch of leading questions in order to 'prove' some racist thesis or to 'prove' Mythos is woke if it declines to along with their performative inquiry.

Let's face it, if some rando comes up to and asks if you have a few minutes to talk about population biology there's a good chance they're a kook.

airstrike

3 hours ago

Will someone think of the children

matheusmoreira

an hour ago

Pretty incredible just how much good will Anthropic managed to burn.

shusaku

14 minutes ago

Are they really burning good will? For many users this is a deal breaker. But for the general public, politicians, etc they’re stamping “safety” on their brand.

matheusmoreira

10 minutes ago

Surveillance is always advanced as a safety measure.

abofh

12 minutes ago

Lawyers are gonna be making this a legal quagmire for years. Even after it gets retracted.

crazylogger

40 minutes ago

Didn’t they all but admit they’ve been storing and actively looking at requests with this post: https://www.anthropic.com/news/detecting-and-preventing-dist... ?

If they weren’t storing, they’d be oblivious to what customers are doing, making this kind of detection impossible. What data did they train their classifier on, if not real user (distiller) traffic?

cowsandmilk

15 minutes ago

Why can’t they have trained the classifier on internal red teaming?

buzer

an hour ago

Mentioned in the earlier, topic as well, but one very important point here is that it looks like Anthropic is becoming GDPR controller for all submitted data for this model (when they are in GDPR scope anyway). So data subjects would have Article 15 right to request information about processing and possibly a copy of the data. Latter might be contested under "rights of others", but former is more absolute.

What this means it that if someone makes an Article 15 request, they would be entitled to know if Anthropic holds personal data about them and also from who they received this data at minimum.

If someone wants to do that, I would recommend combining it with Article 18 request to forbid deleting the data for legal claim in case you contest Anthropic's reply. Otherwise they could just delete the data per their retention policy and DPA would find much later that they no longer hold the data.

Another issue here is that their DPA frames everything as controller-to-processor, i.e. they do not appear to have SCCs in place to actually receive this personal data as controller. So the original exporter would likely also be in breach if they send any GDPR covered personal data to this model.

OkWing99

44 minutes ago

I remember the "Don't be evil" days from Google. At some point most morals change with enough money.

giancarlostoro

3 hours ago

Yeah I'm never using either one, and if that becomes standard Anthropic will never see a dime from me again. I'm going to draw the line in the sand right there.

thekevan

4 hours ago

So if you are under an NDA, does this violate it?

I guess the better question would be if you are under and NDA and using an online model, are you already violating it but does this violate it further?

FiloSottile

3 hours ago

In the same way that using Gmail and Dropbox and iCloud and Notion violates it. (Which IANAL but for most NDAs would be not at all.)

layer8

3 hours ago

I never had an NDA permit such usage.

FiloSottile

3 hours ago

Your NDAs prohibit emailing a colleague about the e.g. project, or discussing it in a Slack DM with the client, or tracking progress on it in JIRA? You have to do NDA’d work exclusively with local tools or end-to-end encryption? Those are some difficult NDAs!

layer8

3 hours ago

We use inhouse on-premises email, issue tracking, and messaging. Depending on the project, external communication does require E2EE email. Development happens on local hardware and software unless required otherwise by the customer.

FiloSottile

2 hours ago

I’m pretty sure (even just based on the revenue of various SaaS products) that’s not typical, hence “most NDAs”. I’m also sure some require a SCIF, but that’s not most of them.

keithnz

4 hours ago

the real risk is using it at all as you are already sending them your data. If you are ok with that, then this retention/review seems ok.

pbgcp2026

3 hours ago

There were two (expensive) exceptions / alternatives so far: Bedrock and Vertex. Their Zero Data Retention was in fact contractually enforced. Now it is all f...d because of these morons at Anthropic. For now I am better off just using DS via their API.

This is just a tragic moment for Tech. We just killed AI privacy. OpenAI already follows this trend and others will do too.

The only hope now is ... tada .. Mistral LOL

osti

3 hours ago

Hmmm no? The only way is to deploy your own local model, using anyone else's you are at their whim on what happens to your data.

dannyw

2 hours ago

It’s not binary. With AWS previously you have contractual guarantees with a third party, that’s been in business for a couple decades, which explicitly state zero seconds of data retention - only as long as needed for inference.

Consider the security angle too. You now have to rely on Anthropic’s infrastructure security. You did not previously when you used Bedrock/Vertex/etc.

Daedren

3 hours ago

From a personal use perspective yes, the big issue here is enterprise and existing contracts as surely most companies will have signed zero retention.

Vortex777

3 hours ago

I mean not just the part 30 days data retention but I think the serious trade of this product is just the token efficiency. They trade it for precision. The claims that they make that it found a 30 year software bug from millions of lines of code is just precision. To human it's looks like a lot but for it it's just the ablity to process (token processing). Let's see how long it runs. Peace.

catigula

4 hours ago

Then don’t use it.

kccqzy

4 hours ago

That’s exactly what my employer had communicated. It will not be allowed.

ai-x

3 hours ago

Step 1: Find all companies which refuses/bans to use SOTA models from irrational fear.

Step 2: Use SOTA models to copy them and crush them

Step 3: Profit.

(Yes, not every business is easily replicable, but you sure can find some)

pbgcp2026

3 hours ago

This. And AI labs seem to be above IP / Copyright law and absolutely nothing will happen to them when they grab all the data and package it up.

applfanboysbgon

3 hours ago

Can you name a single example of a business that has been replaced by another business leveraging LLMs to copy and "crush" their software?

pbgcp2026

3 hours ago

Pretty much any Chinese business. (Except takeouts and laundries)

Wowfunhappy

3 hours ago

Step 4, get sued because you violated an NDA or other regulation?

ai-x

3 hours ago

I'm not talking about Claude copying.

I'm talking about scouring Twitter/LinkedIn and look at posts from employees who say SOTA model is banned. Look at what the business do. Copy it using SOTA. Call their clients with 30% discount and faster turnaround and higher quality product.

It is complicated, but I can get Private Equity of even VCs to fund this idea.

tl;dr -- I'm actually agreeing with you. Anthropic will never copy your business model due to NDA. But there are plenty of fearmongering about they copying you and because of which you won't use their models. If their models are genuinely SOTA you can use that information to your advantage and crush scaredy-cats.

Edit: The fact that these get downvoted is exactly the reason why it's easy to win

bandrami

4 hours ago

I mean, this is the biggest reason that's my employer's position

lvl155

4 hours ago

I actually think that’s warranted. And if you used it to poke around, you would also agree.

unshavedyak

3 hours ago

> And if you used it to poke around, you would also agree.

Would you elaborate? Not sure what you're describing

anigbrowl

an hour ago

All he pre-publicity from Anthropic was about how it was amazing at finding security vulnerabilities, so it's not a stretch to think that some people would want to exploit that for nefarious purposes.

zb3

3 hours ago

What an annoying company, I wish it didn't exist..

pbgcp2026

3 hours ago

All I can say to my team (and my clients): "f...k Anthropic". They've just put both Bedrock and Vertex on slippery slope of "we don't collect your prompts. period. ... comma ... except ..."

Right now we have changed the code of all our agents to data retention mode 'none' (Note: not "default" or "inherited", this is not enough now!) and we are fighting with GCP doco to set similar things for Vertex.

This is just terrible.