hackernews client

mritchie712

5 hours ago

For anyone that doesn't use twitter:

I index all my local Claude Code sessions in DuckDB. I have 202,381 messages in the last 30 days.

There's been a steady increase since Opus 4.6 in the model saying "honest".

It probably shouldn't, but this bugs me.

Should I assume most of the time you're lying and you're being honest in this one message?

I was pumped in the first few hours of Fable where this had seemingly been "fixed". 100+ messages and no "honest" to be seen. But it didn't last.

Within a few hours, Fable proved itself to be the most honest model to date.

Here is the rate at which visible assistant text contained the string "honest" (case-insensitive), split by model:

  claude-fable-5:             25 / 1,397   = 1.7895%
  claude-opus-4-8:            83 / 5,818   = 1.4266%
  claude-opus-4-7:           163 / 16,432  = 0.9920%
  claude-opus-4-6:            18 / 5,877   = 0.3063%
  claude-haiku-4-5-20251001:   0 / 71      = 0.0000%
  claude-sonnet-4-6:           0 / 4       = 0.0000%

HymnIA

5 hours ago

I didnt' know that !!

Fable 5 is Anthropic's most "honest" model

2 Comments

mritchie712

HymnIA