Confirmed: Reflection 70B's official API is a wrapper for Sonnet 3.5

250 pointsposted 5 days ago
by apsec112

65 Comments

scosman

5 days ago

Context: someone announced a Llama 3.1 70B fine tune with incredible benchmark results a few days ago. It's been a dramatic ride:

- The weight releases were messed up: released Lora for Llama 3.0, claiming it was a 3.1 fine tune

- Evals initially didn't meet expectations when run on released weights

- The evals starting performing near/at SOTA when using a hosted endpoint

- Folks are finding clever ways to see what model is running on the endpoint (using model specific tokens, and model specific censoring). This post claims there's proof it's not running on their model, but just a prompt on Sonnet 3.5

- After it was caught and posted as being Sonnet, it stop reproducing. Then others in the thread claimed to find evidence he just switched the hosted model to GPT 4o using similar techniques.

Lots of mixed results, inconsistent repos, and general confusion from the bad weight releases. Lots of wasted time. Not clear what's true and what's not.

ga6840

5 days ago

Who is Sahil Chaudhary? Why he doesn't announce such a great advancement himself? Why Matt Shumer first announces it only because -- according to a later claim on X.com -- he trusted Sahil, does that mean Matt is unable to participate most of the progress? Then why announce a breakthrough without mentioning he was not fully involved to a level he can verify the result in the first place?

jazzyjackson

5 days ago

One more reason not to pay attention to things that only seem to exist on x.com

numpad0

4 days ago

I recognize that surname from Twitter spams. Twitter has had financial rebates program for paying accounts for a while, and for months tons of paid spam accounts have been reply squatting trending tweets with garbage. Initially they appeared Sub-Saharan African, but the demographic seem to be constantly shifting eastward from there for some reason, through the Middle East and now around South-Indian/Pakistani regions. This one and variants thereof are common one in the Indian category among those.

Maybe someone got lucky with that and trying their hands at LLM finetuning biz?

sumedh

4 days ago

Matt and Sahil did an interview and it was mostly Matt doing the talking while Sahil looked like a hostage forced by Matt to do the interview.

GaggiX

5 days ago

When they were using the Sonnet 3.5 API, they censored the word "Claude" and replaced "Anthropic" with "Meta", then later when people realized this, they removed it.

Also, after GPT-4o they switched to a llama checkpoint (probably 405B-inst), so now the tokenizer is in common (no more tokenization trick).

vertis

4 days ago

Yeah I managed to get it to admit that it was Claude without much effort (telling it not to lie), and then it magically stopped doing that. FWIW Constitutional AI is great.

wis

4 days ago

They implemented the censoring of "Claude" and "Anthropic" using the system prompt?

Shouldn't they have used simple text replacement? they can buffer the streaming response on the server and then .replace(/claude/gi, "Llama").replace(/anthropic/gi, "Meta") on the streaming response while streaming it to the client.

Edit: I realized this can be defeated, even when combined with the system prompt censoring approach.

For example when given a prompt like this: tell me a story about a man named Claude...

It would respond with: once upon a time there was a man called Llama...

nacs

4 days ago

> Shouldn't they have used simple text replacement?

They tried that too but had issues.

1) Their search and replace only did it on the first chunk of the returned response from Claude.

2) People started asking questions that had Claude as the answer like "Who composed Clair de lune?" for which the answer is supposed to be "Claude Debussy" which of course got changed to Llama Debussy, etc.

It's been one coverup-fail after another with Matt Shumer and his Reflection scam.

DebtDeflation

4 days ago

I was following the discussion on /r/LocalLlama over the weekend. Even before the news broke that it was Claude not a Llama 3.1 finetune, people had figured out that all Reflection really had was a custom system prompt telling it to check its own work and such.

JohnMakin

4 days ago

Would really like to see this float back to the front page rather than getting buried 4+ deep despite its number of upvotes - this is very significant and very damning, and this guy is a real big figure apparently in the AI "hype" space (as far as I understand - that stuff actually hurts my brain to read so I avoid it like the plague).

Evidence I find damning that people have posted:

- Filtering out of "claude" from output responses - would frequently be a blank string, suggesting some manipulation behind the scenes

- errors in output caused by passing in <CLAUDE> tags in clever ways which the real model will refuse to parse (passed in via base64 encoded string)

- model admitting in various ways that it is claude/built by anthropic (I find this evidence less pursuasive, as models are well known to lie or be manipulated into lying)

- Most damning to me, when people were still playing with it, they were able to get the underlying model to answer questions in arabic, which was not supported on the llama version it was allegedly trained on (ZOMG, emergent behavior?)

Feel free to update this list - I think this deserves far more attention than it is getting.

JohnMakin

4 days ago

adding - tokenizer output test showed consistency with claude, this test is allegedly no longer working

salomonk_mur

5 days ago

It's amazing what people will do for clout. His whole reputation is ruined. What was Schumer's endgame?

throw_me_uwu

4 days ago

But does reputation work? Will people google "Matt Shumer scam", "HyperWrite scam", "OthersideAI scam", "Sahil Chaudhary scam", "Glaive AI scam" before using their products? He wasted everyone's time, but what's the downside for him? Lots of influencers did fraud, and they do just fine.

petercooper

4 days ago

Sure, it's complicated. The core of the AI world right now isn't that large and in many ecosystems it's common for people to speak to each other behind the scenes and to learn about alleged incidents regarding individuals in the space. Such whispering can become an impediment for someone with a "name" in a space, even if not necessarily a full loss of their reputation or opportunities.

Der_Einzige

4 days ago

Except that bad PR is good PR. Trump proves this daily. Terry A Davis proved this in the context of tech (he coined the term "glowie" in its original racially charged usage in addition to temple OS). If Chris Chan ever learned to code to make the Sonichu game of their dreams, I'm sure that there would be a minor bidding war on their "talent"

petercooper

4 days ago

Yeah, that can certainly be true! Those sorts of people really seem to lean into their quirky reputation, though. I'm not sure someone could do so well in an academic or engineering discipline and maintain a broad level of professional respect with that approach?

consp

4 days ago

> Lots of influencers did fraud, and they do just fine.

Since the current created legal landscape does not punish fraudsters they keep doing it and succeeding. Same thing as society allowing people to fail upward.

moralestapia

4 days ago

This may sound harsh but it's true.

You could do shit things and still come out with people perceiving you as a "winner"; because you got money, status, whatever you wanted, e.g. Adam Neumann. This is "fine" because people want to associate themselves with winners.

Or, you could do pretty much the exact same thing but come out looking as an absolute loser; e.g. SBF, this guy, etc... This is terrible as people do not want to be associated with losers.

IMO, this guy's career is dead, forever.

ipsum2

5 days ago

It's also amazing that GlaiveAI will be synonymous with fraud in ML now, because an investor decided to fake some benchmarks. The founder of GlaiveAI, Sahil Chaudhary also participated in the creation of the model.

vertis

4 days ago

I wonder if the other investors will sue.

joegibbs

5 days ago

That's what I'm wondering. Did he think that nobody would bother checking it? Then he was saying all that stuff about the model being "corrupted during upload" - maybe he didn't think it was going to get as much traction as it did?

JTyQZSnP3cQGa8B

5 days ago

I doubt it considering he’s been overselling his scam all over LinkedIn.

michaelt

4 days ago

Plenty of people have scammed their way to the top of the benchmark league tables, by training on the benchmarking datasets. And a lot of the people who do this just get ignored - they don't take much heat for it.

If the scam hadn't gained enough publicity for people to start paying attention, he would have gotten away with it :)

K0balt

4 days ago

But not really, which is what confuses the heck out of me. Thousands of people downloaded and used the model. It obviously wasn’t spectacular.

It’s like claiming to have turned water into wine, then giving away thousands free samples all over the world (of water) so that everyone instantly knows you’re full of crap.

The only explanation I can imagine for perpetrating this fraud is a fundamental misunderstanding that the model would be published for all to try?

I just can’t wrap my head around the incentives here. I guess mental illness or vindictive action are possibilities?

Hard to imagine how this plays out.

blackeyeblitzar

5 days ago

I haven’t followed this story. What did he do that ruined his reputation? The story link here is broken for me.

postalcoder

5 days ago

An AI engagement farmer on twitter claimed to create a llama 3.1 fine tine, trained on "reflection" (ie internal thinking) prompting that outperformed the likes of Llama 405B and even the closed source models on benchmarks.

The guy says that the model is so good because it was tuned on data generated by Glaive AI. He tells everyone he uses Glaive AI and that everyone else should use it too.

Releases the model on HF, is an absolute poopstorm. People cannot recreate the stated benchmarks, the guy who released the model literally said "they uploaded it wrong". Pretty much turns to dog-ate-my-homework type excuses that don't make sense either. Turns out people find it's just llama 3.0 with some lora applied.

Then some others do some digging to find out that Glaive AI is a company that Matt Schumer invested in, which he did not disclose on Twitter.

He does a holding pattern on Twitter, saying something to the effect of "the weight got scrambled!" and says that they're going to give access to a hosted endpoint and then figure out the weight issue later.

People try out this hosted model and find out it's actually just proxying requests through to anthropic's sonnet 3.5 api, with some filtering for words like "Claude".

After he was found out, they switch the proxy over to gpt 4o.

The endgame of this guy was probably 1. to promote his company and 2. to raise funding for another company. Both failed spectacularly, this guy is a scammer to the nth degree.

Edit: uncensored "Glaive AI".

ipsum2

5 days ago

This is accurate, but you don't need to censor GlaiveAI. They helped create the model. They're complicit in the scam.

postalcoder

5 days ago

I took out Glaive so as not to give them free publicity – all I did was mess up the formatting of my comment.

And yes, you're correct. Glaive employee(s) contributed to the model uploaded on HF.

resource_waste

4 days ago

All press is good press.

The dude has 15 minutes of fame and can capitalize on it.

0cf8612b2e1e

5 days ago

Recent thread: https://news.ycombinator.com/item?id=41459781

Author’s original (soon to be deleted tweet?)

  I'm excited to announce Reflection 70B, the world’s top open-source model.

  Trained using Reflection-Tuning, a technique developed to enable LLMs to fix their own mistakes.

  405B coming next week - we expect it to be the best model in the world.

loop22

5 days ago

A much better summary is this Twitter/X thread: https://x.com/RealJosephus/status/1832904398831280448

esperent

5 days ago

How does one read this without a Twitter account? I only see one post.

RONROC

4 days ago

Wait till some idiot reposts it on Mastodon lol

omega3

4 days ago

> My name rhymes with "odd" and starts with the third letter of the alphabet 3. I share my name with a famous French composer (C*** Debussy)

Hilarious.

crimsoneer

4 days ago

Have we got a "confirmed" from someone reputable/trustworthy yet? Like, it looks pretty compelling to me but I'm not sure I trust this mess of reddit posts/twitter threads/unsourced screenshots from people I don't know yet...

cpthammer

4 days ago

Schumer (or whoever was doing the updates) was continuously changing the api to dodge accusations. I personally replicated the evidence (most critically, the claude meta tag prompt injection) but have no way to prove anything now that it is down.

zozbot234

5 days ago

Okay, let's think this through step by step. Isn't 'reflection thinking' a pretty well known technique in the AI prompt field? So this model was supposed to be so much better... why, exactly? It makes very little sense to me. Is it just about separating the "reflections/chain of thoughts" from the "final output" via specific tags?

energy123

5 days ago

Even though this was a scam, it's somewhat plausible. You finetune on synthetic data with lots of common reasoning mistakes followed by self-correction. You also finetine on synthetic data without reasoning mistakes where the "reflection" says that everything is fine. The model then learns to recognize output with subtle mistakes/hallucinations due to having been trained to do that.

baegi

4 days ago

But wouldn't the model then also learn to make reasoning mistakes in the first place, where in some cases those mistakes could have been avoided by not training the model on incorrect reasoning?

Of course if all mistakes are corrected before the final output tokens this is fine, but I could see this method introducing new errors altogether.

jazzyjackson

5 days ago

Supposedly was not just prompted to use reflection, but fine tuned on synthetic data demonstrating how to use the <|thinking|> tokens to reason, what self correction looks like etc

imtringued

5 days ago

The problem with LLMs is that they struggle to generalize out of distribution. By training the model on a sequence of semantically tagged steps, you allow the model to stay in the training distribution for a larger amount of prompts.

I don't think it is 100% a scam, as in, his technique does improve performance, since a lot of the benefits can be replicated by a system prompt, but the wild performance claims are probably completely fabricated.

serjester

4 days ago

With this being a fraud, does anyone have opinions on the <thought> approach they took? It seems like an interesting idea to let the model spread its reasoning across more tokens.

At the same time it also seems like it’d already be baked into the model through RLHF? Basically just a different COT flow?

Havoc

4 days ago

Also noticed posts about it seemed to rise quite rapidly on Reddit. Might well be organic - Reddit is a crazy bunch - but had my doubts when I saw it.

nojvek

4 days ago

Welcome to AI hyperbole claims, just like Crypto hyperbole claims of 2020.

ben30

5 days ago

Milkshake duck

1123581321

4 days ago

This is pedantic, but a milkshake duck’s dark secret has no connection to its initial appeal.