> China's distillation labs
This notion that Chinese labs are merely distilling frontier models is quite an unwarranted slur. Those labs have published WAY more useful research than US labs on RL techniques, novel model architectures, training pipelines, etc. They have also hit intelligence-per-parameter densities that US labs have yet to attain.
Apart from that, merely training a model on outputs from another model, off policy and without the logits, doesn’t really work that well.
The Chinese labs know how to build frontier level models. GLM-5.2 shows that they no longer even need Nvidia chips to do it.
But have they? I understand that the Chinese side is illuminated and the American side is dark. I disagree that the Chinese labs have created anything that isn't in an American research lab or production dc. Sure the Chinese have published their findings and not for nothing. But are they novel? Unlikely imo
I recently watched a video for one of these “Chinese Models” it kept insisting it was Claude when the user asked. Sorry, there’s no “slur” here but legit suspicion.
> We are at the mercy of frontier labs for access to SOTA LLMs
I disagree with this use of SOTA, and this topic is why.
Anthropic and OpenAI have “cutting-edge” models. These are beyond the state of the art but they are closed, secretive, hard to quantify.
The “state of the art” is open source, open weights models that can be inspected, studied, shared and critiqued, because that is what is meant by “the art” —- it is the knowledge and principles and evidence and materials available to all. The “state of the art” is the highest point of that.
I wish we could make this distinction and stop blessing two secretive, unverifiable loss-making companies with so much power.
(Putting that aside, I suspect — without evidence, mind you - that the endless march to solving models by making them bigger is not the solution anyway.)
Sorry but I think you’re requirement that something only be “the art” if any arbitrary person can critique it is off.
The frontier labs are working on the state of the art but it’s just art that you aren’t allowed to see.
Unfortunately.
It is work using the principles of the art, obviously.
But "state of the art" implies the highest state of general availability, not just in terms of access to some product, but of use of the ideas, concepts, methodologies etc.
Anthropic and OpenAI have "cutting edge" models; the state of the art is behind the cutting edge.
The state of the art is the best open source, open weights model available. More or less by definition.
I am probably tilting at windmills here.
the art is the standard engineering practices that go into building the thing
its things you would be trained in as part of a bachelor's degree and some graduate coursework