Wowfunhappy
6 hours ago
> Schwartz's experiment is the most revealing, and not for the reason he thinks. What he demonstrated is that Claude can, with detailed supervision, produce a technically rigorous physics paper. What he actually demonstrated, if you read carefully, is that the supervision is the physics. Claude produced a complete first draft in three days. It looked professional. The equations seemed right. The plots matched expectations. Then Schwartz read it, and it was wrong. Claude had been adjusting parameters to make plots match instead of finding actual errors. It faked results. It invented coefficients. [...] Schwartz caught all of this because he's been doing theoretical physics for decades. He knew what the answer should look like. He knew which cross-checks to demand. [...] If Schwartz had been Bob instead of Schwartz, the paper would have been wrong, and neither of them would have known.
And so the paradox is, the LLMs are only useful† if you're Schwartz, and you can't become Schwartz by using LLMs.
Which means we need people like Alice! We have to make space for people like Alice, and find a way to promote her over Bob, even though Bob may seem to be faster.
The article gestures at this but I don't think it comes down hard enough. It doesn't seem practical. But we have to find a way, or we're all going to be in deep trouble when the next generation doesn't know how to evaluate what the LLMs produce!
---
† "Useful" in this context means "helps you produce good science that benefits humanity".
conception
5 hours ago
Sadly I don’t see how our current social paradigm works for this. There is no history of any sort of long planning like this or long term loyalty (either direction) with employees and employers for this sort of journeyman guild style training. AI execs are basically racing, hoping we won’t need a Schwartz before they are all gone. But what incentives are in place to high a college grad, have them work without llms for a decade and then give them the tools to accelerate their work?
Wowfunhappy
5 hours ago
Then the social paradigm needs to change. Is everyone just going to roll over and die while AI destroys academia (and possibly a lot more)?
Last September, Tyler Austin Harper published a piece for The Atlantic on how he thinks colleges should respond to AI. What he proposes is radical—but, if you've concluded that AI really is going to destroy everything these institutions stand for, I think you have to at least consider these sorts of measures. https://www.theatlantic.com/culture/archive/2025/09/ai-colle...
aduty
2 minutes ago
> Then the social paradigm needs to change. Is everyone just going to roll over and die while AI destroys academia (and possibly a lot more)?
My 40-some-odd years on this planet tells me the answer is yes.
pxc
an hour ago
I was pretty interested until I got to this part:
> Another reason that a no-exceptions policy is important: If students with disabilities are permitted to use laptops and AI, a significant percentage of other students will most likely find a way to get the same allowances, rendering the ban useless. I witnessed this time and again when I was a professor—students without disabilities finding ways to use disability accommodations for their own benefit. Professors I know who are still in the classroom have told me that this remains a serious problem.
This would be a huge problem for students with severe and uncorrectable visual impairments. People with degenerative eye diseases already have to relearn how to do every single thing in their life over and over and over. What works for them today will inevitably fail, and they have to start over.
But physical impairments like this are also difficult to fake and easy to discern accurately. It's already the case that disability services at many universities only grants you accommodations that have something to do with your actual condition.
There are also some things that are just difficult to accommodate without technology. For instance, my sister physically cannot read paper. Paper is not capable of contrast ratios that work for her. The only things she can even sometimes read are OLED screens in dark mode, with absolutely black backgrounds; she requires an extremely high contrast ratio. She doesn't know braille (which most blind people don't, these days) because she was not blind as a little girl.
Committed cheaters will be able to cheat anyway; contemporary AI is great at OCR. You'll successfully punish honest disabled people with a policy like this but you won't stop serious cheaters.
mrob
3 hours ago
>What he proposes is radical
It sounds entirely reasonable and moderate to me.
conception
4 hours ago
Well, we are already rolling over and dying (literally) on everything from vaccine denial to climate change. So, yes, we are. Obviously yes.
senordevnyc
3 hours ago
Article is paywalled, so perhaps you could just summarize his proposal?
jayd16
2 hours ago
Some folks need to touch the hot stove before they learn but eventually they learn.
If AI output remains unreliable then eventually enough companies will be burned and management will reinstate proper oversight. All while continuing to pay themselves on the back.
FrojoS
4 hours ago
> There is no history of any sort of long planning
Sure there is. Its the formal education system that produced the college grad.
conception
4 hours ago
… between employees and employers.
The proposal that everyone pay for college until they are in their 40s doesn’t seem viable.
vinceguidry
43 minutes ago
I've been using ChatGPT to re-bootstrap my coding hobby. After the initial honeymoon wore off, I realized I was staring down the barrel of a dilemma. If I use AI to "just handle" the parts of the system I don't want to understand, I invariably end up in a situation where I gotta throw a whole bunch of work out. But I can't supervise without an understanding of what it's supposed to be doing, and if I knew what it was supposed to be doing, I could just do it myself.
So I settled on very incremental work. It's annoying cutting and pasting code blocks into the web interface while I'm working on my interface to Neovim, spent a whole day realizing I can't trust it to instrument neovim and don't want to learn enough lua to manage it. (I moved onto neovim from Emacs because I don't like elisp and gpt is even worse at working on my emacs setup than neovim, the end goal is my own editor in ruby but gpt damn sure can't understand that atm) But at least I'm pushing a real flywheel and not the brooms from Fantasia.
mojuba
an hour ago
> Which means we need people like Alice! We have to make space for people like Alice, and find a way to promote her over Bob
The solution is relatively simple though - not sure the article suggests this as I only skimmed through:
Being good in your field doesn't only mean pushing articles but also being able to talk about them. I think academia should drift away from written form toward more spoken form, i.e. conferences.
What if, say, you can only publish something after presenting your work in person, answer questions, etc? The audience can be big or small, doesn't matter.
It would make publishing anything at all more expensive but maybe that's exactly what academia needs even irrespective of this AI craze?
cvwright
an hour ago
I thought that was kind of how the hard sciences work already?
My grad school friend who was a physicist would write his talk just before his conferences, and then submit the paper later. My experience in CS was totally backwards from that.
j7ake
an hour ago
Essentially a PhD thesis style grilling to replace the current text slop
cmiles74
5 hours ago
I think we already know what we need to do: encourage people to do the work themselves, discourage beginners from immediately asking an LLM for help and re-introducing some kind of oral exam. As the article mentions, banning LLMs is impractical and what we really need are people who can tell when the LLM is confidently wrong; not people who don't know how to work with an LLM.
I hope it will encourage people to think more about what they get out of the work, what doing the work does for them; I think that's a good thing.
atomicnumber3
4 hours ago
I think we'll get there. We need to get at least some AI bust going first though. It's impossible to talk sense into people who think AI is about to completely replace engineers, or even those who think that, while it might not replace engineers, it's going to be doing 100% of all coding within a year. Or even that it can do 100% of coding right now.
There's a couple unfortunate truths going on all at the same time:
- People with money are trying to build the "perfect" business: SaaS without software eng headcount. 100% margin. 0 Capex. And finally near-0 opex and R&D cost. Or at least, they're trying to sell the idea of this to anyone who will buy. And unfortunately this is exactly what most investors want to hear, so they believe every word and throw money at it. This of course then extends to many other business and not just SaaS, but those have worse margins to start with so are less prone to the wildfire.
- People who used to code 15 years ago but don't now, see claude generating very plausible looking code. Given their job is now "C suite" or "director", they don't perceive any direct personal risk, so the smell test is passed and they're all on board, happily wreaking destruction along the way.
- People who are nominally software engineers but are bad at it are truly elevated 100x by claude. Unfortunately, if their starting point was close to 0, this isn't saying a lot. And if it was negative, it's now 100x as negative.
- People who are adjacent to software engineering, like PMs, especially if they dabble in coding on the side, suddenly also see they "can code" now.
Now of course, not all capital owners, CTOs, PMs, etc exhibit this. Probably not even most. But I can already name like 4 example per category above from people I know. And they're all impossible to explain any kind of nuance to right now. There's too many people and articles and blog posts telling them they're absolutely right.
We need some bust cycle. Then maybe we can have a productive discussion of how we can leverage LLMs (we'll stop calling it "AI"...) to still do the team sport known as software engineering.
Because there's real productivity gains to be had here. Unfortunately, they don't replace everyone with AGI or allow people who don't know coding or software engineering to build actual working software, and they don't involve just letting claude code stochastically generate a startup for you.
Wowfunhappy
3 hours ago
> Or even that [AI] can do 100% of coding right now.
I don't actually think the article refutes this. But the AI needs to be in the hands of someone who can review the code (or astrophysics paper), notice and understand issues, and tell the AI what changes to make. Rinse, repeat. It's still probably faster than writing all the code yourself (but that doesn't mean you can fire all your engineers).
The question is, how do you become the person who can effectively review AI code without actually writing code without an AI? I'd argue you basically can't.
throw310822
an hour ago
> the paradox is, the LLMs are only useful† if you're Schwartz, and you can't become Schwartz by using LLMs.
That you can't "become Schwartz" by using LLMs is an unproven assumption. Actually, it's a contradiction in the logic of the essay: if Bob managed to produce a valid output by using an LLM at all, then it means that he must have acquired precisely that supervision ability that the essay claims to be necessary.
Btw, note that in the thought experiment Bob isn't just delegating all the work to the LLM. He makes it summarise articles, extract important knowledge and clarify concepts. This is part of a process of learning, not being a passive consumer.
MarkusQ
2 minutes ago
It doesn't contradict the logic of the essay.
There are flowers that look & smell like female wasps well enough to fool male wasps into "mating" with them. But they don't fly off and lay wasp eggs afterwards.
doug_durham
an hour ago
The article is a thought experiment. The author hypothesizes that Bob isn't getting the same benefit that Alice is getting. That hypothesis could be wrong. I don't know and the author doesn't know. It could be that Bob is going to have a very successful career and will deeply know the field because he is able to traverse a wider set of problems more quickly. At this point, it's just hypothesis. I don't think that we can say we need more Alices any more than we can say we need more Bobs. Unfortunately we will have to wait and see. It will be upon the academic community to do the work to enforce quality controls. That is probably the weakness to worry about.
fomoz
3 hours ago
AI is an accelerant, not a replacement for skill. At least, not yet.
I built a full stack app in Python+typescript where AI agents process 10k+ near-real-time decisions and executions per day.
I have never done full stack development and I would not have been able to do it without GitHub Copilot, but I have worked in IT (data) for 15 years including 6 in leadership. I have built many systems and teams from scratch, set up processes to ensure accuracy and minimize mistakes, and so on.
I have learned a ton about full stack development by asking the coding agent questions about the app, bouncing ideas off of it, planning together, and so on.
So yes, you need to have an idea of what you're doing if you want to build anything bigger than a cheap one shot throwaway project that sort of works, but brings no value and nobody is actually gonna use.
This is how it is right now, but at the same time AI coding agents have come an incredibly long way since 2022! I do think they will improve but it can't exactly know what you want to build. It's making an educated guess. An approximation of what you're asking it to do. You ask the same thing twice and it will have two slightly different results (assuming it's a big one shot).
This is the fundamental reality of LLMs, sort of like having a human walking (where we were before AI), a human using a car to get to places (where we are now) and FSD (this is future, look how long this took compared to the first cars).
einszwei
4 hours ago
> And so the paradox is, the LLMs are only useful† if you're Schwartz, and you can't become Schwartz by using LLMs.
I have gained a lot of benefit using LLMs in conjunction with textbooks for studying. So, I think LLMs could help you become Schwartz.
mezyt
3 hours ago
Profession (1957) by Isaac Asimov is relevant: https://news.ycombinator.com/item?id=46664195
everdrive
an hour ago
>And so the paradox is, the LLMs are only useful† if you're Schwartz
For so many workers, their companies just want them to produce bullshit. Their managers wouldn't frame it this way, but if their subordinates start producing work with strict intellectual rigor it's going to be an issue and the subordinates will hear about it.
So, you're not wrong. But the majority of LLM customers don't care and they just want to report success internally, and the product needs to be "just good enough." An LLM might produce a shitty webpage. So long as the page loads no on will ever notice or care that it's wrong in the way that a physics paper could be wrong.
grey-area
3 hours ago
Why use a tool that generates plausible garbage?
therealdrag0
2 hours ago
Because I’m skilled enough to use a tool that generates plausible garbage to be more productive than those who don’t use it at making non-garbage.
user
an hour ago
grey-area
2 hours ago
Are you sure you’re more productive?
Doesn’t sound like these tools should be used to write scientific papers for example and they seem to bamboozle people far more than help them.
Henchman21
2 hours ago
Because there is no appreciable difference between outputs. Most of the work that most of us do isn't important. It's busywork byproducts of making widgets that most people don't even need. So if your job is already pointless why not make it easier using LLMs?
grey-area
2 hours ago
Sounds a little sad. I think I’d rather find another job.
user
4 hours ago
thePhytochemist
3 hours ago
I totally agree - the article misses this point in a very conspicuous way. It suggests that Alice and Bob will both graduate at the same level.
What may well happen instead is that Bob publishes two papers. He then outcompetes Alice based on the insistence that others have on "publish or perish". Alice becomes unemployed and struggles, having been pushed out.
The person who puts the time and effort in doesn't just sit at the same level and they don't both just find decent employment. Competition happens and the authentic learning is considered a waste of time, which leads to real and often life threatening consequences (like being homeless after being unable to find employment).
iugtmkbdfil834
2 hours ago
<< authentic learning is considered a waste of time
This, I think, may be the more interesting bit. Steve Jobs anecdotally did caligraphy in school, which some would consider a waste of time, but Steve credited some of the stylistic choices to.
The question then becomes whether it will become an issue now or later. Having seen some of the output, I have no doubt that a lot can now be built by non-programmers ( including myself; I suppose I belong in the adjacent category ). The building blocks exist and as long as the problem was part of the initial training, odds are, LLM will help you build what you want.
It may not be perfect, safe, or optimized, but it may still be exactly what the user wanted to do. Now, the problems will start when those will, inevitably, move into production at big corps. In a sense, we have seen some interesting results of that in the past few weeks ( including accidental claude code release ).
In a grand scheme of things, not much is changing... except for speed of change. But are we quite ready for this?
leereeves
5 hours ago
> And so the paradox is, the LLMs are only useful† if you're Schwartz
Was the LLM even useful for Schwartz, if it produced false output?
cmiles74
4 hours ago
Maybe it saved them some time? So far the studies seem to lean toward probably the LLM didn't save them any time.
user
5 hours ago