extr
6 hours ago
What is with the negativity in these comments? This is a huge, huge surface area that touches a large percentage of white collar work. Even just basic automation/scaffolding of spreadsheets would be a big productivity boost for many employees.
My wife works in insurance operations - everyone she manages from the top down lives in Excel. For line employees a large percentage of their job is something like "Look at this internal system, export the data to excel, combine it with some other internal system, do some basic interpretation, verify it, make a recommendation". Computer Use + Excel Use isn't there yet...but these jobs are going to be the first on the chopping block as these integrations mature. No offense to these people but Sonnet 4.5 is already at the level where it would be able to replicate or beat the level of analysis they typically provide.
Scubabear68
2 hours ago
Having wrangled many spreadsheets personally, and worked with CFOs who use them to run small-ish businesses, and all the way up to one of top 3 brokerage houses world-wide using them to model complex fixed income instruments... this is a disaster waiting to happen.
Spreadsheet UI is already a nightmare. The formula editing and relationship visioning is not there at all. Mistakes are rampant in spreadsheets, even my own carefully curated ones.
Claude is not going to improve this. It is going to make it far, far worse with subtle and not so subtle hallucinations happening left and right.
The key is really this - all LLMs that I know of rely on entropy and randomness to emulate human creativity. This works pretty well for pretty pictures and creating fan fiction or emulating someone's voice.
It is not a basis for getting correct spreadsheets that show what you want to show. I don't want my spreadsheet correctness to start from a random seed. I want it to spring from first principles.
sothatsit
an hour ago
I don't think tools like Claude are there yet, but I already trust GPT-5 Pro to be more diligent about catching bugs in software than me, even when I am trying to be very careful. I expect even just using these tools to help review existing Excel spreadsheets could lead to a significant boost in quality if software is any guide (and Excel spreadsheets seem even worse than software when it comes to errors).
That said, Claude is still quite behind GPT-5 in its ability to review code, and so I'm not sure how much to expect from Sonnet 4.5 in this new domain. OpenAI could probably do better.
scosman
20 minutes ago
> all LLMs that I know of rely on entropy and randomness to emulate human creativity
Those are tuneable parameters. Turn down the temperature and top_p if you don't want the creativity.
> Claude is not going to improve this.
We can measure models vs humans and figure this out.
To your own point, humans already make "rampant" mistakes. With models, we can scale inference time compute to catch and eliminate mistakes, for example: run 6x independent validators using different methodologies.
One-shot financial models are a bad idea, but properly designed systems can probably match or beat humans pretty quickly.
th0ma5
15 minutes ago
> Turn down the temperature and top_p if you don't want the creativity.
This also reduces accuracy in real terms. The randomness is used to jump out of local minima.
noosphr
2 hours ago
My first job out of uni was building a spreadsheet infra as code version control system after a Windows update made an eight year old spreadsheet go haywire and lose $10m in a afternoon.
Spreadsheets are already a disaster.
daveguy
an hour ago
> Spreadsheets are already a disaster.
Yeah, that's what OP said. Now add a bunch of random hallucinations hidden inside formulas inside cells.
If they really have a good spreadsheet solution they've either fixed the spreadsheet UI issues or the LLM hallucination issues or both. My guess is neither.
extr
an hour ago
Is this just a feeling you have or is this downstream of actual use cases you've applied AI to observed and measured reliability on?
mbesto
43 minutes ago
Not the parent poster, but this is pretty much the foundation of LLMs. They are by their nature probabilistic, not deterministic. This is precisely what the parent is referring to.
lionkor
an hour ago
Not OP but using LLMs in any professional setting, like programming, editing or writing technical specifications, OP is correct.
Without extensive promoting and injectimg my own knowledge and experience, LLMs generate absolute unusable garbage (on average). Anyone who disagrees very likely is not someone who would produce good quality work by themselves (on average). That's not a clever quip; that's a very sad reality. SO MANY people cannot be bothered to learn anything if they can help it.
extr
an hour ago
I would completely disagree. I use LLMs daily for coding. They are quite far from AGI and it does not appear they are replacing Senior or Staff Engineers any time soon. But they are incredible machines that are perfectly capable of performing some economically valuable tasks in a fraction of the time it would have taken a human. If you deny this your head is in the sand.
lionkor
40 minutes ago
Capable, yeah, but not reliable, that's my point. They can one shot fantastic code, or they can one shot the code I then have to review and pull my hair out over for a week, because it's such crap (and the person who pushed it is my boss, for example, so I can't just tell him to try again).
That's not consistent.
wahnfrieden
26 minutes ago
You can ask your boss to submit PRs using Codex’s “try 5 variations of the same task and select the one you like most though
MattGaiser
an hour ago
> Mistakes are rampant in spreadsheets
To me, the case for LLMs is strongest not because LLMs are so unusually accurate and awesome, but because if human performance were put on trial in aggregate, it would be found wanting.
Humans already do a mediocre job of spreadsheets, so I don't think it is a given that Claude will make more mistakes than humans do.
lionkor
an hour ago
But isn't this only fine as long someone who knows what they are doing has oversight and can fix issues when they arise and Claude gets stuck?
Once we all forget how to write SUM(A:A), will we just invent a new kind of spreadsheet once Claude gets stuck?
Or in other words; what's the end game here? LLMs clearly cannot be left alone to do anything properly, so what's the end game of making people not learn anything anymore?
silenced_trope
39 minutes ago
> The key is really this - all LLMs that I know of rely on entropy and randomness to emulate human creativity. This works pretty well for pretty pictures and creating fan fiction or emulating someone's voice.
I think you need to turn down the temperature a little bit. This could be a beneficial change.
scoot
39 minutes ago
Or you could, you know, read the article before commenting to see the limited scope of this integration?
Anyway, Google has already integrated Gemini into Sheets, and recently added direct spreadsheet editing capability so your comment was disproven before you even wrote it
cube00
6 hours ago
I don't trust LLMs to do the kind of precise deterministic work you need in a spreadsheet.
It's one thing to fudge the language in a report summary, it can be subjective, however numbers are not subjective. It's widely known LLMs are terrible at even basic maths.
Even Google's own AI summary admits it which I was surprised at, marketing won't be happy.
Yes, it is true that LLMs are often bad at math because they don't "understand" it as a logical system but rather process it as text, relying on pattern recognition from their training data.
extr
5 hours ago
Seems like you're very confused about what this work typically entails. The job of these employees is not mental arithmatic. It's closer to:
- Log in to the internal system that handles customer policies
- Find all policies that were bound in the last 30 days
- Log in to the internal system that manages customer payments
- Verify that for all policies bound, there exists a corresponding payment that roughly matches the premium.
- Flag any divergences above X% for accounting/finance to follow up on.
Practically this involves munging a few CSVs, maybe typing in a few things, setting up some XLOOKUPs, IF formulas, conditional formatting, etc.
Will AI replace the entire job? No...but that's not the goal. Does it have to be perfect? Also no...the existing employees performing this work are also not perfect, and in fact sometimes their accuracy is quite poor.
AvAn12
2 hours ago
> “Does it have to be perfect?”
Actually, yes. This kind of management reporting is either (1) going to end up in the books and records of the company - big trouble if things have to be restated in the future or (2) support important decisions by leadership — who will be very much less than happy if analysis turns out to have been wrong.
A lot of what ties up the time of business analysts is ticking and tying everything to ensure that mistakes are not made and that analytics and interpretations are consistent from one period to the next. The math and queries are simple - the details and correctness are hard.
jacksnipe
16 minutes ago
Is this not belligerently ignoring the fact that this work is already done imperfectly? I can’t tell you how many serious errors I’ve caught in just a short time of automating the generation of complex spreadsheets from financial data. All of them had already been checked by multiple analysts, and all of them contained serious errors (in different places!)
extr
an hour ago
Speak for yourself and your own use cases. There are a huge diversity of workflows with which to apply automation in any medium to large business. They all have differing needs. Many excel workflows I'm personally familiar with already incoporate a "human review" step. Telling a business leader that they can now jump straight to that step, even if it requires 2x human review, with AI doing all of the most tediuous and low-stakes prework, is a clear win.
Revanche1367
19 minutes ago
>Speak for yourself and your own use cases
Take your own advice.
2b3a51
2 hours ago
There is another aspect to this kind of activity.
Sometimes there can be an advantage in leading or lagging some aspects of internal accounting data for a time period. Basically sitting on credits or debits to some accounts for a period of weeks. The tacit knowledge to know when to sit on a transaction and when to action it is generally not written down in formal terms.
I'm not sure how these shenanigans will translate into an ai driven system.
AvAn12
an hour ago
That’s the kind of thing that can get a company into a lot of trouble with its auditors and shareholders. Not that I am offering accounting advice of course. And yeah, one can not “blame” and ai system or try to ai-wash any dodgy practices.
Ntrails
4 hours ago
Checking someone elses spreadsheet is a fucking nightmare. If your company has extremely good standards it's less miserable because at least the formatting etc will be consistent...
The one thing LLMs should consistently do is ensure that formatting is correct. Which will help greatly in the checking process. But no, I generally don't trust them to do sensible things with basic formulation. Not a week ago GPT 5 got confused whether a plus or a minus was necessary in a basic question of "I'm 323 days old, when is my birthday?"
xmprt
4 hours ago
I think you have a misunderstanding of the types of things that LLMs are good at. Yes you're 100% right that they can't do math. Yet they're quite proficient at basic coding. Most Excel work is similar to basic coding so I think this is an area where they might actually be pretty well suited.
My concern would be more with how to check the work (ie, make sure that the formulas are correct and no columns are missed) because Excel hides all that. Unlike code, there's no easy way to generate the diff of a spreadsheet or rely on Git history. But that's different from the concerns that you have.
Wowfunhappy
an hour ago
> Yes you're 100% right that they can't do math.
The model ought to be calling out to some sort of tool to do the math—effectively writing code, which it can do. I'm surprised the major LLM frontends aren't always doing this by now.
mapt
an hour ago
So do it in basic code where numbering your line G53 instead of G$53 doesn't crash a mass transit network because somebody's algorithm forgot to order enough fuel this month.
collingreen
4 hours ago
I've built spreadsheet diff tools on Google sheets multiple times. As the needs grows I think we will see diffs and commits and review tools reach customers
break_the_bank
3 hours ago
hey Collin! I am working on an AI agent on Google Sheets, I am curious if any of your designs are out in the public. We are trying to re-think how diffs should look like and want to make something nicer than what we currently have, so curious.
alfalfasprout
an hour ago
proficient != near-flawless.
> Most Excel work is similar to basic coding so I think this is an area where they might actually be pretty well suited.
This is a hot take. One I'm not sure many would agree with.
koliber
3 hours ago
Maybe LLMs will enable a new type of work in spreadsheets. Just like in coding we have PR reviews, with an LLM it should be possible to do a spreadsheet review. Ask the LLM to try to understand the intent and point out places where the spreadsheet deviates from the intent. Also ask the LLM to narrate the spreadsheet so it can be understood.
Insanity
3 hours ago
That first condition "try to understand the intent" is where it could go wrong. Maybe it thinks the spreadsheet aligns with the intent, but it misunderstood the intent.
LLMs are a lossy validation, and while they work sometimes, when they fail they usually do so 'silently'.
monkeydust
an hour ago
Maybe we need some kind of method, framework to develop intent. Most of things that go wrong in knowledge working are down to lack of common understanding of intent.
runarberg
3 hours ago
> The one thing LLMs should consistently do is ensure that formatting is correct.
In JavaScript (and I assume most other programming languages) this is the job of static analysis tools (like eslint, prettier, typescript, etc.). I’m not aware of any LLM based tools which performs static analysis with as good a results as the traditional tools. Is static analysis not a thing in the spreadsheet world? Are there the tools which do static analysis on spreadsheets subpar, or offer some disadvantage not seen in other programming languages? And if so, are LLMs any better?
eric-burel
2 hours ago
Just use a normal static analysis tool and shove the result to an LLM. I believe Anthropic properly figured that agents are the key, in addition to models, contrary to OpenAI that is run by a psycho that only believes in training the bigger model.
dpoloncsak
4 hours ago
Sysadmin of a small company. I get asked pretty often to help with a pivot table, vlookup, or just general excel functions (and smartsheet, these users LOVE smartsheet)
toomuchtodo
3 hours ago
Indeed, in a small enough org, the sysadmin/technologist becomes support of last resort for all the things.
JumpCrisscross
3 hours ago
> these users LOVE smartsheet
I hate smartsheet…
Excel or R. (Or more often, regex followed by pen and paper followed by more regex.)
lossolo
4 hours ago
Last time, I gave claude an invoice and asked it to change one item on it, it did so nicely and gave me the new invoice. Good thing I noticed it had also changed the bank account number..
The more complicated the spreadsheet and the more dependencies it has, the greater the room for error. These are probabilistic machines. You can use them, I use them all the time for different things, but you need to treat them like employees you can't even trust to copy a bank account number correctly.
onion2k
7 minutes ago
Something that Claude Sonnet does when you use it to code is write scripts to test whether or not something is working. If it does that for Excel (e.g. some form of verification) it should be fine.
Besides, using AI is an exercise in a "trust but verify" approach to getting work done. If you asked a junior to do the task you'd check their output. Same goes for AI.
mikeyouse
4 hours ago
We’ve tried to gently use them to automate some of our report generation and PDF->Invoice workflows and it’s a nightmare of silent changes and absence of logic.. basic things like specifically telling it “debits need to match credits” and “balance sheets need to balance” that are ignored.
wholinator2
2 hours ago
Yeah, asking llm to edit one specific thing in a large or complex document/ codebase is like those repeated "give me the exact same image" gifs. It's fundamentally a statistical model so the only thing we can be _certain_ of is that _it's not_. It might get the desired change 100% correct but it's only gonna get the entire document 99 5%
jay_kyburz
38 minutes ago
>Does it have to be perfect? Also no.
Yeah, but it could be perfect, why are there humans in the loop at all? That is all just math!
next_xibalba
2 hours ago
The use cases for spreadsheets are much more diverse than that. In my experience, spreadsheets just as often used for calculation. Many of them do require high accuracy, rely on determinism, and necessitate the understanding of maths ranging from basic arithmetic to statistics and engineering formulas. Financial models, for example, must be built up from ground truth and need to always use the right formulas with the right inputs to generate meaningful outputs.
I have personally worked with spreadsheet based financial models that use 100k+ rows x dozens of columns and involve 1000s of formulas that transform those data into the desired outputs. There was very little tolerance for mistakes.
That said, humans, working in these use cases, make mistakes >0% of the time. The question I often have with the incorporation of AI into human workflows is, will we eventually come to accept a certain level of error from them in the way we do for humans?
onion2k
12 minutes ago
It's widely known LLMs are terrible at even basic maths.
Claude for Excel isn't doing maths. It's doing Excel. If the llm is bad at maths then teaching it to use a tool that's good at maths seems sensible.
Kiro
an hour ago
Most real-world spreadsheets I've worked with were fragile and sloppy, not precise and deterministic. Programmers always get shocked when they realize how many important things are built on extremely messy spreadsheets, and that people simply accept it. They rather just spend human hours correcting discrepancies than trying to build something maintainable.
mbreese
2 hours ago
I don’t see the issue so much as the deterministic precision of an LLM, but the lack of observability of spreadsheets. Just looking at two different spreadsheets, it’s impossible to see what changes were made. It’s not like programming where you can run a `git diff` to see what changes an LLM agent made to a source code file. Or even a word processing document where the text changes are clear.
Spreadsheets work because the user sees the results of complex interconnected values and calculations. For the user, that complexity is hidden away and left in the background. The user just sees the results.
This would be a nightmare for most users to validate what changes an LLM made to a spreadsheet. There could be fundamental changes to a formula that could easily be hidden.
For me, that the concern with spreadsheets and LLMs - which is just as much a concern with spreadsheets themselves. Try collaborating with someone on a spreadsheet for modeling and you’ll know how frustrating it can be to try and figure out what changes were made.
laweijfmvo
5 hours ago
I don't trust humans to do the kind of precise deterministic work you need in a spreadsheet!
baconbrand
5 hours ago
Right, we shouldn’t use humans or LLMs. We should use regular deterministic computer programs.
For cases where that is not available, we should use a human and never an LLM.
davidpolberger
2 hours ago
I like to use Claude Code to write deterministic computer programs for me, which then perform the actual work. It saves a lot of time.
I had a big backlog of "nice to have scripts" I wanted to write for years, but couldn't find the time and energy for. A couple of months after I started using Claude Code, most of them exist.
baconbrand
2 hours ago
That’s great and the only legitimate use case here. I suspect Microsoft will not try to limit customers to just writing scripts and will instead allow and perhaps even encourage them to let the AI go ham on a bunch of raw data with no intermediary code that could be reviewed.
Just a suspicion.
extr
4 hours ago
"regular deterministic computer programs" - otherwise known as the SUM function in Microsoft Excel
brookst
an hour ago
Do you trust humans to be precise and deterministic, or even to be especially good at math?
This is talking about applying LLMs to formula creation and references, which they are actually pretty good at. Definitely not about replacing the spreadsheet's calculation engine.
bg24
4 hours ago
"I don't trust LLMs to do the kind of precise deterministic work" => I think LLM is not doing the precise arithmetic. It is the agent with lots of knowledge (skills) and tools. Precise deterministic work is done by tools (deterministic code). Skills brings domain knowledge and how to sequence a task. Agent executes it. LLM predicts the next token.
game_the0ry
2 hours ago
> I don't trust LLMs to do the kind of precise deterministic work you need in a spreadsheet.
I was thinking along the same lines, but I could not articulate as well as you did.
Spreadsheet work is deterministic; LLM output is probabilistic. The two should be distinguished.
Still, its a productivity boost, which is always good.
doug_durham
5 hours ago
Sure, but this isn't requiring that the LLM do any math. The LLM is writing formulas and code to do the math. They are very good at that. And like any automated system you need to review the work.
causal
4 hours ago
Exactly, and if it can be done in a way that helps users better understand their own spreadsheets (which are often extremely complex codebases in a single file!) then this could be a huge use case for Claude.
MangoCoffee
an hour ago
LLMs are just a tool, though. Humans still have to verify them, like with very other tools out there
A4ET8a8uTh0_v2
an hour ago
Eh, yes. In theory. In practice, and this is what I have experienced personally, bosses seem to think that you now have interns so you should be able to do 5x the output.. guess what that means. No verification or rubber stamp.
chpatrick
3 hours ago
They're not great at arithmetic but at abstract mathematics and numerical coding they're pretty good actually.
sdeframond
2 hours ago
> I don't trust LLMs to do the kind of precise deterministic work you need in a spreadsheet.
Rightly so! But LLMs can still make you faster. Just don't expect too much from it.
mhh__
3 hours ago
If LLMs can replace mathematica for me when I'm doing affine yield curve calculations they can do a DCF for some banker idiots
informal007
2 hours ago
you might trust when the precision is extremely high and others agree with that.
high precision is possible because they can realize that by multiple cross validations
prisonguard
2 hours ago
ChatGPT is actively being used as a calculator.
zarmin
4 hours ago
>I don't trust LLMs to do the kind of precise deterministic work
not just in a spreadsheet, any kind of deterministic work at all.
find me a reliable way around this. i don't think there is one. mcp/functions are a band aid and not consistent enough when precision is important.
after almost three years of using LLMs, i have not found a single case where i didn't have to review its output, which takes as long or longer than doing it by hand.
ML/AI is not my domain, so my knowledge is not deep nor technical. this is just my experience. do we need a new architecture to solve these problems?
baconbrand
3 hours ago
ML/AI is not my domain but you don’t have to get all that technical to understand that LLMs run on probability. We need a new architecture to solve these problems.
mrcwinn
5 hours ago
I couldn’t agree more. I get all my perfectly deterministic work output from human beings!
goatlover
5 hours ago
If only we had created some device that could perform deterministic calculations and then wrote software that made it easy for humans to use such calculations.
bryanrasmussen
4 hours ago
ok but humans are idiots, if only we could make some sort of Alternate Idiot, a non-human but every bit as generally stupid as humans are! This A.I would be able to do every stupid thing humans did with the device that performed deterministic calculations only many times faster!
baconbrand
3 hours ago
Yes and when the AI did that all the stupid humans could accept its output without question. This would save the humans a lot of work and thought and personal responsibility for any mistakes! See also Israel’s Lavender for an exciting example of this in action.
burnte
11 minutes ago
> What is with the negativity in these comments?
A lot of us have seen the effects of AI tools in the hands of people who don't understand how or why to use the tools. I've already seen AI use/misuse get two people fired. One was a line-of-business employee who relied on output without ever checking it, got herself into a pretty deep hole in 3 weeks. Another was a C suite person who tried to run an AI tool development project and wasted double their salary in 3 months, nothing to show for it but the bill, fired.
In both cases the person did not understand the limits of the tools and kept replacing facts with their desires and their own misunderstanding of AI. The C suite person even tried to tell a vendor they were wrong about their own product because "I found out from AI".
AI right now is fireworks. It's great when you know how to use it, but if you half-ass it you'll blow your fingers off very easily.
atleastoptimal
an hour ago
HN has a base of strong anti-AI bias, I assume is partially motivated by insecurity over being replaced, losing their jobs or having missed the boat on the AI.
lionkor
an hour ago
I use AI every day. Without oversight, it does not work well.
If it doesn't work well, I will do it myself, because I care that things are done well.
None of this is me being scared of being replaced; quite the opposite. I'm one of the last generations of programmers who learned how to program and can debug and fix the mess your LLM leaves behind when you forgot to add "make sure it's a clean design and works" to the prompt.
Okay, that's maybe hyperbole, but sadly only a little bit. LLMs make me better at my job, they don't replace me.
extr
an hour ago
Based on the comments here, it's surprisingly anything in society works at all. I didn't realize the bar was "everything perfect every time, perfectly flexible and adaptable". What a joy some of these folks must be to work with, answering every new technology with endless reasons why it's worthless and will never work.
jay_kyburz
41 minutes ago
I think perhaps you underestimate how antithetical the current batch of LLM AI's is to what most programmers strive for every day, and what we want from our tools. Its not about losing our job, its about "correctness". (or as said below - deterministic)
In a lot of jobs, particularly in creative industries, or marketing, media and writing, the definition of a job well done is a fairly grey area. I think AI will be mostly disruptive in these areas.
But in programming there is a hard minimum of quality. Given a set of inputs, does the program return the correct answer or not? When you ask it what 2+2, do you get 4?
When you ask AI anything, it might be right 50% of the time, or 70% of the time, but you can't blindly trust the answer. A lot of us just find that not very useful.
sothatsit
22 minutes ago
I really don’t think this is accurate. I think the median opinion here is to be suspicious of claims made about AI, and I don’t think that’s necessarily a bad thing. But I also regularly see posts talking about AI positively (e.g. simonw), or talking about it negatively. I think this is a good thing, it is nice to have a diversity of opinions on a technology. It's a feature, not a bug.
hypeatei
an hour ago
> HN has a base of strong anti-AI bias
Quite the opposite, actually. You can always find five stories on the front page about some AI product or feature. Meanwhile, you have people like yourself who convince themselves that any pushback is done by people who just don't see the true value of it yet and that they're about to miss out!! Some kind of attempt at spreading FOMO, I guess.
MattGaiser
an hour ago
HN has an obsession with quality too, which has merit, but is often economically irrelevant.
When US-East-1 failed, lots of people talked about how the lesson was cloud agnosticism and multi cloud architecture. The practical economic lesson for most is that if US-East-1 fails, nobody will get mad at you. Cloud failure is viewed as an act of god.
mceoin
an hour ago
I second this. Spreadsheets are the primary tool used for 15% of the U.S. economy. Productivity improvements will affect hundreds of millions of users globally. Each increment in progress is a massive time save and value add.
The criticisms broadly fall between "spreadsheets are bad" and "AI will cause more trouble than it solves".
This release is a dot in a trend towards everyone having a Goldman-Sachs level analyst at their disposal 24/7. This is a huge deal for the average person or business. Our expectation (disclaimer: I work in this space) is that spreadsheet intelligence will soon be a solved problem. The "harder" problem is the instruction set and human <> machine prompting.
For the "spreadsheets are bad" crowd -- sure, they have problems, but users have spoken and they are the preferred interface for analysis, project management and lightweight database work globally. All solutions to "the spreadsheet problem" come with their own UX and usability tradeoffs, so it'a a balance.
Congrats to the Claude team and looking forward to the next release!
mapt
an hour ago
The vast majority of people in business and science are using spreadsheets for complex algorithmic things they weren't really designed for, and we find a metric fuckton of errors in the sheets when you actually bother looking auditing them, mistakes which are not at all obvious without troubleshooting by... manually checking each and every cell & cell relation, peering through parentheses, following references. It's a nightmare to troubleshoot.
LLMs specialize in making up plausible things with a minimum of human effort, but their downside is that they're very good at making up plausible things which are covertly erroneous. It's a nightmare to troubleshoot.
There is already an abject inability to provision the labor to verify Excel reasoning when it's composed by humans.
I'm dead certain that Claude will be able to produce plausibly correct spreadsheets. How important is accuracy to you? How life-critical is the end result? What are your odds, with the current auditing workflow?
Okay! Now! Half of the users just got laid off because management thinks Claude is Good Enough. How about now?
practice9
an hour ago
LLMs are getting quite good at reviewing the results and implementations, though
lionkor
an hour ago
Not really, they're only as good as their context and they do miss and forget important things. It doesn't matter how often, because they do, and they will tell you with 100% confidence and with every synonym of "sure" that they caught it all. That's the issue.
pavel_lishin
5 hours ago
My concern is that my insurance company will reject a claim, or worse, because of something an LLM did to a spreadsheet.
Now, granted, that can also happen because Alex fat-fingered something in a cell, but that's something that's much easier to track down and reverse.
manquer
4 hours ago
They already doing that with AI, rejecting claims at higher numbers than before .
Privatized insurance will always find a way to pay out less if they could get away with it . It is just nature of having the trifecta of profit motive , socialized risk and light regulation .
smithkl42
3 hours ago
If you think that insurance companies have "light regulation", I shudder to think of what "heavy regulation" would look like. (Source: I'm the CTO at an insurance company.)
manquer
2 hours ago
Light did not mean to imply quantity of paperwork you have to do, rather are you allowed to do the things you want to do as a company.
More compliance or reporting requirements usually tend to favor the larger existing players who can afford to do it and that is also used to make the life difficult and reject more claims for the end user.
It is kind of thing that keeps you and me busy, major investors don't care about it all, the cost of the compliance or the lack is not more than a rounding number in the balance, the fines or penalties are puny and laughable.
The enormous profits year on year for decades now, the amount of consolidation allowed in the industry show that the industry is able to do mostly what they want pretty much, that is what I meant by light regulation.
smithkl42
an hour ago
I'm not sure we're looking at the same industry. Overall, insurance company profit margins are in the single digits, usually low single digits - and in many segments, they're frequently not profitable at all. To take one example, 2024 was the first profitable year for homeowners insurance companies since 2019, and even then, the segment's entire profit margin was 0.3% (not 3% - 0.3%).
https://riskandinsurance.com/us-pc-insurance-industry-posts-...
lotsofpulp
2 hours ago
They have too much regulation, and too little auditing (at least in the managed healthcare business).
nxobject
an hour ago
I agree, and I can see where it comes from (at least at the state level). The cycle is: bad trend happens that has deep root causes (let's say PE buying rural hospitals because of reduced Medicaid/Medicare reimbursements); legislators (rightfully) say "this shouldn't happen", but don't have the ability to address the deep root causes so they simply regulate healthcare M&As – now you have a bandaid on a problem that's going to pop up elsewhere.
lotsofpulp
an hour ago
I mean even in the simple stuff like denying payment for healthcare that should have been covered. CMS will come by and out a handful of cases, out of millions, every few years.
So obviously the company that prioritizes accuracy of coverage decisions by spending money on extra labor to audit itself is wasting money. Which means insureds have to waste more time getting the payment for healthcare they need.
JumpCrisscross
3 hours ago
> They already doing that with AI, rejecting claims at higher numbers than before
Source?
nartho
2 hours ago
Haven't risk based models been a thing for the last 15-20 years ?
philipallstar
3 hours ago
> It is just nature of having the trifecta of profit motive , socialized risk and light regulation.
It's the nature of everything. They agree to pay you for something. It's nothing specific to "profit motive" in the sense you mean it.
manquer
2 hours ago
I should have been clearer - profit maximization above all else as long it is mostly legal. Neither profit or profit maximization at all cost is nature of everything .
There are many other entity types from unions[1], cooperatives , public sector companies , quasi government entities, PBC, non profits that all offer insurance and can occasionally do it well.
We even have some in the US and don’t think it is communism even - like the FDIC or things like social security/ unemployment insurance.
At some level government and taxation itself is nothing but insurance ? We agree to paying taxes to mitigate against variety of risks including foreign invasion or smaller things like getting robbed on the street.
[1] Historically worker collectives or unions self-organized to socialize the risks of both major work ending injuries or death.
Ancient to modern armies operate on because of this insurance the two ingredients that made them not mercenaries - a form of long term insurance benefit (education, pension, land etc) or family members in the event of death and sovereign immunity for their actions.
jimbokun
3 hours ago
Couldn't they accomplish the same thing by rejecting a certain percentage of claims totally at random?
manquer
2 hours ago
That would be illegal though, the goal is do this legally after all.
We also have to remember all claims aren't equal. i.e. some claims end up being way costlier than others. You can achieve similar % margin outcomes by putting a ton of friction like, preconditions, multiple appeals processes and prior authorization for prior authorization, reviews by administrative doctors who have no expertise in the field being reviewed don't have to disclose their identity and so and on.
While U.S. system is most extreme or evolved, it is not unique, it is what you get when you end up privatize insurance any country with private insurance has some lighter version of this and is on the same journey .
Not that public health system or insurance a la NHS in UK or like Germany work, they are underfunded, mismanaged with long times in months to see a specialist and so on.
We have to choose our poison - unless you are rich of course, then the U.S. system is by far the best, people travel to the U.S. to get the kind of care that is not possible anywhere else.
nxobject
an hour ago
> While U.S. system is most extreme or evolved, it is not unique, it is what you get when you end up privatize insurance any country with private insurance has some lighter version of this and is on the same journey .
I disagree with the statement that healthcare insurance is predominantly privatized in the US: Medicare and Medicaid, at least in 2023, outspent private plans for healthcare spending by about ~10% [1]; this is before accounting for government subsidies for private plans. And boy, does America have a very unique relationship with these programs.
https://www.healthsystemtracker.org/chart-collection/u-s-spe...
jimbokun
an hour ago
Why does saying "AI did it" make it legal, if the outcome is the same?
keernan
3 hours ago
>>They already doing that with AI, rejecting claims at higher numbers than before .
That's a feature, not a bug.
elpakal
2 hours ago
This is a great application of this quote. Insurance providers have 0 incentive to make their AI "good" at processing claims, in fact it's easy to see how "bad" AI can lead to a justification to deny more claims.
wombatpm
2 hours ago
Wait until a company has to restate earnings because of a bug in a Claudified Excel spreadsheet.
pluc
2 hours ago
Anthropic now has all your company's data, and all you saved was the cost of one human minus however much they charge for this. The good news is it can't have your data again! So starting from the 163rd-165th person you fire, you start to see a good return and all you've sacrificed is exactitude, precision, judgement, customer service and a little bit of public perception!
A4ET8a8uTh0_v2
an hour ago
It is bad in a very specific sense, but I did not see any other comments express the bad parts instead of focusing merely on the accuracy part ( which is an issue, but not the issue ):
- this opens up ridiculous flood of data that would otherwise be semi-private to one company providing this service - this works well small data sets, but will choke on ones it will need to divvy up into chunks inviting interesting ( and yet unknown ) errors
There is a real benefit to being able to 'talk to data', but anyone who has seen corporate culture up close and personal knows exactly where it will end.
edit: an i saying all this as as person, who actually likes llms.
trollbridge
44 minutes ago
The biggest problem with spreadsheets is that they tend to be accounts for the accumulation of technical debt, which is an area that AI tools are not yet very good at retiring, but very good at making additional withdrawals from.
gadders
5 hours ago
Yeah, this could be a pretty big deal. Not everyone is an excel expert, but nearly everyone finds themselves having to work with data in excel at some time or other.
threetonesun
2 hours ago
Probably because many people here are software developers, and wrapping spreadsheets in deterministic logic and a consistent UI covers... most software use cases.
lacker
2 hours ago
It's like the negativity whenever a post talks about hiring or firing. A lot of people are afraid that they are going to lose their jobs to AI.
hbarka
4 hours ago
What does scaffolding of spreadsheets mean? I see the term scaffolding frequently in the context of AI-related articles and not familiar with this method and I’m hesitant to ask an LLM.
Rudybega
4 hours ago
Scaffolding typically just refers to a larger state machine style control flow governing an agent's behavior and the suite of external tools it has access to.
BuildItBusk
3 hours ago
I have to admit that my first thought was “April’s fool”. But you are right. It makes a lot of sense (if they can get it to work well). Not only is Excel the world’s biggest “programming language”. It’s probably also one of the most unintuitive ways to program.
protonbob
3 hours ago
> but these jobs are going to be the first on the chopping block as these integrations mature.
Perhaps this is part of the negativity? This is a bad thing for the middle class.
jpadkins
2 hours ago
in the short run. In the long run, productivity gains benefit* all of us (in a functional market economy).
*material benefit. In terms of spirit and purpose, the older I get the more I think maybe the Amish are on to something. Work gives our lives purpose, and the closer the work is to our core needs, the better it feels. Labor saving so that most of us are just entertaining each other on social networks may lead to a worse society (but hey, our material needs are met!)
informal007
2 hours ago
agree with you, but it cannot be stopped. development of technology always makes wealth distribution more centralized
informal007
2 hours ago
this will push the development of open source models.
people think of privacy at first regards of data, local deployment of open source models are the first choice for them
tokai
3 hours ago
Whats with claiming negativity when most of the comments here are positive?
Workaccount2
2 hours ago
I think excel is a dead end. LLM agents will probably greatly prefer SQL, sqlite, and Python instead of bulky made-for-regular-folks excel.
Versatility and efficiency explode while human usability tanks, but who cares at that point?
informal007
2 hours ago
Database might be the future, but viable solution on excel are evidence to prove that it works
intended
5 hours ago
I used to live in excel.
The issue isn’t in creating a new monstrosity in excel.
The issue is the poor SoB who has to spelunk through the damn thing to figure out what it does.
Excel is the sweet spot of just enough to be useful, capable enough to be extensible, yet gated enough to ensure everyone doesn’t auto run foreign macros (or whatever horror is more appropriate).
In the simplest terms - it’s not excel, it’s the business logic. If an excel file works, it’s because theres someone who “gets” it in the firm.
extr
5 hours ago
I used to live in Excel too. I've trudged through plenty of awful worksheets. The output I've seen from AI is actually more neatly organized than most of what I used to receive in outlook. Most of that wasn't hyper-sophisticated cap table analyses. It was analysis from a Jr Analyst or line employee trying to combine a few different data sources to get some signal on how XYZ function of the business was performing. AI automation is perfectly suitable for this.
intended
4 hours ago
How?
Neat formatting didn't save any model from having the wrong formula pasted in.
Being neat was never a substitute for being well rested, or sufficiently caffeinated.
Have you seen how AI functions in the hands of someone who isn't a domain expert? I've used it for things I had no idea about, like Astro+ web dev. User ignorance was magnified spectacularly.
This is going to have Jr Analysts dumping well formatted junk in email boxes within a month.
gedy
5 hours ago
It's actually really cool. I will say that "spreadsheets" remain a bandaid over dysfunctional UIs, processes, etc and engineering spends a lot of time enabling these bandaids vs someone just saying "I need to see number X" and not "a BI analytics data in a realtime spreadsheet!", etc.
doctorpangloss
5 hours ago
> What is with the negativity in these comments?
Some people - normal people - understand the difference between the holistic experience of a mathematically informed opinion and an actual model.
It's just that normal people always wanted the holistic experience of an answer. Hardly anyone wants a right answer. They have an answer in their heads, and they want a defensible journey to that answer. That is the purpose of Excel in 95% of places it is used.
Lately people have been calling this "syncophancy." This was always the problem. Sycophancy is the product.
Claude Excel is leaning deeply into this garbage.
behnamoh
4 hours ago
> How teams use Claude for Excel
Who are these teams that can get value from Anthropic? One MCP and my context window is used up and Claude tells me to start a new chat.