Havoc
a day ago
Not sure what these guys are studying but can tell you in the real world - essentially zero AI rollout in accounting world for anything serious.
We've got access to some fancy enterprise copilot version, deep research, MS office integration and all that jazz. I use it diligently every day...to make me a summary of today's global news.
When I try to apply it to actual accounting work. It hallucinates left, right & center on stuff that can't be wrong. Millions and millions off. That's how you get the taxman to kick down your door. Even simple "are these two numbers the same" get false positives so often that it's impossible to trust. So now I've got a review tool that I can't trust the output of? It's like a programming language where the equality (==) symbol has a built in 20% random number generator and you're supposed to write mission critical code with it.
coffeefirst
a day ago
I keep trying to get it to review my personal credit card statements. I have my own budget tracking app that I made, and sometimes there's discrepancies. Resolving this by hand is annoying, and an LM should be able to do it: scrape the PDF, compare the records to mine, find the delta.
I've tried multiple models over the course of 6 months. Yesterday it told me I made a brilliant observation, but it hasn't managed to successfully pin down a single real anomaly. Once it told me the charges were Starbucks, when I had not been to a Starbucks—it's just that Starbucks is a probable output when analyzing credit card statements.
And I'm only dealing with a list of 40 records that I can check by hand, with zero consequences if I get it wrong beyond my personal budgeting being off by 1%.
I can't imagine trusting any business that leans on this for inappropriate jobs.
phkahler
a day ago
>> I keep trying to get it to review my personal credit card statements. I have my own budget tracking app that I made, and sometimes there's discrepancies. Resolving this by hand is annoying, and an LM should be able to do it: scrape the PDF, compare the records to mine, find the delta.
This is a perfect example of what people don't understand (or on HN keep forgetting). LLMs do NOT follow instructions, they predict the next word in text and spit it out. The process is somewhat random, and certainly does not include an interpreter (executive function?) to execute instructions - even natural language instructions.
coffeefirst
a day ago
Agreed. I keep trying stuff because I feel like I’m missing whatever magic people are talking about.
So far, I’ve found nothing of value besides natural language search.
balder1991
13 hours ago
Yeah, if you go to a subreddit like ClaudeAI, you convince yourself there’s something you don’t know because they keep telling people it’s all their prompt faults if the LLM isn’t turning them into billionaires.
But then you read more of the comments and you see it’s really different interpretations from different people. Some “prompt maximalists” believe that perfect prompting is the key to unlocking the model's full potential, and that any failure is a user error. They tend to be the most vocal and create a sense that there's a hidden secret or a "magic formula" you're missing.
Jensson
9 hours ago
Its basically making a stone soup, people wont believe it can be done, but then put a stone in water and boil it, and tell people if you aren't getting a nice soup you aren't doing it right, just put in all these other ingredients that aren't required but really helps and you get this awesome soup!
Then someone say that isn't stone soup, they just did all the work without the stone! But that is just a stone hater, how can you not see this awesome soup made by the stone?
canonistically
an hour ago
I think it's more like lottery winners giving "buy lottery tickets" as financial advice.
It is clear at this point that any meaning found in LLM outputs is projected there by the user. Some people by virtue of several intertwined factors can get some acceleration out of them, but most can't. It becomes like a football fan convinced that their rituals are essential for the team's victory.
Add the general very low understanding of machine learning (or even basic formal logic) and people go from a realistic emulation of conversations to magical thinking about having a mind in a box.
Either that or I am taking crazy pills. Because sometimes it feels like that.
cyrialize
a day ago
There's a very fun video about accounting by Dan Toomey [0] that I think really drives home the point that accounting is:
1) Extremely important
2) Not that glamorous
I always think of accountants as the "nerds" of the finance world. I say this lovingly - I think in another life I would have become an accountant. I find it very fascinating. I worked at a company that worked with auditing datasets, so I knew much more about accounting that I would have otherwise.
Nobody ever wants to listen to accountants because they either are giving you bad news, or telling you the things that you should be doing. No one can deny how important they are, despite how much it seems like everyone wants to get rid of them.
An accounting story I love is how my old company got a lot of business because of Enron. Part of the reason that Enron was caught was due to their audit fees.
Their audit fees were reporting that Arthur Andersen was charging for a huge percentage of non-auditing work (audit fees report what percentage was auditing related and not). This was a huge red flag.
My company was the only one at the time that kept track of audit fees, and so a huge number of people paid to access that data stream.
If one day I quit programming, maybe I'll get my CPA.
Havoc
16 hours ago
Yeah the boring part is definitely true. It is a good path to a reasonably high paycheque with bulletproof job security. To take your Enron example - even when it turned into a smoking crater and all business stopped they still had accountants working on the wreckage years later.
Very nearly went programming (now do that as a hobby). Still not sure how I feel about that choice, especially around mental stimulation. But if we're about to hit a recession/depression then it's not a bad place to be. The space I'm in has future revenues locked in for 10+ years.
1vuio0pswjnm7
a day ago
"...can tell you in the real world - essentially zero AI rollout in accounitng world for anything serious."
The jobs the reseearchers concluded were affected were "unregulated" ones where there are no college education or professional certification requirements, e.g.,
receptionists
translators
software "engineers"
"Not sure what these guys are studying..."Apparently, they studied payroll data from ADP on age, job title and headcount together with, who would have guessed, data from an AI company (Anthropic)
https://digitaleconomy.stanford.edu/publications/canaries-in...
This study has not been peer-reviewed
0xdde
a day ago
It should also be noted that there are some pretty big flaws in the analysis. They mention "the distribution of firms using ADP services does not exactly match the distribution of firms across the broader US economy," but make no attempt to adjust their analysis for it. They also drop 30% of the data for which there is no job title recorded. With such a skewed sample, it's hard to tell how the analysis is supposed to generalize.
tracker1
21 hours ago
This seems to assign a cause without merit other than AI is the buzzword of the day.
Receptionist jobs down.. this has been the case for a while, and does include some AI, but AVR and general speech recognition and matching has gotten pretty good for a while, I'd say AI is half a step back.
Translators, maybe AI, but again, speech recognition in general is pretty good and not strictly an AI thing... but I'll give that one credit.
Software Engineers, maybe it's more about the (I don't even know the right current FAANG acronym anymore) companies that have laid off tens of thousands in the past few years and largely replaced (if at all) them with either contract or h1b workers? Only to grow short term margins on already profitable companies.
ecshafer
a day ago
There seems to be this dream of Tax AI Software that will just do all of the taxes. But other than using AI as a fancy text search, I don't see it happening for a long long time. LLMs can't do arithmetic or count.
Havoc
a day ago
Yeah - classifying an invoice into building rent or say printer ink it'll have some success. So we'll see some of it at the very bottom end.
>LLMs can't do arithmetic or count.
Yes. The fancy copilot stuff does use pandas/python to look at excel files so stuff like add up a table does work sometimes, but the parameters going into the pandas code need to make sense too in the garbage in garbage out sense. The base LLM doesn't seem to understand the grid nature of Excel so it ends up looking at the wrong cells or misunderstands how headings relate to the numbers etc.
It'll get better but there doesn't seem to be the equivalent of "use LLM to write boilerplate code" in this world.
rwmj
21 hours ago
We use Concur (SAP? expenses software), and it can scan your paper receipts and fill in the fields for you. I'd say it's about 30% accurate. Occasionally it'll be incredible. But mostly you end up having to manually adjust fields. It even gets categories completely wrong, like classifying a train ticket as a phone bill. All this means you spend a lot of time checking everything. It'd be hard for me to say honestly that it saves any time, and probably it takes a bit more time.
rogerkirkness
a day ago
It is profoundly bad at accounting. But with a calculator tool, it works okay for math.
toss1
a day ago
Yup, using AI for any serious tax calculation or even advice is a REALLY BAD idea.
A close relative is a top expert in US Trust & Estate Tax law working at a well-known BigLaw firm. Of course they have substantial AI initiatives, integration with their system, mandatory training, etc.
She finds tha AI marginally useful for some things, but overall not very much and there are serious errors, particularly the types of errors only a top expert would catch.
One of the big examples is that in the world of T&E law, there are a lot of mediocre (to be kind) attorneys who claim expertise but are very bad at it (causing a lot of work for the more serious firms and a lot of costs & losses for the intended heirs). The mediocre-minus attorneys of course also write blogs and papers to market themselves, often in greater volume than the top experts. Many of these blogs/papers are seriously WRONG, as in giving the exact opposite of the right advice.
Everyone here sees where this is going. The AI has zero ability to reason or figure out which parts of its training input are from actual top experts and which are dreck. The AI can not reason, and can not even validly check their 'thinking' against existing tax code (which is massive), or the regulations and rulings (which are orders of magnitude more massive). So, the AI gives advice that is confident, cheerful, and WRONG.
Worse yet, the LLM's advice is wrong in ways only a top expert would know, and in ways that will massively screw the heirs. But the errors will likely only be discovered decades later, when it is too late to fix.
Seriously, do NOT use LLMs for tax advice, unless you are also consulting a TOP professional. And skipping the LLM part is best.
My relative is quite frustrated and annoyed by the whole thing, which should be more helpful with these massive code/regs/rulings, but finds it often more work than just using the standard WestLaw/Lexis legal database searches.
Balgair
a day ago
Aisde: Hey, whats the prompt you're using for a summary of the news events?
Havoc
16 hours ago
Nothing super sophisticated - the key part is telling it the categories I want. It seems really good at sticking to that out of the box and seems to do 2-3 items for each category.
```Summarize today's news stories in concise bullet points focusing on relevance and clarity. Organize the summary into the following sections:
* World Events: Major global developments and international headlines
* Geopolitics: Political tensions, alliances or diplomatic events
* London: Local news and developments specific to London
* Supply chains: etc ```
One thing I've had ZERO luck is making it look for forward looking economic indicators. No idea why but it doesn't work at all
vonneumannstan
a day ago
LLMs basically can't do arithmetic directly, trying to get them to do so is a skill issue. Most models can and will happily write and execute code to do that work instead.
Havoc
16 hours ago
What I described is a setup that DOES have pandas & python access and uses it heavily to figure out the excel files. Which is neat, but the output is still wrong
giancarlostoro
a day ago
Which drives me a little crazy. Every LLM worth its salt should just... MCP or whatever the arithmetic of any question, I assume the good ones do.
achenet
a day ago
> It's like a programming language where the equality (==) symbol has a built in 20% random number generator and you're supposed to write mission critical code with it.
<bad joke> Why are we talking about JavaScript in a thread about AI? </bad joke>
tuatoru
17 hours ago
There are a lot of jobs which don't require meticulous accuracy - coming up with marketing plans, press releases, writing HR policies, reading and summarising reports, etc.
Even in accounting I am sure you could use AI at times to help write or read your emails or summarise legislation or IRS rulings. Have it drive Excel or your financial systems directly? No, not yet.
canonistically
an hour ago
> There are a lot of jobs which don't require meticulous accuracy - coming up with marketing plans, press releases, writing HR policies, reading and summarising reports, etc.
I'm sorry but you think marketing plans, press releases, HR policies and report summaries do not need to be accurate? What sort of organization do you work with??
IIAOPSW
a day ago
In fairness, a "20% random number generator" on "mission critical code" is something they literally do at NASA