hackernews client

Tell HN: Is this AI doom loop? I thought o1-preview was amazing, until it wasn't

12 pointsposted 10 months ago

Item id: 41683399

20 Comments

proc0

10 months ago

Right, I think it's a limitation of deep learning. Transformers and gigantic leaps in scaling have allowed AI models to reach impressive capabilities, but at its core it's limited by its training data. This is not how actual intelligence works. People don't need terabytes of information to learn how to communicate and reason.

For this reason the current AI trend will have a bigger impact on creative tasks, rather than critical technical ones. It's already great for generating art quickly, at least for the concepting phase, and other creative assets that can afford to be generic. Solving technical problems, on the other hand, requires reasoning beyond what can be extracted from training data.

We'll need a new paradigm of AI in order to have a chance at creating models that properly reason. Even without detail knowledge of the brain, we can safely speculate that the reason and language areas are extremely efficient compare to cutting edge LLMs, which means there are algorithms more complex and efficient than simple artificial neural connections that just sum weights with a bias.

eschneider

10 months ago

Yes. You're expecting too much. Generative AI models don't "understand" your problem, they don't even "understand" how to program. They're just fitting whatever data they've seen to your input.

gwoolhurme

10 months ago

To his defense, that is how it’s marketed. That this new model can reason.

loveparade

10 months ago

Yeah, but "reason" is not a well-defined term. It means different things to different people in different contexts. It's just marketing speech. You can easily argue that all ML models, even those from 50 years ago, can reason to some extent.

gwoolhurme

10 months ago

Fully agree that’s kind of my point though. It’s a very tall order for some people. Like the OP

mnk47

10 months ago

> The latest of this fad is o1-preview

Not for programming it's not. It's confusing, but o1-preview is currently pretty broken for many tasks, or in the words of Sam Altman [0], "deeply flawed". o1-mini is the recommended model [1] for programming exercises and is superior to o1-preview in OpenAI's programming benchmarks [2].

[0] - https://analyticsindiamag.com/ai-news-updates/sam-altman-say...

[1] - https://help.openai.com/en/articles/9824965-using-openai-o1-...

[2] - https://openai.com/index/openai-o1-mini-advancing-cost-effic...

Personally, I'm sticking with Claude Sonnet 3.5 until more people figure out how to use these new models effectively. OpenAI employees said on launch day that traditional prompting techniques might not work with this one so we'll just have to experiment.

jprete

10 months ago

It never occurred to me before that chatbot randomness might have the same reward structure as a slot machine, but apparently it does. OpenAI got you to spend all your credits on seven attempts at this one problem. I'm not saying you're addicted, but I wonder about the people who absolutely gush over it.

segmondy

10 months ago

It's a limitation of your approach. If you have an idea what you are trying to do, a good detail might produce the correct code. But if it doesn't, then your next prompt will not be asking for it. But explaining to the AI how it went wrong, suggesting and guiding it towards a different path. It will often help you resolve the issue. If you have no domain knowledge or just keep regenerating responses in home that one will be correct, you are going to be wasting time and money.

resource0x

10 months ago

It's time for "How many programmers does it take to screw in a lightbulb using AI" jokes.

infamouscow

10 months ago

None, because it's a hardware problem.

rvz

10 months ago

> When people hype up that the AI solved something for them, I wonder were they lazy like me working with something complex, or were they lazy and didn't even try on something simple?

The truth they won't tell you is that they have likely invested in that AI tool and they are hyping it up with their VC friends that 'It works' even when they know it doesn't.

Each time I talk to the AI bros about these limitations, they retort to their whataboutisms with 'But humans hallucinate too!', 'The human brain is the same as an LLM' nonsense excuses.

LLMs do not 'understand' your problems nor can they reason about them. O1 is no different and instead of buying into the scam and prompting endlessly with garbage results and attempting to replace your co-worker, actual programmers can write the plan for the code themselves and solve it. Especially for unseen code or new changing syntax for it.

Whenever I see someone promoting another AI tool, I always see who invested and 9/10 of the time it is funded by VCs and ex-FAANG engineers yet again on the snake oil grift. (And they know it but will never admit it.)

user

10 months ago

[deleted]

muzani

10 months ago

It might be just an OpenAI thing where they ramp up the power a week after the demo to get more subscriptions, then bring it down gradually. The forums now happily gaslight you into thinking it's a conspiracy theory, despite the evidence. It's easy to catch - just share an amazing input/output response with a friend, then try the exact same thing a month later.

It's one of the arguments for using open source AI even though it's still a little behind - at least when you're running it on your own system, you know if you're the problem.

solardev

10 months ago

Does Claude work any better for you?

moomoo11

10 months ago

I have tried Claude and I find that it is more "to the point", but still suffers from giving the wrong/unusable answers.

Both claude and chatgpt are good for rote tasks, but claude is definitely more succinct.

b20000

10 months ago

these models are search engines with some interpolation thrown in

AnimalMuppet

10 months ago

It sounds like you're finding AI to be a useful (if infuriating) rubber duck. If you don't expect it to be more than that, it's useful for what it is.

mergisi

10 months ago

[dead]

skydhash

10 months ago

Gambling can be frustrating when it falls short on achieving wealth, especially to buy a big and beautiful house. To get the most out of it, you need to break down big objectives into smaller ones and choose the right game for your needs-- like Poker, Blackjack. Also, leveraging techniques like card counting can help increase chance. For objectives like buying a nice gift, specialized games like sport betting work great. Gambling isn't perfect for everything, but when used selectively, it can still be super helpful.

meiraleal

10 months ago

You are replying to chatgpt