alangibson
5 months ago
That wasn't prerecorded, but it was rigged. They probably practiced a few times and it confused the AI. Still it's no excuse. They've dropped Apollo-program level money on this and it's still dumb as a rock.
I'm endless amazed that Meta has a ~2T market cap, yet they can't build products.
qgin
5 months ago
I don't think it was pre-recorded exactly, but I do think they built something for the demo that responded to specific spoken phrases with specific scripted responses.
I think that's why he kept saying exactly "what do I do first" and the computer responded with exactly the same (wrong) response each time. If this was a real model, it wouldn't have simply repeated the exact response and he probably would have tried to correct it directly ("actually I haven't combined anything yet, how can I get started").
poidos
5 months ago
It's because their main business (ads, tracking) makes infinite money so it doesn't matter what all the other parts of the business do, are, or if they work or not.
IncreasePosts
5 months ago
That's Google's main business too, they have infinite money plus 50% relative to meta, and they are still in the top two for AI
marcus_holmes
5 months ago
Google are well-known, like Meta, for making products that never achieve any kind of traction, and are cancelled soon after launch.
I don't know about anyone else, but I've never managed to get Gemini to actually do anything useful (and I'm a regular user of other AI tools). I don't know what metric it gets into the top 2 on, but I've found it almost completely useless.
pnt12
5 months ago
I agree they aren't building great user products anymore but gemini is solid (maybe because it's more an engineering/data achievement than a ux thing? the user controls are basically a chat window).
I asked for a deep research about a topic and it really helped my understanding backed with a lot of sources.
Maybe it helps that their search is getting worse, so Gemini looks better in comparison. But nowadays even kagi seems worse.
FrinkleFrankle
5 months ago
Kagi has their own AI assistant that let's you choose any model from the major, and some not so major, providers. You can even hop between them I the same chat. It is also able to search for results using Kagi. This includes any lenses you could configure.
It's worked extremely well for me. Their higher subscription was less than ChatGPT + Kagi. I haven't used Gemini on its own interface yet to compare, though.
bhrlady
5 months ago
In what ways does Kagi seem worse? Any specific examples?
thebytefairy
5 months ago
Please share an example. Your 'almost completely useless' claim runs counter to any model benchmark you could choose.
DoctorOW
5 months ago
I'm not the person you're responding to, but I feel I have a great example. Replacing the Google Assistant with Gemini has made my phone both slower and less accurate. More than once have I said "Hey Google, Play <SONG> by <ARTIST>" and had my phone will chirp back the song is available for streaming instead of just playing it. Once, I even had it claim it wasn't capable of playing music, I assume because that's true on other platforms.
marcus_holmes
5 months ago
The most spectacular failure was when I asked it to make a logo for a project. The project has "cogs" in the title but that refers to Cost of Goods Sold not the physical object, so I specified that it should not include a cog a in the logo. Of course, it included a cog in the logo.
I asked it to help me create a business plan. Partway through it switched to Indonesian language, for no reason I could see. Then, after about two hours work on the plan, with about 200K tokens in the context, it stopped outputting anything reasonable.
I have tried to get it to help with Google Sheets formulae about a dozen times so far. Not once has it actually got anything right. Not once.
It's serviceable as a chatbot, but completely useless if you try to get it to actually do anything.
bugglebeetle
5 months ago
Gemini just eclipsed ChatGPT to be #1 on the Apple app store for these kinds of apps. The 2.5 pro series is also good/SOTA at coding, but unfortunately poorly trained for the agentic workflows that have become predominant.
johnnienaked
5 months ago
Annoying to boot
rixed
5 months ago
Haven't Google also famously faked a phone call with an AI some years ago for an event?
bitpush
5 months ago
> When you call a business, the person picking up the phone almost always identifies the business itself (and sometimes gives their own name as well). But that didn't happen when the Google assistant called these "real" businesses:
That's the whole argument?
gruez
5 months ago
No, because if you read the article you'd see that there's more, like the "business" not asking for customer information or the PR people being cagey when asked for details/confirmation.
user
5 months ago
johnnyanmac
5 months ago
At this point, honesty is an oasis that is the 2025 year of scams and grifts. I'm just waiting for all the bubbles to pop.
fullshark
5 months ago
It's been this way since natural internet user base growth dried up
johnnyanmac
5 months ago
True. We at least had a period where they weren't so blatant about it. But now it's robbery in blind daylight.
privatelypublic
5 months ago
Well, it _IS_ a rock after all.
imiric
5 months ago
At that point, why not just go full out with the fake demo, and play all responses from a soundboard?
They could learn a thing or two from Elon.
smelendez
5 months ago
That was my thought — the memory might not have been properly cleared from the last rehearsal.
I found the use case honestly confusing though. This guy has a great kitchen, just made steak, and has all the relevant ingredients in house and laid out but no idea how to turn them into a sauce for his sandwich?
pessimizer
5 months ago
Yes. Even if the demo worked perfectly, it's hopelessly contrived. Just get text-to-speech to slowly read you the recipe.
al_borland
5 months ago
> Just get text-to-speech to slowly read you the recipe.
Even this feels like overkill, when a person can just glance down at a piece of paper.
I don’t know about others, but I like to double check what I’m doing. Simply having a reference I can look at would be infinitely better than something taking to me, which would need to repeat itself.
XorNot
5 months ago
A hardened epaper display I could wash under a sink tap for the kitchen, with a simple page forward/back voice interface would actually be pretty handy now that I think about it.
com2kid
5 months ago
Paper get lost, it gets wet, it gets blown around it not weighed down, and when weighed down it then quickly gets covered with things.
Prepping raw ingredients, once has to be careful not to contaminate paper, or at least the thing weighing the paper down that may be covering the next step.
I cook a lot of food, and having hands free access to steps is a killer feature. I don't even need the AI, just the ability to pull up a recipe and scroll through it using the wrist controller they showed off would be a noticeable, albeit small, improvement to my life multiple times per week.
hypertele-Xii
5 months ago
I just bought paper made from stone that doesn't get wet, so there's one problem solved..!
x0x0
5 months ago
You could imagine some utility to something that actually worked if it allowed you to continue working / not have to clean a hand and get your phone out while cooking. (Not a ton of utility, but some). But if it stumbles over basic questions, I just can't see how it's better than opening a recipe and leaning your phone against the backsplash.
JKCalhoun
5 months ago
Or pick up a bottle of bulgogi sauce?
dgfitz
5 months ago
> confused the AI.
I will die on this hill. It isn’t AI. You can’t confuse it.
jayd16
5 months ago
They "poisoned the context" which is clearly what they meant.
AdieuToLogic
5 months ago
>>> confused the AI.
>> I will die on this hill. It isn’t AI. You can’t confuse it.
> They "poisoned the context" which is clearly what they meant.
The "demo" was clearly prescriptive and not genuinely interactive. One could easily make the argument that the kayfabe was more like an IVR[0] interaction.
0 - https://en.wikipedia.org/wiki/Interactive_voice_response
dgfitz
5 months ago
“The blue square is red.”
“The blue square is blue.”
“The blue square is green.”
The future is here.
taneq
5 months ago
Ok, you’ve piqued my interest. What’s required in order for something to be genuinely confused?
MyOutfitIsVague
5 months ago
The ability to think. If it can't think in the first place, it can't get confused. Whether it's "real AI" or not depends on semantics of that you consider AI to be:
* If you think it's something that resembles intelligence enough to be useful in the same way intelligence is and to seem to be intelligence, this is clearly it. The "plant based meats" of AI.
* If you think it means actual intelligence that was manufactured by a human, this is not that. It's shockingly impressive auto correct, and it's useful, but it's not actually thinking. This would be "artificially created intelligence"; in essence, real intelligence with an artificial origin. The lab grown meat of AI.
For the latter, I really think it needs reasoning ability that isn't based on language. Plenty of animals can think and reason without depending on language. Language is a great carrier of intelligence, which is why LLMs work so well, but language is not the foundation of intelligence.
That said, I think "confused" is a fine enough anthropomorphization. I refer to things like Bluetooth issues as the machine getting confused all the time. Back in the day, Netflix would often have problems with the wrong image showing for a show, and I called that "Netflix getting confused". We know it's not actually getting confused.
taneq
5 months ago
You're just moving the question from "confused" to "think", and I think you're also conflating "to be confused" (which was what I said) with "to feel confused" (which is a whole other thing.)
I guess my definition of 'to be confused' is something like 'to have a significant mismatch between your model and reality', so yeah, you could argue that a PID controller is "confused" by a workload with significant inertia. And 'to feel confused' would be 'to identify a significant mismatch between your model and reality', of which clearly a PID controller is not capable, but most multicellular life forms are.
chmod775
5 months ago
If by "genuinely confused" we are talking about what is generally called the emotion "confusion", then you need meta-cognition to even recognize your thinking is confused in order to then experience it. LLMs can model reflection by using their own output as input, but otherwise lack meta-cognitive processes. Through reflection they can later recognize what they did previously was wrong, but they will never experience "what I am currently thinking seems wrong/does not fit".
However "confusion" can also mean "mistaking one thing for another" or simply "doing the wrong thing", which is something computer programs have been able to fuck up since forever.
Fade_Dance
5 months ago
Just try asking an AI about the Seahorse Emoji (insert close-yet-comically-incorrect emojis since they won't render here):
_________
ChatGPT: There is a seahorse emoji: — wait, no, that’s a lobster. Okay, for real now:
Here it is: � Seahorse — argh, it’s still not showing properly here.
Let’s try again:
(hedgehog) (sauropod) (T-rex) (otter) (seal) (lobster) …but no actual (squid) or (mermaid) matches…
Okay, deep breath — I checked the official Unicode list. And you’re right: The seahorse emoji does exist — it’s — no, okay really:
(hedgehog) (hippo) (peacock) Still nothing?
Wait — turns out, I was wrong, and you’re right to call me out.
There is a seahorse emoji:
— no!
_________
tomrod
5 months ago
Capacity for will, I would imagine the OP may respond. I'm not averse to the position.