hackernews client

Show HN: Open-source real-time talk-to-AI wearable device for few $

103 pointsposted a year ago

125 Comments

throwaway314155

a year ago

> In the US, about 1/5 children are hospitalized each year that don’t have a caregiver. Caregiver such as play therapists and parents’ stress can also affect children's emotions.

Trust me, large language models are not anywhere close to being able to substitute as an effective parent, therapist, or caregiver. In fact, I'd wager any attempts to do so would have mostly _negative_ effects.

I would implore you to reconsider this as a legitimate use case for your open device.

> We believe this is a complement tool and it is not intended to replace anyone.

Well which is it? Both issues you list heavily imply that your tool will serve as a de facto replacement. But then you finish by saying you don't intend to do that. So what aspects of the problems you listed will be solved as a simple "complement tool"?

Intralexical

a year ago

> In fact, I'd wager any attempts to do so would have mostly _negative_ effects.

It does kinda send an interesting message to a child, doesn't it? "You're not worth the time of anybody human, so here's a machine instead."

And that's before the chat even starts (and eventually goes off the rails).

edmundsauto

a year ago

Wouldn't the alternate message be "you're not worth the time of anybody human or machine"? That seems strictly worse.

guappa

a year ago

But they don't need a machine, they need a human :D

CryptoBanker

a year ago

Might be a good lesson for the world they will enter…see the average customer support experience from large companies these days.

szundi

a year ago

I had a quite good social sciences teacher.

I never forget one of his remarks: There can be only one thing that is worse than someone not having a mother - that he has one.

So maybe a chatty LLM is not the worse thing that can happen with someone.

petee

a year ago

Wow someone had a bad childhood...why even share that with your class?

HeatrayEnjoyer

a year ago

Why not?

noworriesnate

a year ago

Because it's literally misrepresenting the facts.

HeatrayEnjoyer

a year ago

Misrepresenting that they had a bad childhood? What makes you think that's the case?

noworriesnate

a year ago

Here's the fact that's being presented:

> There can be only one thing that is worse than someone not having a mother - that he has one.

However, this is not accurate.

fragmede

a year ago

> Trust me, large language models are not anywhere close to being able to substitute as an effective parent, therapist, or caregiver.

You're asking us to trust you, but why should we trust you in this matter? Regardless of if I think ChatGPT is any good at those things, you'd need some supporting evidence for that one way or another before continuing.

throwaway314155

a year ago

It's an expression. In this context I just meant "it should be obvious". Maybe try steel-manning my argument first. If you really can't see why that's likely the case after using a LLM yourself, then I'll be happy to admit that I'm making an emotional argument and you're in no way required to "trust me".

fragmede

a year ago

https://chatgpt.com/share/6701aab3-2138-8009-b6b8-ec345b4382...

Why is that "not anywhere close to being able to substitute as an effective parent, therapist, or caregiver."?

Maybe I've had a bad parents/therapists/caregivers all my life, but it seems like an entirely reasonable response. If there's a more specific scenario you'd like me to pose and show me that it's advice is no good, I'm happy to ask it.

throwaway314155

a year ago

I gladly admit that I was making an appeal to emotional intelligence and you won't likely agree with me no matter what back and forth we go through.

fragmede

a year ago

I'm not sure why you assume I'm coming from a position of bad faith but to skip the back and forth, I'll just plainly state where I'm coming from. I'm agnostic as to the whole thing and ultimately, to be totally transparent, I still have a human therapist, for good reason. But he's only available during set hours so when I'm in crisis at 3am on a Tuesday, I also fully admit that I'll have conversations with ChatGPT. I'm sure I'm not alone in doing so.

I'm not trying to convince you that it's, right now, a replacement for a human parent/therapist/caregiver. it's the "not anywhere close" part that I'm responding to. It's closer than talking with a speak and spell, or a See'n'say, for instance, but also ahead of static worksheets that you can't have a conversation with. I have no idea if this is good for society, and I have no idea where this technology will take us.

I want to know the limitations of this technology, and I'm willing to be convinced that, hey, maybe what some of it's saying isn't helpful as a therapist, because that's interesting. The number of R's in strawberry, for instance has a specific technical reason it's bad at, because of how tokenization works. If, after being fed every psychology textbook, the advice it gives would be egregiously or subtly bad/wrong/harmful, or biased towards, say, giving a Freudian analysis when the industry's moved way past that, I'd like to know and hear about it, so I better know when not to trust its advice and be able to warn others.

throwaway314155

a year ago

i'm of the opinion that it's like self driving cars. Even if you get 99.999% of the way there it's still "not anywhere close" to the real thing because you're speaking with something that has little to no agency and acting as though it's a good substitute for a person.

My instincts tell me that humans are pretty good at detecting this difference. And when they aren't - they still won't like being lied to or tricked about it. You can see it already - generative art, or music for instance is (in some cases) objectively more impressive than art created by humans all else constant. You might trick a contest into giving you an award but the moment people find out it's generated, they almost immediately react angrily and no longer express interest in the result.

That's because they used to attribute the result to a person and now they know it's not a person. The psychology there probably isn't even fully fleshed out, but i feel it instinctively, as I said before. And I suspect others do as well based on the reactions here.

Sorry for assuming bad faith. i've met a lot of persons here who really do think LLM's in their current form are a kind of sentience. Blake Lemoigne (sp?) is a good example of that kind of naïveté.

I too have a human therapist, doctors, etc. And I too find myself chatting with ChatGPT, etc. about personal issues and in certain cases benefit tremendously from it. In particular, whenever it is something I would normally feel embarassed to say to another actual human. Since I am very confident ChatGPT doesn't have feelings or even an internal monologue with which it could "judge" me - I have no issue telling it such things. The benefit here is from the questions I can have answered that would otherwise go unanswered. I think this makes for a potential assistive technology as you implied earlier (better than a worksheet).

But for precisely that same reason, it will never work (in its current form) as a complete substitute for a human. And attempts to do so may in fact be actively harmful (as I originally suggested). Again, I'll just say that I don't think there's yet enough research on this but that "I know it when I see it". Any sufficiently serious topic I discuss with ChatGPT ultimately winds up with me drained because I feel as though I'm talking to a wall and not actually being acknowledged by anyone with agency who matters to me.

I will definitely admit that this is a highly opinionated take and is rooted in a lot of my personal feelings on the matter. As such, I can't really say that I've definitively proven that my point is the correct point. But, I hope you at least get the gist of what I am saying.

fragmede

a year ago

For something that's not rigorously defined, 99.999% and 100% is pretty frickin close together in my book. Like, TherapistGPT isn't going to randomly say you should go kill yourself.

Unfortunately, I'm not sure what your point actually is. Is ChatGPT in it's current form, a replacement for human contact? absolutely not. do people have strong emotions around something using a GPU and a bunch of math and was generated instead of being organically hand crafted by a human being, and having it fall into the uncanny valley? totally. is this box of matrices and dot products outputting something I personally find useful, despite shortcomings? yeah.

I agree that there's totally this brick wall feeling when ChatGPT spins itself in circles because it ran out the context window or whatever.

at the end of the day, I think the yacht rock cover of "closer" is fun, even though it's AI generated. however that makes you feel about my opinions.

https://youtu.be/ejubTfUhK9c

ben_w

a year ago

> Like, TherapistGPT isn't going to randomly say you should go kill yourself.

It won't literally do that, the labs are all careful about the obvious stuff.

But consider that Google Gemini's bad instructions almost gave someone botulism*, there's a high chance of something like that in almost every field. I couldn't tell you what that would mean in therapy for the same reason I wouldn't have known Gemini's recipe would lead to culturing botulism.

These are certainly more capable than anything before them, but the Peter Principle still applies, we should treat them as no more than interns for now. That may be OK, may even be an improvement on not having them, but it's easy to misjudge them.

* https://news.ycombinator.com/item?id=40724283

user

a year ago

[deleted]

fragmede

a year ago

Hey I think you replied and then deleted it so I don't totally know what your said but I wanted to say thank you for your time, even though we didn't manage to come face to face on things it was nice talking to you.

throwaway314155

a year ago

Likewise.

eddd-ddde

a year ago

Honestly I don't see it as an "obvious" thing.

I won't be surprised if in a couple more years this kind if thing is the norm. I don't think there's anything inherently different from a person that listens to you.

ben_w

a year ago

It wasn't obvious for a long time, but the closest we have to a relevant experiment* shows that physical contact is also necessary for parenting, especially soft contact: https://en.wikipedia.org/wiki/Harry_Harlow

Humanoid robots are improving, so I won't say "never", but I will say "not yet". Not in isolation at least.

* and likely the closest we ever will, because it was disturbing enough to a big influence on animal welfare movement

tempodox

a year ago

I'm at a loss for words. If you really think there's no difference between a human and a machine, I don't know what to tell you.

lancesells

a year ago

I'm not trying to shit on this but the world seems to have jumped headfirst into dystopian Black Mirror episodes. It looks like people too young for children thinking AI technology is the answer.

Again, not shitting on the people creating this and this is the forum for it, but I feel all of this is just such a wrong direction for people and humanity in general.

zq2240

a year ago

Like in pederatic care, not every child has a parent who takes good care of them. In hospitals, it is more often play therapists who do this work, but their negative stress can also affect children's emotions. For example, some children feel very traumatized before doing line placement/blood test. This tool can help explain the specific process to them using empathic language and encourage them on specific topics.

I mean doctors and play therapists still have to do their job, We have interviewed some doctors who feel particularly frustrated about how to comfort children before tests or surgeries. They hope for a tool can help building comfort for kids -> which means time is faster to run tests.

tempodox

a year ago

Ultimately, you are repackaging the services of actual LLM suppliers without having any knowledge or control of how those services might develop in the future. So it is logically and physically impossible for you to represent the fitness of those services for any purpose whatsoever. And anyone else you may have asked questions about this cannot either.

I can only urge you to reconsider how honest, realistic, and credible those promises you make can possibly be. After all, you are playing with the lives and wellbeing of humans here. Every drug and therapeutic device has to go through rigorous vetting and testing before being cleared for human treatment. Ever heard of clinical trials? And you seriously think you can skip that with “we asked some pediatricians”? Please, think again. And ask someone with more domain knowledge than vague hopes in a technology they don't understand.

zq2240

a year ago

Thanks for your feedback. I want to clarify that Starmoon is not being positioned as a therapeutic or medical device. Rather, it is intended as a supplementary tool that might potentially support emotional well-being, similar to how some doctors use YouTube videos to comfort kids for non-clinical support. Currently, we have agreements to pilot the product with a few hospitals in London to collect more data through trials to improve it.

tempodox

a year ago

A doctor can look at a YT video and know what's in it. They can judge whether it's appropriate for a child to watch. No doctor can know what output LLMs will generate, it is impossible to verify that it will be safe. Contrary to your claim those things aren't similar at all.

akadeb

a year ago

Thanks for this feedback. I hear what you are saying. If I understand correctly its like LLMs are a black box doctors dont understand and while it talks back in a friendly voice it can cause harm if it says something awful. While this is not likely due to our prompting it is not impossible either.

Junru and I will discuss the approach with pediatric care based on this. And I agree having domain expertise/advisors to guide us in the right direction is important

laylower

a year ago

Maybe convert it to an offline version of furby instead. My position would be not to confound treating/medicine with a neat stuffed animal.

renewiltord

a year ago

[flagged]

maeil

a year ago

The machine in question is not based on a set of rules of better quality. It regurgitates the average of the exact humans you're talking of. This is no improvement.

renewiltord

a year ago

No. It regurgitates the average of all humans represented in the text record available. These humans are 80% using therapists. The average human is not.

maeil

a year ago

It regurgitates the average human completion of a certain text/sentence. Guess which humans have been writing the average text related to these subjects. Not plumbers and bakers.

bn-l

a year ago

Fully agree. I think one thing future generations will look back on us with disgust is how flippantly we care for children. With an LLM we have at least a chance of instilling the right values, providing the right answers to questions, etc.

nradov

a year ago

No one knows what the "right" values are, least of all an LLM.

bn-l

a year ago

We can get close. And heaps closer than the alternative.

nradov

a year ago

Who is "we"? Are you referring to the handful of tech company employees who control LLM development? Do they have the right values? What are the right values?

bn-l

a year ago

The we is we as a species. But if you're point is that it won't end up like that then yeah, fair enough.

moralestapia

a year ago

>I would implore you to reconsider this as a legitimate use case for your open device.

OP, I would implore you to not listen to any of this "advice" at all and just keep on building really nice things.

I can already think of a dozen valuable applications of it in a therapheutic context.

Ignore those who don't "do".

brailsafe

a year ago

> Ignore those who don't "do".

I'm actually pretty ok with ignoring those who don't "think" before they "do", not that the OP is one of those people, but "doing" as a mark of virtue seems fairly likely destructive

moralestapia

a year ago

One day of doing is worth a billion years of thinking.

The world is material, not imaginary.

brailsafe

a year ago

Ya, I guess, or you could just measure twice and cut once

Nullabillity

a year ago

There's nothing admirable about charging head-first in the wrong direction.

moralestapia

a year ago

Perhaps you are a psychic but that is not my case.

"Charging head-first", even in the wrong direction, is the only thing worth doing.

RHSeeger

a year ago

That type of logic is generally used to address the problem of being unwilling to act, for fear of making the decision; and it's reasonable in that context. But taking the time to understand the situation so that you can make the best decision possible (with the available information) is almost always beneficial.

RHSeeger

a year ago

I would argue just the opposite. Thinking without doing accomplishes very little. Doing without thinking might accomplish something, or it might be utterly destructive and take 1000x the amount of "doing" (and a lot of thinking) to undo.

brailsafe

a year ago

Agreed, but would add that deciding not to do something is an underappreciated action of doing. If the thinking process results in deciding your deployable resources can be better used, how would that not also be "doing". The act of relentless material production seems so wasteful tasteless.

moralestapia

a year ago

You and GP and all others in this thread of comments.

Can you share anything you've done so that we can all see it?

RHSeeger

a year ago

I'm not sure I see the point in that, since you wouldn't have the context involved in each decision to know if we made a good one. Unless you're specifically looking to identify that we haven't done anything _you_ consider valuable, so our opinion that thinking before we act is important isn't valid.

I can say

- I have a wonder family that loves me (and vice versa)

- I have a place to live and can pay the bills

- The people I work with are glad I'm working with them (and vice versa)

- I've been a software developer for decades and I get to solve new and (what I find) interesting problems on a fairly regular basis. Most of that problem solving involves thinking; with a minor bit of doing at the end.

So overall, I'm pretty happy with where my life philosophy has taken me.

brailsafe

a year ago

> I've been a software developer for decades and I get to solve new and (what I find) interesting problems on a fairly regular basis. Most of that problem solving involves thinking; with a minor bit of doing at the end

Great answer overall, but this bit specifically is so crucial to maintaining interest long-term. Mechanistic programming needs to be done sometimes, and it's every junior's first impulse when presented with what they think has an obvious solution (see the problem -> start coding, or "why don't we just refactor this? I'll just put in a ton of extra unpaid hours and make everything better"), but that wears thin very quickly to the point of risking burnout.

You need to develop perspective on why you do something and what impact it may or may not have on small or large scales, and put yourself in positions where the majority of your labor goes to understanding how to apply your skills or resources in a sufficient way, given the constraints in front of you and down the line; this isn't just to advance your measurable skills or your resume, but to maintain your interest in what can quickly become incredibly dull and soul-sucking.

moralestapia

a year ago

Those things are great things to have, unironically.

You're blessed.

zq2240

a year ago

Thank you for all your advice.

echoangle

a year ago

I don't want to criticize a cool project but why do people feel the need to create new hardware for AI things? It was the same thing with the rabbit r1. Why do I need a device that contains a screen, a microphone and a camera? I have that, it's called a smartphone. Being a bit smaller doesn't really help because I have my phone with me almost all the time anyways. So it's actually more annoying to carry the phone and the new device instead of just having the phone. I would be happy with it just being an app.

suriya-ganesh

a year ago

I can answer to this, having worked on an assistant that is always on, from your phone.

The platforms (ios, Android, etc.) are very limiting. It is hard to have something always on and listening. Especially apple is aggressive with apps running in the background.

You need constant permissioning and special privileges. The exposed APIs themselves are not enough to build deep and stable integrations to the level of Siri/Google Assistant.

echoangle

a year ago

Oh, I didn't get that it's supposed to always be listening. Maybe I'm not the target audience but I wouldn't want that anyways. If that's important, that might be a good reason. I think this needs to change in the future though if AI agents are supposed to become popular, I can't imagine buying separate hardware every time. Either the integration in the OS needs to become better or Google/Apple will monopolize the market and be the only options.

jsheard

a year ago

> Oh, I didn't get that it's supposed to always be listening. Maybe I'm not the target audience but I wouldn't want that anyways.

I don't know about this project, but generally when a voice assistant is "always listening" they mean it's sitting in a low power state listening for a very specific trigger like "Hey Siri" or "OK Google" and literally nothing else. As much as they would probably like to have it really listening all the time, the technology to have a portable device run actual speech recognition and parsing at all times with useful battery life doesn't really exist yet.

joeyxiong

a year ago

You are right, “always listening" they mean it's sitting in a low power state listening for a very specific trigger like "Hey Siri" or "OK Google" and literally nothing else.

fragmede

a year ago

Yes it does. A Nvidia Jetson Nano with a microphone running Whisper with a banana sized battery will give you 8 hours of transcription.

user

a year ago

[deleted]

echoangle

a year ago

Yes, I thought it was button-triggered.

xnx

a year ago

If you have a separate -dedicated- Android smartphone for this task, why wouldn't the app run in the foreground?

suriya-ganesh

a year ago

In some sense this is what, Rabbit R1 tried to do. they just shipped a low end custom android skin in a new formfactor so that they own the platform. Didn't go well for them

fsflover

a year ago

> The platforms (ios, Android, etc.) are very limiting.

I'm not sure what that "etc" is supposed to mean, but GNU/Linux phones run desktop OS and impose no artificial limits on you.

explorigin

a year ago

I don't think you're their target audience. I'd love something like this for my kid (who isn't ready for a smartphone).

Other problems are persistence. Have you looked at how hard it is to keep an app running in the background on an iPhone? on a Samsung phone? For an app that needs to be always-on, it's a non-starter unless you're Apple or Google respectively.

meiraleal

a year ago

We definitely need an alternative to the duopoly iOS/Android

fsflover

a year ago

Librem 5 and Pinephone already exist.

moralestapia

a year ago

>why do people feel the need to create new hardware for AI things?

Because people have agency and hobbies, and they're free to decide what to spend their money and time on.

bn-l

a year ago

He means why do they think it’s a good idea not why do they think they have the right to.

moralestapia

a year ago

You missed this part:

"Because people have agency and hobbies"

More attention next time.

AlexeyBelov

a year ago

[flagged]

dmitrygr

a year ago

Apple would stop you from scooping up all that delicious delicious data. Google probably would too. Always-on listening requires building e-waste.

jejeyyy77

a year ago

because these devices are intended to be always on, which is not possible with iphone?

allears

a year ago

This tool requires a paid subscription, but it doesn't say how much. The hardware is affordable, but the monthly fees may not be. Also, the hardware is only useful as long as the company's servers are up and running -- better hope they don't go out of business, get sold, etc.

joeyxiong

a year ago

Sorry for the confusion, we are still discussing the paid subscription pricing, but I can be sure that the price of premium subscription will not be higher than $9 per month.

akadeb

a year ago

Thanks for the useful feedback. I just updated our product page with the subscription price.

Since you can use docker for our backend, you can self-host your own service with our hardware. You can use our subscription only if you want us to handle the STT/TTS/LLM costs

xtagon

a year ago

I highly, highly doubt we've reached the level of AI safety required to make it a good idea to replace (or even just supplement) caregivers for children. Nobody has truly solved the safety problems with AI yet, just doing the best they can--seems like a terrible idea to put that in direct intimate access of emotionally vulnerable children. We've already passed the threshold of AI suggesting to testers to commit suicide[0], and the bar has been raised to actual users being told that[1] and someone reportedly following through.[2]

[0]: https://www.artificialintelligence-news.com/news/medical-cha...

[1]: https://ainiro.io/blog/googles-ai-encouraging-people-to-comm...

[2]: https://www.euronews.com/next/2023/03/31/man-ends-his-life-a...

jstanley

a year ago

Personally I have found talking to AI to be much more draining than typing. It's a bit like having a phone call vs IM. I'd basically always prefer IM as long as I'm getting quick responses.

josephg

a year ago

Since the new OpenAI voice model launched, I feel the opposite. Some of the responses me and my gf have gotten from it were fantastic. It’s really good at role play and using intonation now. And you can interrupt it halfway through a response if it’s going off track.

For example, I spent 20 minutes the other day talking through some software architecture decisions for a solo project. That was incredible. No way I would have typed out my thoughts as smoothly.

willsmith72

a year ago

I still use text most of the time (technical or complex problems, copy pasting materials...), but for things like language learning or getting answers while commuting/walking, voice is a no-brainer.

ProjectArcturis

a year ago

I want to talk my input and read its output. Both are faster.

afro88

a year ago

For the use case that this project is for?

user

a year ago

[deleted]

zq2240

a year ago

Yeah, I know your point. Compared with human communication, I think talk with AI can be self-paced.

aithrowawaycomm

a year ago

This seems to be yet another reckless and dishonest scam from yet another cohort of AI con artists. From starmoon.app:

> With a platform that supports real-time conversations safe for all ages...Our AI platform can analyse human-speech and emotion, and respond with empathy, offering supportive conversations and personalized learning assistance.

These claims are certainly false. It is not acceptable for AI hucksters to lie about their product in order to make a quick buck, regardless of how many nice words they say about emotional growth.

Do you have a single psychologist on your staff that signed off on any of this? Telling lies about commercial products will get you in trouble with regulators, and it truly seems like you deserve to get in trouble.

arendtio

a year ago

Can you please elaborate on why this is 'certainly false'? What is missing?

To me, it looks like you have some experience with the topic and believe that it is very hard to build something like the device in question, but which properties of the solution make you so certain?

aithrowawaycomm

a year ago

The primary thing that's missing is any evidence that the claim is true, or even plausible. There's no indication that they even tested this with kids.

I don't take advertising at face value, even if that advertising might appeal to sci-fi sensibilities. Your question has an air of "well you can't PROVE the flying spaghetti monster is false."

arendtio

a year ago

I think the plausibility is granted by the usage of the emotion intelligence model[1].

However, I agree with you that this is very thin ice. Given the selection of books used as decoration in the video, the authors seem to have more of a business background [2] than one of psychology.

I don't like calling someone a liar when no evidence is present (either way). I would rather say: 'Bold claims, can you prove it?'

[1]: https://github.com/StarmoonAI/Starmoon/blob/main/.env.exampl...

[2]: https://youtu.be/59rwFuCMviE?t=69

akadeb

a year ago

Those books made good props for the demo but I should add that we aren't exactly experts in it either. I have a background in computer engineering and math at UIUC and my cofounder has a background in data science and machine learning from city university in london and architectural computation at UCL.

Ours are indeed bold claims and we have updated some copies on the website to update our core value prop

akadeb

a year ago

hey there I am one of the founders. this is our project which we are trying to grow through open-source. I agree our wording can be better so its backed by data and not just a marketing stint.

> Do you have a single psychologist on your staff that signed off on any of this?

We've been talking to pediatricians at portland hospital and cromwell hospital in london to support the "safe for all ages" claim but I agree that we want to back all our claims with data

edent

a year ago

Note to readers. The Cromwell Hospital is a private company. It is not part of the NHS.

zq2240

a year ago

I would argue that most hospitals in the US are private.

meiraleal

a year ago

These claims are certainly not false. They described ChatGPT.

> It is not acceptable for AI hucksters to lie about their product in order to make a quick buck

You created a fake/throwaway just to make posts with this kind of cheap insults?

stavros

a year ago

I'd love a hardware device that streamed the audio to an HTTP endpoint of my choosing, and played back whatever audio I sent. I can handle the rest myself, but the hardware side is tricky.

akadeb

a year ago

Neat! How can I help you? I think our devkit's components will help you get started. Here is the main.cpp code for this

``` /* * @file streams-i2s-webserver_wav.ino * * This sketch reads sound data from I2S. The result is provided as WAV stream which can be listened to in a Web Browser * * @author Phil Schatzmann * @copyright GPLv3 /

#include <WiFi.h> #include "AudioTools.h"

const char ssid = "<stavros_ssid>"; const char *password = "stavros_pw"; AudioWAVServer server(ssid, password);

I2SStream i2sStream; ConverterFillLeftAndRight<int16_t> filler(LeftIsEmpty); // fill both channels - or change to RightIsEmpty

void setup() { Serial.begin(115200); AudioLogger::instance().begin(Serial, AudioLogger::Info);

    //   // Connect to Wi-Fi
    Serial.println("Connecting to WiFi...");
    WiFi.begin(ssid, password);
    while (WiFi.status() != WL_CONNECTED)
    {
        delay(1000);
        Serial.println("Connecting...");
    }
    Serial.println("Connected to WiFi");

    // Print the IP address
    Serial.print("IP address: ");
    Serial.println(WiFi.localIP());

    // start i2s input with default configuration
    Serial.println("starting I2S...");
    auto config = i2sStream.defaultConfig(RX_MODE);

    // working well
    config.i2s_format = I2S_STD_FORMAT;
    config.sample_rate = 44100;  // INMP441 supports up to 44.1kHz
    config.channels = 1;         // INMP441 is mono
    config.bits_per_sample = 16; // INMP441 is a 24-bit ADC

    config.pin_ws = 19; // Adjust these pins according to your wiring
    config.pin_bck = 18;
    config.pin_data = 21;
    config.use_apll = true; // Try with APLL for better clock stability
    i2sStream.begin(config);
    Serial.println("I2S started");

    // start data sink
    server.begin(i2sStream, config, &filler);

}

// Arduino loop void loop() { // Handle new connections server.copy(); }

```

This code just listens to audio you record on a microphone (INMP441 MEMS microphone used here) and streams it to an endpoint of the microcontroller's IP address. If you would like more help on this give me a shout anytime. my email: akash at starmoon dot app

stavros

a year ago

This is good, thanks! Now that I see this, I realize that I'd want wakeword detection as well, which is getting a bit far. I like what the Willow team did with their stack, unfortunately that project seems to have died. It was pretty ideal, otherwise...

zq2240

a year ago

Hi, you can check out this: https://www.youtube.com/watch?v=qq2FRv0lCPw. I think it might be useful to you

stavros

a year ago

This is great, thank you!

zombot

a year ago

I swear, every last bit of exchange from the "project developers" in this comment section, and all of the project repo, including the readme, is LLM-generated.

akadeb

a year ago

We are all LLM-generated

butterfly42069

a year ago

I think this is great, ignore the people comparing your project to the commercial Rabbit R1 project, those people are comparing apples and oranges.

A lot of the subscription based pull ins could be replaced by networking into a machine running whisper/ollama etc anyway.

Keep up the great work I say :)

akadeb

a year ago

thanks for your words of encouragement!

> A lot of the subscription based pull ins could be replaced by networking into a machine running whisper/ollama etc anyway.

could you clarify this point? I think local LLMs are great for us to reduce cost and improve privacy concerns. For conversational AI however, is there a better way to run STT and TTS models?

butterfly42069

a year ago

Can I message you on github, I'll happily go in depth with you and you can pick my brain :)

crooked-v

a year ago

So one big question is, will the service refuse to answer when topics like sex, self harm, physical violence, drug use, or the like come up? Every bigcorp LLM tends towards the social propriety of a Victorian governess, and for plenty of people being able to talk about those things is a baseline requirement for even the blandest 'friend'.

wokwokwok

a year ago

Yes, it will refuse, because it uses openAI for the model.

The interesting thing to do with this project would be to fork it and run it with open inference models.

…buuuuuut, this is one of those “modern” web apps that has a dozen third party api dependencies to worry about, built on non-self-hostable platform (superbase) so even if you wanted to, it’s probably actually impossible to run in an isolated sandbox you completely control.

/shrug

pocketarc

a year ago

In the 2001 movie AI, the protagonist children play with an "old" robotic teddy bear named "Teddy".

The bear's movement isn't great, and its voice sounds robotic. Projects like this make me think that Teddy either could be built with today's tech, or is very close to being buildable.

w-ll

a year ago

For sure we are getting toys like that by next xmass. I legit going into my parts bin to see if I can wipe something up to stick in a teddy bear right now... Might not have movement, but a talking teddy bear is fun little project.

akadeb

a year ago

will I think the devkit on our website will speed up your project. And if you would like extra parts (like a servo) for a moving teddy head, I am happy to send it to you free of charge. email me at akash at starmoon dot app

akadeb

a year ago

I have yet to watch AI (2001). I was very inspired by asimov's book the positronic man where the main family robot Andrew NDR has a positronic brain and finds himself "feeling" human emotions.

gcanyon

a year ago

I was a solo latchkey kid from age... 5 or 6 maybe? I developed a love of reading and spent basically all my waking hours that weren't forcibly in the company of others doing that, by myself: summertime in San Diego, teenage me read 2-4 books a day. I grew up to be incredibly introverted (ironic that I work as a product manager, which strongly favors extroverts) and I wonder how differently I might have turned out if a digital companion had urged me to be more social (something my parents never did), or just interacted with me on a regular basis.

vunderba

a year ago

I predicted a Teddy Ruxpin / AG Talking Bear driven by LLMs a while ago. My biggest fear is that the Christmas toy of the year would be a mass produced always listening device that's effectively constantly surveilling and learning about your child, courtesy of Hasbro.

akadeb

a year ago

For our device currently, conversational audio is only used by the LLM when a button is pressed like a "push-to-talk" feature. In the future we want to either make it a capacitive touch sensor or listen for a wake word, like "Hi starmoon". I think it would be awesome to actually have models be running locally. And we don't have to have any conversation data stored in our db.

But I agree with your thought, I think this is a big fear with google home and alexa as well. ie. Are home automation tools always listening to our conversations. IIRC pre-wake word detection, none of the audio recorded is used to market products to you.

deanputney

a year ago

Is this specific hardware necessary? If I wanted to run this on a Raspberry Pi zero, for example, is that possible?

akadeb

a year ago

Dean I think that's important feedback. It's important for us to be platform agnostic. Like Junru stated, we are supporting ESP32 models: WROOM/WROVER and S3. But down the line would like to support rasp pi 4B, 5, zero etc.

Would you like to try building it with our devkit? We will prioritize raspberry pi firmware support if there is enough demand around this.

zq2240

a year ago

Sorry, it currently support esp32-devkit and Seeed Studio Xiao ESP32S3. For the Raspberry Pi Zero, you may need to switch to a different PlatformIO environment and replace the corresponding GPIO pins.

napoleongl

a year ago

I can see something like this being used in various inspection scenarios. Instead of an inspectior having to fill out a template or fiddle with an ipad-thingie in tight situations they can talk to this, and a LLM converts it to structured data according to a template.

akadeb

a year ago

Oh I see that. what inspection scenario you are thinking of? Have you faced something similar as the inspector and needed structured data immediately?

danielbln

a year ago

Any plans for being able to run the entire thing locally with local models?

zq2240

a year ago

Thanks for asking. We will launch local LLM, STT, and TTS models in the near future versions.

_joel

a year ago

You can make an OpenAI compatible local server using LMStudio[1] and load any model you want. It'd have to be on another host though, the s3 has some inference capabilities with addons afair, but nowhere enough grunt to run locally at any usable token/s

[1] https://lmstudio.ai/

homarp

a year ago

or with open source tools like

llama.cpp - https://github.com/ggerganov/llama.cpp/blob/master/examples/...

mistral.rs - https://github.com/EricLBuehler/mistral.rs/blob/master/docs/...

lmstudio and ollama use llama.cpp underneath. cut the middle man

endofreach

a year ago

Why not just give kids MDMA if they feel lonely?

akadeb

a year ago

I think I heard this on a joe rogan podcast. Will add this to our roadmap along with DMT as an alternative.