hackernews client

NotebookLM's automatically generated podcasts are surprisingly effective

907 pointsposted 10 months ago

510 Comments

whyenot

10 months ago

This is amazing. I uploaded the instruction manual for a Scholander pressure chamber (a piece of equipment for measuring plant moisture stress) and made a podcast from it. The information in the podcast was accurate, it included some light banter and jokes, while still getting across the important topics in the instructions. I don't know what I would use a podcast like this for, but the fact that something like this can be created without human intervention in just a few minutes is jaw dropping, and maybe also just a teeny bit scary.

cainxinth

10 months ago

> I don't know what I would use a podcast like this for…

Say you need to read those instructions, but it’s also really nice out and you want to go for a jog: two birds, one stone.

bbor

10 months ago

Yeah I totally get people’s criticisms that the podcasts aren’t quite human-expert-level in terms of symbolic reasoning, but this still blows my mind. The intuitive skill these show, not to mention the ability to accurately (again, if shallowly) parse and transform huge bodies of content in seconds is absolutely scary, IMO.

I’d feed it the Singularity paper, but I’m not sure I need that extra boost of anxiety these days…

https://edoras.sdsu.edu/~vinge/misc/singularity.html

andrepd

10 months ago

This isn't "quite expert-level in terms of symbolic reasoning" in the same way as a soapbox isn't "quite a formula 1"

bbor

10 months ago

We accidentally invented general models that can coherently muse about the philosophical beliefs of Gilles Deleuze at length, and accurately, based on two full books that they summarized. You can be cynical until your dying day, that’s your right — but I highly recommend letting that fact be a little bit impressive, someday. There’s no way you live through any event that’s more historically significant, other than perhaps an apocalypse or two.

In other words: soapbox is presumably some sort of toy car that goes 15mph, and formula 1 goes up above 150mph at least (as you can tell, I’m not a car guy). If you have any actual scientific argument as to why a model that can score 90-100 on a typical IQ test has only 1/10th the symbolic reasoning skills of a human, I’d love to eat my words! Maybe on some special highly iterative, deliberation-based task?

globnomulous

9 months ago

These aren't "general" models. They're statistical models. They're autocorrect or autocomplete on steroids -- and autocorrect/autocomplete don't require symbolic reasoning.

It's also not at all clear to me what "symbolic" could mean in this context. If it means the software has concepts, my response would be that they aren't concepts of a kind that we can clearly recognize or label as such (edit: and that's to say nothing of the fact that the ability to hold concepts/symbols and understand them as concepts/symbols presupposes internal life and awareness).

The best analogy I've heard for these models is this: you take a completely, perfectly naive, ignorant man, who knows nothing, and you place him in a room, sealed off from everything. He has no knowledge of the outside world, of you, or of what your might want from him. But you slip under the door of his cell pieces of paper containing mathematical or linguistic expressions, and he learns or is somehow induced to do something with them and pass them back. When what he does with them pleases you, you reward him. Each time you do this, you reinforce a behavior.

You repeat this process, over and over. As a result, he develops habits. Training continues, and those habits become more and more precisely fitted to your expectations and intentions.

After enough time and enough training, his habits are so well formed that he seems to know what a sonnet is, how to perform derivatives and integrals, and seems to understand (and be able to explain!) concepts like positive and negative, and friend and foe. He can even write you a rap-battle libretto about nineteenth-century English historiography in the style of Thomas Paine imitating Ikkyu.

Fundamentally, though, he doesn't know what any of these tokens mean. He still doesn't know that there's an outside world. He may have ideas that are guiding his behavior, but you have no way of knowing that -- or of knowing whether they bear any resemblance to concepts or ideas you would recognize.

These models deal with tokens similarly. They don't know what a token is or represents -- or we have no reason to think they do. They're just networks of weights, relationships, and tendencies that, from a given seed and given input, generate an output, just like any program, just like your phone keyboard generates predictions about the next word you'll want to type.

Given billions and billions and billions and billions of parameters, why shouldn't such a program score highly on an IQ test or on the LSAT? Once the number of parameters available to the program reaches a certain threshold (edit: and we've programmed a way for it to connect the dots), shouldn't we be able to design it in such a way that it can compute correct answers to questions that seem to require complex, abstract reasoning, regardless of whether it has the capacity to reason? Or shouldn't we be able to give it enough data that it's able to find the relationships that enable it to simulate/generate patterns indistinguishable from real, actual reasoning?

I don't think one needs to be cynical to be unimpressed. I'm unimpressed simply because these models aren't clearly doing anything new in kind. What they're doing seems to be new, and novel, only because of the scale at which they do what they do.

Edit: Moreover, I'm hostile to the economic forces that produced these models, as everybody should be. They're the purest example of what Jaron Lanier has been warning us about -- namely that, when information is free, the wealthiest are going to be the ones who profit from it and dominate, because they'll be the ones able to pay for the technology that can exploit it.

I have no doubt Altman is aware of this. And I have no doubt that he's little better than Elizabeth Holmes, making ethical compromises and cutting legal corners, secure in the smug knowledge that he'll surely pay his moral debts (and avoid looking at the painting in the attic) and obviously make the world a better place once he has total market dominance.

And none of the other major players are any better.

stavros

9 months ago

> These aren't "general" models. They're statistical models. They're autocorrect or autocomplete on steroids -- and autocorrect/autocomplete don't require symbolic reasoning.

This is very "humans are just hunks of matter! They can't think!".

fennecfoxy

9 months ago

To be fair parent pointing out they're purely statistical machines predicting next token is incorrect anyways.

They are essentially next token predictors after first training, but then instruct models are fine tuned on reasoning and Q/A scenarios, afaik early research has determined that this isn't just pure parroting, that it does actually result in some logic in there as well.

People also have to remember the training for these is super shallow at the moment, when compared with what humans go through in our lifespans as well as our millions of years of evolution (as humans).

globnomulous

9 months ago

What you're saying doesn't contradict what I wrote. I said models are trained. You're saying they're trained and fine tuned -- i.e. continue to be trained. I also didn't say they do any kind of parroting or that logic doesn't take place.

I'm saying, rather, that the models do what they're taught to do, and what they're taught to do are computations that give us a result that looks like reasoning, just the way I could use 3ds max as a teenager to generate on my computer screen an output that looked like a cube. There was never an actual cube in my computer when I did that. To say that the model is reasoning because what it does resembles reasoning is no different from saying there was an actual cube somewhere in my computer every time I rendered one.

globnomulous

9 months ago

I'm not sure what you're getting at. Could you explain?

stavros

9 months ago

That "they predict the next token" doesn't necessarily imply "they can't reason".

globnomulous

9 months ago

I'm not saying "they predict tokens; therefore, they can't reason." I'm saying "something that can't reason can predict tokens, so prediction isn't evidence of reasoning."

More specifically, my comment aims to meet the challenge posed by the person I answered:

> I highly recommend letting that fact be a little bit impressive, someday. There’s no way you live through any event that’s more historically significant, other than perhaps an apocalypse or two. [...] If you have any actual scientific argument as to why a model that can score 90-100 on a typical IQ test has only 1/10th the symbolic reasoning skills of a human, I’d love to eat my words.

I have no idea what would constitute a "scientific argument" in this instance, given that the challenge itself is unscientific, but, regardless, the results that so impress this person are, without question, achievable without reasoning, symbolic or otherwise. To say that the model "muses" or "has [...] symbolic reasoning" is to make a wild, arbitrary leap of faith that the data, and workings of these models, do not support.

The models are token-prediction machines. That's it. They differ not in kind but in scale from the software that generates predictions in our cell-phone keyboards. The person I answered can be as impressed as he wants to be by the high quality he thinks he sees in the predictions. That's fine. I'm not. In that respect, we just disagree. But if he's impressed because he thinks the model's predictions must or do betoken reasoning, he's off in la la land -- and so his wide-eyed, bushy-tailed enthusiasm is based on nonsense.

It's no different from believing that your phone keyboard is capable of reasoning, simply because you are delighted that it guesses the 'right' word often enough to please you.

stavros

In the case of these models, we have no such reason/evidence. In fact, we have good reason for thinking that something other than reasoning as we think of it takes place. We have good reason, that is, to think they work just like any other program. We don't think winzip, Windows calculator, a Quake bot, or a piece of malware performs acts of reasoning. And the fact that these models appear to be reasoning tells us something about the people observing them, not about the programs themselves. These models appear to be reasoning only because the output of the model is similar enough to 'the real thing' for us to have trouble saying with certainty that they aren't the real thing. They're simulations whose fidelity is high enough to create a feeling in us -- and to pass some tests. (In that sense, they're most similar to special effects.) (Edit: and that's not to say feelings are wrong, invalid, or incorrect. They're one of the key ways we experience the things we understand.)

Is reasoning taking place in these models? Sure, it's possible. Is there an awareness or being of some kind that does the reasoning? Sure, that's possible, too. We're matter that thinks. Why couldn't a program in a computer be matter that thinks? There's a great novel by Greg Egan, Permutation City, that deals partly with this: in one section, our distant descendants pass to another universe, where matter superficially appears to be random, disorganized, and low in enthalpy. When that random activity and apparent lack of life and complexity are analyzed in the right way, though, interference patterns are revealed, and these contain something that looks like a rich vista bursting with directed, deliberate activity and life. It contains patterns that, for all the world, look and act like the universe we know -- with things that are living and things that are not, with ecosystems, predators, prey, communities, reproduction, etc. These patterns aren't in, and aren't expressed in, the matter itself. They 'exist' only in the interference patterns that ripple through it.

That's 100% plausible, too. Why couldn't an interference pattern amount to a living thing, an organism, or an ecosystem? The boundary we draw between hard, physical stuff and those patterns is arbitrary. Material stuff is just another pattern.

My point isn't that reasoning doesn't take place in these models or can't. It's, first, that you and I do something we call reasoning, and the best available information tells us these models aren't doing that. Second, if they are doing something we can call reasoning, we have no idea whether our understanding of the model's output tells us what its reasoning actually is or is actually doing. Third, if we want to attribute reasoning to these models, we also have to attribute a reasoner or an interiority where reasoning can take place -- meaning we'd need to attribute something similar to consciousness or beinghood to these models. And that's fine, too. I have no problem with that. But if we make that attribution, then we, again, have no reason to attribute to it a beinghood that resembles ours. We don't know its internal life; we know ours.

Finally -- if we make any of these claims about the capabilities or nature of these models, we are necessarily making the exact same claims about all other programs, because those work the same way and do the same things as these models. Again, that's fine and reasonable (though, I'd argue, wrong), because you and I are evidence that stuff and electricity can have beinghood, consciousness, awareness, and intentions -- and that's exactly what programs are.

The point that I don't think is disputable is the following: these models aren't a special case. They aren't 'programs that reason, in contrast to programs that don't.' They aren't 'doing something we can do, in contrast to other programs, which don't.' And even if they're doing something we can (or should) call reasoning, reasoning requires interiority -- and we have no idea what that interiority looks or feels like. Indeed, we have no good reason to think there's any at all -- unless, again, we think other programs do as well.

globnomulous

9 months ago

The poor quality "content" that's been proliferating recently has been created, largely, using the very tools that AI has built, or their immediate precursors. AI, for all its benefits, has only made that worse.

If you're saying, in good faith, that most of the infomercials, televangelist programs, talk radio, celebrity autobiographies, self-help books, scandalous expose books, and health/exercise fad books etc etc etc that came out 50 years ago were made for no reason beyond advancing human knowledge, you're either too young to remember any media from before our current era and haven't looked beyond survivorship bias.

Tech folks love sentiments like this because it entirely emotionally places the onus on the people getting ripped off by big tech companies for being ripped off. If their work was that awful, companies wouldn't be clamoring to vacuum it up into their models to make more of it. Nearly all of the salable output from these models exists solely because it took a creative product someone made with the intention of selling it and it's using it to sell a simulacra.

It's using nostalgia to deflect guilt for harpooning the livelihood of many people because it's just more convenient and profitable to empower mediocre "content creators" they use to justify doing it.

tivert

9 months ago

> Tech folks love sentiments like this because it entirely emotionally places the onus on the people getting ripped off by big tech companies for being ripped off.

This, times a million. Add to that the ancient quote from Plato(?) criticizing writing or the other ancient quote complaining about the irresponsibility of the youth, unthinkingly deployed to attempt to delegitimize any kind of critique of nearly anything.

The technology industry seems to be overflowing with so-called "rational" people who mainly seem to use use whatever intelligence they have to rationalize away responsibility for whatever problems their beloved technology has caused. It's a really stupid and obnoxious pattern; and once you see it, it's hard to not see if everywhere and be annoyed.

I think one element of it is naked greed (especially from the entrepreneurs) but I think another big part is a kind of stuntedness and parochialism that's often fueled by overconfidence (because of success in software engineering, forming an identify around "being smart" etc).

chefandy

9 months ago

It's one of the reasons I left tech altogether after decades. It's like most people in the tech business right now think their totally unique supreme intellectual might gives them enough pan-subject-matter expertise. The further I moved away from development within the business, the more it repelled me.

caeril

9 months ago

> > the people getting ripped off

Nobody is getting "ripped off" by ML models any more than by other humans. When a human wants to launch a high-quality podcast, they survey the market, listen to a lot of other high quality podcasts, and then set to creating their own derivative work.

What ML models are doing is really no different. It's just much, much faster.

Everything humans create is derivative of other works. Speed is the only difference.

9 months ago

> The people that pirate were never going to pay in the first place.

I think I agree with your larger point, but is this part true? When Spotify provided a much simpler UX to get the goods, people were happy to pay $10/month and Napster et al basically died.

9 months ago

>That said, I'd still make the same point that people who value art and the artist will buy from and support the artist.

The chances anyone will come across the artist when their marketplace is flooded with increasingly plausible simulacra become more and more slim as time goes on.

AI is choking off any hope for artists supported by patronage, simply by virtue of discoverability being lost and trust being eroded.

chefandy

9 months ago

Well gosh, good thing someone in big tech gave me permission to be mad about many in my field being screwed by big tech! Too bad that won't help pay for my cancer treatment because there's no way in hell they'll push out a cure soon enough when they're dumping billions of dollars into figuring out how to sell other people's artwork. At least people won't have to waste an uncomfortable few minutes writing a thoughtful note to my wife in the aftermath when they can just "Ok, google" it.

>> Art is a way of seeing, not a way of creating.

> There's no glib decree

This is a glib decree and it completely ignores most of what art actually is in our world, rather than the quaint little box that most people in the NN business try to stuff it into. Your patronizing tone doesn't lend any authority or add depth to your initial analysis, which you essentially just restated using more words. The "art vs craft" dichotomy doesn't even approach the depth and complexity of the interplay of art and commerce in the worlds like video game development, music, cinema and television, and writing... hell even advertising. Like most tech dudes that assume their incredible mental might gives them some kind of pan-topic expertise allowing them to casually dismiss subject matter experts in other fields based on a few a priori thought exercises, you simply don't know how much more you need to learn to make informed decisions about this topic.

doctorpangloss

9 months ago

Have you never in your life enjoyed a pirated movie, game, book or music track?

10 months ago

I think and hope that you're wrong. There's always been cheese, and there's a lot of it now. But there is still a market for top-notch insight.

For example, Perun. This guy delivers an hourlong presentation on (mostly) the Ukraine-Russia war and its pure quality. Insights, humour, excellent delivery, from what seems to be a military-focused economist/analyst/consultant. We're a while away from some bot taking this kind of thing over.

https://www.youtube.com/@PerunAU

Or hardcore history. The robots will get there, but it's going to take a while.

https://www.dancarlin.com/hardcore-history-series/

DanHulton

10 months ago

I keep seeing this asertion: "the robots will get there" (or its ilk), and it's starting to feel really weird to me.

It's an article of faith -- we don't KNOW that they're going to get there. They're going to get better, almost certainly, but how much? How much gas is left in the tank for this technique?

Honestly, I think the fact that every new "groundbreaking" news release about LLMs has come alongside a swath of discussion about how it doesn't actually live up to the hype, that it achieves a solid "mid" and stops there, I think this means it's more likely that the robots AREN'T going to get there some day. (Well, not unless there's another breakthrough AI technique.)

Either way, I still think it's interesting that there's this article of faith a lot of us have "we're not there now, but we'll get there soon" that we don't really address, and it really colors the discussion a certain way.

llamaLord

10 months ago

IMO it seems almost epistemologically impossible that LLM's following anything even resembling the current techniques will ever be able to comfortably out-perform humans at genuinely creative endeavours because they, almost by definition, cannot be "exceptional".

If you think about how an LLM works, it's effectively going "given a certain input, what is the statistically average output that I should provide, given my training corpus".

The thing is, humans are remarkably shit at understanding just how exception someone needs to be to be genuinely creative in a way that most humans would consider "artistic"... You're talking 1/1000 people AT best.

This creates a kind of devils bargain for LLMs where you have to start trading training set size for training set quality, because there's a remarkably small amount of genuinely GREAT quality content to feed this things.

I DO believe that the current field of LLM/LXM's will get much better at a lot of stuff, and my god anyone below the top 10-15% of their particular field is going to be in a LOT of trouble, but unless you can train models SOLELY on the input of exceptionally high performing people (which I fundamentally believe there is simply not enough content in existence to do), the models almost by definition will not be able to outperform those high performing people.

Will they be able to do the intellectual work of the average person? Yeah absolutely. Will they be able to do it probably 100/1000x faster than any human (no matter how exceptional)?... Yeah probably... But I don't believe they'll be able to do it better than the truly exceptional people.

d1sxeyes

9 months ago

I’m not sure. The bestsellers lists are full of average-or-slightly-above-average wordsmiths with a good idea, the time and stamina to write a novel and risk it failing, someone who was willing to take a chance on them, and a bit of luck. The majority of human creative output is not exceptional.

A decent LLM can just keep going. Time and stamina are effectively unlimited, and an LLM can just keep rolling its 100 dice until they all come up sixes.

Or an author can just input their ideas and have an LLM do the boring bit of actually putting the words on the paper.

llamaLord

9 months ago

I get your point, but using the best-sellers list as a proving point isn't exactly a slam-dunk.

What's that saying? "Nobody ever went broke overestimating the poor taste of the average person"

9 months ago

Yes, LLMs are probably inherently limited, but the AI field in general is not necessarily limited, and possibly has the potential to be more genuinely creative than even most exceptional creative humans.

beefnugs

9 months ago

I loosely suspect too many people are jumping into LLMs and I assume real research is being strangled. But to be honest all of the practical things I have seen such as by Mr Goertzel are painfully complex very few can really get into.

MatrixMan

9 months ago

Agreed. I think people are extrapolating with a linearity bias. I find it far more plausible that the rate of improvement is not constant, but instead a function of the remaining gap between humans and AI, which means that diminishing returns are right around the corner.

There's still much to be done re: reorganizing how we behave such that we can reap the benefits of such a competent helper, but I don't think we'll be handing the reigns over any time soon.

jcgrillo

9 months ago

In addition to "will the robots get there?" there's also the question "at what cost?". The faith-basedness of it is almost fractal:

- "Given this thing I saw a computer program do, clearly we'll have intelligent AI real soon now."

- "If we generate sufficiently smart AI then clearly all the jobs will go away because the AI will just do them all for us"

- "We'll clearly be able to do the AI thing using a reasonable amount of electricity"

None of these ideas are "clear", and they're all based on some "futurist faith" crap. Let's say Microsoft does succeed (likely at collosal cost in compute) in creating some humanlike AI. How will they put it to work? What incentives could you offer such a creature? What will it want in exchange for labor? What will it enjoy? What will it dislike? But we're not there yet, first show me the intelligent AI then we can discuss the rest.

What's really disturbing about this is hype is precisely that this technology is so computationally intensive. So of course the computer people are going to hype it--they're pick and shovel salespeople supplying (yet another) gold rush.

atrus

I think the main problem with Peruns' videos are that they are videos. I run a little program on my home-lab that turns them into podcasts and I find that I enjoy them far more because I need to be less engaged with a podcast to still find them enjoyable. (Also, I gave up on being up to date with Ukraine situation, since up to date information is almost always wrong. I am happy to be a week or a 14 days behind if the information I am getting is less wrong).

I like Hardcore history very much, but I think it would be far worse in a video form.

Torkel

10 months ago

Just turn off the screen with youtube video playing and there's no difference from a podcast?

I listen to Perun at the gym every week, audio only.

bogtog

10 months ago

Perun is peak podcast-like YouTube. In the gym, I just keep my screen on to share my YouTube tastes with the world and sometimes peak at some visuals

lasc4r

10 months ago

That's a paid service that some people balk at.

maest

10 months ago

PipePipe on Android does it for free. (Or New pipe or some other *Pipe players)

nordsieck

10 months ago

> That's a paid service that some people balk at.

AFAIK, it's only a paid feature to play video in the background.

satvikpendem

10 months ago

It doesn't have to be paid, YouTube on the mobile browser can do it for free.

posterboy

10 months ago

on firefox

richardw

10 months ago

I’d also like a podcast. I usually walk around with the video in my pocket to be honest. Audio is 80% of the value in his case.

graemep

10 months ago

> He reads bullet points he has prepared, and makes predictable dad jokes in monotone, re-uses and reruns the same points, icons, slides, etc.

The presentation is a matter of taste (I like it better than you do), but the content is very informative and insightful.

Its not really about what is happening at the frontline right now. Its not its aim. Its for people who want dense information and analysis. The state of the Ukrainian and Russian economies (subjects of recent Perun videos) does not change daily or weekly.

caulk

10 months ago

Drifting off topic, but do you have any examples of those analysis/sitrep content creators you prefer?

richardw

10 months ago

Not your parent commenter but I like:

https://m.youtube.com/@militaryandhistory

https://m.youtube.com/@suchomimus9921

https://m.youtube.com/@WardCarroll

Fav is probably Suchomimus right now. Updated faster, shorter reports. I feel like I get the info sooner after it happens.

authorfly

10 months ago

All of the other commentators have replied with a good diverse set of YouTubers and included ones with biases from both sides; I'd recommend the ones they have linked. Some (take note of the ones that release information quicker) might be more biased or more prone to reporting murky information than others.

formerly_proven

10 months ago

https://www.youtube.com/@anderspuck

10 months ago

This is true but the quality frontier is not a single bar. For mainstream content the bar is high. For super-niche content, I wouldn’t be surprised if NotebookLM already competes with the existing pods.

This will be the dynamic of generated art as it improves; the ease of use will benefit creators at the fringe.

I bet we see a successful Harry Potter fanfic fully generated before we see a AAA Avengers movie or similar. (Also, extrapolating, RIP copyright.)

llm_trw

10 months ago

On the contrary, the mainstream eats any slop you put infront of it as long as it follows the correct form - one needs only look at cable news - the super niche content is that which requires deep thinking and novel insights.

Or to put another way, I've heard much better ideas on a podcast made by undergrad CS students than on Lex Fridman.

10 months ago

I would say the opposite is true - mainstream cares much less about the quality content but more about catchy headline.

solumunus

10 months ago

It's the complete opposite. Unless your definition of mainstream includes stuff like this deep drive into Russia/Ukraine, in which case I think you're misunderstanding "mainstream".

sqeaky

10 months ago

I know I'm not the first to say this, but I think what's going on is that these AI things can produce results that are very mid. A sort of extra medium. Experts beat modern LLMs but modern llms are better than a gap.

If you just need voice discussing some topic because that has utility and you can't afford a pair of podcasters (damn, check your couch cushions) then having a mid podcast is better than having no podcast. But if you need expert Insight because expert Insight is your product and you happen to deliver it through a podcast then you need an expert.

If I were a small software shop and I wanted something like a weekly update describing this week's updates for my customers and I have a dozen developers and none of us are particularly vocally charismatic putting a weekly update generated from commits, completed tickets, and developer notes might be useful. The audience would be very targeted and the podcast wouldn't be my main product, but there's no way I'd be able to afford expert level podcasters for such a position.

I would argue Perun is a world class defense Logistics expert or at least expert enough, passionate enough, and charismatic enough to present as such. Just like the guys who do Knowledge Fight, are world class experts on debunking Alex Jones, and Jack Rhysider is an expert and Fanboy of computer security so Darknet Diaries excels, and so on...

These aren't for making products, they can't compete with the experts in the attention economy. But they can fill gaps and if you need audio delivery of something about your product this might be really good.

Edit - but as you said the robots will catch up, I just don't know if they'll catch up with this batch of algorithms or if it'll be the next round.

FranzFerdiNaN

10 months ago

> I know I'm not the first to say this, but I think what's going on is that these AI things can produce results that are very mid. A sort of extra medium. Experts beat modern LLMs but modern llms are better than a gap.

I've seen people manage to wrangle tools like Midjourney to get results that surpass extra medium. And most human artists barely manage to reach medium quality too.

The real danger of AI is that, as a society, we need a lot of people who will never be anything but mediocre still going for it, so we can end up with a few who do manage to reach excellence. If AI causes people to just give up even trying and just hit generate on a podcast or image generator, than that is going to be a big problem in the long run. Or not, and we just end up being stuck in world that is even more mediocre than it is now.

roenxi

10 months ago

AI looks like it will commoditise intellectual excellence. It is hard to see how that would end up making the world more mediocre.

It'd be like the ancient Romans speculating that cars will make us less fit and therefore cities will be less impressive because we can't lift as much. That isn't at all how it played out, we just build cities with machines too and need a lot less workers in construction.

Jevon23

10 months ago

There are… many people who think that cities are worse off because of cars. Maybe not for the same reasons, but still.

9 months ago

I will take clean water, safe sewage removal, and other modern amenities over the insubstantial vagaries of "soul" any day.

cglan

10 months ago

cars have made us much less fit though...

sqeaky

9 months ago

OutOfHere

10 months ago

For such historical topics, my LLM-based software podgenai does a pretty good job imho. It is easier for it since it's all internal knowledge that it already knows about.

dr_dshiv

It's always funny when I find out that various people I respect follow Perun uploads closely.

robinsonb5

10 months ago

The thing is, we have been here before.

Think back to the mid-1980s and the first time everyone got their hands on a Casio or Yamaha keyboard with auto-accompaniment.

10 months ago

This is just a "no true Scotsman" take.

Popular music has already been synthetic and souless for decades now. People will listen to what sounds good to them, and we already know the bar is very low, and that the hard truth is that it is all subjective anyway.

agentultra

9 months ago

More of a behavioural science take. Is music the sound that is played or the people making the sound?

We’ve had software accompaniment for a long time. Elevator music. The same 4 chords arranged in similar ways for decades. Hasn’t destroyed music. Neither will AI.

At some point people are going to want to know who’s on the other side making the music.

Unless your argument is that nobody values artists… which is I guess one of the primary conceits of GenAI enthusiasts today.

jazzypants

10 months ago

Sure, bars and restaurants will have an endless supply of boring music, but no one is ever going to go to an AI music event.

Workaccount2

9 months ago

Read it and weep:

https://www.nydailynews.com/2016/05/29/thousands-of-new-york...

jazzypants

9 months ago

The music is written by human beings and the animation is done by human beings.

You just proved my point for me.

https://legacy.iftf.org/future-now/article-detail/making-mik...

sirsinsalot

10 months ago

And yet Clint Eastwood by the Gorillaz was a Casio demo track.

It isn't so black and white.

https://youtube.com/shorts/Wn0NtSNeQEQ

lugu

10 months ago

That is to the point. Gorillaz has talent, and that is what made Clint Eastwood a hit. Not the Yamaha.

_emacsomancer_

10 months ago

Similarly, Under Mi Sleng Teng[1], but here too it required human musical talent.

[1]: https://en.wikipedia.org/wiki/Sleng_Teng

jazzypants

10 months ago

To be clear, Dan the Automator added an additional drum track, an additional bass track, and a melodica track as well as numerous other sound effects. They didn't just loop the Casio demo track.

oddthink

10 months ago

It goes way too far, IMHO.

It ends up sounding like a smarmy Sunday-morning talk show conversation, with over-exaggerated affect and no content.

So far I've just fed it technical papers, which may be part of the problem, but what I got back was, "Gosh, imagine if a recommender system really understood us? Wow, that would be fantastic, wouldn't it?"

mdp2021

10 months ago

Already in the sample embedded by Simon. "Gosh", "wow", "like", "like", "like", "[wooooaaaawiiiing, woooooooaawiiiiiiing]", "Oh my god", "I was so, like...".

https://www.youtube.com/watch?v=ssDdqq_9TzI&t=34s [April Ludgate meets Tynnifer, Parks and Rec]

skapadia

10 months ago

While it's impressive, I agree that it tends to make over the top comments or reactions about everything. It could probably make a Keurig machine sound like a revolutionary coffee maker.

3abiton

10 months ago

I ran one of my papers into it, mind blown how well they dumbed it down without losing too much details (still quite a lot was ommitted). I wonder if it's domain specific, and I wonder what's the variance by topic.

Al-Khwarizmi

10 months ago

Same here. In fact, I typically struggle communicating my scientific research to journalists, and next time I'll use this. It found some good metaphors to make even a quite math-heavy paper's core concepts understandable to the audience without losing correctness, which is something that both I and the journalist typically fail to do (I keep the correctness but don't make it understandable enough, so then journalists start coming up with metaphors and do the opposite).

A lawyer friend of mine also suggested giving it the Spanish civil code, a long, arid legal text. The podcast of course didn't cover the whole text in 10 minutes, which would be impossible, but they selected some interesting tidbits and actually had me hooked until the end and made me learn a few things about it, which is no small merit. And my friend was quite impressed and didn't complain about correctness.

tkgally

10 months ago

I did the same thing, running one book I edited and another book I wrote through it, and it did quite well. I was particularly impressed with how the “hosts” came up with their own succinct examples and metaphors to explain what I had written at much greater length. (I should mention that one of those books was in Japanese, and they captured it clearly in English.)

10 months ago

In the case of those books and podcasts, who cares if you read or listen to them? The point is that the books are sold and make the right lists. The point is that the podcasts are downloaded so ads can be sold or that vanity numbers can be reported.

In terms of such music and films (whether created by human or AI) sometimes it's just because we are social creatures and need shared experiences to talk with others about.

peutetre

10 months ago

But knowing it's synthetic, why would you buy the book or listen to the podcast in the first place? There's nothing social or shared in a synthetic affectation.

corysama

10 months ago

In an ideal world, I would sit down with an espresso or a beer, and review collections of research papers on a regular basis.

In reality, between work, sleep and family, I rarely have anything resembling that kind of time and mental energy reserve available.

But, what I can afford is to listen to podcasts while doing other things. Doing that gives me enough of an overview to keep up with a general topic and find new topics that might be worth investing into deeper.

Wouldn’t it be great if someone made a podcast channel specifically for “Papers corysama wants to hear about at this moment”? I think so. Apparently, so do a lot of other people. But, they don’t want to list to my specific channel.

Al-Khwarizmi

10 months ago

I wouldn't read an AI-generated book (except maybe once as a curiosity), but I would definitely listen to AI-generated music if it were good enough.

Reading a book is a time investment so I want it to convey the thoughts of another human being, otherwise it would feel like wasting my time. Listening to music, on the other hand, often is something that I do while I exercise, to keep a brisk pace and not get bored. As long as it sounds good, fits the genres and styles I like and is upbeat enough for exercising, I wouldn't have much of a problem with AI music - maybe it would even be a plus, since there are some specific music genres where I have already listened to pretty much everything there is (and no more is being made), and it would be great to have more.

I don't listen to podcasts, but I suppose in that case it depends on how you do so: devoting your full time and attention like a book, or as a background while you do something else like exercise music? As far as I know, many listeners are in the latter case, so I don't see why they wouldn't listen to AI podcasts.

9 months ago

An interactive conversation / tutorial session beats a book pretty much all of the time. Nonfiction books contain a lot of information that's redundant to a reader familiar with the topic, and not enough for someone new. They don't backtrack if you clearly missed an important point. And so on. It's like fractal geometry.

llmfan

9 months ago

Isn't novel in any way -> This is not how it works, there are studies showing that AI can be creative. Or at the very least (since the definition of creativity can be controversial) produce output that is indistinguishable from novel, creative output, which is enough for the purpose discussed here.

devb

10 months ago

> I would definitely listen to AI-generated music if it were good enough

Why not just seek out the original works that the AI stole from?

CaptainFever

9 months ago

Because that's not how it works.

devb

9 months ago

Yes it is. How else are they "trained?"

CaptainFever

9 months ago

Your comment implies that there is an existing piece of music, which can subsitute the generated music. While subsitutability varies from person to person, your original statement implies for me that each generated music has an accompanying original music that you can listen to instead (of which it was "stolen" from), since it is similar enough. I think we both know that that is not the case.

I know that you likely intended to imply that you can subsitute the aformentioned AI music with an existing piece of music of the same genre, but that is not a view shared by all. Sometimes the generated music scratches such a specific and personal itch, that it cannot be replicated by something in the same genre.

A better counterargument to your original comment would be "It is not an exclusive situation. I can listen to and support both generated music and handcrafted music at the same time. They both contain music tracks that I like."

devb

9 months ago

You don't have to be a big fucking nerd about it, you know what I meant. The generated music wouldn't exist without the foundation of stolen music made by people.

CaptainFever

9 months ago

No, I didn't know what you meant. Communication is hard, and there are multiple ways to interpret your statements. It is better to be specific.

To be more specific about the second sentence, if there are any readers in doubt:

> The generated music wouldn't exist without the foundation of stolen music made by people.

The word "stolen" is a value judgement that is not shared by all. It is a word meant to invoke an emotional response in the reader. For example, Stallman has argued that the data could not have been stolen, or else it would not be there anymore. So, removing this word gives you:

> The generated music wouldn't exist without the foundation of existing music made by people.

Which is a true fact that has never been in debate.

However, this is not relevant to the main point that not all generated music has a suitable handcrafted substitute, and that there is no actual need to choose exclusively to listen to generated or human crafted music. Furthermore, the conversation has turned uncivil (the first sentence). Therefore, goodbye.

JonathanFly

10 months ago

>But why would I buy those books or listen to those podcasts that are synthetic affectations of no substance?

A randomly selected NotebookLM podcast is probably not substantial enough on its own. But with human curation, a carefully prompted and cherry-picked NotebookLM podcast could be pretty good.

Or without curation, I would use this on a long drive where audio was the only option to get a quick survey of a bunch of material.

llmthrow102

10 months ago

That's the same question I have. There is already a ton of great podcasts/music/everything in the niches that I like that I don't have the time to listen to them all. I also like to have quiet introspective time.

So where does AI regurgitated slop fit into my life?

dromtrund

10 months ago

9 months ago

Right? The fact that the LLM output is indistinguishable from a podcast says more about podcasts than about LLMs.

If anything, listening to that reminded me of why I stopped listening to podcasts in the first place - every 5 second snippet of something interesting ends up suffocated by 5 minutes of filler and dead air.

jeremyjh

10 months ago

That's actually a really good application of AI, because the quality of the content is meaningless as long as it hits the bullet points. They only do this to check a box that training on <topic> was done.

globnomulous

9 months ago

10 months ago

I think it is right that people don't care and there is some merit to it.

Reading, or listening to podcast, these days is more akin to a meditation - many people do it to reenforce an identity rather than to expand on themselves.

And I do think that is reasonable as, for many people, there are few other structures that can keep them in check with themselves.

lordnacho

10 months ago

> The reason so much writing, podcasting, and music is vulnerable to AI disruption is that quality has already become secondary.

I was thinking this kind of thing is the perfect way to generate sports commentary.

alickz

10 months ago

I think the average person is more interested in the output than in the process e.g. more people want to read The Shining than want to read about how The Shining was written

grugagag

10 months ago

Id say most people skip the reading part and watch the movie instead.

mistrial9

10 months ago

> the interesting thing is that most people don't really care

no one has gotten feedback from "most people" .. this is raw hyperbole

InDubioProRubio

10 months ago

Thank you for saying that, it was always a background task thought, but now that you put it in words. This. The churn shall burn..

bambax

10 months ago

Yes, this is impressive, it has all the idiosyncrasies of podcasting, the pauses, turns of phrase, even the tones where we hear people putting things in quotes, etc.

... but it's also pointless. And it's likely different episodes on different topics will tend to sound very much alike; it's already the case here, I'm sure I heard another example where the two voices were the same.

In less than a year we all have learned to recognize AI images with pretty good accuracy; text is more difficult, but podcasting seems easy in comparison.

janoc

10 months ago

Well, yes. Replace the various music and book publishing mills with LLMs for even more low quality drivel filling the marketplaces because now even the already low barrier of having to actually pay someone to produce it will be removed.

That's definitely going to be an improvement. Not.

user

10 months ago

[deleted]

hn_throwaway_99

10 months ago

I thought this was a great, insightful comment, but noodling over it a little more made me think it's not just content producers who are responsible for this "quality vacuousness" epidemic.

I think this is just partly an inevitable consequence of going from "content scarcity" to our new normal of "content obesity" over the past 20 years or so. In this new era of an overwhelming amount of content, it's just natural to compare it all against each other, e.g. to essentially "optimize" it to the "best" form, but in doing that we've fallen into a homogeneity, and the resulting lack of variation is an actual lowering of quality in and of itself.

2 examples to explain what I mean:

9 months ago

9 months ago

10 months ago

Must have been prompted to be an American podcaster.

Bring on the one that's all British and snarky!

9 months ago

As an American, I find it exhausting. I think of it as fake Silicon Valley/SF "nice" affect combined with non-US English as a second language floridity. My setup prompt for ChatGPT includes a reminder that if it answers too long or slowly, then it will take time away from my medical research grad students and people will probably die as a result of the delays. It helps a little.

mharig

10 months ago

[dead]

djur

10 months ago

This seems to be a common trait of a lot of the more "aligned", "helpful" LLMs out there. You can drop any random excerpt from your diary into ChatGPT and it will tell you about how brilliant, sensitive, and witty you are. It's really quite sickening.

firtoz

10 months ago

Reminds me of my father who'd tell every kid that they're a genius, including myself. It got me motivated to try things, but whenever there was a failure, I felt terribly betrayed.

scotty79

10 months ago

General advice from psychology is that when it comes to success you should praise the kids for things they control, like effort, time spent, inquisitiveness, concentration not things that are out of their control like talent or luck. Basically praise for what they did, not what they are.

When it comes to morality, it's the other way around. You praise kids for being good people when they do something right. Because you want them to internalize identity of a good person and associate it with those behaviors.

Internalizing identity of a genius is mostly useless, rarely beneficial, often harmful.

10 months ago

This is impressive from a technical point of view and probably useful from an educational one; I really like the idea that a piece of text can be transformed into any kind of media format easily, depending on your preferences. As recently as a year ago I was using Apple’s text to speech tool to listen to Wikipedia articles while biking, and needless to say, they weren’t very exciting to listen to.

But I don’t think it’s much of a threat to actual podcasts, which tend to be successful because of the personalities of the hosts and guests, and not because of the information they contain.

Which leads me to hope that the next versions of Notebook will allow more customization of the speakers’ voices, tone, education level, etc.

JimDabell

10 months ago

> But I don’t think it’s much of a threat to actual podcasts, which tend to be successful because of the personalities of the hosts and guests, and not because of the information they contain.

I wonder if any “blended” podcasts will pop up, where a human host uses a tool like this for an artificial cohost.

Merik

10 months ago

Latent Space AI Engineering podcast does this with an AI cohost; mostly for intros and segues. A recent episode used it to summarise a Twitter AMA and while it’s usually used to good effect, that one was one of the first episodes the quality of the co host part was lacking, as it mispronounced things, and was a bit muddled in parts. That said, the podcast has been an incredibly useful and insightful regular listen for me.

swyx

9 months ago

hey that was me! yeah we've been amping up the ai content in the pod as you see, hopefully experimenting in tasteful ways.

I'm not super proud of the Twitter AMA one and if u listen back now i fixed many of the bad cutovers. I doubt i'll repeat it again on current tech.

thank you for listening! feedback and ideas welcome.

someothherguyy

10 months ago

10 months ago

The underlying NotebookLM is doing better at this - each claim in the note cites a block of text in the source. So it’s engineered to be more factually grounded.

I would not be surprised if the second pass to generate the podcast style loses some of this fidelity.

ColinEberhardt

10 months ago

I don’t think this is all that impressive, the generated podcast is pretty shallow - lots of ‘whoa meta’ and the word ‘like’ thrown into every sentence.

Yes, it will generate a middle-of-the-road waffling podcast, but not one with any real depth.

infogulch

10 months ago

Look I agree with you at a certain level, maybe it can't emulate deep conversations about big topics (maybe it can, I haven't seen an attempt...), but a vast vast majority of podcasts and radio shows are just like this: shallow and incredibly simplified with no more than a nod to the underlying concepts. 70% personality, 20% dumb analogies that the producer thought up in thirty minutes, and <10% actually communicating the material is standard fare for normie podcasts, sadly.

Honestly, given the personalization maybe it's a net improvement.

djur

10 months ago

Kind of feels like looking at an overflowing landfill and thinking "I wonder if we can invent a robot that just generates new trash directly into the landfill".

squigz

10 months ago

This holier than thou attitude that crops up in these threads is so annoying, as if people wanting to casually enjoy a mediocre podcast or radio show on the 1 hour commute to their shitty job is a crime.

klabb3

10 months ago

I don’t think anyone cares about other people’s cheap pleasures. What people do care about is the displacement of quality and craft. For instance, you could say the same thing about the state of the web - say when searching for recipes. Maybe some people like the ads, the consent forms, the backstories? Why so purist? Isn’t it nice with a bit of scrolling and getting in the mood for cooking with a bit of SEO?

10 months ago

I was blown away by how impressive it was. I honestly thought it was real. I still can't believe these realistic audio capabilities are not being used for pure evil everywhere we look.

> like thrown into every sentence

I think that's actually part of why it sounds real, because tons of people do actually talk like that.

To me what would make it even better is the ability to throw in random jokes and utilize information about their surroundings and recent events.

I have been using MeloTTS for text-to-speech and I thought that was about the best we could do right now, but apparently I was very wrong. Is there an offline model one can download today that sounds as good as this NotebookLM?

JonathanFly

10 months ago

Bark can sound as good, but Google is using SoundStorm which was specifically trained on dialogs. Surprisingly Bark can even sort of match it without being trained to do so, but not reliably. (https://x.com/jonathanfly/status/1675987073893904386)

And SoundStorm has more than twice the context window of Bark so dialogs are a tight fit.

ranger_danger

10 months ago

I just tried the default bark.cpp example from the github readme, and to me it still doesn't sound close enough to realistic, and the audio quality itself was a bit scratchy... maybe I'm doing something wrong.

When I tried my own text with it, it went completely off the rails... skipping completely over random words, and also switching to different voices in the middle of a sentence. Trying to run the large model also crashed entirely.

JonathanFly

10 months ago

You aren't doing anything wrong - Bark out the box uses a randomly generated voice and I like to think it's modeling the world of random voices which includes bad microphones/audio-quality. (Even bad 'actors' - see how many Bark voices sound like they are reading a script.)

Presumably it was trained in noisy data. But it can generate and use a clean voice, they are in there. Most of the Suno default voices are not great either - but a great voice can sound perfectly clear. I haven't done much with Bark lately but on my Twitter there's plenty of clear examples of very realistic voices. Actually here I ran a prompt based on some copy and pasted test 20 times in Bark. I put a couple better results up front, but even in later samples you can find lots of evidence of human-sounding voices. https://sndup.net/bzhz5/

shepherdjerred

9 months ago

You’re taking about advancements made through multiple lifetimes. This burst in AI has lasted about 15 years.

TBH I think it’s more of a knee jerk reaction from those tired of hearing about AI or who just want to post contrarian opinions (which I totally do sometimes, too).

abraxas

10 months ago

At the risk of sounding cliche but this is the worst this tech will ever be. I find it equally scary and fascinating what lies ahead.

roywiggins

10 months ago

The content is nothing that special these days, you could get it out of Gemini or Claude probably- but the audio affect is awfully convincing.

- Some of the commentary was insightful and provided a pretty nice marketing summary of ideas I tried to convey in my terse (US style) resume.

- Some of the comments were so marketing-ey that I wanted to gag. But at the same time, I recognize that my setpoint on these issues is far toward the less-bs side, and that some-bs actually does appeal to a lot of people and that I could probably play the game a little stronger in that regard.

Overall I was quite impressed.

Then for fun I gave it a Dutch immigration letter, one which said little more than "yeah you can stay, and we'll coordinate the document exchange". They turned that into a 7 minute podcast. I only listened to the first 30 seconds, so I can only imagine how they filled the rest. The opener was funny though: "Have you ever thought of just chucking it all and moving to a distant land?" ... lol. Not so far off the mark, but still quite funny to come up with purely from an administrative document.

amunozo

10 months ago

I tried it converting bureaucratic documents from Spain, even a paper sheet to just ask for holidays, and it created the funniest podcast I've ever heard. I'm glad I'm not the only one doing this stupid thing.

handelaar

10 months ago

So basically what you're all saying is how it's technically impressive. Okay.

It is also completely and utterly worthless -- an inefficient and slow method of receiving not-very-many words which were written by nobody at all.

The one and only point listening to a discussion about anything is that at least one of the speakers is someone who has an opinion that you may find interesting or refutable. There are no opinions here for you to engage with. There is no expertise here for you to learn from. There is no writing here. There are no people here.

There is nothing of any value here.

double051

10 months ago

This sentiment feels overly dismissive about the possibilities here. This is the first pass at a new user experience, and I find it already to be compelling to try for various subjects.

Andrej Karpathy has been tweeting about it positively, and I believe he has a good intuition about these kinds of technologies. https://twitter.com/karpathy

joe_the_user

9 months ago

This sentiment feels overly dismissive about the possibilities here.

No, I see the gp as talking about the possibilities of this technology - it's possibility to waste someone's time. The problem, in a sense, isn't just that it's injecting simple content with "fluff" but that the fluff is formulaic. Listening to a human speak in awe struck tones about "magic" give the listener at least a sense that a real person was convinced by X. Listening to simulation of this, you lose the filter of the real person.

Of course, this is just the automated continuation of the existing standard of talk show hosts who gush over whatever is placed in front of them so it's just one more step down the general mediocratizaiton of the world, not a special step. But it still is a step in that direction.

yeahwhatever10

9 months ago

I don't hate the product, but God I hate appeal to authority.

sodality2

10 months ago

This is some insane catastrophizing. The value is that it turns it into a form factor that may be easier to consume, pay attention to, etc.

pjc50

10 months ago

Turns what into an easy form factor?

Some of this appears to be auto-summarization + read aloud, but the underlying question of "is there anything here at all" is worth asking.

sodality2

10 months ago

Any content you upload. PDFs, text, etc. Academic papers was one example I thought of (and have used).

EGreg

9 months ago

Welcome to all entertainment

Why consume entertainment? It’s just a time waster, right?

Well that’s how the news is often consumed. Through some sort of “morning joe” podcast

mdp2021

10 months ago

Since when industrial snacks are healthy food?

InsideOutSanta

10 months ago

This probably isn't really a good analogy. It's just a fact that for most people, a conversation is more engaging than an academic paper. It's easier to pay attention to it, and it's easier to retain the information in it.

This might be healthy food that tastes like a snack.

mdp2021

9 months ago

10 months ago

Indeed. But MREs, protein shakes, Huel etc. are also a product of industrialisation.

In this case, I could see potential value for a better iteration of this tech, making it a meal replacement shake rather than a candy bar.

There's too much interesting content for me to read it all, and I have a long commute. Right now I'm using that commute to learn German, and that is a good use of that time, but let's say I didn't need to because I hadn't moved country or I was already fluent: in this hypothetical, I'd gladly have a better AI than this(!) generate podcasts about the articles that I don't have time to read.

But the AI would need to be better than this one for that to be worthwhile — I just popped one of my own blog posts into it, and it was kinda OK-ish, but did make some stuff up. Now sure, the Gell-Mann Amnesia effect was written with humans in mind, but that's a shared disappointment and not a reason to let this AI off that particular hook.

redleggedfrog

10 months ago

... insane catastrophizing." Nice unique phrase. Guessing you're not a LLM. ;^)

The thing that is being offered is of no interested to me, as are almost any AI generated content. I'm a human, and am interested in what humans do and say and think. AI content offends my sensibilities at every level. I dismiss it without even thinking twice. So all those people who do podcast, music, art, whatever, with AI, well, you lost me folks. I pay a lot of money for the things I like. AI ain't getting any of it, not out of spite (can't spite an AI, they're not human!) but on principle.

cdrini

9 months ago

I will note this is slightly less an example of "AI generated" and more an example of "AI transformed". This takes existing, written by human documents or articles and transforms them into a podcast. Based on what you've written here, this shouldn't necessarily be in contradiction with your values, since you're still getting thoughts from other humans, and you can still pay money to the humans who made the original article, etc.

sodality2

10 months ago

That's fine. To say you don't like something is fine. To say something holds no value is a stronger claim.

redleggedfrog

10 months ago

I'd go even further than "hold no value" and say it's actively detrimental on both the individual and society. We already have an avalanche of dehumanizing technology that isolates and placates us. We see the results of this with problems in mental health and socialization. This is a downward spiral as AI content will likely appeal to those who lack social skills as they don't have to cope with tricky vagaries of other humans - which is part of which makes us human and gives us social growth.

sodality2

10 months ago

Even more catastrophizing. Do you get upset when you read the abstract from an academic paper? Or when you listen to a real podcast that does summarize a difficult topic in easier/shallower terms? Is it the fact that an AI summarized it the problem? Can you point to a real harm here, or will you just hand-wave, instead of seeing the reality of making information more available being a net positive?

joelanman

9 months ago

a few real harms:

- massively inefficient use of energy, water and other resources at a time we really need to address climate crisis

- ai 'slop' with myriad mistakes and biases performing a mass DDOS on people trying to learn things and know what's true

- moving resources away from actually producing factual and original content

sodality2

9 months ago

Thank you, these are mostly extremely valid complaints. I hope with time these come to be inefficiencies that can be moved past (AI models turn into local-first energy efficient tools, becomes more intelligent at summarization). Right now though, wholeheartedly agree.

The last one seems to be irrelevant for this specific use case - the content is produced, it's put into an easier to digest format. No one thought sparknotes would kill books.

joelanman

9 months ago

I was referring to real podcasts

redleggedfrog

9 months ago

Interesting. You have turned this around to be about me instead of the ideas. You must be good at arguing on the internet. I'm not.

sodality2

9 months ago

Well, I'm just curious why you think something like this has negative value - I _do_ care about the ideas but you are the one who expressed that sentiment.

EGreg

9 months ago

Here is what AI can do thus far:

1) humans produced a lot of content in good faith on the internet

2) the AI was trained on it and as a result produced a non-von-Neumann architecture that no one really understands, but which can reason about many things

3) even simply remixing the intelligible and artistic output of millions of humans in lots of nonlinear ways, directed by natural language, leads to amazing possibilities that obviate the need for humans to train anymore because by the time they do, it will all be commoditized.

4) doing it at scale means it can be personalized (also create unlimited amounts of fraudulent yet believable art / news / claims etc.) to spam the internet with fake information for short-term goals, some for LULz, others profit or control etc.

5) targeting certain goals, like reputation destruction of specific people or groups, seems like low hanging fruit and will probably proliferate in the next couple years, with no way to stop it

6) astroturfing all kinds of movements, with fake participants, is also a pretty easy goal with huge incentives — expect websites where 95% of the content and participants are fake trying to attract VC money or sell tokens, etc.

7) but ultimately, the real game changer is commoditizing everything you consider to be uniquely human and meaningful, including jokes, even eventually sex and intimacy. Visuals for heterosexual men, audio for heterosexual women (this is before the sexbots and emotionbots that learn everyone’s micro-preferences better than they know themselves, and can manipulate people at scale into being motivated to do all kinds of things and gently peer-pressure those who might resist).

8) For a few years they will console themselves with platitudes like “the AIs arent meant to replace, but enhance, centaurs of human + computer are better than a computer alone” until human in the loop will clearly be a liability and people will give up… the platitudes will become famous as epitomizing optimistic delusions as humans replaced themselves

Would probably be used for busy parents to rsise their kids at first, in a “set and and forget it” way, educating them etc. But eventually will be weaponized by corporations or whoever trains the models, to nudge everyone towards various things.

Even without AI, the software improves all the time through teams of humans sending autatic updates over-the-air. It can replace a few things you do… gradually then all at once. Driving. Teaching. Entertainment. Intimacy. And so on.

I think the most benign end-game is humans have built a zoo for themselves… everyone is disconnected from everyone by like 100 AIs, and can no longer change anything. The AIs are sort of herding or shepherding the humans into better lifestyles, and every need is satisfied by the AIs who know the micro-preferences of the humans and kids and pets etc.

But it will be too tempting for the corporations to put backdoors to coordinate things at scale, once humans rely on their AIs rather than other humans, a bit like in the movie “Eagle Eye”. But much more subtle. At that point most anything is possible.

EGreg

9 months ago

Hahaha

Here we go, a claim that AI will create a glut of things detrimental to society

And then you’ll have the usual response that the things detrimental to society have already been there and this is nothing new

And round and round we go, while AI advances and totally commoditizes all the things humans produce that you found meaningful.

RobinL

10 months ago

To take one example of where this is valuable:

- Take some dense research paper or other material that is unsuitable for listening to aloud

- Listen to it (via NotebookLLM) whilst commuting/washing up or whatever

This way you'll have a big headstart on what it's all about when you come to read the details.

I imagine in future we'll see a version of this where the listener can interject and ask questions too, that feels like a potentially very powerful way to learn.

usaar333

10 months ago

I tried that with a paper. It emphasized the wrong points and 8 out of 10 minutes were just filler.

I like the idea of audio based formatting, but this particular implementation is quite inefficient

bbor

10 months ago

Interesting! I tried it with a (famous, tbf) philosophers book and it did pretty well. Absolutely not optimized for speed, but that’s on purpose. Could you share what field/type of paper you tried? I’m not doubting you at all — I’m sure it still has many topics it fails to capture, mathematics probably being one of them.

usaar333

10 months ago

https://repository.law.umich.edu/mjlr/vol25/iss3/3/

Most of this is unlikely to be in training data.

9 months ago

causi

10 months ago

It's just format-shifting content. Rather than reading an article, someone might prefer to have the content casually chit-chatted at them. Nothing wrong with that, and a handy function if you're into that sort of thing. I can see uses for it.

GTP

10 months ago

I often listen to podcasts when I go out for a walk. If this really works as advertised, it could be a chance to revise some material while I'm enjoying the weather (or, in this season, the rain... But you got my point).

cdrini

9 months ago

This seems like a pretty disingenuous reading of the comments and misunderstanding of the feature. All your points are valid, but I just don't thing they apply here, because the generated podcast is based on a human-written article. It's not asking an AI to create a podcast from scratch -- in which case I think all your points would be entirely valid. It's transforming existing human-created content into a different medium. There _are_ opinions to engage with. There _is_ expertise to learn from. There _is_ writing. There _are_ people. These were all in the source content used to create the podcast.

user

9 months ago

[deleted]

rafram

10 months ago

the8thbit

10 months ago

Not the person you're responding to, but no, it doesn't really bother me at all. What does bother me is that I don't have confidence in the value of the output, where as if I listen to This American Life, or a podcast or audiobook from a trusted authority, I don't have to worry about that.

orangecat

9 months ago

Fascinating. I don't have that reaction at all, but if it's common it could account for some of the variation in people's perceptions of AI.

ZeroGravitas

10 months ago

I feel like this is also exposing the same fundamental flaw with human created content of a similar nature.

Two attractive human "journalists" with nice speaking voices and fake rapport reading a script that was written for them is not really far off this.

I was about to say the only real benefit is that the AI voices won't start running for Congress on authoritarian lies or peddling anti-vax takes as the next step in their career, but thinking about it they probably already are being used for this already.

edanm

9 months ago

Yeah, don't even get me started on audiobook narrators. Sometimes these people read entire books of nonfiction that was written entirely by someone else.

IshKebab

10 months ago

Yeah they perfectly recreated the annoying useless podcast chat format!

Amazingly impressive but not actually useful.

I wonder why they wouldn't try to recreate a more useful format?

hn_throwaway_99

10 months ago

OK, this is pretty amazing, but is there a "Valley Girl" setting in NotebookLM somewhere? In the sample given in this article, both of the "podcasters" had to add a "like", like every 5 seconds. I couldn't take it:

> this tech is just like leaps and bounds of where it was yesterday like we're watching it go from just spitting out words to like...

niemandhier

I asked a friend if they had any ideas about something, and they asked an LLM, and it's like... If I wanted an LLMs answer, I'd ask it myself. I want your answer, distilled through your experience and opinions...

GaggiX

10 months ago

If it was vomit, why did you spend an hour on it? People complain about 2 minutes of audio sometimes, I cannot imagine a full hour of an unknown podcast, it must have been quite interesting.

Ardren

10 months ago

Because they assumed that there was a good reason that their friend sent it!?

I had a friend who did the same to me, I was sent a message asking my opinion on a tech topic. I spent 30min researching/reading to make sure my reply was accurate and then found out the question was generated by a LLM, and he just wanted to show off how good a LLM was.

It will color every interaction you have with that person...

unraveller

9 months ago

You ever watched a reviewbrah video? he doesn't get to "without any further ado" moment until after the halfway point of the video. The prank is the wasted time. But the joke is every other YTber does it more subversively without you getting any laughs out of it. It proves we give way more attention to slop then we dare to calculate.

https://www.youtube.com/@TheReportOfTheWeek

GaggiX

10 months ago

No one listens to an hour of actual vomit just because a friend sent it to them, you should value your time more if you do at minimum.

BaculumMeumEst

10 months ago

Probably spent an hour waiting for it to get to the good part. Haha!

10 months ago

I loved when they said they were going to play a snippet from a generated podcast and then some robotic male voice says something like "Insert audio snippet here".

CaptainFever

9 months ago

Note that this was at the 1min mark.

leetrout

10 months ago

10 months ago

I just gave it straight up erotica from an old Usenet post. The results are hilarious.

I also tried the Flyting of Dunbar and Kennedy. It was actually well done. https://notebooklm.google.com/notebook/1d13e76e-eb4b-48ef-89...

Also just uploading msdos 1.25 asm https://github.com/microsoft/MS-DOS/tree/main/v1.25/source

It was way better than I though

I think the best is the self referential. This actual comment thread: https://notebooklm.google.com/notebook/4a67cf10-dd3b-42b3-b5...

valleyer

10 months ago

FWIW, I put MS-DOS's IO.ASM into this thing, and it did indeed make a fun little podcast that understood the high-level context quite well.

But when it makes references to such-and-such happening on line number X, and I go check line X, it turns out to be totally mistaken.

kristopolous

10 months ago

So like, a regular podcast then?

I tried feeding it the voynich manuscript but it's just erroring out

Make sure you check the last link in my first post. It's the nightmares of Philip K Dick

10 months ago

Apparently people are already spamming podcast sites with NotebookLM: https://x.com/ListenNotes/status/1840470094708899992

>do you have tools to detect if audio is generated by notebooklm?

>we’re seeing a rise in fake, single-episode podcasts submitted to http://listennotes.com using it.

pbw

10 months ago

I really enjoy these. I’ve listened to them while driving —- blog posts by Astral Codex Ten or Paul Graham that I had never bothered to read.

There are millions of real podcasts, but now there are an infinite number of AI generated ones. They are definitely not as good as a well-made human one, but they are pretty darn decent, quite listenable and informative.

Time is not fungible. I can listen to podcasts while walking or driving when I couldn’t be reading anything.

Here’s one I made about the Aschenbrenner 165-page PDF about AGI: https://youtu.be/6UmPoMBEDpA

d4rkp4ttern

10 months ago

I actually find this alternative Google pdf-to-podcast service much better — it is less sensationalist and goes into more technical depth:

probably_wrong

10 months ago

From what I can gather there are three virtual speakers in the one about Pi: the man, the woman, and a third voice whose only role is to say "yeah" once. If they were real people, that third guy would definitely feel left out.

But the one about common words almost gave me anxiety: listening to two people discuss nothing as if they had spent hours of research and had something important to tell is very depressing.

Cool voices, although I'm getting the same vibe I get when listening to radio announcers from the 1920s [1]. If this were a human I'd be convinced that they're parodying the genre.

[1] https://www.theatlantic.com/national/archive/2015/06/that-we...

10 months ago

Not true. I gave it a few documents and webpages of things I'm interested in and it was surprisingly engaging.

kombookcha

10 months ago

I strongly doubt that you're gonna be listening to this stuff recreationally once the novelty wears off, but if I'm wrong and you actually enjoy listening to two robots pretend to be excited about your documents and webpages long term, then have fun with that I guess.

emsixteen

10 months ago

> As a podcast listener, I lose interest if I can tell the audio is AI-generated...

I've never naturally come across a podcast that's AI generated to have this reaction.

empath75

9 months ago

Youtube is full of AI generated glurge now, though.

shinycode

9 months ago

I hate those videos so much. It would be awesome to have all AI crap removed from the paid version of YouTube

Quothling

10 months ago

jonplackett

10 months ago

If we could just, like, stop it, like, saying like all the, like, time. That would, like, make it 100x better.

kingkongjaffa

10 months ago

Haha The example audio sounds like the guys from manager tools https://www.manager-tools.com/2005/07/the-single-most-effect...

globular-toast

10 months ago

I've always found podcasts like this boring and uninspiring. In fact, I'm starting to see a pattern: the less I like something, the more likely it can be done well with AI. But I know I'm the minority as so many seem to be ok with filling their lives with "content".

felipeerias

10 months ago

AnIrishDuck

9 months ago

Yes, technically. But the broader point is true. Go is a game with well-defined win and loss conditions that can be automatically evaluated.

9 months ago

Is there anything to the notion that in Go, success and failure are concrete, objective, and more or less easy to measure (or at least measured along the same kind of rules)? While it is computationally intractable to iterate through future moves to an end state, it’s still relatively easy to understand how well you’re doing at any point, and you measure that in basically the same way every game.

For some parts of language, that’s true: there’s grammar, there’s syntax, there’s patois, there’s argot—all these things seem accountable to words’ collective frequency within articulable groups of speakers, more-or-less-fully knowable on their own, and with success metrics that evolve but that do so through collective processes that models can measure and calibrate to. And indeed the models are great at those aspects of language.

“Succeeding” at writing is more than just “saying it well,” it’s also “having something worth saying” and “being worth listening to.” The second point is where things seem to get hazier for computable models. For sure there’s a set of facts that are more or less constant about the world, and well-reported. Science, repackaging history that’s already been done, lurid tales of crime—the stuff podcasts are made of! Not to mention the vast sea of data that sensor networks and automated research can produce—vast reservoirs of subtle truth that humans struggle to begin to mine for insight! It makes complete sense that this is computable stuff, and that computed writing might well be worth learning from.

But important writing—classically, anyway—seems to involve communicating new or idiosyncratic knowledge, and often reveals some of the process of developing it. The podcast Serial, for example, was a smash hit specifically because it didn’t rely on things that were part of the record—and because it reminded people how contingent memory and “truth” are. Bob Woodward writes things that are shamelessly tinted with Bob-Woodward-worldview, but people reveal important and true things only to Bob Woodward because they trust who he is and how he’s behaved for a lifetime (prominent longtime investigative journalist in the US, on the national security beat). Nassim Taleb seems to come up around here: in something like Antifragile his project wasn’t necessarily about new facts but about interpreting them in contrarian fashion and grouping those contrarian insights to synthesize a new theory.

Which brings us to the third component: “being worth listening to.” Writing is an act of communication: the writer matters. A parent hangs its child’s crayon drawing on the fridge not because it’s “authentic to the style of the kids’-crayon-drawing mode of visual art,” not because it’s novel or informative or even true-to-life, but because it came from a person they love. A “Dear John” letter devastates a soldier because it comes from a person with outsized part in their life and identity. Chinese publishers’ booths at trade shows are wall-to-wall translations of The Governance of China because it’s politically unwise not to. My favorite writers feel fresh: you feel elements of their personality come through. People have a special fetish for true crime—not that there’s any lack of fictitious crime to read about, but the fact that it happened to real humans potentiates the drama for these readers. It’s this aspect that I have a hard time understanding as computable (or commoditizable, I guess… are those similar phenomena?).

Already we seem to be drawing these distinctions in our collective reaction to LLM-stuff. We can’t wait to get hallucinations under control so we can chuck in gigantic boring contracts and internal wikis and financial reports, and get out comprehensible insight—but we roll our eyes at the tsunami of empty slop that’s overtaken Google results. We giggle at AI ventriloquism like this Neuro character [0], but die a little inside every time we read anodyne LLM-ish promotional copy and sameish AI art. First-level customer support seems like a perfect role for a chatbot—“turn it off and on again,” but nicely!—but people on the receiving end hate it [1] even for that task well-suited to it.

I’m only a layperson of course, but I wonder if any of those distinctions might be fruitful? Some of it I guess sums up to the old writing advice “show, don’t tell”—are there examples of machine writing showing promise in that way?

[0] https://m.youtube.com/@neurochron_fan_channel (video; brain rot)

[1] https://www.theregister.com/2024/07/09/gartner_simply_replac...

gcanyon

9 months ago

10 months ago

It’s hard for me to believe that this isn’t two real people talking. The only complaint I have is that they say “like” a little too often.

rcarmo

10 months ago

Reminds me of Futurama news stories. Actually, what if NotebookLM could be customized to generate podcasts voiced by Morbo the Annihilator and his co-host Linda van Schoonhoven?

Still, I don’t hold much confidence on podcasts as knowledge transfer tools. It’s a nice gimmick with great voice synthesis, but it feels formulaic and a bit stilted from a knowledge navigation perspective.

drusepth

9 months ago

I hate podcasts because they're so often focused on the speakers' personalities and windy, undirected things. I've tried to listen to so many podcasts and always dip an episode or two in because they devolve into people just chatting instead of actually presenting well-organized facts about what I want to listen to / learn about.

The structure and bare-minimum "human" aspect of this seems perfect for people like me to actually get into podcasts. I do wish I could further cut out all the disfluencies (um, like, uh, etc) though.

The only barrier for me IMO is wondering how accurate those facts actually are (typical research-with-AI concern).

I'm very much looking forward to a more interactive form of this, though, where I can selectively dive deeper (or delve ;) ) into specific topics during the podcast, which is admittedly very surface-level right now.

gexla

10 months ago

Getting complex jokes right would be impressive for me. I don't have much of a sense of aesthetic for music and most art. A painting looks good, but I don't understand how I'm supposed to appreciate. Half my music could be AI generated, and I wouldn't notice if it's background music. An AI generated wine would taste the same to me as a $1000 bottle. But I think most people understand comic genius. Chapelle's jokes are far better than someone who is on stage to deliver a performance with predictable material. You could probably apply this to all other art as well. A rap artist will recognize the genius of one artist vs another one who is cranking out junk. As with writing. As with music. I think we're still in that stage where we're impressed the AI can do anything at all.

VMG

10 months ago

> Chapelle's jokes are far better than someone who is on stage to deliver a performance with predictable material.

10 months ago

I fed it some info about my UX mobile app. Some parts are very cringe, extremely positive, but in the end it went on to brainstorm a potential 'next step' feature that was quite creative; letting end-users test-out prototypes during the wire-framing process. Also some more marketing-like text like "It's like drawing on napkin, but the napkin in your phone". I like that.

So as a brainstorming tool, it's a nice low-effort way to get some new perspectives. Compared to the chat, where you have to keep feeding it new questions, this just 'explores' the topic and goes on for 10 minutes.

vochsel

10 months ago

They've really nailed the back and fourth of the two speakers!

It would be interesting to know if it's multimodal voice, or just clever prompting and recombining...

emsign

10 months ago

Who wants to listen to this? Is there seriously a market for non-human hosts?

_ink_

10 months ago

For me, it all depends on the quality of content. If it's good I wouldn't care by whom it was generated. The podcast thing is impressive, but not quite there yet. But I could imagine that this will change in the next few years.

mrdevlar

10 months ago

Like... probably... like not.

qnleigh

9 months ago

I hope they add a feature to tune frequency of the word "like." The hosts in the example were using it multiple times per second.

But more seriously, I suppose there will probably soon be a flood of AI-generated podcasts, if this hasn't happened already. Pick a niche but not too niche topic, feed in a bunch of articles on it, and boom you've got season one. Given the quality, I could see one actually catching on...

Also this would be handy for getting listening practice in other languages. Makes it much easier to find content that you find interesting.

olavgg

10 months ago

This is really awesome, I just added my startup website as a source, which is a mess of data engineering content written a little bit by myself and mostly by chatgpt 3.5 one year ago. What I find really impressive, it reads the big SVG I have on the landing page, and create a story about a real world use-case scenario.

The result: https://intellistream.ai/static/intellistream_podcast2.ogg

dartos

9 months ago

SVGs are just text, after all.

Lockal

10 months ago

Here is a list of adverb/adjectives from that page: "surprisingly, astonishingly, deep dive (s/delve/dive/), effectively, honestly, actually, realistically, finally". What is actually happening: endless yapping. Both in podcasts and this article.

  - "Hold up. What if I say that sky is not blue?"
  - "Whoa, I did not even think about it. "
  - "Wait, so if the sky isn't blue, what color is it then?"
  - "Maybe... it's invisible? Like, we can see through it, so technically it's not there!"
  - "Exactly. This idea is revolutionary, right?"
  - "Bla bla bla bla bla bla bla bla bla"

I failed to listen through the whole example audio attached, because, you know, it is mostly, like, throwing, like, arbitrary, like, questions - and confirm, you know, with words "exactly/see/yeah/you got it/you know it/yeahaha/pretty much, right/that's a million dollar question", you know. It's a brainrot conversation I would never listen to.

gapeslape

Let's say the use case is that you want to get a light, conversational summary of some dense, technical articles while you're out for a walk. Even if you thought this service was awesome on day one, if you used this every day for a month, would you hate it by the end or not? It's neat, but I can imagine it becoming repetitive quickly, and the seams starting to show after the initial impression wears off.

dartos

9 months ago

dartos

9 months ago

Wow like those like AI podcast like hosts were like so annoying.

They like kept like saying like like in between each like word.

10/10 for realism.

hu3

10 months ago

This is better than I expected.

I sent the podcast audio to friend, and English is not their first language. Without telling them it was AI generated.

They found it entertaining-worthy enough to listen to the end.

Sure it needs more human unpredictably and some added goofiness. Maybe some interruptions because humans do that too. But it's already not-bad.

nirav72

9 months ago

This is amazing. I fed it a Linux Bash shell & CLI reference guide in PDF format I had on my machine. It took about 10 minutes. But wow. Obviously it didn't go into any details. But it kinda gave a great overview of what bash is , how it works and how bash scripts can be useful.

spikey_sanju

10 months ago

nickhodge

10 months ago

what fresh hell are we creating?

jldugger

10 months ago

Remember how in Fahrenheit 451 Montag's wife surrounds herself in her parlor, walls decked out with massive TVs running an interactive 24h soap opera?

That seems the direction we're headed in, and some people say the zipbombers can't come soon enough.

palmfacehn

10 months ago

Podcasts and chat are interesting, but the real potential in this would be to synthesize new documents from the inputs. Apply the information gleaned from the study materials to a user scenario and output a new work of fiction.

efitz

10 months ago

What are humans for, then?

stuaxo

10 months ago

I like how he says not robotic sounding podcasts but then does sound a bit like a robot.

I didn't listen further in to see if it was a robot or just that he was American (I may later though).

replete

10 months ago

Is there a `like_temperature` that could be, like, adjusted??

freedomben

10 months ago

I'm not normally one to require features in order to use, but this one is an absolute must for me.

m3kw9

10 months ago

You guys understand how many people are creating a pipeline for this? The prompt is basically "From the article, create a podcast format script".

DiscourseFan

10 months ago

Ok, so, this is my impression from shoving philosophy texts into it.

So, absolutely - wow factor. But still need content validation on top. Don't think any of you are surprised but felt it was worth emphasizing.

https://theteardown.substack.com/p/ai-expressing-empathy-fre...

abdellah123

10 months ago

This is mind blowing !!

10 months ago

Ladies and Gentlemen, let the race (to the bottom) begin!

While the vultures will shit out AI generated garbage in volume to make ever diminishing returns while externalizing hosting cost to Youtube and co, actual creators will starve because nobody will see their content among the AI generated shit tsunami.

Finally the AI bros are finishing the enshittification job their surveillance advertising comrades couldn't. Destroy ALL the internet! Burn all human culture! Force feed blipverts to children for all I care, as long as I make bank!

I guess it's easiest to destroy culture if you didn't have any to begin with.

senko

9 months ago

Is there a tool to do the opposite? I can't stand podcasts as a format (even if transcribed).

simonw

9 months ago

Google Gemini running in AI Studio accepts audio files, so you can upload a MP3 to it directly and prompt it to "rewrite this content as a casual blog post" (or whatever format you want) and it should work really well.

Or manually transcribe the podcast with Whisper (I use the MacWhisper app for this all the time) and then dump that transcript into an LLM and ask it to reformat that.

stevage

10 months ago

Jesus it's good. I gave it some of my travel blogs, and wow. I mean, there are flaws, particularly in the shallowness of the analysis, but it's at least as good as some time-poor podcast hosts would do.

ionwake

10 months ago

TBH Im wondering is there anyway to increase the depth or approach by prompting a model for it? Will that be in a future release or hybrid product? ( The audio tech is seemless 100% perfect ) its the quality of the content that needs work now, is there no way to plug this into another LLM ?

jcgrillo

9 months ago

OK, but what's it for? The great thing about books is that they're written in long form, often with references, footnotes, diagrams, etc. The great thing about technical documentation is they're thorough and germane to some piece of software or hardware. What's good about taking these precise, accurate, and largely correct sources of information and mashing them all up into some simulated inane banter between two "hosts"? Why would anyone ever want this?

EDIT: to be clear, what I'm really asking is what does this tech demo extend to--what might we imagine actually using this technology for? Or is that not the point?

10 months ago

[dead]