hackernews client

NotebookLM is quite powerful and worth playing with

75 pointsposted a year ago

24 Comments

OutOfHere

a year ago

The two voices are really good, but there are some major problems with the generated audio:

1. The speakers keep cutting each other off mid-sentence, completing each other's sentences. This is cute if it happens once in a while, but it's happening so often as to be extremely annoying.

2. It made a 20 minute file, the last 5 minutes of which were essentially a waste, a repetition.

3. There is excessive amounts of filler material. The speakers are not real people, and they don't need to act as such. The listener's time should be valued better.

wenc

a year ago

You just described the average human podcast (that's not on a Top 100 list). :)

The output reminds me of any number of new marketing podcasts that are still trying to find their footing.

Podcasts are generally terrible. They will never get close to text in terms of capability to deliver content, and enabling critical analysis of the content, because unlike text you cannot revisit sections with ease. You cannot ingest information at the same rate as when reading either. While a fun gimmick it feels like a step back in terms of learning / improving rate of ingestion of information. Imagine how useless this site would be if every comment section was just recordings instead of text comments.

OutOfHere

a year ago

While that's quite true, you probably must not have listened to the output of my software podgenai yet. It is not a substitute for reading material, but I'd say it's quite instructive in its own way.

AtlasBarfed

a year ago

Podcast: a one minute wikipedia page read that is regurgitated as a 20 minute pseudo-suspenseful time waste.

throw310822

a year ago

> Imo LLM capability (IQ, but also memory (context length), multimodal, etc.) is getting way ahead of the UIUX of packaging it into products.

Which means: LLMS are evolving faster than we are inventing ways of incorporating them into products. Thus all real-world impacts of LLMs lag behind their current state and capabilities by a margin that is possibly even getting larger with time.

FistfulOfHaws

a year ago

Nat Friedman estimated that if we stopped developing new models today, we’d have 5-10 years worth of innovation to tap into before we’d need more models.

sandspar

a year ago

A number he pulled out of his ass.

3abiton

a year ago

In other words, lots of unexplored potential.

belter

a year ago

Also on long winding subtly weaved security attacks.

lupire

a year ago

Tweet content:

NotebookLM is quite powerful and worth playing with https://notebooklm.google

It is a bit of a re-imagination of the UIUX of working with LLMs organized around a collection of sources you upload and then refer to with queries, seeing results alongside and with citations.

But the current most new/impressive feature (that is surprisingly hidden almost as an afterthought) is the ability to generate a 2-person podcast episode based on any content you upload. For example someone took my "bitcoin from scratch" post from a long time ago: https://karpathy.github.io/2021/06/21/blockchain/ and converted it to podcast, quite impressive: https://notebooklm.google.com/notebook/ba017fec-7068-4085-97...

You can podcastify anything. I give it train_gpt2.c (C code that trains GPT-2): https://github.com/karpathy/llm.c/blob/master/train_gpt2.c and made a podcast about that: https://notebooklm.google.com/notebook/2585c187-b059-475a-b4... I don't know if I'd exactly agree with the framing of the conversation and the emphasis or the descriptions of layernorm and matmul etc but there's hints of greatness here and in any case it's highly entertaining.

Imo LLM capability (IQ, but also memory (context length), multimodal, etc.) is getting way ahead of the UIUX of packaging it into products. Think Code Interpreter, Claude Artifacts, Cursor/Replit, NotebookLM, etc. I expect (and look forward to) a lot more and different paradigms of interaction than just chat.

That's what I think is ultimately so compelling about the 2-person podcast format as a UIUX exploration. It lifts two major "barriers to enjoyment" of LLMs. 1 Chat is hard. You don't know what to say or ask. In the 2-person podcast format, the question asking is also delegated to an AI so you get a lot more chill experience instead of being a synchronous constraint in the generating process. 2 Reading is hard and it's much easier to just lean back and listen.

OutOfHere

a year ago

notebooklm.google.com looks to be an evolution of illuminate.google.com.

Strictly speaking, two-voice conversations are slightly overrated. A single voice, akin to a lecture rather than a discussion, is no worse, and is quite sufficient for conveying information. Either approach does the job.

The downloaded audio file extension is wav, implying that it's not compressed, but this is a minor quirk.

simonw

a year ago

In classic Google news, NotebookLM and Illuminate are two unrelated initiatives by entirely separate teams: https://twitter.com/raiza_abubakar/status/184056133374773269...

lupire

a year ago

"two person podcast" format is very similar to the extremely popular "FAQ" format.

OutOfHere

a year ago

Between a lecture vs an FAQ, I don't think there is any clear winner.

salad-tycoon

a year ago

It goes beyond the concept of a notebook summary as it will add in lots of filter material. Humorous examples abound, typing a single ridiculous sentence into it seems that the podcast must fill in a certain amount of time. I’d rather it just do my notes and not veer off.

8f2ab37a-ed6c

a year ago

I mention this in another post, but I really hope they keep investing into NotebookLM and expand its ability to source more types of files, including codebases, complex websites etc. Feels really powerful for anybody studying or consulting many different clusters of learning materials at once.

Being able to talk to the books you're trying to learn, have them test you for knowledge, being able to combine many resources all at once, just so powerful.

andrewinardeer

a year ago

I fed it a CSV of numbers relating to Australian Football player statistics and it spat out an analysis.

sandspar

a year ago

The podcast presenters have the typical Google AI affliction of coming across as anxious. I added my sleep routine and the 8 minute podcast was roughly 3 minutes of "important note" type content, where the AI lectures you about risk as if you are a child. Gemini is a nanny bot and it's interesting to see that NotebookLM is as well. The form factor is cool though.

vitorgrs

a year ago

I just tried the podcast generation, and got surprised! Sadly English only, but I gave a Portuguese link to it, and it did well...

wenc

a year ago

I gave it some of my Google Docs and it generated an amazing back-and-forth audio podcast conversation that was quite believable!

The only issue was that it got some things wrong (like my gender, but my name is gender neutral, so) and factually nothing was wrong, but it harped too much on one little quote that was really not very consequential. If only there was a way to tweak the podcast script...

But otherwise, it's really very good. The vocal inflections were all very natural. The two presenters' banter was somewhat trite, but that's par for the course even for the average human podcast. If only they could a "Conversations with Tyler" style podcast with my Google Docs content....

wenc

a year ago

This is the RAG (Retrieval-Augmentation) tool for Google Docs that I've been waiting for.

It's like Perplexity for your personal docs.

OutOfHere

a year ago

Are you saying this has Google Drive integration? Afaik, it doesn't care where the input file comes from.

For the record, for the past year, you could have uploaded your personal doc to ChatGPT and have it do things with it.

wenc

a year ago

Yes it’s integrated with Google drive and allows external uploads too.

This is more seamless than uploading files.