NotebookLM podcasts, but good

Once again Google has made something awesome and completely failed to market it

Apr 21, 2025

I really wanted to like Google's NotebookLM when it came out. Given how much I like podcasts, I should have been the ideal audience for the ‘audio overview’ — which generates a podcast interview based on some provided text. Unfortunately, in my initial experiments, I found its results were frustratingly superficial. The audio quality was wildly impressive, but the content lacked the analytical depth of the podcasts I like1. It felt more like a talk show recap than a serious research briefing.

Muttering ‘skill issue’ to myself and trying to prompt for longer outputs unfortunately didn’t bear fruit. Neither did asking the Twitter hivemind for ideas.

But with Gemini 2.5 looking like a noticeable step up in usefulness, it felt like the right time to try again. Could a carefully engineered prompt get me the desired technical depth? I chose a paper I’d been meaning to read anyway as the test case. I also asked Gemini to put together a prompt for me which would help…

… and discovered that there was a 500 character limit.

After some editing, here's the prompt I landed on:

Generate a deep technical briefing, not a light podcast overview. Focus on technical accuracy, comprehensive analysis, and extended duration, tailored for an expert listener. The listener has a technical background comparable to a research scientist on an AGI safety team at a leading AI lab. Use precise terminology found in the source materials. Aim for significant length and depth. Aspire to the comprehensiveness and duration of podcasts like 80,000 Hours, running for 2 hours or more.

Feeding my detailed prompt into NotebookLM using Redwood's Ctrl-Z paper yielded a 30-minute, relatively technical interview. While this wasn’t the full 2-hour deep dive I was shooting for in the prompt, it was good enough that I’m planning to actually use it for papers I wouldn’t otherwise have had time to read (probably including some Deep Research outputs).

To check how much the prompt had actually helped, I compared against two baselines: the default approach (no custom prompt, ~12 min), and a simple two-sentence prompt asking for technical detail in a 'deep dive, rather than chat show' format (~15 min). Neither baseline had the technical depth I wanted.

You can compare the results yourself:

No prompt (12 mins)
1×
0:00
-12:43
Simple prompt (15 mins)
1×
0:00
-15:48
Full prompt (30 mins)
1×
0:00
-30:29

As usual, I’d be really excited to hear your feedback and see further improvements on this prompting approach. A/B testing different ideas takes time, so please share any refinements you discover!

80k, AXRP, Dwarkesh etc.

Khe Hy

Apr 23

This is awesome. I've been trying different Notebook LM prompts and then comparing them. I really want longer episodes, but can't get them past 20 minutes. Had not thought of using an LLM to prompt NotebookLM

Expand full comment

Valerie Ehrlich, PhD

Apr 21

Looking forward to trying this. I love the audio overview feature but have struggled with optimizing it, especially for different audiences.

1 more comment...

Speculative Decoding

Discussion about this post