Getting the Most from Deep Research Models

By just getting Claude to write the prompts

Apr 09, 2025

Last updated April 21

I'm excited about the potential of Deep Research (DR) models. As with many things, it seems like the quality of output you get depends a lot on how well you construct your prompt. I’ve seen lots of people be very unimpressed with the output, often for reasons like “the models are too credulous of low-quality sources”.

But that seems like the sort of thing it should be pretty easy to fix. If I had an intern or a research assistant whose reports were over-indexing on whatever clickbait they’d found first, I’d give them a quick rundown of how to determine source quality and then tell them to try again.

Except, actually, this is exactly the sort of information I would expect a language model to know. Guides for evaluating sources are all over the internet. I’d tell my intern/RA to get some guidance from Claude or Gemini 2.5 on this, rather than explain it myself. The guidance might end up looking like this.

After a fair bit of experimentation, I've built a Claude project that handles all the repeatable high-effort prompt engineering (like source selection) for you. It asks for your preferences, clarifies what you need, and produces a well-structured prompt that you can then feed into any DR model.

I think the best two DR models are made by OpenAI (as part of ChatGPT Enterprise or Pro) and Google DeepMind (as part of Gemini). Since Gemini 2.5 was released, I prefer Gemini, though it’s pretty close, and I’d suggest experimenting with both.

How to use the project

Start a chat1 in the Claude project
Tell it what you want to research
Answer the clarifying questions it asks
Copy the resulting prompt artefact into your preferred Deep Research model
Press go

What does the project do?

I think the prompts produced by the project improve on the status quo by:

Asking for the right context
Structuring the request in a way DR models respond well to
Improving the quality of the sources that get used
Formatting the output to match your requirements

Mostly though, it should save you time. The briefs are often a page or more, and I think that’s appropriate given the size of the report that will be produced. But who’s going to write a page-long prompt by hand whenever they want to use a model?

Setup Guide

If you'd like to try this yourself, I've made the source files publicly available here. Here's a detailed setup guide:

Step-by-Step Setup Instructions:

Create a new Claude project
Your project should have the following components in the Project Knowledge section:
- STYLE_GUIDANCE
- SOURCE_HIERARCHY
- OUTPUT_FORMATS
- CONTEXT_BRIEF
For each component:
- Go to the corresponding tab in the Google Doc
- Select all the text (Ctrl+A)
- Use "Copy as markdown"
- In your Claude project, click "+" in the Project Knowledge section
- Select "Add text content"
- Paste the markdown content
- Name it according to the tab name (e.g., "STYLE_GUIDANCE")
For the Core Instructions:
- Go to the "Project Instructions" tab in the Google Doc
- Copy the text as markdown (as before)
- In your Claude project, click "Edit" next to the instructions field
- Paste the markdown content

When you're done, your project should look something like this:

When to use this?

I don’t think DR models should replace Google for most of the things you use Google for (for that, use Claude 3.7 with search and extended thinking enabled, or GPT 4.5 with search enabled). But if you had a free, moderately competent, unbelievably diligent, research assistant, who also turned things around in 10ish minutes rather than days, what would you ask them to do?

I'm continuing to tweak this, and I'd love to hear how it works for you. If you try it out, let me know what kind of results you get and how it could be improved. I’m also keen for more anecdata about how ChatGPT’s version compared to Gemini’s.

I’d recommend using 3.7 Sonnet with thinking enabled. You can also set the project up as a ‘Gem’ in the Gemini app and use Gemini 2.5 instead. In my experience, and also that of one commenter, thinking models do a better job of following the instructions and combining the different advice.

Nathan Lubchenco

Apr 10

This is excellent. Thanks for putting this together.

Expand full comment

Oscar Delaney

Thanks! I just set up the project and tried an example, and the prompt Claude generated for me referred to files in its knowledge base like 'STYLE_GUIDANCE', which seems suboptimal since the deep research model (OAI or GDM) won't have access to that. Have you found it does that? (I can privately share the example if that is useful.)

4 replies by Alex Lawsen and others

11 more comments...

Speculative Decoding

Discussion about this post