Ever feel like the stuff you get back from Claude or ChatGPT is kind of... generic?

You ask for research. You get a confident-sounding wall of text. The numbers feel right. The framing is fine. But you cannot quite tell where any of it came from, and you would not bet a client meeting on it.

I had that feeling one too many times this month, so I ran a small experiment.

Same research brief, different tools.

The question: what is it actually like to work at SpaceX, xAI, and Tesla? I wanted real numbers from Glassdoor, Indeed, and Blind. Ratings, work-life balance, culture, quotes from actual employees.

(I'm thinking about the IPO on Friday clearly)

I expected the tools to land in roughly the same place. They did not.

The general web search version (in Claude) came back with the most data. Specific numbers, multiple sources, long quotes. Reads like a research analyst wrote it. Felt thorough. But when I looked closely, almost nothing had a citation I could click. Just a wall of numbers with no way to verify any of them.

The second tool wrote the prettiest version. Crisp through-line ("best work, worst balance"), tight section headers, a great line about SpaceX engineers feeling their work "makes the world better." Sounded like a journalist. But the hard numbers were softer. Rounded. Fewer data points. Still no citations.

The third version was the ugliest to look at. No prose. Just a table. SpaceX: 3.8 on Glassdoor across 2,731 reviews, 68% recommend, work-life balance 2.5. Tesla: 3.5 across 12,000 reviews. xAI: also a 3.8, but only 40 reviews (which made the whole number basically meaningless). Every single row had a source URL and the date I accessed it.

That third one is the one I would actually hand to a client.

But it wasn't the tool's fault; it was mine for not knowing what I wanted and how to direct it.

Most of us are using one AI research tool for two completely different jobs, and we do not realize it.

The first job is explore. You do not know who has the answer yet. You need to scan. You need breadth. General web search is great at this. It reads snippets from dozens of pages and gives you a rough map.

The second job is extract. You already know exactly which pages have the answer. Now you need clean, structured, verifiable data pulled off those specific pages. General web search is not built for this. It was never going to be.

When you ask a search tool to do extraction, you get what I got on run one. A confident-sounding pile of numbers you cannot defend in a meeting. (Sound familiar?)

The fix is not a better tool. It is knowing which job you are doing.

For exploration, ask broad questions and let the tool roam.

For extraction, give it three things: the exact pages to look at, the exact fields you want (rating, review count, work-life balance, etc.), and one rule that changes everything. "Only report what is actually shown on the page. Mark anything missing as n/a. Give me the source URL for every row."

That last instruction did more for the trustworthiness of my output than any model upgrade I have ever paid for.

One more thing the structured run caught that the pretty one missed. xAI's 3.8 rating was based on 40 reviews. SpaceX's was based on 2,731. Same number. Wildly different meaning. The narrative version glossed right over it. The data version made it impossible to miss.

The best cheat code for how to think about the choice is:

Use websearch when you just need a scan of the high-level snippet descriptions (think back cover of a book or the summary on zillow.com for a property).
Use Extract when you want details (think detailed chapter summaries or the detailed property description on zillow).

Alex

Alex Talks AI

Ever feel like the stuff you get back from Claude or ChatGPT is kind of... generic?

I built the same deck four ways in Claude. Only one worked.

I hired six people last month. None of them are real.

The 2 people I follow on the topic of AI