Ever feel like the stuff you get back from Claude or ChatGPT is kind of... generic?


You ask for research. You get a confident-sounding wall of text. The numbers feel right. The framing is fine. But you cannot quite tell where any of it came from, and you would not bet a client meeting on it.

I had that feeling one too many times this month, so I ran a small experiment.

Same research brief, different tools.

The question: what is it actually like to work at SpaceX, xAI, and Tesla? I wanted real numbers from Glassdoor, Indeed, and Blind. Ratings, work-life balance, culture, quotes from actual employees.

(I'm thinking about the IPO on Friday clearly)

I expected the tools to land in roughly the same place. They did not.

The general web search version (in Claude) came back with the most data. Specific numbers, multiple sources, long quotes. Reads like a research analyst wrote it. Felt thorough. But when I looked closely, almost nothing had a citation I could click. Just a wall of numbers with no way to verify any of them.

The second tool wrote the prettiest version. Crisp through-line ("best work, worst balance"), tight section headers, a great line about SpaceX engineers feeling their work "makes the world better." Sounded like a journalist. But the hard numbers were softer. Rounded. Fewer data points. Still no citations.

The third version was the ugliest to look at. No prose. Just a table. SpaceX: 3.8 on Glassdoor across 2,731 reviews, 68% recommend, work-life balance 2.5. Tesla: 3.5 across 12,000 reviews. xAI: also a 3.8, but only 40 reviews (which made the whole number basically meaningless). Every single row had a source URL and the date I accessed it.

That third one is the one I would actually hand to a client.

But it wasn't the tool's fault; it was mine for not knowing what I wanted and how to direct it.

Most of us are using one AI research tool for two completely different jobs, and we do not realize it.

The first job is explore. You do not know who has the answer yet. You need to scan. You need breadth. General web search is great at this. It reads snippets from dozens of pages and gives you a rough map.

The second job is extract. You already know exactly which pages have the answer. Now you need clean, structured, verifiable data pulled off those specific pages. General web search is not built for this. It was never going to be.

When you ask a search tool to do extraction, you get what I got on run one. A confident-sounding pile of numbers you cannot defend in a meeting. (Sound familiar?)

The fix is not a better tool. It is knowing which job you are doing.

For exploration, ask broad questions and let the tool roam.

For extraction, give it three things: the exact pages to look at, the exact fields you want (rating, review count, work-life balance, etc.), and one rule that changes everything. "Only report what is actually shown on the page. Mark anything missing as n/a. Give me the source URL for every row."

That last instruction did more for the trustworthiness of my output than any model upgrade I have ever paid for.

One more thing the structured run caught that the pretty one missed. xAI's 3.8 rating was based on 40 reviews. SpaceX's was based on 2,731. Same number. Wildly different meaning. The narrative version glossed right over it. The data version made it impossible to miss.

The best cheat code for how to think about the choice is:

  • Use websearch when you just need a scan of the high-level snippet descriptions (think back cover of a book or the summary on zillow.com for a property).
  • Use Extract when you want details (think detailed chapter summaries or the detailed property description on zillow).

Alex

Alex Talks AI

As an AI Coach, Advisor, and Agent Builder, I help organizations and business leaders harness the power of artificial intelligence to boost productivity and streamline operations. I enable organizations to navigate the transformative landscape of AI, educating teams, identifying operational and strategic opportunities with AI and creating a framework for safe and transparent use of data in the organization.

Read more from Alex Talks AI

I spent last week building an investment deck for a client. The raw material was a pile of research reports. The output needed to be a branded PowerPoint that looked like it came from inside their firm, not from a random consultant with a Canva account. If you've ever tried to get an LLM to spit out a polished, branded deck, you know how this usually goes. The content is fine. The formatting is a disaster. Here's what I tried. Attempt 1. I worked in Claude, pointed it at the folder of...

Their names are Aaron Sorkin, Andy Sachs, Hemingway, Darwin, Ted Lasso, and Archivist. They're agents I built inside Claude. Each one has a role, a personality, a set of files they own, and a clear job. Aaron Sorkin is my chief of staff. He directs everything. When I throw something into the void at 11pm, he decides whether it's an Andy problem, a Hemingway problem, or something I actually need to handle myself. Andy Sachs runs operations. She tracks my Notion CRM, drafts invoices, watches my...

When Steve Jobs walked onstage in 2007 and held up the iPhone, nobody in that room could have pitched you Uber. The people in that room weren't short on imagination. The thing that made Uber possible (a supercomputer in your pocket that always knew where you were) was just so new that no one had lived with it long enough to see what it unlocked. The phone had to sit in our hands for a few years first. Then, in 2010, somebody asked a question that made no sense in 2007: what if a stranger's...