🔎 The next $10B AI company won't win with scale. It'll win with weird data.

Everyone's chasing the next foundational model.
But defensibility in AI today isn’t about who has 100B parameters.
It’s about who owns the messiest, most overlooked, most painful-to-collect datasets.

Think:

Real-world driving miles (Tesla)

3D surgical imaging data (Medivis)

Small freight operator logistics (Channel19)

Behavioral clickstreams (Klaviyo)

These aren't "off the shelf".
They're earned: through workflow integration, patient data collection, and relentless operational grind.

Friction is the moat.
The more annoying the data is to get - the more valuable it becomes.

The smartest investors evaluating AI startups today may not be asking, "What’s your model?" They're asking:

What proprietary signal improves with every user interaction?

What painful-to-replicate feedback loop are you capturing?

How hard would it be for a competitor to recreate your dataset?

Because the next $10B outcome?
It probably won’t come from another chatbot.

It’ll come from a startup logging signals no one else even notices - until suddenly, they’re the only ones with the data that matters.

📥 I wrote more about this in today’s newsletter, linked below.

Would love to hear: **what's the weirdest dataset you wish you could own?**👇


This post was originally shared by on Linkedin.