Sesame AI’s Conversational Speech Model (link in first comment) is trained on over 1 million hours of audio. Its AI voices don’t just sound real. They feel real. Emotional depth, context awareness, and nuance make it a step beyond anything else out there.
This is more than AI-generated speech. It is the first step toward presence. The voices have emotional intelligence, natural cadence, and full customization for pitch, speed, and tone. As Tobi Lütke founder & CEO of Shopify put it, "Sesame’s voice model is absolutely insane."
This is a major shift for businesses.
⚡ Game developers can create NPCs instantly
🎧 Audiobooks become 10 times cheaper
📞 Call centers deliver better customer experiences
💡 Brainstorming sessions become interactive
Sean Hollister (co-founder of The Verge) said, "[It's the] First voice assistant I’ve wanted to talk to more than once."
Sesame’s model comes in different sizes: Tiny (1B), Small (3B), and Medium (8B). I had a 15-minute conversation with it and never hit an uncanny moment. The team hasn’t published benchmarks yet, but it already feels next level.
Sesame is a shift from speech to presence. Pair it with AI glasses, and you have a companion that speaks, sees, and understands. Pair it with a game, and you have a more immersive real-world experience. The list goes on…
This is a preview of where AI is going. Voice-first AI could transform daily life sooner than we think or it could be too uncanny to gain traction.
Have you had a chance to try it out? Where do you see this going?
This post was originally shared by on Linkedin.