Genie – Voice-Powered Akinator Clone

I Built a Voice-Powered Akinator Clone with a Talking Avatar

Remember Akinator, the creepy-clever web genie that always seemed to read your mind? You’d think of a character, answer a handful of yes-or-no questions, and somehow the thing would guess SpongeBob, your obscure anime protagonist, or Napoleon Bonaparte with eerie accuracy. I loved it as a kid. So I rebuilt it — but this time, the genie talks back.

Meet Genie: a voice-driven guessing game powered by LiveKit Agents, animated by a lemonslice avatar, and backed by some of the best AI APIs available right now. Instead of clicking buttons on a web page, you have an actual conversation with an animated, theatrical genie who asks you questions in real time. It’s part demo, part experiment, and part pure fun.

Why Voice?

Text-based AI interfaces are everywhere. But there’s something fundamentally more engaging about talking to something — especially for a game. Akinator’s whole appeal is the illusion that something mystical is reading your mind. Voice amplifies that illusion dramatically. When a genie character asks you “Is this person real?” in an expressive, spoken voice, the experience shifts from a web form into something that feels almost magical.

That’s the core thesis behind this project: voice + avatar = a totally different feel, even with the same underlying logic.

The Tech Stack

Under the hood, Genie is stitched together from several real-time AI services, all orchestrated by LiveKit Agents:

  • Deepgram handles speech-to-text, transcribing everything you say in real time
  • OpenAI’s gpt-4o-mini drives the Genie’s reasoning — deciding which questions to ask and, eventually, making its guess
  • Cartesia Sonic converts the Genie’s responses back into expressive speech
  • lemonslice brings the whole thing to life with a lip-synced animated avatar that reacts as the Genie speaks
  • Silero VAD and multilingual turn detection ensure the conversation flows naturally, without awkward interruptions

LiveKit ties all of these together over WebRTC, keeping the whole pipeline low-latency and real-time.

How the Game Works

The flow is simple and familiar:

  1. You join a LiveKit room where the Genie agent is running
  2. The Genie greets you and asks you to think of any real or fictional person or character
  3. Once you say you’re ready, the questioning begins — short, targeted yes-or-no questions
  4. Within 20 questions, the Genie attempts its guess
  5. Win or lose, it offers to play again

All the Genie’s personality and game logic lives in a single system prompt inside src/agent.py, which makes it easy to tweak the character, adjust the strategy, or change the guessing rules entirely.

What I Learned

A few things stood out during the build:

Prompt design matters a lot for game logic. Getting the Genie to ask smart, information-theoretic questions (rather than random ones) required careful prompt engineering. The model needs to understand the goal — binary search over the space of possible characters — not just “ask questions.”

Avatar latency is real. Keeping lemonslice’s lip sync tight with Cartesia’s audio output required thinking carefully about buffering and streaming. LiveKit’s architecture made this tractable, but it’s non-trivial.

Voice turn detection is hard. Natural conversation has pauses, filler words, and overlapping speech. Silero VAD combined with LiveKit’s multilingual turn detection handles most cases gracefully, but edge cases still crop up.

Try It Yourself

The project is fully open source. Setup takes a few minutes — you’ll need a LiveKit deployment (their free cloud tier works great), plus API keys for Deepgram, OpenAI, Cartesia, and lemonslice. Everything runs from a single Python entrypoint:

uv run python src/agent.py dev

Check out the full README and source on GitHub, and watch the demo on YouTube to see Genie in action.


Voice AI is moving fast. Tools like LiveKit Agents make it genuinely easy to wire together best-in-class models into a coherent, real-time experience in just a few hundred lines of Python. Genie is a small demo, but it points at something bigger: the future of conversational AI interfaces probably looks a lot less like a chat box, and a lot more like this.

— Pete Haas

Scroll to Top