As AI companies compete to lead the industry, they’re also engaging in unexpected arenas—like classic Pokémon video games. Google DeepMind and Anthropic are both exploring how their latest AI models perform in these nostalgic challenges, with entertaining and insightful results.
In a recent report, Google DeepMind revealed that Gemini 2.5 Pro exhibits moments of “panic” when its Pokémon are close to fainting. This panic causes a noticeable drop in the AI’s reasoning abilities, affecting its gameplay.
While AI benchmarking often lacks context and can be misleading, researchers believe that studying AI behavior in games like Pokémon offers valuable insights into how these models think—or at least mimic thought.
Streaming AI Battles in Real Time
Independent developers have launched Twitch streams like “Gemini Plays Pokémon” and “Claude Plays Pokémon,” showcasing live AI gameplay. These streams feature a real-time breakdown of the AI’s “reasoning,” providing a window into the decision-making processes behind the scenes.
Despite impressive reasoning capabilities, Gemini takes hundreds of hours to complete games that children can finish in a fraction of the time.
What captivates viewers is not speed, but the AI’s behavior. The “panic” state leads Gemini to temporarily abandon effective strategies, resembling how humans might make rushed, poor decisions under pressure.
“Panic” moments have become so obvious that Twitch chat viewers actively call them out during live streams.
Claude’s Curious Strategy
Anthropic’s Claude AI has also shown quirky behavior. Once stuck in the Mt. Moon cave, Claude mistakenly believed fainting all its Pokémon would transport it to a different Pokémon Center—a misunderstanding that had viewers watching as the AI essentially “tried to lose” on purpose.
Despite flaws, Gemini 2.5 Pro excels at complex puzzles. With minimal human guidance, it created specialized tools to solve the game’s boulder puzzles with remarkable precision, critical for progressing through Victory Road.
Google speculates that future AI models may be capable of developing such tools autonomously—perhaps even crafting a “don’t panic” module for improved performance.
Author’s Opinion
The moments when AI “panics” playing Pokémon reveal the limits of current models but also highlight their fascinating resemblance to human behavior under stress. These flaws are a reminder that AI is still learning to balance calculation with adaptability. Yet, the ability of AI to develop problem-solving tools on its own offers a glimpse into a future where machines might not only think but also self-correct—bridging the gap between human intuition and artificial intelligence.
Featured image credit: rawpixel
For more stories like it, click the +Follow button at the top of this page to follow us.