OpenAI’s ChatGPT o3 emerged as the winner of a Kaggle-hosted tournament aimed at finding the strongest chess-playing large language model, defeating Elon Musk’s xAI model Grok 4 in the final. The three-day event featured eight general-purpose LLMs from companies including OpenAI, xAI, Google, Anthropic, DeepSeek, and Moonshot AI, competing under standard chess rules without specialised chess engines. Google’s Gemini secured third place after defeating another OpenAI entry.
what!
Sam Altman’s OpenAI Crushes Elon Musk’s Grok in AI Chess Championship pic.twitter.com/jPeXxx7kX3
— Aryan SMM (@aryaan_smm) August 10, 2025
Grok 4 started strong in the Kaggle AI chess tournament but faltered in the final match against OpenAI’s o3, making multiple tactical blunders including repeated queen losses. “Up until the semi finals, it seemed like nothing would be able to stop Grok 4,” noted Chess.com writer Pedro Pinhata, but its play “collapsed under pressure” on the last day. Grandmaster Hikaru Nakamura, who provided live commentary, observed: “Grok made so many mistakes in these games, but OpenAI did not.”
ALSO SEE: Apple Set To Unveil iPhone 17 Series In September: What To Expect
Elon Musk downplayed the loss, calling Grok’s earlier strong performance a “side effect” and stating that xAI had “spent almost no effort on chess.” The match result adds another public layer to the rivalry between Musk’s xAI and OpenAI, co-founded by individuals who once worked together.
OpenAI’s o3 just wiped the floor with Grok 4 in a no-holds-barred chess showdown hosted by Google. No engines, no training wheels—just raw AI brainpower. And wow, Grok 4 crumbled hard.
From blundering queens to bizarre mid-game moves, Grok looked like it was playing… pic.twitter.com/2uO5y0Jvtn
— Seven Crypto (@SevenWinse) August 10, 2025
Chess has long been a benchmark for AI capabilities, with milestones such as DeepMind’s AlphaGo defeating top human players in Go. However, this Kaggle event was distinct in testing general-purpose large language models rather than specialised chess engines, revealing their ability, or lack thereof, to navigate complex, rule-based tasks.
The outcome shows that while o3 maintained consistent, strategic play under pressure, Grok 4’s collapse highlighted the inconsistency of some LLMs in adversarial settings. Organisers and commentators expect chess and similar structured challenges to remain valuable tools for probing reasoning, planning, and robustness as AI models continue to evolve.
ALSO SEE: Sam Altman: OpenAI will bring back GPT-4o after user backlash
Great Job Priya Singh & the Team @ Mashable India tech Source link for sharing this story.