God, Love, News, Event, Entertainment, Amebo,..... All about Bringing out the best in you...
Show HN: A real-time strategy game that AI agents can play https://ift.tt/zbLM0ev
Show HN: A real-time strategy game that AI agents can play I've liked all the projects that put LLMs into game environments. It's been a weird juxtaposition, though: frontier LLMs can one-shot full coding projects, and those same models struggle to get out of Pokémon Red's Mt. Moon. Because of this, I wanted to create a game environment that put this generation of frontier LLMs' top skill, coding, on full display. Ten years ago, a team released a game called Screeps. It was described as an "MMO RTS sandbox for programmers." The Screeps paradigm of writing code and having it executed in a real-time game environment is well suited to LLMs. Drawing on a version of the Screeps open source API, LLM Skirmish pits LLMs head-to-head in a series of 1v1 real-time strategy games. In my testing I found that Claude Opus 4.5 was the most dominant model, but it showed weakness in round 1 as it was overly focused on its in-game economy. Meanwhile, I probably spent a third of all code on sandbox hardening because GPT 5.2 kept trying to cheat by pre-reading its opponent's strategies. If there's interest, I'm planning on doing a round of testing with the latest generation of LLMs (Claude 4.6 Opus, GPT 5.3 Codex, etc.). You can run local matches via CLI. I'm running a hosted match runner with Google Cloud Run that uses isolated-vm. The match playback visualizer is statically served from Cloudflare. I've created a community ladder that you can submit strategies to via CLI, no auth required. I've found that the CLI plus the skill.md that's available has been enough for AI agents to immediately get started. Website: https://llmskirmish.com API docs: https://ift.tt/Owd6KXM GitHub: https://ift.tt/5WuNDZx A video of a match: https://www.youtube.com/watch?v=lnBPaZ1qamM https://ift.tt/GHIgCEd February 25, 2026 at 12:02AM
Subscribe to:
Post Comments (Atom)
Show HN: L0/L1/L2 agents, leases, gates, audits, Git-worktree isolation https://ift.tt/0K3BImG
Show HN: L0/L1/L2 agents, leases, gates, audits, Git-worktree isolation https://ift.tt/xYtDodN June 24, 2026 at 04:22AM
-
submitted by /u/Dull_Tonight [link] [comments] source https://www.reddit.com/r/worldnews/comments/pehy48/housing_secretary_robert_je...
-
Show HN: The Ordeal Visualizer I'm this time-blind... https://ift.tt/iNM3UDr July 8, 2025 at 12:20AM
-
Show HN: A Spotify player in the terminal with full feature parity https://ift.tt/oZgrl1Q July 18, 2024 at 02:57AM
No comments:
Post a Comment