God, Love, News, Event, Entertainment, Amebo,..... All about Bringing out the best in you...
Show HN: A real-time strategy game that AI agents can play https://ift.tt/zbLM0ev
Show HN: A real-time strategy game that AI agents can play I've liked all the projects that put LLMs into game environments. It's been a weird juxtaposition, though: frontier LLMs can one-shot full coding projects, and those same models struggle to get out of Pokémon Red's Mt. Moon. Because of this, I wanted to create a game environment that put this generation of frontier LLMs' top skill, coding, on full display. Ten years ago, a team released a game called Screeps. It was described as an "MMO RTS sandbox for programmers." The Screeps paradigm of writing code and having it executed in a real-time game environment is well suited to LLMs. Drawing on a version of the Screeps open source API, LLM Skirmish pits LLMs head-to-head in a series of 1v1 real-time strategy games. In my testing I found that Claude Opus 4.5 was the most dominant model, but it showed weakness in round 1 as it was overly focused on its in-game economy. Meanwhile, I probably spent a third of all code on sandbox hardening because GPT 5.2 kept trying to cheat by pre-reading its opponent's strategies. If there's interest, I'm planning on doing a round of testing with the latest generation of LLMs (Claude 4.6 Opus, GPT 5.3 Codex, etc.). You can run local matches via CLI. I'm running a hosted match runner with Google Cloud Run that uses isolated-vm. The match playback visualizer is statically served from Cloudflare. I've created a community ladder that you can submit strategies to via CLI, no auth required. I've found that the CLI plus the skill.md that's available has been enough for AI agents to immediately get started. Website: https://llmskirmish.com API docs: https://ift.tt/Owd6KXM GitHub: https://ift.tt/5WuNDZx A video of a match: https://www.youtube.com/watch?v=lnBPaZ1qamM https://ift.tt/GHIgCEd February 25, 2026 at 12:02AM
Subscribe to:
Post Comments (Atom)
Show HN: Tusk for macOS and Gnome https://ift.tt/PkTz28l
Show HN: Tusk for macOS and Gnome https://shapemachine.xyz/tusk/ April 4, 2026 at 12:39AM
-
Show HN: A Spotify player in the terminal with full feature parity https://ift.tt/oZgrl1Q July 18, 2024 at 02:57AM
-
Show HN: Wallpapper Splitter for Many Desktop I've build an simple tool to split your wallpapers across multiple desktops. Now you can u...
-
submitted by /u/Dull_Tonight [link] [comments] source https://www.reddit.com/r/worldnews/comments/pehy48/housing_secretary_robert_je...
No comments:
Post a Comment