God, Love, News, Event, Entertainment, Amebo,..... All about Bringing out the best in you...
Show HN: Multi-Agent-Coder Is #12 on Stanford's TBench. Beats Claude Code https://ift.tt/riGaL45
Show HN: Multi-Agent-Coder Is #12 on Stanford's TBench. Beats Claude Code This weekend I built a multi-agent coding system which, quite unexpectedly, beat Claude Code on Stanford's Terminal Bench! The architecture is straightforward, consisting of an orchestrator agent that deploys explorer & coder subagents to complete complex terminal based tasks, utilising an intelligent context sharing mechanism along the way which makes it all work. The repo has a lot of technical details, and all the code and prompts for you to play around with if you'd like! I had a lot of fun making this, I hope you have fun reading the README, using it yourself, or even extending it! As always, a huge thanks to the great team behind Terminal Bench. It is a great benchmark. Thanks for reading, Dan https://ift.tt/0ZLljp8 September 2, 2025 at 10:04PM
Subscribe to:
Post Comments (Atom)
Show HN: Foundation, a different approach to software and AI https://ift.tt/LwODqPo
Show HN: Foundation, a different approach to software and AI https://ift.tt/08FW9rd July 4, 2026 at 02:46AM
-
submitted by /u/Dull_Tonight [link] [comments] source https://www.reddit.com/r/worldnews/comments/pehy48/housing_secretary_robert_je...
-
Show HN: Lindra – generate browser agents to automate any website Hi HN, We’re one month into building Lindra, a platform that turns any web...
-
Show HN: High-precision date/time in SQLite https://ift.tt/hEvedVC August 12, 2024 at 03:47AM
No comments:
Post a Comment