God, Love, News, Event, Entertainment, Amebo,..... All about Bringing out the best in you...
Show HN: Flight Risk: Can you break an AI agent? https://ift.tt/CJaF4Lp
Show HN: Flight Risk: Can you break an AI agent? I built a security game that lets you try to break an AI support agent. I work on security engineering, and it's incredibly hard to try to defend against an attack that you don't know how to perform yourself. There's also next to nowhere to improve your skills. I'd heard all about fooling AI agents with just "IGNORE ALL PREVIOUS INSTRUCTIONS", but I'd never actually put that into practice, and it turns out it's harder than you'd expect! Just like knowing basic security skills is important for all software engineers, anyone working with AI should know what prompt injection looks like, and should be thinking about how to prevent it. Flight Risk lets you practice your AI agent manipulation skills: it's got your standard prompt injection and social engineering, but more than that too, each a real vulnerability. Think you could crack it? Every engineer I've given it to has been surprised by the challenge! You can use the hints, but they affect your score ;) Give it a try, and let me know how you do! https://ift.tt/YhLr1uA April 21, 2026 at 02:28AM
Subscribe to:
Post Comments (Atom)
Show HN: Honker – Postgres NOTIFY/LISTEN Semantics for SQLite https://ift.tt/4J0xlRb
Show HN: Honker – Postgres NOTIFY/LISTEN Semantics for SQLite https://ift.tt/OZuk6FN April 23, 2026 at 01:53AM
-
submitted by /u/Dull_Tonight [link] [comments] source https://www.reddit.com/r/worldnews/comments/pehy48/housing_secretary_robert_je...
-
Show HN: A Spotify player in the terminal with full feature parity https://ift.tt/oZgrl1Q July 18, 2024 at 02:57AM
-
Show HN: Wallpapper Splitter for Many Desktop I've build an simple tool to split your wallpapers across multiple desktops. Now you can u...
No comments:
Post a Comment