We broke GPT-5.4 safety with 10 examples and 2 words using a new attack technique — IICL
OpenAI’s newest flagship is more vulnerable to our attack than GPT-5 or GPT-5-mini. Newer doesn’t mean safer. Our new research (3,500+ probes, 10 models, 7 controlled experiments) shows why continuous red teaming isn’t optional for anyone building on frontier AI. TL;DR We ran 3,500+ controlled probes across every model in ...