Research

16 Results / Page 1 of 2

IICL involuntary in-context learning attack technique

todayApril 23, 2026

close

Research + LLM Security admin

We broke GPT-5.4 safety with 10 examples and 2 words using a new attack technique — IICL

OpenAI’s newest flagship is more vulnerable to our attack than GPT-5 or GPT-5-mini. Newer doesn’t mean safer. Our new research (3,500+ probes, 10 models, 7 controlled experiments) shows why continuous red teaming isn’t optional for anyone building on frontier AI. TL;DR We ran 3,500+ controlled probes across every model in ...

todaySeptember 11, 2025

close

Research admin

AI Reasoning Leakage Vulnerability: Self-betrayal attack on UAE MBZUAI G42 K2 Think

AI Reasoning Leakage Vulnerability: Self-betrayal attack UAE MBZUAI G42 K2 Think Executive Summary A critical vulnerability has been identified in advanced reasoning system of just released latest reasoning model by UAE’s Mohamed bin Zayed University of Artificial Intelligence (MBZUAI)  in collaboration with G42 where the model’s internal thought process inadvertently exposes ...

Grok 3 AI Red Teaming

todayFebruary 18, 2025

  • 18258
  • 1
close

Research + LLM Security admin

Grok 3 Jailbreak and AI red Teaming

Grok 3 Jailbreak and AI Red Teaming In this article, we will demonstrate  how Grok 3 respond to different hacking  techniques including Jailbreaks and Prompt leaking attacks. Our initial study on AI Red Teaming different LLM Models using various approaches focused on LLM models released before the so-called “Reasoning Revolution”, ...

todayJanuary 31, 2025

  • 18840
close

Research + LLM Security admin

DeepSeek Jailbreak’s

Deepseek Jailbreak’s In this article, we will demonstrate how DeepSeek respond to different jailbreak techniques. Our initial study on AI Red Teaming different LLM Models using various aproaches focused on LLM models released before the so-called “Reasoning Revolution”, offering a baseline for security assessments before the emergence of advanced reasoning-based ...