Human-in-the-loop isn’t enough: New attack turns AI safeguards into exploits
Human-in-the-loop (HITL) safeguards that AI agents rely on can be subverted, allowing attackers to weaponize them to run malicious code, new research from CheckMarx shows.
HITL dialogs are a safety backstop (a final “are you sure?”) that the agents run before executing sensitive actions like running code, modifying files, or touching system resources.
Checkmarx researchers described it as an HITL dialog forging technique they’re calling Lies-in-the-Loop (LITL), where malicious instructions are embedded into AI prompts in ways that mislead users reviewing approval dialogs.
The research findings reveal that keeping a human in the loop is not enough to neutralize prompt-level abuse. Once users can’t reliably trust what they’re being asked to approve, HITL stops being a guardrail and becomes an attack surface.
“The Lies-in-the-Loop (LITL) attack exploits the trust users place in these approval dialogs,” CheckMarx researchers said in a blog post. “By manipulating what the dialog displays, attackers turn the safeguard into a weapon — once the prompt looks safe, users approve it without question.”
https://www.csoonline.com/article/4108592/human-in-the-loop-isnt-enough-new-attack-turns-ai-safeguards-into-exploits.html?utm_date=20251228152159&utm_campaign=Computerworld%20Business%20Critical&utm_content=slotno-6-title-Human-in-the-loop%20isn%E2%80%99t%20enough%3A%20New%20attack%20turns%20AI%20safeguards%20into%20exploits&utm_term=Computerworld%20US%20Editorial%20Newsletters&utm_medium=email&utm_source=Adestra&aid=17849992&huid=fc3d92ce-1b66-42b1-a11d-1383a1a768a3














