Research
Prompt Injection
3 articles in archive
Introducing Lockdown Mode and Elevated Risk labels in ChatGPT
Introducing Lockdown Mode and Elevated Risk labels in ChatGPT to help organizations defend against prompt injection and AI-driven data exfiltration.
OpenAI Blog35d ago
Continuously hardening ChatGPT Atlas against prompt injection
OpenAI is strengthening ChatGPT Atlas against prompt injection attacks using automated red teaming trained with reinforcement learning. This proactive discover-and-patch loop helps identify novel exploits early and harden the browser agent’s defenses as AI becomes more agentic.
OpenAI Blog89d ago
The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions
Today's LLMs are susceptible to prompt injections, jailbreaks, and other attacks that allow adversaries to overwrite a model's original instructions with their own malicious prompts.
OpenAI Blog700d ago
