OpenAI rolls out Lockdown Mode to curb prompt-injection data theft risks

OpenAI has begun rolling out a new security feature called ‘Lockdown Mode’ across its products, making the feature available to eligible Free, Go, Plus, Pro, and self-serve ChatGPT Business accounts. The company said the optional setting targets people and organisations that handle sensitive information and seek stronger protection against data exfiltration risks linked to prompt-injection attacks. The feature restricts several web-connected capabilities, including Deep Research, Agent Mode, and live web browsing.

However, OpenAI acknowledged that the feature is not a complete solution. In its FAQ, the company said Lockdown Mode is designed to “substantially reduce the risk of prompt injection-based data exfiltration†but “does not guarantee data exfiltration cannot happenâ€. OpenAI added that risks may still arise through third-party apps that remain enabled, cached data, unforeseen combinations of capabilities, or newly discovered attack techniques.

What is prompt injection? Prompt injection is a technique in which attackers hide malicious instructions inside content that an AI system reads, such as webpages, documents, emails, PDFs, or database records. If the model follows those instructions, it may ignore its original directions, reveal sensitive information to attackers, perform unintended actions, or generate misleading outputs.

What Lockdown Mode restricts: OpenAI said that Lockdown Mode disables or limits several features that can connect to the web or external services.

At the same time, OpenAI clarified that Lockdown Mode does not change memory settings, file uploads, conversation sharing, or how the company uses conversations to improve its models.

How Lockdown Mode works and app risk categories: OpenAI said Lockdown Mode builds on existing protections such as sandboxing, URL-based data exfiltration safeguards, monitoring systems and enterprise controls. Rather than blocking prompt injections outright, it aims to prevent the final stage of an attack by limiting outbound network requests that could transfer sensitive information to an attacker.

The company also categorised apps and actions by risk level:

Why does this matter? OpenAI said prompt injection is “not currently a major risk,†yet security researchers increasingly view it as one of the most serious threats facing AI systems. OWASP ranks prompt injection as the top security risk for LLM applications, warning that attackers can hide malicious instructions in websites, documents or emails that AI models later process. In one case, a user tricked a Chevrolet dealership chatbot into selling them a car for $1. The UK’s National Cyber Security Centre (NCSC) has also warned that prompt injection attacks may never be fully mitigated because LLMs do not inherently distinguish between instructions and data.

Also read:

Back To Top