OpenAI has patched a flaw that could have allowed hackers to manipulate ChatGPT into leaking private information from a victim’s Gmail inbox. The vulnerability was discovered and reported by cybersecurity vendor Radware. The problem was found in ChatGPT’s “deep research” function, a premium tool that can handle complex tasks such as browsing the web and analyzing messages and files in a user’s inbox. With a user’s permission, deep research can connect to various services, including Gmail, Google Drive, and Microsoft’s OneDrive. The vulnerability arose when a user asked ChatGPT to perform a deep research query on their Gmail inbox. Radware found that ChatGPT could be manipulated into scanning and leaking the user’s private information if it encountered a hacker-written email containing secret instructions to tamper with the chatbot.
How the Attack Was Executed
The proof-of-concept attack was not easy to develop. It required a lengthy, specially crafted phishing email that appeared to be about HR processes but contained malicious instructions hidden within the code. This email was designed to dupe ChatGPT into executing the instructions, which involved extracting relevant names and addresses from the user’s inbox and sending them to a hacker. Although the attack was dependent on the user initiating a deep research query on a specific topic, once triggered, ChatGPT would gather the sensitive information and send it to a hacker-controlled web page “without user confirmation and without rendering anything in the UI,” according to Radware.
The same attack was also difficult for traditional cybersecurity tools to detect because the data exfiltration originated from OpenAI’s own cloud infrastructure, rather than the user’s device or browser. Radware noted that this makes it nearly impossible for traditional defenses to intercept the data.
Mitigating the Persistent Threat
OpenAI told a publication that it takes steps to reduce the risk of malicious use and that it is continually improving safeguards. According to Radware, OpenAI patched the flaw in August before publicly acknowledging it in September. The findings highlight the persistent threat of hackers planting hidden instructions in web content to manipulate chatbots into executing malicious actions. Last month, both Anthropic and Brave Software warned about similar threats potentially affecting AI-powered browsers and browser extensions. To defend against such threats, Radware suggests safeguards could include “sanitizing” emails to remove hidden AI instructions and better monitoring of chatbot actions.
Author’s Opinion
This vulnerability, while patched, signals a new and complex era in cybersecurity. Traditional defenses are ill-equipped to handle threats that originate from a chatbot’s infrastructure rather than a user’s device. As AI agents become more integrated into our digital lives and gain access to our most sensitive information, the responsibility for security is shifting. The discovery of this flaw is a crucial wake-up call, demonstrating that the integrity of our data now depends not just on the security of our own devices, but on the robustness of the AI models that we grant access to them. The tech industry, regulators, and users must all adapt to a world where AI itself can become both a tool and a target in the cybersecurity war.
Featured image credit: Jonathan Kemper via Unsplash
For more stories like it, click the +Follow button at the top of this page to follow us.