OpenAI has introduced a new general-purpose AI agent within ChatGPT, designed to perform a wide range of computer-based tasks on behalf of users. This AI agent can automatically manage calendars, create and edit presentations, run code, and more, aiming to go beyond just answering questions.
Combining Powerful Features for Smarter Assistance
Named simply ChatGPT agent, the tool builds on capabilities from OpenAI’s previous agentic products. It combines features like Operator’s web navigation and Deep Research’s multi-source information synthesis to deliver comprehensive research reports. Users interact with the agent via natural language prompts, simplifying complex workflows.
The agent is available starting Thursday for subscribers on OpenAI’s Pro, Plus, and Team plans. Activation is as easy as selecting “agent mode” from ChatGPT’s tool dropdown menu.
ChatGPT agent leverages ChatGPT connectors, enabling it to access apps such as Gmail and GitHub to pull relevant data. It also has access to a command-line terminal and can call APIs to interact with various software tools.
OpenAI highlights use cases including planning and purchasing groceries for a Japanese breakfast or analyzing competitors and preparing slide decks — tasks that require parsing websites, planning actions, and tool use far beyond simple question answering.
Leading Performance on Complex Benchmarks
The underlying model of ChatGPT agent delivers state-of-the-art results. It scored 41.6% on Humanity’s Last Exam (pass@1), a challenging test spanning over a hundred subjects — nearly double prior OpenAI mini models. On FrontierMath, a difficult math benchmark, it scored 27.4% with tool access, outperforming previous records significantly.
Recognizing the risks of its expanded capabilities, OpenAI classified ChatGPT agent as “high capability” in sensitive domains such as biological and chemical weapons, indicating potential for misuse. While no direct evidence exists of malicious use, OpenAI has implemented real-time monitoring that flags biological threat content.
The agent’s memory feature has been disabled to prevent data leakage via prompt injections, though OpenAI may reconsider this in the future.
Author’s Opinion
ChatGPT agent marks an important step toward AI systems that truly help users accomplish complex tasks end-to-end. However, the real-world performance of AI agents remains unproven, and the potential misuse risks are significant. OpenAI’s proactive safety measures are encouraging but should be paired with transparent public oversight. The success of agentic AI hinges on balancing capability with responsibility, and it’s clear the technology is only just beginning its journey.
Featured image credit: ITPro
For more stories like it, click the +Follow button at the top of this page to follow us.