DMR News

Advancing Digital Conversations

OpenAI Introduces Jalapeño, Its First Custom AI Inference Processor

ByJolyen

Jun 25, 2026

OpenAI Introduces Jalapeño, Its First Custom AI Inference Processor

OpenAI and Broadcom have introduced Jalapeño, OpenAI’s first custom processor designed specifically for running large language models.

The companies describe the chip as an “Intelligence Processor” built around the requirements of OpenAI’s inference systems. It is the first product in a planned multi-generation computing platform intended to make AI models faster, more efficient, and less expensive to operate.

Jalapeño Focuses on AI Inference

Inference is the process through which a trained AI model responds to prompts, generates code, creates content, or completes other tasks for users. These workloads run continuously after a model has been developed and can account for a substantial share of an AI company’s computing costs.

Jalapeño has been optimised for large language model inference rather than every type of AI workload. OpenAI said early testing showed stronger performance per watt than existing leading alternatives, although it did not publish detailed benchmark results.

The company highlighted real-time coding models as one area where lower latency and operating costs could be particularly valuable. OpenAI is developing coding products such as Codex, which require models to generate and revise software quickly while interacting with users and development tools.

More demanding work such as pre-training frontier models may continue to rely on systems from Nvidia and other chip suppliers. OpenAI has separate agreements to deploy large quantities of Nvidia and AMD hardware alongside its custom processors.

OpenAI Models Assisted With Chip Development

OpenAI said its own AI models contributed to the chip’s development. The company did not provide detailed information about which stages used AI, but chipmakers increasingly apply machine learning to tasks such as architecture design, verification, optimisation, and debugging.

OpenAI contributed its knowledge of model behaviour and inference workloads, while Broadcom provided semiconductor design, networking, connectivity, and manufacturing expertise.

The companies first announced their strategic collaboration in October 2025. Their agreement covers 10 gigawatts of custom accelerators and associated networking systems, with deployment expected to begin in the second half of 2026 and continue through 2029.

Custom Chips Give OpenAI More Control

Google and Amazon already use custom AI accelerators to reduce their dependence on general-purpose hardware and optimise costs for their own services. OpenAI’s move gives it similar control over the infrastructure supporting its models and products.

The company said it is now working across the entire AI computing stack, including processor architecture, memory, networking, software kernels, scheduling, deployment systems, models, and customer products.

Designing these layers together could allow OpenAI to optimise them around the same workloads. Jalapeño remains in testing, and the company has not disclosed when it will begin supporting ChatGPT, Codex, or other commercial services.


Featured image credits: Roboflow Universe
For more stories like it, click the +Follow button at the top of this page to follow us.

Jolyen

As a news editor, I bring stories to life through clear, impactful, and authentic writing. I believe every brand has something worth sharing. My job is to make sure it’s heard. With an eye for detail and a heart for storytelling, I shape messages that truly connect.

Leave a Reply

Your email address will not be published. Required fields are marked *