Chinese startup DeepSeek, which sent Silicon Valley into a frenzy when it launched its first model R1 out of nowhere last year, has released a new experimental model, DeepSeek-V3.2-Exp. The model, an experimental version of its current V3.1-Terminus, builds on the company’s mission to increase efficiency in AI systems. The big improvement is a new feature called “DeepSeek Sparse Attention” (DSA), which makes the AI better at handling long documents and conversations. According to Adina Yakefu, Chinese community lead at Hugging Face, the new model also cuts the cost of running the AI in half compared to the previous version without a noticeable drop in performance.

The Pros and Cons of ‘Sparse Attention’

An AI model makes decisions based on its training data and new information. Sparse attention works by only factoring in the data it thinks is important for the task at hand, which drastically reduces the resources needed to run it. This is a boon for efficiency and the ability to scale AI. However, critics, such as Ekaterina Almasque, a cofounder of BlankPage Capital, have raised concerns that sparse attention could lead to a drop in reliability due to the lack of oversight in how it chooses to filter information. “The reality is, they have lost a lot of nuances,” she said, adding that it may not be the “optimal one or the safest” AI model to use.

Despite these concerns, DeepSeek says the experimental model performs on par with its previous version. Yakefu also noted that DeepSeek’s models work “right out of the box” with Chinese-made AI chips, such as Ascend and Cambricon, which allows them to run locally on domestic hardware without any extra setup. The company has also shared the actual programming code needed to use the experimental model, which allows other people to learn from it and build their own improvements.

Author’s Opinion DeepSeek’s new model, with its focus on efficiency and cost reduction, is a strategic move that could democratize access to powerful AI. While competitors are locked in a race for raw power and size, DeepSeek is carving out a niche by making AI more practical and affordable for developers, researchers, and smaller companies. This approach, combined with its open-source nature and compatibility with Chinese hardware, positions the company as a key player in the global AI competition. This is a long-term play that shows DeepSeek’s leadership is thinking about the sustainability and accessibility of AI, not just the speed and power of its models.

