DeepSeek Unveils V3.2-Exp, Slashes API Costs Again
DeepSeek is taking another swing at the AI heavyweights with the launch of DeepSeek-V3.2-Exp, an experimental version of its flagship model rolled out Monday on Hugging Face.
The new release builds on last year's V3.1-Terminus and debuts DeepSeek Sparse Attention, a feature designed to make training and inference more efficient when working with long chunks of text. The company called it a step toward its next-generation architecture, adding that the research is focused on squeezing more efficiency out of transformer models without sacrificing performance.
To sweeten the deal, DeepSeek also cut API pricing. Inputs now cost as little as $0.028 per cache hit, down from $0.07, while outputs dropped to $0.42 from $1.68. That pricing move underscores the startup's pitch: powerful models at a fraction of the cost of U.S. rivals.
DeepSeek has been turning heads with its lean training approach. Its reasoning-focused R1 model, for instance, cost just $294,000 to train using slightly more than 500 Nvidia (NVDA) H800 GPUs. By comparison, reports suggest Microsoft MSFT-backed OpenAI's GPT-4 burned through more than 10,000 GPUs.