DeepSeek-V3.2-Exp - Long-context efficiency with DeepSeek Sparse Attention

DeepSeek-V3.2-Exp is a new experimental model introducing DeepSeek Sparse Attention (DSA). This new architecture boosts long-context efficiency for training and inference while maintaining the performance of V3.1-Terminus. API prices have been cut by over 50%.

Hi everyone!

DeepSeek's new experimental model, V3.2-Exp, is here!

It's built on V3.1-Terminus but introduces their new DeepSeek Sparse Attention (DSA).

This release is all about efficiency. They've found a smarter way to handle long context, which means training and inference are faster and cheaper without sacrificing the quality of the previous version.

Good new for developers: API prices are down by over 50%.

After their paper landed in Nature, it's clear DeepSeek has more architectural innovations to surprise us with in the future.

DeepSeek-V3.2-Exp - Long-context efficiency with DeepSeek Sparse Attention

Replies