Key Points
- DeepSeek’s new AI model, DeepSeek-V3.1, uses UE8M0 FP8 precision, explicitly designed for domestic Chinese chips, signaling a strategic co-development with China’s semiconductor industry.
- DeepSeek-V3.1 introduces an “agent era” with enhanced Programming Agent and Search Agent capabilities through post-training optimization.
- A new “Hybrid Inference” mode allows the model to switch between “Quick Response” (deepseek-chat) and “Deep Thinking” (deepseek-reasoner), achieving comparable quality to previous models with a 20%-50% reduction in output tokens.
- API costs are increasing, with input tokens now ¥4 RMB ($0.55 USD) and output tokens ¥12 RMB ($1.64 USD) per million tokens, both supporting a 128K context window.
- V3.1 was trained on an additional 840 billion tokens, is open-source on Huggingface and ModelScope (Modata 魔搭), and now supports the Anthropic API format.

A new AI model from DeepSeek (Shenxun 深思) is making waves, but the biggest story isn’t just about smarter agents or faster responses.
The real headline is a quiet technical detail with massive implications: the model utilizes UE8M0 FP8 scale parameter precision, a format explicitly designed for the next generation of domestic Chinese chips.
What’s the Big Deal with UE8M0 FP8 and Domestic Chips?
This is more than just a model update; it’s a strategic move.
On August 21, DeepSeek dropped a comment on its official WeChat (Weixin 微信) account that sent a clear signal to the tech world.
The company confirmed that the new UE8M0 FP8 format in their DeepSeek-V3.1 model is a forward-looking decision, specifically tailored for upcoming homegrown hardware.
This suggests a tight-knit, co-development strategy between China’s leading AI labs and its burgeoning semiconductor industry.
They’re not just building AI software; they’re building an entire, self-reliant ecosystem from the silicon up.

Resume Captain
Your AI Career Toolkit:
- AI Resume Optimization
- Custom Cover Letters
- LinkedIn Profile Boost
- Interview Question Prep
- Salary Negotiation Agent

Meet DeepSeek-V3.1: The “Agent Era” Begins
While the hardware angle is the long-term play, the new DeepSeek-V3.1 model is a significant step forward in its own right.
DeepSeek is calling it “our first step toward the agent era.”
Here’s the breakdown of what’s new:
1. Beefed-Up Agent Capabilities
Through clever post-training optimization, V3.1 shows major gains in tasks that require action and tool use.
- Programming Agent: Improved performance in coding-related tasks.
- Search Agent: Better at using search tools to find and synthesize information.
Essentially, the model is getting much better at doing things, not just answering questions.
2. The “Hybrid Inference” Mode is Genius
This is one of the coolest features.
V3.1 has a single model that can operate in two modes:
- Quick Response (deepseek-chat): For simple, fast answers.
- Deep Thinking (deepseek-reasoner): For complex problems that require comprehensive analysis.
You can toggle between them with a “Deep Thinking” button in the app and on the web platform.
As one user on X (formerly Twitter) put it, “Hybrid inference is fantastic. Having a model that can switch between deep thinking and quick responses feels like the future of practical AI.”
This dual-mode approach is incredibly efficient, avoiding wasted compute on simple tasks while dedicating power where it’s needed most.
3. Higher Thinking Efficiency
The new “thinking” mode (V3.1-Think) is not only smart but also fast.
DeepSeek confirmed that it delivers response quality comparable to their previous R1-0528 model but with a 20%-50% reduction in the number of output tokens.
This is thanks to Chain-of-Thought (CoT) compression training, making the model more concise without sacrificing performance.
The non-thinking mode is also more efficient, giving you the same quality answers with a shorter output length compared to the V3 model.

The Price of Progress: API Costs are Going Up
With new features comes a new pricing structure.
DeepSeek is adjusting its API call costs, and the night-time discount will be discontinued starting September 6th.
Here’s the new pricing for the DeepSeek-V3.1 API:
- Input (Uncached): ¥4 RMB ($0.55 USD) per million tokens (up from ¥2 RMB ($0.27 USD) for V3).
- Input (Cached): ¥0.5 RMB ($0.07 USD) per million tokens.
- Output: ¥12 RMB ($1.64 USD) per million tokens (up from ¥8 RMB ($1.09 USD) for V3).
Both API modes now support a generous 128K context window.

Find Top Talent on China's Leading Networks
- Post Across China's Job Sites from $299 / role, or
- Hire Our Recruiting Pros from $799 / role
- Qualified Candidate Bundles
- Lower Hiring Costs by 80%+
- Expert Team Since 2014
Your First Job Post

More Training, Open Source, and a Nod to Anthropic
DeepSeek isn’t keeping all this progress to itself.
The company shared a few more important updates:
- More Data: The V3.1 base model was trained on an additional 840 billion tokens on top of the original V3 foundation.
- Open Source: Both the base model and the post-training model are now available on Huggingface and ModelScope (Modata 魔搭) for the community to use and build upon.
- Anthropic API Support: In a smart move “to meet user demand,” DeepSeek now supports the Anthropic API format. This means developers can easily integrate V3.1’s power into the Claude Code framework.
- Training Data: An additional 840 billion tokens were used to train the V3.1 base model, building upon the original V3 foundation.
- Open-Source Availability: Both the base model and the post-training model are now open-source on Huggingface and ModelScope (Modata 魔搭), promoting community access and collaboration.
- Anthropic API Support: DeepSeek-V3.1 now supports the Anthropic API format, allowing seamless integration with the Claude Code framework to meet developer demand.

ExpatInvest China
Grow Your RMB in China:
- Invest Your RMB Locally
- Buy & Sell Online in CN¥
- No Lock-In Periods
- English Service & Data
- Start with Only ¥1,000

The Takeaway
DeepSeek-V3.1 is a powerful iterative update that pushes the boundaries of AI agent capabilities and efficiency.
But the real story is the strategic alignment with future hardware.
By optimizing for China’s next-gen processors with DeepSeek’s UE8M0 FP8 precision, the company isn’t just building a model; it’s building a moat and signaling a future where Chinese AI runs best on Chinese silicon.
