Key Points
- DeepSeek’s Aggressive Pricing: DeepSeek-V4-Pro offers input tokens at ¥0.25 RMB ($0.035 USD) per million (cache hit), a potential 700x lower than OpenAI’s GPT-5.5 Pro, disrupting an industry where others are raising prices.
- Technological Advantage: DeepSeek achieves these low prices through a Hybrid Attention Architecture (CSA, HCA, Sliding Window Attention) that dramatically reduces computational overhead, with V4-Pro using only 27% of inference FLOPs and 10% of KV cache compared to older models.
- Domestic Chip Integration: DeepSeek leverages Chinese domestic chips like Huawei’s Ascend and Cambricon, allowing for optimized performance and bypassing supply chain vulnerabilities, creating a strong ecosystem advantage.
- Industry Trend Reversal: While companies like Alibaba Cloud, Baidu AI Cloud, Tencent Cloud, and Zhipu AI are increasing prices due to surging demand and costs, DeepSeek is purposefully driving prices down, signaling a shift towards volume over margin.
- Broader Implications: This move highlights the growing importance of vertical integration, the maturing of Chinese domestic tech ecosystems, and could redefine the pricing floor for the entire AI industry.
- Input (Cache Hit): ¥0.25 RMB ($0.035 USD)
- Input (Cache Miss): ¥3 RMB ($0.42 USD)
- Output: ¥6 RMB ($0.84 USD)
- Promotion Period: April 26 – May 5

The AI landscape just shifted.
While the rest of the industry is hiking prices, DeepSeek (Shenzhou Duanshou 深度求索) just dropped a bombshell: a 75% discount on its latest model that’s making competitors look expensive.
We’re talking about a price war that could fundamentally change how companies think about AI infrastructure costs.
The Numbers That Matter: DeepSeek’s Aggressive Pricing Strategy
On April 26, DeepSeek announced a limited-time promotional pricing for its DeepSeek-V4-Pro model API.
Here’s what you’re actually paying:
- Input (cache hit): ¥0.25 RMB ($0.035 USD) per million tokens
- Input (cache miss): ¥3 RMB ($0.42 USD) per million tokens
- Output: ¥6 RMB ($0.84 USD) per million tokens
- Promotion valid through: May 5
To put this in perspective—those prices are wild compared to what you’re paying elsewhere.
Find Top Talent on China's Leading Networks
- Post Across China's Job Sites from $299 / role
- Qualified Applicant Bundles
- One Central Candidate Hub
Your First Job Post Use Checkout Code 'Fresh20'

The Competitive Gap: Why DeepSeek is Disrupting the Market
Let’s do a side-by-side comparison of what the industry’s heaviest hitters are charging:
OpenAI’s Pricing (GPT Models)
- GPT-5.5 Pro: $30 USD (¥214.50 RMB) input | $180 USD (¥1,287 RMB) output
- GPT-5.5 (Standard): $5 USD (¥35.75 RMB) input | $30 USD (¥214.50 RMB) output
- GPT-5.4: Comparable to GPT-5.5 Pro rates
That means OpenAI’s GPT-5.5 Pro input cost is over 700x higher than DeepSeek V4 Pro’s discounted rate.
Other Major Players
- Anthropic (Ai Si Luo Bi 安卓皮克) Claude Opus series: $12-25 USD output per million tokens
- Google (Gu Ge 谷歌) Gemini 3.1 Pro: $12-25 USD output per million tokens
Every single one of them sits significantly higher than DeepSeek’s adjusted rates.
This isn’t just competitive pricing—it’s a strategic move to grab market share.
ExpatInvest China
Grow Your RMB in China:
- Invest Your RMB Locally
- Buy & Sell Online in CN¥
- No Lock-In Periods
- English Service & Data
- Start with Only ¥1,000

The Industry Context: Everyone Else is Raising Prices (Except DeepSeek)
Here’s where it gets interesting.
While DeepSeek is dropping prices, the rest of the industry is doing the opposite.
This creates a fascinating dynamic: DeepSeek is doubling down on its “AI price reduction” philosophy precisely when computing costs are skyrocketing everywhere else.
The Cloud Infrastructure Price Hikes Timeline
April 13 — Alibaba Cloud (Ali Yun 阿里云)
Alibaba Cloud announced changes to DataWorks, its Big Data Development and Governance platform, starting April 14, 2026:
- Removed daily API call limits for Standard and Professional users
- Standard Edition now includes 100,000 free API calls monthly
- Professional Edition includes 500,000 free calls monthly
- Overage fees follow a pay-as-you-go model
March 18 — Baidu AI Cloud (Baidu Zhinen Yun 百度智能云)
Baidu issued an official notice attributing price increases to surging global AI demand and rising hardware/infrastructure costs:
- AI computing power prices increased by 5-30% starting April 18
- Parallel file storage services increased by approximately 30%
- The company framed this as necessary to ensure “long-term stability and service quality”
March 11 & April 9 — Tencent Cloud (Tengxun Yun 腾讯云)
Tencent announced two consecutive price increases in 2026 alone:
- First increase on March 11 for certain models
- Second increase announced April 9, effective May 9, 2026
- Affected products: AI computing power, container services, and Elastic MapReduce (EMR)
- Justification: “Surging global demand and supply chain costs”
The pattern is clear: infrastructure providers are raising prices due to supply chain pressures and demand.
DeepSeek’s move is essentially a middle finger to that trend.
Resume Captain
Your AI Career Toolkit:
- AI Resume Optimization
- Custom Cover Letters
- LinkedIn Profile Boost
- Interview Question Prep
- Salary Negotiation Agent

The Model Layer: Downstream Providers Are Also Hiking
It’s not just infrastructure—the model companies themselves are getting more expensive too.
Zhipu AI (Zhipu Huazhang 智谱华章), a major domestic large model manufacturer, has raised prices three times already in 2026:
Zhipu’s Price Increase Roadmap
February 12 — GLM Coding Plan Restructuring
- Overall price increase starting at 30%
- Rationale: “Sustained strong growth in market demand and rapid increase in user scale and call volume”
March 16 — GLM-5-Turbo Release
- New model optimized for the “OpenClaw” (Longxia 龙虾) agent scenario
- API price increase of 20%
April 8 — GLM-5.1 Official Release
- Another 10% price increase
- The cumulative effect: cache hit token pricing for GLM-5.1 in coding scenarios now approaches Anthropic’s Claude Sonnet 4.6 levels
So Zhipu’s pricing trajectory is essentially: up, up, and up again.
Meanwhile, DeepSeek is going down.

The Technical Edge: Why DeepSeek Can Afford Lower Prices
This price war isn’t just marketing—it’s backed by legitimate technological advantages.
DeepSeek’s innovation centers on a Hybrid Attention Architecture that dramatically reduces computational overhead.
The Architecture Breakdown
DeepSeek V4 uses two alternating attention mechanisms:
- Compressed Sparse Attention (CSA): Handles fine-grained, medium-range information
- Heavy Compression Attention (HCA): Manages coarse-grained, ultra-long-range information
- Sliding Window Attention: A local branch in each layer that focuses on the most recent 128 tokens, preserving fine details that might get lost in compression
The result?
Massive efficiency gains in ultra-long context scenarios (one million tokens):
Computational Efficiency Comparison (V3.2 vs. V4)
- V4-Pro: Uses only 27% of inference computation (FLOPs) and 10% of KV cache (“working memory”)
- V4-Flash: Even more aggressive—10% inference computation and 7% KV cache
Translation: DeepSeek can process more tokens with fewer resources.
Lower computational costs = ability to offer lower prices while maintaining margins.

The Domestic Chip Angle: DeepSeek’s Infrastructure Advantage
Here’s another strategic piece of the puzzle: DeepSeek’s integration with Chinese domestic chips.
This is huge because it bypasses potential supply chain vulnerabilities and builds an entire ecosystem around homegrown technology.
Huawei’s Support
Huawei (Huawei 华为) Computing announced that its Ascend (Sheng teng 昇腾) ultra-node products fully support DeepSeek V4 through close collaboration between chip and model developers.
The Ascend 950, specifically, optimizes for DeepSeek through:
- Kernel fusion and multi-stream parallel technology to reduce Attention computation
- Decreased memory access overhead
- Quantization algorithms enabling high-throughput, low-latency deployment
- The Ascend A3 ultra-node series also fully adapted with training reference implementations
Cambricon’s Day 0 Support
Cambricon (Hanwu-ji 寒武纪) also announced “Day 0” adaptation for both DeepSeek-V4-flash and DeepSeek-V4-Pro versions based on the vLLM inference framework, with code open-sourced to GitHub.
This ecosystem approach—where chip manufacturers and model developers move in lockstep—creates competitive advantages that aren’t available to companies relying on external hardware.

What This Means For You: The Broader Implications
This price war signals something fundamental shifting in the AI industry:
1. Volume over margin is becoming the play.
DeepSeek is betting on capturing massive volume at razor-thin margins, which forces competitors to either match prices or risk losing customers.
2. Vertical integration wins.
Companies that control their entire stack—chips, software, infrastructure—can outprice those relying on third-party components.
3. Domestic tech ecosystems are maturing.
Chinese chip makers and model developers demonstrating they can compete globally is reshaping geopolitical tech competition.
4. Watch what others do next.
If other model providers are forced to match DeepSeek’s pricing, it validates a new pricing floor for the industry.
If they don’t, it suggests DeepSeek has found a temporary advantage they can exploit before margins compress across the board.

The Bottom Line
DeepSeek’s ¥0.25 RMB ($0.035 USD) per million token pricing isn’t just a promotional stunt—it’s a calculated move that reveals:
- Superior efficiency through innovative architecture
- Ecosystem advantages via domestic chip integration
- Strategic positioning in a price-hiking industry
- A willingness to fight for market dominance through aggressive unit economics
Whether this pricing holds post-promotion or becomes the new industry standard remains to be seen.
But one thing’s certain: the AI price war is real, and DeepSeek just made it impossible to ignore.




