- Supply Certainty: Ensuring a stable flow of hardware regardless of geopolitical shifts or export controls.
- Bargaining Power: Using multiple suppliers to create competition and gain leverage during negotiations.
- Cost Optimization: Selecting different chips for specific workloads (Training vs. Inference) to lower the Total Cost of Ownership (TCO).
- Regulatory Compliance: Meeting “Xinchuang” (Information Technology Application Innovation) requirements for domestic substitution in critical sectors.

The ByteDance-Iluvatar CoreX Deal: What’s Actually Happening
Let’s break down what’s on the table.
The chips under discussion are primarily designed for inference workloads—specifically the Iluvatar CoreX “Zhikun” (Zhikun 智铠) series of cloud-based inference GPUs.
For context: inference is when an AI model processes queries and generates responses. It’s different from training, which is how models learn in the first place.
Here’s the strategic layer that matters:
- Training workloads: ByteDance uses Huawei (Huawei 华为) Ascend and Cambricon (Hanwuja 寒武纪) chips
- Inference workloads: ByteDance would now add Iluvatar CoreX Zhikun series to the mix
If this deal closes, Iluvatar CoreX becomes ByteDance’s third GPU supplier.
Neither ByteDance nor Iluvatar CoreX has officially commented as of press time, but the pattern is unmistakable.
Find Top Talent on China's Leading Networks
- Post Across China's Job Sites from $299 / role
- Qualified Applicant Bundles
- One Central Candidate Hub
Your First Job Post Use Checkout Code 'Fresh20'

Why This Matters: The Shift From Training to Inference
This isn’t just about filling a supply gap.
What we’re witnessing is a fundamental structural shift in how the entire industry thinks about computing power.
Three things are happening simultaneously:
- Demand for AI computing power is undergoing total transformation
- Companies are pursuing independent and controllable computing strategies
- The separation of training and inference hardware is entering large-scale implementation
ByteDance’s approach is the clearest blueprint:
- Huawei Ascend focuses on cluster training and ultra-large-scale model pre-training
- Cambricon handles mid-to-high-end inference and private deployments for specific industries
- Iluvatar CoreX Zhikun would provide main supply for high-traffic online inference
Translation: different chips for different jobs, all under one company’s control.
ExpatInvest China
Grow Your RMB in China:
- Invest Your RMB Locally
- Buy & Sell Online in CN¥
- No Lock-In Periods
- English Service & Data
- Start with Only ¥1,000

The Industry-Wide Computing Power Arms Race
ByteDance isn’t alone in this aggressive push.
Every major Chinese tech company is laying down serious capital for independent data center infrastructure.
Baidu (Baidu 百度): Deploying AI computing clusters at the 10,000-card scale nationwide.
Alibaba (Alibaba 阿里巴巴): Single-quarter capital expenditures exceeded ¥38 billion RMB ($5.32 billion USD) for fiscal year 2026, with an estimated ¥380 billion RMB ($53.2 billion USD) earmarked over the next three years for cloud and intelligent computing hardware.
Tencent (Tengxun 腾讯): Constructing “Tencent Cloud (Tengxun Yun 腾讯云) HCC” high-performance AI clusters in multiple locations, with plans to import domestic computing power on a large scale in the second half of 2026 to support Hunyuan (Hunyuan 混元) large model MaaS, game AIGC, and video generation services.
This isn’t panic spending.
It’s strategic positioning.
Resume Captain
Your AI Career Toolkit:
- AI Resume Optimization
- Custom Cover Letters
- LinkedIn Profile Boost
- Interview Question Prep
- Salary Negotiation Agent

Why Domestic Chips Now? The Real Reason
According to Li Yuxuan (Li Yuxuan 李宇轩), head of AI Infra technology at ModelBest (Mianbi Zhineng 面壁智能), tech giants are building multi-supplier systems based on three core factors:
- Supply certainty: Control over your own destiny
- Bargaining power: Multiple vendors = leverage
- Cost structures: Playing suppliers against each other
Here’s the critical insight: inference demand vastly outpaces training demand at scale.
In large corporate environments, the volume of inference far exceeds training.
And here’s the key breakthrough: domestic chips have reached “usable” status for inference work.
The requirements for inference chips (interconnectivity, memory bandwidth, ecosystem maturity) are lower than what’s needed for training.
This means multiple domestic suppliers are now a viable engineering choice, not just a strategic concept.
There’s also the compliance angle: the “Information Technology Application Innovation” (Xinchuang 信创) initiative for domestic substitution is mandatory for government, state-owned enterprises (SOEs), and central enterprises work.
According to Xie Siyuan (Xie Siyuan 谢思远), Managing Director of Yijing Capital (Yijing Ziben 沂景资本), because internet firms serve major SOE clients and critical industries, they prioritize domestic capabilities during infrastructure builds.
Parallel technical routes also solve a real problem: companies aren’t tethered to a single vendor’s hardware or ecosystem iteration pace.
But obstacles remain.
Ecosystem matching between tech giants and domestic chip makers is still being worked out, and both sides are in active adjustment mode.
Lu Qiang (Lu Qiang 卢强), Senior Vice President of TsingMicro (Qingwei Zhineng 清微智能), laid out the real motivations: the core reason for diversifying the supply chain isn’t just domestic substitution, but a combination of demand, supply safety, and multi-dimensional security.
The demand for large model inference is exploding.
The supply of high-end overseas chips remains uncertain due to export controls.
And crucially: as domestic chips improve in cost-performance ratios, delivery controllability, and localized service, they’ve transitioned from Proof of Concept (PoC) verification to large-scale deployment.

The New Competition Metric: Token Cost
From a market economy perspective, this massive investment in domestic data centers is driven by real, growing demand for AI computing power.
Wang Zhan (Wang Zhan 王湛), Co-CEO of Sunrise (Xiwang 曦望), made a crucial observation:
As Chinese large models entered an application explosion in 2026, and models like DeepSeek V4 triggered surges in the Token market and rapid popularization of AI Agents, the key to industry competition has become “whose Token cost is lower.”
Domestic inference chips are showing strong performance-per-watt and cost-effectiveness in specific scenarios:
- High-concurrency inference clusters: Search recommendations, intelligent customer service, short-video multimodal generation
- MoE (Mixture of Experts) architectures: As MoE becomes popular, demand for computing scheduling and local inference capabilities has skyrocketed
These applications consume trillions of Tokens daily.
They’re currently the largest buyers of domestic inference chips.

The Inference Boom: Market Size and Projections
The numbers back up the urgency.
According to a Global AI Inference Chips report by China Insights Consultancy (Zhuoshi Zixun 灼识咨询), the AI chip industry is undergoing a fundamental shift from training-centric to inference-centric.
Demand for AI inference chips is expected to grow dramatically:
- By 2030: Global AI inference chip market projected to reach ¥3.0696 trillion RMB ($429.7 billion USD)
- China’s domestic market by 2030: Expected to reach ¥1.1664 trillion RMB ($163.3 billion USD)
That’s not a typo—we’re talking about nearly a third of the global market happening in China alone.
Liu Hua (Liu Hua 刘华), Deputy General Manager of the Emerging Business Department at UCloud (Youkede 优刻得), predicted demand for office scenarios and business AI integration will continue to grow, keeping demand high for the next 3 to 5 years.

The Scramble for Cards: Supply Crunch Reality
The surge in inference demand has created a real problem: temporary supply-demand imbalance.
Wang Zhan observed a “scramble for cards” in the market.
Head enterprises are racing to grab GPU cards, purchase memory, rent data centers, and expand inference capacity.
The pressure is real:
- In Q1 this year, computing power rental costs rose by nearly 30% to 40%
- For the full year, AI inference demand is estimated to be 4 to 5 times that of training
Lu Qiang confirmed shortages exist across multiple areas:
- High-end AI accelerator cards
- HBM (High Bandwidth Memory)
- Advanced packaging
- Complete servers
Delivery cycles are lengthening.
Finished server prices are fluctuating upward based on supply and memory costs.
Xie Siyuan predicted that as competition intensifies in the second half of the year, homogenization will become more apparent.
A price war focused on Token costs is expected, which may actually drive overall prices downward.

Why Inference Dominates Energy Consumption
The energy picture tells another part of this story.
Based on research from The Hong Kong Polytechnic University (Xianggang Ligong Daxue 香港理工大学), inference energy consumption already accounts for 60%–90% of total AI energy use in ultra-large-scale cloud settings.
Why?
High-frequency requests from billions of users.
The Chinese Academy of Engineering (Zhongguo Gongchengyuan 中国工程院) added more context: in Q1 2026, China’s inference demand reached eight times the demand for training.
This structural change has massive implications for data center operators.
A research report from BOCOM International (Jiaoyin Guoji 交银国际) concluded that for data center operators, high-density, low-latency computing power for large-scale inference tasks will be the main driver of growth.
With imminent expansion of domestic GPU production and rolling release of orders from hyper-scale cloud vendors, the pace of project implementation in the second half of 2026 is expected to accelerate further.

The Domestic Ecosystem Playing Catch-Up
Let’s be real about where things stand.
Domestic cards are not yet available in massive quantities due to production capacity and adaptation issues.
But the volume is gradually increasing.
Liu Hua acknowledged short-term pressure exists.
However, in the medium to long term, supply pressure from Nvidia (Yingweida 英伟达) will promote the growth of domestic computing power.
Constrained by geopolitical export controls, domestic industries still need to procure high-end Nvidia computing power as a supplement in the short term.
But a domestic computing ecosystem with substitution capabilities has entered the stage of large-scale construction.

The Long Game: In-House Research and Strategic Independence
Chinese internet firms are playing a three-layer strategy:
- Short term: Procurement to solve immediate needs
- Medium term: Multi-vendor domestic strategies to mitigate risk and costs
- Long term: In-house research to retain profits
This follows the logic of Google’s TPU strategy.
The core goal of self-developed chips isn’t to sell them to others.
It’s to escape a passive strategic position.
Here’s why a multi-supplier system makes practical sense:
Large tech firms have diverse internal businesses and rich scenarios.
Their requirements for training, fine-tuning, high-concurrency inference, and image processing differ significantly.
A multi-supplier system allows optimization:
- Use “Chip A” for large training
- Use “Chip B” for long-text inference
- Use “Chip C” for lightweight edge computing (Bianyuan Jisuan 边缘计算)
This optimizes the Total Cost of Ownership (TCO).
Xie Siyuan emphasized: the core consideration for building a multi-supplier system is cost-effectiveness.
In the context of restricted supply, reducing dependence on a single vendor is a necessary risk management tool.
However, the factor that truly determines procurement scale remains economic performance after actual deployment.

Traditional Computing Centers vs. Commercial Players
There’s an important distinction worth noting.
Traditional Intelligent Computing Centers have taken on the role of industry nurturing and infrastructure construction.
Compared to actual utilization rates, their role in driving the revenue growth of domestic chip companies is more significant.
The construction of computing networks by internet cloud vendors, however, remains market-driven commercial behavior.
That’s the difference: infrastructure plays versus real-world usage.

The Pivotal Moment: From Verification to Large-Scale Usage
The industry is at an inflection point.
We’re moving from “usability verification” to “large-scale usage.”
In the past: customers looked at single-card metrics.
Now: customers look at the stability of 1,000-card or 10,000-card clusters and the unit cost.
Large orders will significantly improve a manufacturer’s:
- Revenue
- Cash flow
- Bargaining power
But the industry landscape won’t be decided by one or two orders alone.
It will ultimately be determined by:
- Product iteration
- Ecosystem support
- Delivery reliability
- Customer retention
The large-scale data center layouts by internet giants remain a critical milestone in the computing power market.
What happens next will reshape the entire global AI infrastructure landscape.






