Key Points
- Global AI token consumption reached 28.9 trillion tokens for the week ending May 24, 2026, marking a 7.4% week-over-week increase and five consecutive weeks of growth.
- Chinese AI models are now leading global token consumption, surpassing U.S. models for four consecutive weeks, with 9.22 trillion tokens per week compared to 4.93 trillion for U.S. models.
- DeepSeek-V4-Flash is the top-ranked global AI model by call volume on OpenRouter, an overseas-focused platform, suggesting potentially larger domestic Chinese market figures.
- China’s average daily token call volume exceeded 140 trillion in March 2026, with platforms like Doubao seeing usage double to 120 trillion tokens daily within three months.
- The emergence of “Token Factories” by major Chinese telecommunication companies signifies a shift to token-based consumption billing, commoditizing AI compute and decoupling revenue from time-based rental.

The world’s appetite for AI tokens just hit a breaking point.
We’re watching something remarkable unfold in real-time: token consumption is accelerating at a pace that’s reshaping the entire computing power industry.
And it’s not just a blip on the radar—this is the fifth consecutive week of rising call volumes, signaling that we’re entering a sustained period of accelerated growth across AI infrastructure.
—
The Numbers Tell the Story: Global Token Consumption Explodes
Let’s start with the raw data, because it’s genuinely impressive.
According to OpenRouter (the AI model aggregation and call platform that tracks this stuff with transparency), global token consumption reached 28.9 trillion tokens for the week ending May 24, 2026.
That’s a 7.4% week-over-week increase.
But here’s what matters more: this isn’t a one-time spike.
We’re looking at five consecutive weeks of growth, which tells us demand for model interactions is being released at a genuinely rapid pace.
The shift is driven by two major forces:
- AI Agents are moving from experimental to practical
- Advanced applications are multiplying token usage across industries
—
Find Top Talent on China's Leading Networks
- Post Across China's Job Sites from $299 / role
- Qualified Applicant Bundles
- One Central Candidate Hub
Your First Job Post Use Checkout Code 'Fresh20'

China’s AI Models Are Now Leading Global Token Rankings
Here’s the plot twist nobody saw coming (or maybe everyone did): Chinese AI models are now consuming more tokens globally than U.S. models.
For the same week, here’s the breakdown:
- Chinese AI models: 9.22 trillion tokens per week (up 19.89% month-over-month)
- U.S. AI models: 4.93 trillion tokens per week (up 16.27% month-over-month)
And it’s not even close anymore.
Chinese models have now surpassed U.S. models for four consecutive weeks, firmly establishing China as the global leader in token consumption.
DeepSeek-V4-Flash Takes the Crown
At the top of the OpenRouter global AI model call rankings sits DeepSeek-V4-Flash.
DeepSeek (Zhishen Nexun 深度求索) has absolutely dominated the conversation around efficient, high-performance AI models, and the data backs it up.
What’s wild is that OpenRouter’s user base is primarily overseas, with Chinese developers accounting for only about 6% of its traffic.
Translation: the figures we’re seeing are likely just scratching the surface.
The domestic Chinese market is almost certainly even larger than these global rankings suggest.
—
ExpatInvest China
Grow Your RMB in China:
- Invest Your RMB Locally
- Buy & Sell Online in CN¥
- No Lock-In Periods
- English Service & Data
- Start with Only ¥1,000

Inside China’s AI Token Consumption Boom
If global numbers are impressive, what’s happening inside China is downright explosive.
According to the National Bureau of Statistics (Guojia Tongjiju 国家统计局), China’s average daily token call volume exceeded 140 trillion in March 2026 alone.
Let that sink in.
That’s just one country, one month.
Doubao’s Meteoric Rise
Bytedance’s (Zijietiaodongg 字节跳动) Doubao (Doubao 豆包) is showing just how fast these platforms can scale.
Doubao’s daily usage doubled to 120 trillion tokens within just three months.
That’s not gradual adoption—that’s vertical hockey-stick growth.
—
Resume Captain
Your AI Career Toolkit:
- AI Resume Optimization
- Custom Cover Letters
- LinkedIn Profile Boost
- Interview Question Prep
- Salary Negotiation Agent

The Multiplier Effect: How AI Agents Are About to Reshape Token Consumption
Here’s where things get really interesting for investors and founders.
CICC (Zhongjin Gongsi 中金公司), one of China’s leading investment banks, is projecting something significant about the relationship between AI Agents and token consumption.
According to their estimates in a medium-use scenario:
Once AI Agent penetration reaches 8%, the total token consumption from Agents alone will equal the token consumption of traditional chatbots.
But it doesn’t stop there.
The real multiplier effect kicks in as three factors converge:
- Task complexity increases
- Usage duration extends
- Penetration rates climb
When all three move together, CICC projects that daily token consumption could grow by more than five times.
This isn’t just incremental growth.
This is transformational.
—

The Rise of “Token Factories”: How Telecom Giants Are Monetizing AI Compute
As consumption accelerates, a new business model is crystallizing: the “Token Factory“.
Major telecommunications companies in China have recognized the opportunity and are launching standardized token services at scale.
This is significant because it represents the shift from AI compute as a specialty service to AI compute as a commoditized utility.
Here’s What the Big Three Telcos Are Doing Right Now
China Mobile (Zhongguoyidong 中国移动)
- Launched a Token computing power service for individual users on April 21
- Supports DeepSeek and Qwen (Tongyi Qianwen 通义千问) models
- Entry-level packages start as low as ¥5.99 RMB ($0.83 USD)
China Telecom (Zhongguodianxin 中国电信)
- Officially launched trial commercial Token packages on May 17
- Entry-level version for small and medium-sized enterprises priced at ¥39.9 RMB ($5.51 USD) per month
- Currently bidding on a “Token Factory” generation capability service procurement project
China Unicom (Zhongguoliantong 中国联通)
- Shanghai branch announced Token services for Shanghai OPC customers on May 16
What’s important to notice: all three are moving simultaneously.
This isn’t random—it’s a signal that the market has reached escape velocity.
—

The Bigger Shift: From Computing Power to Token-Based Billing
- Old Model: Bare Metal Rental – Fixed monthly costs for server access regardless of actual usage.
- New Model: Token-Based Consumption – Billing based on actual units of data processed, aligning cost with demand.
- Economic Impact – Revenue decoupling from time-based rental; direct capture of AI demand growth.
- Primary Beneficiaries – High-end chip resource owners and providers with significant scale elasticity.
Behind all this activity, something structural is happening to the computing power industry itself.
Token Factories represent a fundamental shift in how computing power gets valued and priced.
According to CITIC Securities (Zhongxin Zhengquan 中信证券), this marks a transition from one model to another entirely:
Old model: Bare metal server rental — you pay by the month for hardware access, regardless of actual usage.
New model: Token-based consumption billing — you pay for what you actually use, measured in tokens processed.
This is huge for service providers because it decouples revenue from time-based rental and ties it directly to demand growth.
Who Wins in This New Reality?
CITIC Securities is bullish on leading computing power providers with significant high-end chip resources.
Here’s why:
- They can fully capture the dividends of expanding token demand
- They benefit from high penetration across all AI application scenarios
- They show the most growth elasticity under token-driven expansion
The current supply-demand mismatch in the domestic market is creating a favorable environment for companies with the right infrastructure in place.
—

What This Means for Founders, Investors, and Builders
If you’re building in the AI space, here’s what’s actually happening:
Token consumption is transitioning from an edge-case concern to an existential business driver.
The emergence of Token Factories means:
- Infrastructure providers have a new monetization layer
- Model developers need to optimize for token efficiency
- Applications that can run with fewer tokens gain a competitive advantage
- The AI compute market is consolidating around clear winners with scale
For investors, the math is straightforward: if token consumption grows by 5x as agents penetrate, and if leading providers capture that growth through better infrastructure, then computing power providers with scale look exceptionally well-positioned.
And for founders, the message is clear: token efficiency matters more than ever.
Building lean matters.
Optimizing for consumption matters.
Because when compute becomes a commodity and billing moves to tokens, the winners will be the ones who figure out how to do more with less.
—

References
- Global Token Consumption Continues to Climb: DeepSeek Models Top the Call Rankings – CLS (Cailianshe 财联社)
- Model Rankings and Usage Statistics – OpenRouter
- DeepSeek Official Website – DeepSeek (Zhishen Nexun 深度求索)
- National Bureau of Statistics of China Official Website – National Bureau of Statistics (Guojia Tongjiju 国家统计局)





