It's not the Token in cryptocurrency, but the token in AI—every word and every answer produced by the large model is a combination of a bunch of tokens.
NVIDIA's Jensen Huang directly stated at the 2026 GTC conference: “Tokens are the new commodity!”
Data centers are now token factories: put in electricity + data, spit out tokens, make money!
He also provided a simple formula:
Income = number of tokens per watt × available gigawatts
What it means is: under the fixed upper limit of electricity, whoever can generate more tokens per kilowatt will make a fortune. No matter how powerful the hardware is, it cannot escape the laws of physics— a 1-gigawatt factory will never turn into a 2-gigawatt one.
Liu Liehong, director of the National Data Bureau of China, also recently stated: As of March this year, China's daily average token call volume has exceeded 140 trillion!
Compared to 100 billion at the beginning of 2024, it has grown more than 1000 times; compared to 100 trillion at the end of 2025, it has increased by over 40% in three months.
This thing is no longer science fiction; it is becoming a real new industrial chain.
What exactly is the token economy?
In simple terms: the smallest unit of information processing for AI large models is a token.
When you ask AI a question, it first breaks your words into a bunch of tokens, and after processing, it stitches them back into a sentence for you.
Every time a token is generated, it consumes GPU computing power and electricity.
So tokens naturally are a unit of measurement — just like the 'degree' of electricity and the 'barrel' of oil.
In the past, tokens were just a cost, and everyone was competing on parameters and training data.
Now AI has entered the inference era; every time users chat or each time an Agent executes a task, tokens are consumed wildly.
AI companies charge by tokens; the more they consume, the more they sell, turning tokens from 'cost' into a product that can be mass-produced, priced, and traded.
OpenAI CEO Sam Altman has also said: Our (and all AI model providers') business is essentially selling tokens.
Old Huang calls the data center the 'Token Factory', with raw materials being data and electricity, and the product being tokens. The core metric is Tokens per Watt — the more efficient, the lower the cost, and the one who wins.
'Token Factory' industry chain breakdown (four links, super easy to understand)
A token factory is quite similar to a traditional factory and is divided into these segments:
1. Production phase (most costly, benefits the earliest)
AI chips & servers (NVIDIA, Huawei, Haiguang, etc.)
AIDC (Artificial Intelligence Data Center) machine room
Liquid cooling, power supply systems (power equipment, green electricity)
The limits on factory capacity are determined by hardware and electricity. With the same kilowatt-hour, whoever can produce more tokens has a lower cost.
2. Optimization phase (can greatly increase output without adding hardware)
Inference optimization algorithms, scheduling systems
Optical modules, networks, etc.
Old Huang cited examples: Fireworks AI and Lynn, two companies that didn't change hardware but updated the NVIDIA software stack, increasing token generation speed from 700 per second to nearly 5000, a sevenfold increase!
Powerful algorithms can help fixed-power factories earn more money.
3. Circulation phase (quickly delivering tokens to users)
CDN (Content Delivery Network) — last mile delivery
Cross-border dedicated networks, submarine cables — international logistics
Token production and delivery happen almost simultaneously, with very low latency.
The inference cost of domestic models is only 1/6 to 1/10 of that overseas, relying on APIs for large-scale 'token exports', turning China's computing power and electricity into 'digital exports'.
4. Application phase (where monetization happens)
Large model manufacturers (OpenAI, ByteDance, Alibaba, etc.)
Agent applications
Vertical industry SaaS, multi-modal generation platforms
In the future, every SaaS will become Agent-as-a-Service.
Engineers will have an annual token budget; the more they use, the better.
Consumption scenarios expand from chatting to financial analysis, content generation, automated decision-making... The more consumption, the more upstream needs to expand production, forming a positive flywheel.
From an investment perspective, who gets to eat the meat first?
Institutions are currently most optimistic about computing power infrastructure (production phase): AI chips, data centers, liquid cooling, power supply, green electricity.
ByteDance's token consumption is said to double every three months; domestic large cloud vendors will feel the computing power gap when daily consumption hits 30-60 trillion.
By 2026, the computing power industry chain is expected to enter a 'full-chain inflation' cycle, with prosperity transmitted from chips to AIDC, cloud services, and power equipment.
Token exports + computing power leasing are also hot directions, with clear advantages in China's electricity and algorithm costs.
Research reports summarize the price increase logic as: overseas demand explosion → storage and computing hardware shortage → energy/infrastructure bottlenecks → full-chain cost reassessment.
Priorities are roughly divided into stages:
Short-term: Storage, video memory (the greatest elastic mismatch in supply and demand)
Mid-term: Computing power chips & servers
Long-term: Power equipment, green electricity + leading players with real landing and overseas capabilities
In summary: the token economy converts the electricity of the physical world into the intelligence of the digital world, and then sells it like industrial products.
This wave is not about speculation; it is a real industrial revolution 2.0 — completely shifting from training large models to large-scale inference and the Agent era.
Brothers, tokens are on fire, are you ready?
Should we continue to watch the excitement, or layout the computing power, electricity, algorithms, and overseas concepts related to the Token Factory?
Feel free to discuss your views in the comments~
