Telcos Chase the Agent Economy With Token Metering and Noise-Free Calls
ZTE's AIR Core prototype shows how carriers plan to sell AI traffic as a service, moving beyond bandwidth into computing, identity, and real-time inference markets.

The Pitch: From Gigabytes to Tokens
ZTE demonstrated at MWC Shanghai 2026 a core-network prototype that reflects a bet the entire telecoms industry is now making: that the next decade of revenue will come not from selling connectivity, but from selling inference, identity services, and compute scheduling to AI agents. The company calls the platform AIR Core, and it represents one of the first concrete attempts to build what carriers are starting to term an Agent Service Network.
At DailyTechWire, we have tracked the rising urgency among Asia-Pacific operators to move beyond wholesale bandwidth. The stakes are high. Mobile data revenue per gigabyte has fallen by double digits in mature markets over the past five years, even as traffic has surged. AIR Core tries to address that squeeze by introducing three new revenue instruments: token-based metering for generative-AI workloads, experience-based pricing for latency-sensitive applications, and agent-identity management as a billable service.
The architecture integrates what ZTE describes as AI-UPF, a user-plane function that embeds large-model inference directly into the packet path. That allows the network to identify AI traffic patterns in real time, classify them by modality and burstiness, and assign differentiated quality-of-service policies on a per-session basis. The goal is to make token consumption, not just byte volume, the unit of sale.
Voice Gets a Micro-Model Makeover
One of the more tangible demonstrations involved an AI-enhanced voice-calling service. ZTE embedded a noise-cancellation micro-model that runs in the network rather than on the device, claiming 99.9 percent suppression of ambient sound. The model is small enough to execute with sub-50-millisecond latency, which keeps it within the real-time budget for voice paths.
The same voice session can now invoke what the company terms AI Assistant Calling. A user can summon a persistent agent during a call to handle tasks such as appointment booking, document retrieval, or language translation. Because the assistant runs server-side and shares context with the call session, it can close the loop on multi-step requests without requiring the user to switch apps or re-authenticate.
This is part of a broader shift carriers are exploring: treating the phone call not as a static audio pipe but as a session that can carry multimodal data and agent interactions. For operators, the value proposition is twofold. First, it creates a premium tier above standard voice. Second, it opens a channel to monetize natural-language processing and task automation, services that today flow entirely over-the-top through messaging apps and voice assistants controlled by platform companies.
Precision QoE and the High-Speed-Rail Problem
ZTE also highlighted integration between its Network Data Analytics Function and the AI-UPF to deliver what it calls experience monetization. The system performs real-time quality-of-experience assessment and adjusts policy in under one second. The use case most often cited by Asian carriers is high-speed rail, where handoff frequency, Doppler shift, and variable cell load create volatile QoE.
In the demonstration, NWDAF uses a spatiotemporal large model to build user profiles on the fly, predicting which passengers are likely to stream video or join video calls during a journey. The network then pre-allocates slice resources and tunes handover parameters for those sessions. The result is a service tier that passengers can purchase at the point of ticket booking, bundled with rail travel.
This approach reflects a broader ambition: to make network performance perceptible and therefore saleable. For years, carriers have struggled to differentiate premium mobile plans because users cannot easily perceive the difference between 100 Mbps and 300 Mbps on a smartphone screen. By tying QoS guarantees to specific scenarios like travel, gaming, or live commerce, operators hope to create price tiers that map to real user intent.
Device-Network Collaboration and the Edge Compute Play
The token-metering piece of AIR Core is aimed squarely at generative-AI workloads. ZTE argues that AI traffic is fundamentally different from traditional mobile data: it is multimodal, bursty, symmetric in uplink and downlink, and sensitive to tail latency. The AI-UPF is designed to detect these patterns and establish dedicated PDU sessions with token-aware QoS.
More interesting is the device-network collaboration layer. AIR Core allows the network to offload portions of an agent's inference workload from the device to edge compute nodes. The decision to offload is made dynamically, based on battery state, thermal headroom, network latency, and the complexity of the prompt. The network meters both the tokens consumed and the compute cycles delivered, enabling a billing model that charges for inference as a service rather than raw data transfer.
This is a direct play for a slice of the edge-AI market that hyperscalers and chipmakers are also targeting. If carriers can offer sub-20-millisecond inference with guaranteed uptime and built-in identity, they believe they can compete with cloud-based API endpoints, especially for latency-critical applications like autonomous vehicles, industrial robotics, and real-time translation.
From Humans to Agents: Expanding the Addressable Market
AIR Core's Agent Service Network concept extends the definition of a network subscriber. In addition to humans and IoT devices, the network is designed to serve AI entities such as home robots, digital humans, and virtual assistants. Each agent receives a network identity, authenticated through the core, which allows it to establish sessions, consume resources, and be billed independently.
This is more than a technical curiosity. In South Korea and China, the number of registered smart-home devices per household already exceeds three, and that figure is expected to double by 2028. Many of those devices will soon run local or hybrid AI models that need to communicate not just with cloud back-ends but with each other and with external services. Carriers see an opportunity to become the identity and orchestration layer for that machine-to-machine economy.
ZTE also outlined support for Non-Terrestrial Networks, integrating satellite connectivity into the same core. Intelligent routing logic, aware of dynamic orbital topology, coordinates handoffs between low-earth-orbit satellites and terrestrial cells. The primary use cases are maritime, aviation, remote infrastructure, and disaster response, where terrestrial coverage is sparse or unavailable.
Data as a First-Class Asset
AIR Core introduces what ZTE calls a Unified Data Plane, a centralized subsystem for real-time data collection, storage, and analysis. The platform aggregates telemetry from user sessions, application flows, and agent interactions, then feeds it into profiling and monetization engines.
This reflects a strategic shift. For years, operators collected vast amounts of signaling and session data but lacked the analytics infrastructure to extract value at scale. The new architecture treats data as a first-class product, enabling use cases such as behavioral segmentation, churn prediction, and targeted offer generation. It also opens the door to data marketplaces, where anonymized insights can be sold to third parties, though regulatory and privacy constraints remain significant in most Asia-Pacific jurisdictions.
Cloud Native to AI Native: An Architectural Bet
ZTE describes AIR Core as a leap from cloud-native to AI-native architecture. In practice, that means embedding inference engines, vector databases, and real-time feature stores directly into core-network functions. It also means designing control and user planes to handle the statistical characteristics of generative workloads: long-tailed latency distributions, token-based resource consumption, and the need for stateful context across multi-turn interactions.
The platform is currently in pilot and early commercial deployment with a handful of operators, though ZTE has not disclosed names or geographies. The broader question is whether the telco industry can move quickly enough to capture value in the agent economy before hyperscalers and device makers lock in distribution and margin.
At DailyTechWire, we have watched similar platform bets play out over the past decade, from NFV to network slicing to Open RAN. The pattern is familiar: operators invest heavily in new infrastructure, hoping to unlock revenue streams beyond connectivity. Success has been mixed. The difference this time may be timing. Generative AI is creating genuine demand for low-latency, identity-aware, token-metered services, and carriers control the last mile. Whether they can translate that control into margin remains the open question.
The Margin Question
The economics of AIR Core will ultimately hinge on whether operators can price AI services at a premium sufficient to cover the cost of deploying inference engines, training micro-models, and operating edge compute clusters. Early pilots in Japan and South Korea suggest that consumers are willing to pay for guaranteed QoE in specific contexts, such as live sports streaming or multi-party video calls, but willingness-to-pay data for token-metered agent services does not yet exist at scale.
There is also the question of fragmentation. If every vendor and every operator builds a proprietary Agent Service Network, interoperability will suffer, and the ecosystem will struggle to achieve the network effects that made the mobile internet valuable. Standards bodies are beginning to address agent identity and session management, but consensus is slow.
ZTE's demonstration at MWC Shanghai offers a concrete vision of what an AI-native core network might look like. Whether that vision translates into a sustainable business model for carriers, or simply shifts more value to the platform and model providers who sit above the network layer, will become clear over the next two to three years as pilots scale and pricing models solidify.


