Enhanced-BLE: A Hybrid BLE-ESB Framework for Dynamically Reconfigurable and Energy-Efficient 2.4 GHz IoT Communication
Authors: Ziyao Zhou, Chen Shen, Tiancheng Cao, Hen-Wei Huang
First: 2026-05-20T14:58:41+00:00 · Latest: 2026-05-20T14:58:41+00:00
Abstract
Bluetooth Low Energy (BLE) is widely used in IoT systems because of its low power consumption, interoperability, and reliable bidirectional communication. However, its connection-oriented architecture introduces trade-offs among wake-up latency, throughput, and energy efficiency, limiting its suitability for burst-mode and on-demand sensing applications. Enhanced ShockBurst (ESB), a lightweight connectionless protocol supported by the same 2.4 GHz Nordic Semiconductor hardware, enables fast wake-up and efficient data transmission, but does not provide BLE-level robustness for sustained bidirectional communication. This work systematically benchmarks BLE and ESB on a unified Nordic nRF54L15 platform and proposes Enhanced-BLE, a hybrid framework that integrates the two protocols to extend conventional BLE operation. Experimental results show that ESB nearly halves packet transmission time and energy compared with BLE, doubles the achievable forward throughput, and reduces wake-up latency and energy by nearly twentyfold during intermittent operation. However, ESB reverse transmission may suffer packet loss, whereas BLE maintains reliable bidirectional communication. Enhanced-BLE addresses this trade-off through adaptive radio scheduling and coexistence-aware connection management, combining ESB-based high-throughput forward transmission with BLE-based reliable reverse communication. The framework enables BLE-to-ESB handover within approximately 18 ms and restores BLE operation within 49 ms from standby mode. Enhanced-BLE also achieves approximately twofold higher forward throughput than BLE while reducing wake-up latency. These results demonstrate a practical and hardware-compatible strategy for low-latency, high-throughput, energy-efficient, and reliable 2.4 GHz IoT communication.
Summary / 总结
Bluetooth Low Energy (BLE) is widely used in IoT systems because of its low power consumption, interoperability, and reliable bidirectional communication.
High-speed Networking for Giga-Scale AI Factories
Authors: Sajy Khashab, Albert Gran Alcoz, Alon Gal, Jacky Romano, Rani Abboud, Yonatan Piasetzky, Lior Maman, Amit Nishry, Barak Gafni, Omer Shabtai, Matty Kadosh, Dror Goldenberg, Gilad Shainer, Mark Silberstein
First: 2026-05-20T13:52:47+00:00 · Latest: 2026-05-20T13:52:47+00:00
Abstract
As distributed model training scales to span hundreds of thousands of GPUs, scale-out networks face unprecedented performance and efficiency demands. NVIDIA Spectrum-X Ethernet has been designed from the ground up to achieve predictable and stable network performance with high utilization and low latency. This paper presents the Spectrum-X multiplane architecture, which replaces hierarchical depth with topological parallelism, and introduces hardware-accelerated load balancing in NICs and switches as the key architectural approach to provide fast reaction to highly dynamic network conditions at the microsecond timescales that AI training workloads demand. We describe the motivation, design principles, evaluation methodology and performance on state-of-the-art benchmarks, as well as the lessons we learned from deploying and debugging Spectrum-X networks in large-scale systems. Our evaluation highlights production-grade AI infrastructure performance across three core dimensions: 98% of the theoretical line rate with low jitter-free latency; strong cross-tenant isolation for concurrent workloads; robust, capacity-proportional bisection bandwidth and 7% latency increase for 10% fabric link failures; and rapid reaction to host and fabric link flaps during LLM training workloads.
Summary / 总结
As distributed model training scales to span hundreds of thousands of GPUs, scale-out networks face unprecedented performance and efficiency demands.
TrimCaching: Parameter-sharing Edge Caching for AI Model Downloading
Authors: Guanqiao Qu, Zheng Lin, Qian Chen, Jian Li, Fangming Liu, Xianhao Chen, Kaibin Huang
First: 2024-04-22T14:13:36+00:00 · Latest: 2026-05-20T02:47:49+00:00
Comments: 19 pages, 13 figures. Part of this work has been accepted by ICDCS 2024
Abstract
Next-generation mobile networks are expected to facilitate fast AI model downloading to end users. By caching models on edge servers, mobile networks can deliver models to end users with low latency, resulting in a paradigm of edge model caching. In this paper, we develop a novel model placement framework, called parameter-sharing model caching (TrimCaching). TrimCaching exploits the key observation that a wide range of AI models, such as convolutional neural networks or large language models, can share a significant proportion of parameter blocks containing reusable knowledge, thereby improving storage efficiency. To this end, we formulate a parameter-sharing model placement problem to maximize the cache hit ratio in multi-edge wireless networks by balancing the fundamental tradeoff between storage efficiency and service latency. We show that the formulated problem is a submodular maximization problem with submodular constraints, for which no polynomial-time approximation algorithm exists. To tackle this challenge, we study an important special case, where a small fixed number of parameter blocks are shared across models, which often holds in practice. In such a case, a polynomial-time algorithm with a $\left(1-ε\right)/2$-approximation guarantee is developed. Subsequently, we address the original problem for the general case by developing a greedy algorithm. Simulation results demonstrate that the proposed TrimCaching framework significantly improves the cache hit ratio compared with state-of-the-art content caching without exploiting shared parameters in AI models.
Summary / 总结
Next-generation mobile networks are expected to facilitate fast AI model downloading to end users.
Intent-First Aerial V2V for Tactical Coordination and Separation: Protocol and Performance Under Density and Disturbance
Authors: Mehrnaz Sabet
First: 2026-05-20T01:04:17+00:00 · Latest: 2026-05-20T01:04:17+00:00
Comments: Submitted to IEEE Transactions on Intelligent Transportation Systems
Abstract
Dense low-altitude aerial operations require more than pre-flight route coordination and last-resort collision avoidance. Once aircraft are airborne, disturbances can emerge on timescales shorter than strategic reauthorization can absorb, while collision avoidance is too late and disruptive to serve as routine traffic management. Although tactical separation is recognized as the intermediate layer, realizing it at scale requires a deployable neighborhood communication mechanism that provides fresh, trusted information for local coordination. This paper presents what is, to our knowledge, the first controller-coupled characterization of an all-airborne, sidelink-class, intent-first vehicle-to-vehicle (V2V) tactical neighborhood exchange stack for dense Unmanned Aircraft System Traffic Management (UTM) operations. Unlike awareness-only broadcast, the proposed exchange combines refreshed state and intent beacons for local awareness, cooperative perception, and degraded-mode assessment with event-triggered messages for yielding, sequencing, release, and contingency coordination. We implement and evaluate this model on an all-airborne V2V stack using sidelink-class C-V2X modules with authenticated freshness checks. Evaluation uses a scenario-driven, high-volume stress campaign supported by real-time, field-anchored infrastructure. Results show that V2V reduces stale-belief divergence, preserves observability through cooperative perception, rejects invalid tactical messages, suppresses false local inference, and structures shared-resource coordination. The implemented stack provides a viable communication layer for tactical separation in lower-to-moderate regimes, but transitions toward guarded fallback as density, impairment, and complexity increase. These findings position intent-first aerial V2V as a bounded enabler for scaling tactical coordination in disturbance-driven urban airspace.
Summary / 总结
Dense low-altitude aerial operations require more than pre-flight route coordination and last-resort collision avoidance.
Detecting Data Exfiltration through I2P Anonymity Networks: A Two-Phase Machine Learning Approach
Authors: Siddique Abubakr Muntaka, Muntaka Mohammed, Mansuru Mikail Azindo, Ibrahim Tanko, Franco Osei-Wusu, Edward Danso Ansong, Benjamin Yankson, Oliver Kornyo, Foster Yeboah, Jones Yeboah, Richmond Adams, Pulcheria Serwaa
First: 2026-05-19T22:46:22+00:00 · Latest: 2026-05-19T22:46:22+00:00
Abstract
The Invisible Internet Project (I2P) provides strong anonymity through garlic routing and distributed network architecture, making it attractive for legitimate privacy needs. Nevertheless, the same properties can be exploited by malicious actors to steal sensitive information from corporate networks without detection. Current network security measures often fail to detect I2P traffic, and existing literature has focused primarily on protocol-level traffic identification without addressing behavioral threat assessment. This paper proposes a two-stage machine-learning model for I2P traffic analysis using the SafeSurf Darknet 2025 dataset comprising 184,548 network flows. Phase 1 achieved 99.96% accuracy in distinguishing I2P traffic from normal network traffic using a Random Forest classifier, with only 2 false positives among 32,318 normal flows. Phase 2 performed behavioral analysis on traffic identified as I2P, classifying it as either exfiltration or legitimate activity, achieving 91.11% accuracy using XGBoost. The system demonstrates that tree-based ensemble methods substantially outperform deep neural networks and support vector machines for this task. Feature importance analysis indicates that the most discriminative features are packet timing and flow duration. These findings establish that accurate I2P traffic detection and threat prioritization are achievable in operational network environments, enabling security teams to focus resources on high-risk events rather than monitoring all encrypted traffic.
Summary / 总结
The Invisible Internet Project (I2P) provides strong anonymity through garlic routing and distributed network architecture, making it attractive for legitimate privacy needs.
A Meshtastic-based LoRa Mesh System for Smart Campus Applications: From Solar-Powered Sensing to Containerized Data Management
Authors: Rafael Garzon Andosilla, José de Jesús Rugeles Uribe
Venue: www
First: 2026-05-19T18:30:48+00:00 · Latest: 2026-05-19T18:30:48+00:00
Comments: 14 pages, 4 figures, 5 tables.To appear in the proceedings of the 6th CATAÏ Workshop (Bogotá, Colombia, May 7-8, 2026). Workshop website: https://www.catai.fr/catai2026.html
Abstract
This work presents the design, implementation, and evaluation of a LoRa-based mesh network using the Meshtastic protocol for Smart Campus applications at Universidad Militar Nueva Granada (UMNG). The system integrates heterogeneous hardware nodes including a solar-powered ecological sensing node built around a Raspberry Pi Pico and a Semtech SX1262 transceiver, and mobile trackers based on the Seeed SenseCAP T1000-E managed through a containerized edge gateway running on a Raspberry Pi 4. A Docker Compose microservices stack handles data ingestion via Node-RED, time-series storage in InfluxDB, and real-time visualization through Grafana dashboards. The architecture's performance was evaluated under realistic propagation scenarios at the UMNG Cajicá campus, characterizing link quality using Received Signal Strength Indicator (RSSI) and Signal-to-Noise Ratio (SNR) metrics. Experimental results demonstrate robust mesh connectivity across key university facilities, including an extended-range link of approximately 2.47 km linking the campus gateway to a remote station at Mirador La Cumbre (n = 62 packets received, mean RSSI = -110 dBm, mean SNR = +2.75 dB). This architecture demonstrates that open-source mesh protocols combined with containerized microservices offer an autonomous, highly reproducible infrastructure for environmental monitoring and asset tracking, supporting the transition toward data-driven "Smart Campus" ecosystems without reliance on centralized commercial LoRaWAN operators.
Summary / 总结
This work presents the design, implementation, and evaluation of a LoRa-based mesh network using the Meshtastic protocol for Smart Campus applications at Universidad Militar Nueva Granada (UMNG).
Fair-Aurora: Comparing Fairness Strategies for Reinforcement Learning-Based Congestion Control in Multi-Flow Environments
Authors: Thomas Mbrice, Yuyu Liu
First: 2026-05-19T14:38:12+00:00 · Latest: 2026-05-19T14:38:12+00:00
Abstract
Reinforcement learning (RL) has emerged as a promising paradigm for Internet congestion control, achieving higher link utilization than classical heuristics. However, RL-based controllers trained in single-flow environments are not guaranteed to share bandwidth equitably when deployed in multi-flow networks. This paper investigates the fairness properties of Aurora~\cite{jay2019aurora}, a state-of-the-art deep RL congestion controller, and evaluates three post-hoc fairness strategies that preserve Aurora's RL architecture: \emph{reward shaping} (Strategy~A), \emph{observation augmentation} (Strategy~B), and \emph{loss-sensitivity tuning} (Strategy~C). Using a custom shared-bottleneck simulator and Jain's fairness index as the primary metric, we find that modest reward shaping achieves the best fairness while preserving aggregate throughput. All strategies maintain the total bandwidth budget with fairness being achieved through redistribution, not reduction. Beyond the 2-flow homogeneous setting, an extended evaluation across mixed Aurora--CUBIC competition and dynamic flow entry/exit scenarios shows that Strategy~C's loss-sensitivity emerges as the most TCP-friendly mechanism, while Strategy~B is the most stable through dynamic flow-set changes.
Summary / 总结
Reinforcement learning (RL) has emerged as a promising paradigm for Internet congestion control, achieving higher link utilization than classical heuristics.
Security Analysis of Bitcoin's V2 Transport Protocol: Exploiting Design Implications for Sustained Eclipse and Downgrade Attacks
Authors: Charmaine Ndolo, Florian Tschorsch
First: 2026-05-19T11:50:57+00:00 · Latest: 2026-05-19T11:50:57+00:00
Comments: 34 pages, 16 figures, 2 tables
Abstract
Bitcoin recently introduced a new protocol for the encryption of peer-to-peer (P2P) communication. The protocol, known as V2 P2P transport, represents a big step towards securing the overlay network against various previously-known attack vectors. Based on an analysis of V2 P2P transport, this work examines the current viability of said attacks and concludes that while they are now remediated, alternative attacks and paths to similar objectives exist. The identified shortcomings are conceptual (and not implementation bugs) and even applicable to other P2P networks. We show how a network-level attacker can identify application messages using the length of TCP payloads, can eclipse a target node by taking advantage of how encrypted communication channels work and can downgrade all of a node's connections to the unencrypted protocol by using the mechanisms designed for compatibility. We validate our contributions using a combination of network measurements, emulations and simulations. Finally, we propose a series of short-term and long-term countermeasures towards securing Bitcoin's P2P network. To the best of our knowledge, we are the first to study Bitcoin's security under V2 P2P transport.
Summary / 总结
Bitcoin recently introduced a new protocol for the encryption of peer-to-peer (P2P) communication.
How Helpful is LLM Assistance in Network Operations? A Case Study at a Large Demonstration Network
Authors: Ryo Nakamura, Koshi Eguchi
First: 2026-05-19T10:06:02+00:00 · Latest: 2026-05-19T10:06:02+00:00
Abstract
This paper reports on a real-world case study in which over 100 network engineers assessed how a Large Language Model (LLM) can assist in building and operating a network. The versatility of LLMs has accelerated their adoption across a wide range of domains, and assisting network operations is one such promising application. LLMs are probabilistic models, unlike deterministic protocols and configurations; therefore, clarifying their capabilities -- how and to what extent LLMs can help in network operations -- is a crucial step toward adopting LLMs. To offer practical insights into this issue, we conducted an extensive experiment on a large demonstration network built for a public exhibition, consisting of 21 racks with heterogeneous network devices. In the experiment, a total of 105 network engineers used an LLM-based chatbot while building and operating the network. The chatbot was equipped with three external functions: retrieval-augmented generation for domain-specific knowledge, CLI control of network devices running on the network, and access to a ticket system. The participants gave evaluations for the chatbot's responses on a best-effort basis. Analysis of the chat histories shows that 68.1% of the evaluations were positive, indicating a quantitative baseline of the LLM's helpfulness in network operations. Our results also demonstrate that understanding the capabilities of the chatbot is important for eliciting better responses. Moreover, we provide detailed use case analyses while sharing actual user--chatbot interactions.
Summary / 总结
This paper reports on a real-world case study in which over 100 network engineers assessed how a Large Language Model (LLM) can assist in building and operating a network.
Sample-Efficient Misconfiguration Classification for Network Resilience in Wireless Communications
Authors: Xin Hao, Chenhan Zhang, Massimo Piccardi, Vijaya Durga Chemalamarri, Qiwen Jiang, Wei Ni, Raymond Owen
First: 2026-05-19T03:29:28+00:00 · Latest: 2026-05-19T03:29:28+00:00
Abstract
As modern wireless communication networks grow increasingly complex, network outages driven by the inconsistency between dynamic topologies and protocol configurations have become a critical concern. To solve this issue, we mathematically formulate a protocol misconfiguration classification problem as a graph-based learning task and solve it with our proposed EtaGATv2 algorithm, an edge-type-aware graph attention network with dynamic attention. EtaGATv2 addresses two critical challenges: i) it captures non-uniform symptom propagation for protocol misconfiguration classification tasks, where certain network paths and nodes become critical for diagnosis, and ii) it extracts protocol-specific features from heterogeneous routing protocols with distinct message-passing behaviors by utilizing edge-type-aware transformations. Experiments across diverse and real-world topologies demonstrate that EtaGATv2 reaches state-of-the-art performance with 50% of the training samples, making it particularly suitable for networks with dynamic topologies and limited negative-labeled data.
Summary / 总结
As modern wireless communication networks grow increasingly complex, network outages driven by the inconsistency between dynamic topologies and protocol configurations have become a critical concern.
Enabling Agile Ambient IoT Networking via a Parameterized Hybrid Radio
Authors: Jiazhen Lei, Fengyuan Zhu, Tianze Cao, Yuxin Sha, Linling Zhong, Wenhui Li, Bingbing Wang, Zeming Yang, Jinyang Sun, Yibin Deng, Xiaohua Tian
First: 2026-05-18T12:31:30+00:00 · Latest: 2026-05-18T12:31:30+00:00
Comments: 14 pages, 23 figures
Abstract
The emergence of Ambient IoT signals a paradigm shift toward massive batteryless networking. However, the absence of an agile physical layer substrate remains a fundamental barrier to research and standardization. Current testbeds are hindered by decoupled radio paths, high static power, and cumbersome control methods, which stifle rapid protocol prototyping. In this paper, we present Janus, the first hybrid active-passive configurable radio architected for agile Ambient IoT networking. Janus introduces a parameterized architecture that unifies passive and active transmission into a single RF front end, abstracting complex physical layer behaviors into concise parameters. This design enables a system-level control plane for dynamic mode transitions and an energy management plane for fine-grained harvesting across multiple sources. We implement a compact PCB prototype and evaluate its performance across diverse protocol landscapes, including 3GPP A-IoT, IEEE 802.11 AMP, and Bluetooth SIG. Our experimental results demonstrate that Janus achieves communication performance on par with dedicated radios while significantly reducing configuration overhead. Ultimately, Janus serves as a versatile enabler for validating emerging protocols and accelerating the standardization of next-generation low-power networks.
Summary / 总结
The emergence of Ambient IoT signals a paradigm shift toward massive batteryless networking.
ZeroSiam: An Efficient Asymmetry for Test-Time Entropy Optimization without Collapse
Authors: Guohao Chen, Shuaicheng Niu, Deyu Chen, Jiahao Yang, Zitian Zhang, Mingkui Tan, Pengcheng Wu, Zhiqi Shen
First: 2025-09-27T08:37:47+00:00 · Latest: 2026-05-18T11:34:53+00:00
Abstract
Test-time entropy minimization helps adapt a model to novel environments and incentivize its reasoning capability, unleashing the model's potential during inference by allowing it to evolve and improve in real-time using its own predictions, achieving promising performance. However, pure entropy minimization can favor non-generalizable shortcuts, such as inflating the logit norm and driving all predictions to a dominant class to reduce entropy, risking collapsed solutions (e.g., constant one-hot outputs) that trivially minimize the objective without meaningful learning. In this paper, we reveal asymmetry as a key mechanism for collapse prevention and introduce ZeroSiam--an efficient asymmetric Siamese architecture tailored for test-time entropy minimization. ZeroSiam prevents collapse through asymmetric divergence alignment, efficiently achieved by a learnable predictor and a stop-gradient operator before the classifier. We provide empirical and theoretical evidence that ZeroSiam not only prevents collapse, but also regularizes biased learning signals, enhancing performance even when no collapse occurs. Despite its simplicity, extensive results show that ZeroSiam performs more stably over prior methods using negligible overhead, demonstrating efficacy on both vision adaptation and large language model reasoning tasks across challenging test scenarios and diverse models, including particularly collapse-prone tiny models.
Summary / 总结
Test-time entropy minimization helps adapt a model to novel environments and incentivize its reasoning capability, unleashing the model's potential during inference by allowing it to evolve and improve in real-time using its own predictions, achieving promising performance.
Enhancing Network Resilience via Graph-Based Anomaly Detection in Sovereign Functions
Authors: Xin Hao, Wei Ni, Chenhan Zhang, Massimo Piccardi, Raymond Owen
First: 2026-05-18T00:36:10+00:00 · Latest: 2026-05-18T00:36:10+00:00
Abstract
Sovereign network functions, e.g., routing protocols, are becoming increasingly complex and susceptible to failures arising from protocol configuration anomalies and anomalous configurations. This paper interprets the protocol configuration anomaly detection problem as detection of structural inconsistencies of connected nodes and edges in a bipartite graph that captures both physical network entities and logical protocol states. This graph structural inconsistency detector (GSID) model is proposed to solve the problem efficiently. To handle the heterogeneous nature of protocol configuration parameters, GSID employs an adaptive configuration encoder (ACE) that dynamically selects encoding strategies per parameter to preserve fine-grained numerical discrepancies. To expose the subtle inconsistencies of connected nodes and edges in the bipartite graph, GSID uses an inconsistency dynamic attention (IDA) mechanism that scores edges by drawing asymmetric attentions from both ends, rule compliance from one end and route connectivity from the other. It is demonstrated experimentally that GSID outperforms state-of-the-art baselines by threefold in F1 score and by 23.2% in accuracy. Ablation studies validate the effectiveness of both the ACE and IDA modules. Tests on unseen network scales and real-world network topologies show the superior adaptability of our GSID, compared to the baselines.
Summary / 总结
Sovereign network functions, e.g., routing protocols, are becoming increasingly complex and susceptible to failures arising from protocol configuration anomalies and anomalous configurations.
Cross-Domain Query Translation for Network Troubleshooting: A Multi-Agent LLM Framework with Privacy Preservation and Self-Reflection
Authors: Nguyen Phuc Tran, Brigitte Jaumard, Karthikeyan Premkumar, Salman Memon
Venue: EuCNC & 6G Summit 2026
First: 2026-04-14T23:23:46+00:00 · Latest: 2026-05-17T15:44:10+00:00
Comments: Accepted for publication in EuCNC & 6G Summit 2026. copyright 2026 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission
Abstract
This paper presents a hierarchical multi-agent LLM architecture to bridge communication gaps between non-technical end users and telecommunications domain experts in private network environments. We propose a cross-domain query translation framework that leverages specialized language models coordinated through multi-agent reflection-based reasoning. The resulting system addresses three critical challenges: (1) accurately classify user queries related to telecommunications network issues using a dual-stage hierarchical approach, (2) preserve user privacy through the anonymization of semantically relevant personally identifiable information (PII) while maintaining diagnostic utility, and (3) translate technical expert responses into user-comprehensible language.
Our approach employs ReAct-style agents enhanced with self-reflection mechanisms for iterative output refinement, semantic-preserving anonymization techniques respecting $k$-anonymity and differential privacy principles, and few-shot learning strategies designed for limited training data scenarios. The framework was comprehensively evaluated on 10,000 previously unseen validation scenarios across various vertical industries.
Summary / 总结
This paper presents a hierarchical multi-agent LLM architecture to bridge communication gaps between non-technical end users and telecommunications domain experts in private network environments.
A Multihop Rendezvous Protocol for Cognitive Radio-based Emergency Response Network
Authors: Zahid Ali, Saritha Unnikrishnan, Eoghan Furey, Ian McLoughlin, Saim Ghafoor
First: 2026-02-18T11:09:58+00:00 · Latest: 2026-05-16T17:47:35+00:00
Comments: Accepted for publication in the Proceedings of IEEE MeditCom 2026
Abstract
This paper addresses the challenge of efficient rendezvous in multihop cognitive radio networks, where existing channel-hopping algorithms designed for single-hop scenarios incur increased delay and coordination inefficiencies in multinode topologies. To overcome these limitations, we propose a Multihop Dual Modular Clock Algorithm (M-DMCA), which systematically extends modular clock-based rendezvous to multihop environments while preserving efficient channel coordination. The proposed scheme enables dual-channel selection per timeslot and incorporates a lightweight three-way handshake mechanism to improve coordination among intermediate nodes. Simulation results under worst-case conditions, including high primary user activity, asymmetric channel availability, and dense network settings, demonstrate that M-DMCA significantly reduces rendezvous time compared to existing approaches, achieving up to 24% improvement. These results demonstrate the suitability of M-DMCA for timely node discovery in dynamic emergency response scenarios.
Summary / 总结
This paper addresses the challenge of efficient rendezvous in multihop cognitive radio networks, where existing channel-hopping algorithms designed for single-hop scenarios incur increased delay and coordination inefficiencies in multinode topologies.
Escape from Callback Hell! A New Programming Paradigm for Network Simulation
Authors: Yuanyi Zhu, Zijian Li, Xin Ai, Zixuan Chen, Sen Liu, Yang Xu
First: 2026-05-16T09:17:23+00:00 · Latest: 2026-05-16T09:17:23+00:00
Abstract
Network simulation plays a crucial role in both networking research and industry. Existing commonly-used Discrete Event Simulations (DES) are based on callback mechanisms for discrete event (DE). However, due to the inability of callbacks to naturally simulate network events, programs in network simulation cannot be written in a sequential workflow. This leads to inherent complexity and poor maintainability, resulting in stack ripping and callback hell. These problems significantly increase simulation development workloads and introduce substantial cognitive loads associated with programming and debugging.
To enable more efficient development of network simulation and facilitate the rapid evaluation and evolution of network functions, we propose a novel development paradigm for network simulation named ``CoDES" (\textbf{Co}routine-based \textbf{DES}). To the best of our knowledge, we are the first to focus on optimizing the network simulation development process rather than performance based on the coroutine mechanism. We implement a new network simulation framework based on CoDES that is capable of naturally simulating network events and effectively address key system challenges related to correctness, functionality, compatibility, and overhead. It enables developers to create sequential workflows for network programs and simplifies the code structure, thus reducing development workloads while enhancing code readability and maintainability.
We apply this paradigm to a commonly used network simulator, NS-3 to implement Message Passing Interface (MPI), High Precision Congestion Control (HPCC), and Routing Information Protocol (RIP), achieving up to 62.3\% and 82.6\% reduction in code volume and structure complexity without sacrificing simulation accuracy, extending execution time or increasing runtime memory of simulation.
Summary / 总结
Network simulation plays a crucial role in both networking research and industry.
SpaceMoE: Towards Orbital General Intelligence with Distributed Mixture-of-Experts Inference
Authors: Qian Chen, Xianhao Chen, Min Sheng, Kaibin Huang
First: 2026-05-16T07:15:47+00:00 · Latest: 2026-05-16T07:15:47+00:00
Comments: 7 pages, 5 figures
Abstract
As satellite networks evolve to support increasingly diverse services and artificial general intelligence (AGI), large language models (LLMs) are emerging as a critical foundation for future space systems. However, deploying LLMs on satellites is hindered by stringent constraints on onboard memory, computation, and energy. In this context, the mixture-of-experts (MoE) architecture emerges as a promising solution, leveraging sparse expert activation to enable scalable model inference. By harnessing the architectural advantages of MoE, this article provides a comprehensive overview of SpaceMoE, a new paradigm for distributed MoE inference in satellite networks. We first review recent industrial progress and emerging standardization trends that motivate the evolution toward space AGI systems. Then, we introduce the fundamentals and architectural evolution of SpaceMoE. Subsequently, we discuss three fundamental design problems in SpaceMoE, namely expert placement, expert selection, and hidden-state transmission and routing, highlighting how satellite-specific factors such as dynamic topology, battery degradation, and thermal limits fundamentally reshape their solutions. Finally, we outline promising research directions for realizing scalable, efficient, and sustainable on-orbit MoE inference in future satellite networks.
Summary / 总结
As satellite networks evolve to support increasingly diverse services and artificial general intelligence (AGI), large language models (LLMs) are emerging as a critical foundation for future space systems.
Against the Monolithic Wireless World Model: Why NextG Needs Composable and Agentic Intelligence
Authors: Aladin Djuhera, Farhan Ahmed, Vlad C. Andrei, Swanand Ravindra Kadhe, Alecio Binotto, Haris Gacanin, Holger Boche
First: 2026-05-15T22:56:11+00:00 · Latest: 2026-05-15T22:56:11+00:00
Abstract
AI-native 6G visions increasingly invoke wireless foundation models, large multimodal models, and wireless world models as the natural endpoint of AI-native networking, drawing an analogy to recent developments in large language models (LLMs). We argue that this analogy is structurally incomplete. The success of LLMs is based on a broad, reusable, and largely self-contained tokenized data substrate, whereas the wireless domain lacks an equivalent data foundation. Unlike text, code, or images, wireless data such as CSI tensors, IQ samples, or scheduler logs are not self-contained: their meaning is configuration-dependent, simulator-conditioned, task-disaggregated, and weakly grounded in operational feedback, all structural bottlenecks that undermine current pre- and post-training recipes. We therefore argue that monolithic models, including mixture-of-experts (MoE) and wireless world models, are not the most realistic near-term path toward deployable AI-native networks. Instead, emerging evidence points toward composable and agentic network architectures, where general reasoning models orchestrate specialized signal processing models, classical algorithms, digital twins, standards-aware retrieval, and safety checks through explicit programmable interfaces.
Summary / 总结
AI-native 6G visions increasingly invoke wireless foundation models, large multimodal models, and wireless world models as the natural endpoint of AI-native networking, drawing an analogy to recent developments in large language models (LLMs).
End-to-End Simulation of 5G NR Integrated Access and Backhaul Networks for Remote Maritime Connectivity
Authors: Alessandro Traspadini, Matteo Pagin, Raphaël Ihamouine, Rupert Lucas, Andrew Noren, Michele Zorzi, Marco Giordani
First: 2026-05-15T18:28:58+00:00 · Latest: 2026-05-15T18:28:58+00:00
Comments: 13 pages, 7 figures, 1 table. This paper has been accepted for publication at IEEE Transactions on Communications, 2026. Please cite it as: A. Traspadini, M. Pagin, R. Ihamouine, R. Lucas, A. Noren, M. Zorzi, and M. Giordani, "End-to-End Simulation of 5G NR Integrated Access and Backhaul Networks for Remote Maritime Connectivity," IEEE Transactions on Communications, to appear, 2026
Abstract
Millimeter wave (mmWave) 5th generation (5G) networks offer high data rates but face coverage challenges due to severe path loss and blockage. These problems motivate the use of Integrated Access and Backhaul (IAB) as a flexible wireless backhaul solution that extends connectivity to cell boundaries and unfibered areas, including maritime environments. This paper overviews the latest 3GPP specifications for IAB networks in Releases 16 through 18. Then, it presents an ns-3 module for IAB, featuring a complete end-to-end protocol stack, including the backhaul adaptation protocol (BAP) layer, flexible slot and control configurations, and multiplexing schemes based on both time and frequency division. We test the IAB module via extensive system-level simulations in a custom maritime scenario where vessels, equipped with IAB-nodes, can simultaneously act as access points and relays, forming dynamic multi-hop networks that maintain connectivity via wireless backhaul to shore-based stations. We evaluate different topologies and channel conditions, providing insights into the design and deployment of mmWave IAB networks in offshore environments.
Summary / 总结
Millimeter wave (mmWave) 5th generation (5G) networks offer high data rates but face coverage challenges due to severe path loss and blockage.
HOPPER: A Hop-by-hop Entanglement Distribution Protocol for Asynchronous Quantum Networks
Authors: Claudio Cicconetti
First: 2026-05-15T11:36:01+00:00 · Latest: 2026-05-15T11:36:01+00:00
Comments: Accepted for oral presentation at IEEE ICCCN 2026
Abstract
The quantum Internet relies on the ability to distribute entangled quantum bits (ebits) between quantum memories at the end nodes, to perform applications like blind or distributed quantum computing that are impossible if end nodes are connected via a classical, i.e., non-quantum network. This need creates new challenges due to the fragile nature of entanglement, which decoheres over short timescales and cannot be amplified, buffered, or retransmitted. Two broad categories of approaches have been proposed in the scientific literature to realize such an entanglement distribution in a given path: one relying on a synchronous time-slotted model, and another one where intermediate nodes interact asynchronously. However, both of them implicitly assume a serial operation, where one ebit is established and made available to the application on end nodes before creating a new one. This is inefficient in long-range networks, with high transmission latencies, if the intermediate nodes have multiple memory qubits that could be used in parallel. To overcome this limitation, in this paper, we study the implications of multiplexing concurrent ebit requests on the same quantum, for both synchronous and asynchronous operation. Furthermore, for the latter, we define a novel distribution protocol, called HOPPER, where the intermediate nodes make autonomous and hop-by-hop decisions on the use of their local resources when establishing an ebit. With numerical simulations, we show that HOPPER is effective in handling multiple ebit requests in parallel, and it exhibits significantly better performance than a synchronous alternative in different scenarios.
Summary / 总结
The quantum Internet relies on the ability to distribute entangled quantum bits (ebits) between quantum memories at the end nodes, to perform applications like blind or distributed quantum computing that are impossible if end nodes are connected via a classical, i.e., non-quantum network.
The Internet Runs on Names
Authors: Geoff Huston, Lixia Zhang
First: 2026-05-15T06:01:30+00:00 · Latest: 2026-05-15T06:01:30+00:00
Abstract
The Internet's TCP/IP architecture was designed for resilient packet delivery between hosts identified by IP addresses. Over time, however, the consolidation of applications and services into large-scale platforms built on that universal packet-delivery substrate drove deployment practices that fundamentally changed the Internet's operational model: the network now operates primarily on names. DNS names have become the basis for service identity, reachability, load balancing, and trust, while IP addresses have become ephemeral routing locators. This change was driven by application needs and platform consolidation in the absence of any overarching plan. The resulting mismatch between the original address-based design and the current name-based operation leads to serious consequences: operational complexity that grows with each new layer of indirection, fragility, and vulnerability - as seen in recent high-profile outages. This paper exposes this mismatch as a necessary first step toward understanding its consequences and addressing the risks of continuing on the same path.
Summary / 总结
The Internet's TCP/IP architecture was designed for resilient packet delivery between hosts identified by IP addresses.
A Techno-Economic Framework for Cost Modeling and Revenue Opportunities in Open and Programmable AI-RAN
Authors: Gabriele Gemmi, Michele Polese, Tommaso Melodia
First: 2026-03-30T16:59:15+00:00 · Latest: 2026-05-14T23:18:45+00:00
Comments: Accepted for publication on the 35th International Conference on Computer Communications and Networks (ICCCN 2026)
Abstract
The large-scale deployment of 5G networks has not delivered the expected return on investment for mobile network operators, raising concerns about the economic viability of future 6G rollouts. At the same time, surging demand for Artificial Intelligence (AI) inference and training workloads is straining global compute capacity. AI-RAN architectures, in which Radio Access Network (RAN) platforms accelerated on Graphics Processing Unit (GPU) share idle capacity with AI workloads during off-peak periods, offer a potential path to improved capital efficiency. However, the economic case for such systems remains unsubstantiated. In this paper, we present a techno-economic analysis of AI-RAN deployments by combining publicly available benchmarks of 5G Layer-1 processing on heterogeneous platforms -- from x86 servers with accelerators for channel coding to modern GPUs -- with realistic traffic models and AI service demand profiles for Large Language Model (LLM) inference. We construct a joint cost and revenue model that quantifies the surplus compute capacity available in GPU-based RAN deployments and evaluates the returns from leasing it to AI tenants. Our results show that, across a range of scenarios encompassing token depreciation, varying demand dynamics, and diverse GPU serving densities, the additional capital and operational expenditures of GPU-heavy deployments are offset by AI-on-RAN revenue, yielding a return on investment of up to 8x. These findings strengthen the long-term economic case for accelerator-based RAN architectures and future 6G deployments.
Summary / 总结
The large-scale deployment of 5G networks has not delivered the expected return on investment for mobile network operators, raising concerns about the economic viability of future 6G rollouts.
LatencyScope: A System-Level Mathematical Framework for 5G RAN Latency
Authors: Arman Maghsoudnia, Aoyu Gong, Raphael Cannatà, Dan Mihai Dumitriu, Haitham Hassanieh
First: 2025-11-26T11:09:43+00:00 · Latest: 2026-05-14T19:25:41+00:00
Comments: 23 pages
Abstract
This paper presents LatencyScope, a mathematical framework for computing one-way uplink and downlink latency in fifth-generation radio access networks across diverse system configurations. LatencyScope models latency sources across the protocol stack, including radio interfaces, scheduling decisions, processing delays, frame structures, and hardware and software constraints, while capturing dependencies among configuration parameters and stochastic sources of delay. The framework also includes a configuration analyzer that uses these models to search billions of candidate settings and identify those that satisfy latency-reliability targets under user-specified constraints. We validate LatencyScope on two open-source fifth-generation radio access network testbeds, as well as on measurements from a public commercial fifth-generation network. The results show that LatencyScope closely matches empirical latency distributions, captures observed lower and upper latency bounds, and substantially outperforms prior analytical models and widely used fifth-generation network simulators. LatencyScope can determine whether ultra-reliable low-latency communication targets are feasible for a given deployment and, when they are feasible, efficiently find satisfying configurations, helping network operators reason about latency modeling, configuration analysis, and system-level bottlenecks.
Summary / 总结
This paper presents LatencyScope, a mathematical framework for computing one-way uplink and downlink latency in fifth-generation radio access networks across diverse system configurations.
Investigating the Suitability of Delay Tolerant Networks for Broadcasting Tsunami Warnings in Palu, Indonesia
Authors: Adam Graham, Milena Radenkovic
First: 2026-05-14T17:21:20+00:00 · Latest: 2026-05-14T17:21:20+00:00
Abstract
On the 28th of September, 2018, a tsunami hit the city of Palu in Indonesia, killing 4,340 people. The earthquake preceding the tsunami crippled communication lines and may have rendered the transmission of tsunami warning messages using traditional end-to-end approaches impossible. This paper proposes an alternative approach using Delay Tolerant Networks (DTNs) for tsunami warning message routing given their resilience to disruptions and sparse connections. Both Epidemic and Spray and Wait routing protocols were simulated in a pseudo-realistic environment to evaluate their effectiveness for transmitting tsunami warning messages in Palu. Results indicated that these protocols are not suitable for the tight time constraints of post-earthquake tsunami warnings with the currently available technology. However, they may have promising applications for the earthquakes that precede tsunamis.
Summary / 总结
On the 28th of September, 2018, a tsunami hit the city of Palu in Indonesia, killing 4,340 people.
A Resource-Driven Framework for Configurable Entanglement in Quantum Networks
Authors: Francesco Mazza, Claudio Pellitteri, Angela Sara Cacciapuoti, Marcello Caleffi
First: 2026-05-14T16:26:00+00:00 · Latest: 2026-05-14T16:26:00+00:00
Comments: This work has been funded by the European Union under the ERC grant QNattyNet, n. 101169850
Abstract
Shared multipartite entanglement defines a ``whatever channel'', i.e., a latent communication substrate that does not determine a priori which end-to-end entangled links are activated, but can be configured to support different entanglement-connectivity graphs through Local Operations and Classical Communication (LOCC). Building on this, we propose a resource-driven framework in which multipartite entanglement is treated as a programmable resource that induces a space of admissible entanglement-graph configurations. Within this framework, connectivity provisioning emerges as a particular instance of a more general resource reconfiguration process. To support this paradigm, we introduce a set of structural design parameters that characterize the operational degrees of freedom of the resource and define the admissible transformations independently of the specific mechanism used to realize them. We then formalize Entanglement Rolling as a measurement-based protocol that operates over the induced configuration space, enabling the systematic reconfiguration of the shared resource across a family of multipartite states. Finally, we analyze the proposed framework under realistic noise conditions. Leveraging the Noisy Stabilizer Formalism (NSF), we derive closed-form noise maps that characterize the effect of noise on the resource transformations and show that the proposed approach maintains reliable performance under relevant noise processes.
Summary / 总结
Shared multipartite entanglement defines a ``whatever channel'', i.e., a latent communication substrate that does not determine a priori which end-to-end entangled links are activated, but can be configured to support different entanglement-connectivity graphs through Local Operations and Classical Communication (LOCC).
A Tutorial on Cognitive Biases in Agentic AI-Driven 6G Autonomous Networks
Authors: Hatim Chergui, Farhad Rezazadeh, Merouane Debbah, Christos Verikoukis
First: 2025-10-22T19:05:04+00:00 · Latest: 2026-05-14T08:57:53+00:00
Comments: 26 pages, 18 figures, 4 tables, link to source code available. Accepted at IEEE OJCOMS
Abstract
The path to higher network autonomy in 6G lies beyond the mere optimization of key performance indicators (KPIs), requiring systems that perceive and reason over the network environment as it is. This can be achieved through agentic AI, where large language model (LLM)-powered agents utilize multimodal telemetry, memory, and cross-domain negotiation to achieve multi-objective goals. However, deploying such agents introduces cognitive biases inherited from human design, which can severely distort reasoning and actuation. This paper provides a comprehensive tutorial on well-known cognitive biases, detailing their taxonomy, mathematical formulation, emergence in telecom systems, and tailored mitigation strategies. We validate these concepts through two distinct use-cases in 6G management. First, we tackle anchoring bias in inter-slice resource negotiation. To overcome the prohibitive execution delays of cloud-based LLMs, this use-case deploys a locally hosted 1B-parameter model on an RTX A4000 GPU, successfully achieving sub-second inference latencies compatible with near-real-time operations. By replacing fixed heuristic anchors with a Truncated Weibull randomized anchor strategy, the agents dismantle rigid biases, intelligently consume SLA slack, and dynamically double the system-wide energy savings (peaking at 25\%) without violating strict latency limits. Second, we mitigate temporal and confirmation biases in RAN-Edge cross-domain negotiation by designing an unbiased collective memory. By integrating semantic/temporal decay and an inflection bonus that actively highlights past negotiation failures, agents are prevented from over-relying on recent data or repeating past mistakes. Grounding decisions in this richer, debiased historical context yields highly robust agreements, achieving a $\times 5$ latency reduction and roughly 40\% higher energy savings compared to memoryless baselines.
Summary / 总结
The path to higher network autonomy in 6G lies beyond the mere optimization of key performance indicators (KPIs), requiring systems that perceive and reason over the network environment as it is.
Characterizing AI-Assisted Bot Traffic in Darknet Data: Implications for ICS and IIoT Security
Authors: Alex Carbajal, Caleb Faultersack, Jonahtan Vasquez, Shereen Ismail, Asma Jodeiri Akbarfam
First: 2026-05-14T00:07:20+00:00 · Latest: 2026-05-14T00:07:20+00:00
Comments: This work has been accepted for publication at IEEE AIIIoT 2026
Abstract
The rise of automated scanning tools and AI assisted reconnaissance agents has significantly altered internet background traffic patterns, threatening the baseline assumptions underlying intrusion detection systems (IDS) deployed in critical infrastructure networks. This paper characterizes the evolution of automated bot traffic by analyzing a longitudinal dataset of 192 million passive darknet packets captured across 2021 and 2025 from the Merit ORION Network Telescope. A modular analysis pipeline was developed to compute metrics including average packet rate, global Shannon entropy, inter-arrival time (IAT) burstiness, geographic attribution, and destination port targeting across key industrial protocols. Results reveal a highly distributed yet focused reconnaissance landscape, with traffic targeting ICS-relevant ports nearly doubling from 0.82% to 1.51% over the four-year period. Furthermore, burstiness analysis exposes intentional micro-pacing behaviors (1ms to 100ms delays) that allow modern botnets to artificially smooth their overall volume. Our simulated anomaly-based IDS demonstrates that these evasion techniques enable 97.47% of modern bot traffic to bypass standard volumetric thresholds undetected. Compensatory sensitivity tuning triggers a 68.10% false-positive rate, highlighting fundamental visibility and alerting gaps in operational technology (OT) environments.
Summary / 总结
The rise of automated scanning tools and AI assisted reconnaissance agents has significantly altered internet background traffic patterns, threatening the baseline assumptions underlying intrusion detection systems (IDS) deployed in critical infrastructure networks.
Mesh Augmentation of LoRaWAN-based IoT Networks
Authors: Ram Ramanathan, Dmitrii Dugaev, Liang Tan, Warren Ramanathan
First: 2025-11-28T19:02:10+00:00 · Latest: 2026-05-13T20:50:26+00:00
Abstract
LoRaWAN is a leading standard and technology for low-power, long-range Internet-of-Things (IoT) communications. However, its single-hop architecture results in limited effective range and excessive power consumption for end devices, especially when deployed in large, remote and RF-challenged environments. Existing solutions are either incompatible with LoRaWAN, or limit relaying to a single hop. We present LIMA, a protocol for augmenting an existing or new LoRaWAN deployment with a mesh network of LIMA Routers. LIMA increases the effective coverage range well beyond the maximum LoRa range via multi-hopping, and significantly reduces the energy consumed by end-devices. LIMA requires no changes to the end-device, the servers or the LoRaWAN standard. LIMA builds routes using reverse path forwarding, tunnels LoRaWAN messages over LIMA, provides transparent extension of the existing Adaptive Data Rate (ADR), and suppresses duplicate forwarding if the device is directly reachable from the Gateway. Simulations using Network Simulator 3 (ns-3) show that LIMA increases the delivery rate, scalability, ED energy consumption by up to 5x, 8x and 12.6x respectively, and reduces latency by up to 2.3x. Table-top and outdoor testing with a prototype constructed using a commercial gateway as a starting point confirm that LIMA can be successfully deployed within an existing LoRaWAN system, and can provide range and energy gains transparently.
Summary / 总结
LoRaWAN is a leading standard and technology for low-power, long-range Internet-of-Things (IoT) communications.
WirelessSenseLLM: Zero-Shot Human Activity Understanding by Bridging Wireless Signals and Human Language
Authors: Mahmuda Keya, Sneh Pillai, Jiawei Yuan, Kai Zeng, Long Jiao
First: 2026-05-13T19:47:07+00:00 · Latest: 2026-05-13T19:47:07+00:00
Comments: Accepted at IEEE SECON 2026. 9 pages, 11 figures
Abstract
There is growing interest in enabling wireless sensing systems to interpret human motion from unsegmented wireless signals; however, existing CSI-based applications rely heavily on accurate signal segmentation and predefined action labels, limiting their applicability in zero-shot scenarios. We present WirelessSenseLLM, a language-driven framework that leverages large language models (LLMs) to enable zero-shot human motion understanding from unsegmented Wi-Fi Channel State Information (CSI). To bridge the modality gap between time-series CSI and discrete language representations, we introduce a CSI-to-Language Adapter and a cross-modal projection mechanism that maps CSI features into a language-aligned semantic space. This design enables the generation of fine-grained natural language descriptions of sequential and overlapping human motions, supporting downstream reasoning without segmented training data. We address two core technical challenges: modality mismatch between CSI features and language embeddings, and overlapping actions in unsegmented CSI streams. Extensive experiments demonstrate strong performance in zero-shot action understanding (92% accuracy and 91% F1-score), language-based reasoning quality (30% factual and 15% reasoning improvements), and multi-person motion explanation with an average 12.33% improvement over prior methods. These results highlight WirelessSenseLLM's effectiveness for robust and interpretable human motion understanding from CSI signals.
Summary / 总结
There is growing interest in enabling wireless sensing systems to interpret human motion from unsegmented wireless signals; however, existing CSI-based applications rely heavily on accurate signal segmentation and predefined action labels, limiting their applicability in zero-shot scenarios.
StormShield: Fingerprint-Based Detection and Mitigation of RRC Signaling Storms in O-RAN 5G RANs
Authors: Noemi Giustini, Andrea Lacava, Leonardo Bonati, Stefano Maxenti, Michele Polese, Tommaso Melodia, Francesca Cuomo
First: 2026-05-13T18:45:50+00:00 · Latest: 2026-05-13T18:45:50+00:00
Comments: 11 pages, 9 figures, 6 tables, 19th ACM WiSec26
Abstract
5G networks provide low-latency, high throughput, and massive connectivity, yet the control plane remains exposed to several security threats. Among the most common and impactful threats are Denial-of-Service (DoS) attacks, with Radio Resource Control (RRC) signaling storms being particularly effective and difficult to mitigate. In this attack, a malicious User Equipment (UE) aims to exhaust Next Generation Node Base (gNB) resources, preventing legitimate UEs from establishing a connection. Existing defenses are typically limited to detection, only evaluated through numerical simulations, and cannot discern between high-load network conditions and attacks. Most of them also assume static setups and do not take mobility into account. In this paper, we first evaluate the feasibility of the signaling storm attack by using the OpenAirInterface(OAI) 5G protocol stack. Then, we propose StormShield, a signaling storm attack detection and mitigation technique implemented as an xApp on an O-RAN Near-Real-Time (near-RT) RAN Intelligent Controller (RIC). It fingerprints and blocks Malicious UEs (MUEs) before gNB resources are exhausted. We prototyped our solution on an Over-The-Air (OTA) testbed with OAI, NVIDIA Aerial, and two different gNB setups. The first one leverages an USRP X410 Software-defined Radio (SDR) with 8.1 functional split; the second a commercial Foxconn Radio Unit (RU) with 7.2 functional split. Our experimental evaluation demonstrates that StormShield effectively prevents gNB resource exhaustion, identifying and blocking MUEs with an average detection accuracy of 97.6% within 106.5 ms from the beginning of the attack.
Summary / 总结
5G networks provide low-latency, high throughput, and massive connectivity, yet the control plane remains exposed to several security threats.
KVServe: Service-Aware KV Cache Compression for Communication-Efficient Disaggregated LLM Serving
Authors: Zedong Liu, Xinyang Ma, Dejun Luo, Hairui Zhao, Bing Lu, Wenjing Huang, Yida Gu, Xingchen Liu, Zheng Wei, Jinyang Liu, Dingwen Tao, Guangming Tan
First: 2026-05-13T16:12:33+00:00 · Latest: 2026-05-13T16:12:33+00:00
Comments: Accepted by SIGCOMM 2026
Abstract
LLMs are widely adopted in production, pushing inference systems to their limits. Disaggregated LLM serving (e.g., PD separation and KV state disaggregation) improves scalability and cost efficiency, but it also turns KV into an explicit payload crossing network and storage boundaries, making KV a dominant end-to-end bottleneck. Existing KV compression are typically static runtime configurations, despite production service context varies over time in workload mix, bandwidth, and SLO/quality budgets. As a result, a fixed choice can be suboptimal or even increase latency. We present \emph{KVServe}, the first service-aware and adaptive KV communication compression framework for disaggregated LLM serving: KVServe (1) unifies KV compression into a modular strategy space with new components and cross-method recomposition; (2) introduces Bayesian Profiling Engine that efficiently searches this space and distills a 3D Pareto candidate set, reducing $50\times$ offline search overhead; and (3) deploys a Service-Aware Online Controller that combines an analytical latency model with a lightweight bandit to select profiles under constraints and correct offline-to-online mismatch. Integrated into vLLM and evaluated across datasets, models, GPUs and networks, KVServe achieves up to $9.13\times$ JCT speedup in PD-separated serving and up to $32.8\times$ TTFT reduction in KV-disaggregated serving.
Summary / 总结
LLMs are widely adopted in production, pushing inference systems to their limits.
Identifying AI Web Scrapers Using Canary Tokens
Authors: Steven Seiden, Triss Ren, Caroline Zhang, Taein Kim, Enze Liu, Emily Wenger
First: 2026-05-13T15:53:57+00:00 · Latest: 2026-05-13T15:53:57+00:00
Abstract
From pre-training to query-time augmentation, web-scraped data helps to improve the quality and contextual relevancy of content generated by large language models (LLMs). However, large-scale web scraping to feed LLMs can affect site stability and raise legal, privacy, or ethics concerns. If website owners wish to limit LLM-related web scraping on their site, due to these or other concerns, they may turn to scraper access control mechanisms like the Robots Exclusion Protocol. To be most effective, such mechanisms require site owners to first identify the scrapers that they wish to restrict (e.g., via User-Agent strings). Existing mechanisms to identify LLM-related scrapers rely on voluntary disclosure by companies, one-off experiments by researchers, or crowd-sourced reports -- methods that are neither reliable nor scalable. This paper proposes a novel technique for accurately and automatically inferring LLM-related scrapers. We host dynamic websites that serve unique canary tokens to each visiting scraper, then prompt LLMs for information about our sites. If an LLM consistently generates outputs containing tokens unique to a scraper, it provides evidence of exposure to that scraper. Via experiments across 22 production LLM systems, we demonstrate that our approach can reliably identify which scrapers feed which LLM, including several that are not publicly known or disclosed by the companies. Our approach provides a promising avenue for unprivileged third parties to infer which scrapers serve data to which LLMs, potentially enabling better control over unwanted scraping.
Summary / 总结
From pre-training to query-time augmentation, web-scraped data helps to improve the quality and contextual relevancy of content generated by large language models (LLMs).
MultiPath Memory Access: Breaking Host-GPU Bandwidth Bottlenecks in LLM Services
Authors: Lingfeng Tang, Daoping Zhang, Junjie Chen, Peihao Huang, Feng Jin, Chengguang Xu, Yuxin Chen, Feiqiang Sun, Guo Chen
First: 2025-12-18T00:45:00+00:00 · Latest: 2026-05-13T15:33:38+00:00
Abstract
Host-GPU data movement has become a latency-critical bottleneck in LLM serving, surfacing in common paths such as model-weight movement and KV cache offload/fetch. Today, each host-GPU copy is effectively confined to the PCIe path of the target GPU, even though modern multi-GPU servers contain additional PCIe links on peer GPUs and high bandwidth GPU interconnects. This leaves substantial intra-server I/O capacity unused. To address this issue, we present Multipath Memory Access (MMA), a software-defined multipath memory access system for host--GPU data transfer. To the best of our knowledge, MMA is the first software-defined system to enable efficient multipath host--GPU data transfer within a single multi-GPU server. MMA expands a single host--GPU copy across available direct and relay paths without hardware, driver, or application changes. It preserves CUDA stream semantics with a dependency-preserving Dummy Task, coordinates distributed micro-transfer completion through a lightweight synchronization mechanism, and uses queue backpressure to route traffic without explicit link-state feedback. On an 8-GPU NVIDIA H20 server, MMA achieves 245 GB/s peak host-to-GPU bandwidth, a 4.62x improvement over native CUDA copies, and reduces TTFT for KV cache fetching by 1.14-2.38x and model wake-up/switching latency by 1.12-2.48x.
Summary / 总结
Host-GPU data movement has become a latency-critical bottleneck in LLM serving, surfacing in common paths such as model-weight movement and KV cache offload/fetch.
Avoiding Cross-Datacenter Collective Congestion via Disaggregated Buffering
Authors: Mariano Scazzariello, Noga H. Rotman, Dima Gavrilenko, Sajy Khashab, Alexander Shpiner, Matty Kadosh, Marco Chiesa, Dejan Kostic, Mark Silberstein
First: 2026-05-12T09:38:25+00:00 · Latest: 2026-05-13T14:24:02+00:00
Abstract
LLM training at the scale of tens of thousands of GPUs now spans multiple datacenters (DC), making cross-DC collectives over long-haul links unavoidable. A critical and overlooked bottleneck arises when these collectives collide with intra-DC traffic at the destination - a common pattern in real workloads. The multi-millisecond congestion control loop is too slow to react, triggering severe packet loss and congestion collapse.
We present Spillway, a transparent in-network mechanism that buffers dropped packets in switch-disaggregated buffers in a destination data center and drains them once congestion subsides. Through large-scale end-to-end simulations and a hardware prototype, we show that Spillway eliminates performance degradation from collective collisions, reducing iteration time by up to 14 %, without changes to end hosts or training frameworks.
Summary / 总结
LLM training at the scale of tens of thousands of GPUs now spans multiple datacenters (DC), making cross-DC collectives over long-haul links unavoidable.
Toward Practical Age-of-Information Scheduling in 5G Cellular
Authors: Zhuoyi Zhao, Igor Kadota
First: 2026-05-13T05:04:52+00:00 · Latest: 2026-05-13T05:04:52+00:00
Abstract
We consider a 5G cellular network where a gNB schedules time-sensitive uplink transmissions from multiple UEs and forwards received packets to remote destinations. In practical 5G networks, the gNB does not directly observe the destination-side Age of Information (AoI) and must make scheduling decisions under stringent slot-level runtime constraints. In this paper, we develop a low-complexity AoI-aware scheduling policy for 5G cellular under limited observability. We first design a low-complexity estimator that infers UE-side packet timestamps and destination-side AoI from gNB-visible observations. Based on these estimates, we propose and implement a Max-Weight policy (MW-LC) in NetSim, a 5G emulator with a standards-compatible protocol stack, to showcase its performance against baseline 5G scheduling policies. Furthermore, we use MATLAB simulations to show that the LC estimator and MW-LC achieve performance close to a richer estimator-based AoI policy from the literature. The estimator may be of independent interest to the community, enabling AoI-aware algorithms beyond 5G scheduling.
Summary / 总结
We consider a 5G cellular network where a gNB schedules time-sensitive uplink transmissions from multiple UEs and forwards received packets to remote destinations.
Dynamic Transaction Scheduling and Pricing in the Ethereum Mempool
Authors: Fatemeh Fardno, S. Rasoul Etesami
First: 2026-05-12T22:20:29+00:00 · Latest: 2026-05-12T22:20:29+00:00
Abstract
The Ethereum blockchain utilizes the EIP-1559 algorithm to manage transaction inclusion and block assembly. However, EIP-1559 and much of the existing literature study this problem from a static perspective, focusing on price evolution without modelling transaction dynamics within the mempool. Motivated by this limitation, we study a dynamic transaction scheduling problem in which transactions with heterogeneous sizes and per-unit values arrive over time and remain in the mempool until scheduled. To capture the stochastic mempool evolution, we formulate the problem as a Markov Decision Process (MDP) whose state represents the mempool configuration and whose actions correspond to block prices. We first provide a primal-dual interpretation of the static EIP-1559 mechanism, showing that block prices arise naturally as dual variables of a social-welfare maximization problem. Building on this perspective, we extend the framework to the dynamic setting and formulate an objective that maximizes long-run discounted reward while incorporating holding costs and overshoot penalties. We then employ a Natural Policy Gradient (NPG) algorithm to compute the optimal policy. Our results show that dynamic pricing stabilizes the mempool while maximizing long-run discounted reward. In particular, as the overshoot penalty increases, the average scheduled transaction volume converges to the target block capacity, and the resulting NPG updates closely resemble the EIP-1559 price update rule. Finally, we study two special cases of the MDP formulation: homogeneous transactions and uniform arrivals. In the homogeneous setting, where the protocol directly controls scheduled volume, we show that the optimal policy has a threshold structure. We then propose a bang-bang pricing mechanism for uniform arrivals and derive a lower bound on the block capacity needed to ensure system stability.
Summary / 总结
The Ethereum blockchain utilizes the EIP-1559 algorithm to manage transaction inclusion and block assembly.
Large Language Models for Agentic NetOps and AIOps: Architectures, Evaluation, and Safety
Authors: Muhammad Bilal, Jon Crowcroft, Ruizhi Wang, Xiaolong Xu, Schahram Dustdar
First: 2026-05-12T20:31:41+00:00 · Latest: 2026-05-12T20:31:41+00:00
Comments: 50 pages, 15 figures, 6 tables; survey article
Abstract
Large language models are increasingly being used to support network operations (NetOps) and artificial intelligence for IT operations (AIOps), including incident investigation, root-cause analysis, configuration synthesis, and limited self-healing. In both NetOps and AIOps, this shift is changing how tasks are managed. Agent-based operations work as workflows, from gathering evidence to taking action, following permissions, policies, and checks, and providing rollback options when necessary. This is crucial because operational decisions can have instant impacts. To make the argument concrete, we organise the relevant literature around the hierarchy of autonomy, tool scope, evidence traces, and assurance contracts. These contracts define what an agent may observe, propose, and execute. They also define the checks that must pass before any action is allowed. A consistent pattern appears across work on telemetry query recommendation, diagnosis, root-cause analysis, configuration synthesis, change planning, and limited self-healing. Operational reliability does not come chiefly from the model itself. It depends on the machinery around the model. We also argue that evaluation should go beyond static question answering. Agentic NetOps and AIOps systems require workflow-centred evaluation, including trace quality, bounded tool use, safe proposal generation, replay in sandboxed environments, and canary trials with rollback-aware scoring. Without these measures, a system may appear robust yet remain too fragile. Finally, we examine security, privacy, and governance risks that become acute when agents sit close to operational control surfaces. Taken together, the survey concludes that progress in intelligent NetOps and AIOps will depend on treating autonomy as a constrained operational control problem, whose outputs must be reliable, auditable, and securely deployable.
Summary / 总结
Large language models are increasingly being used to support network operations (NetOps) and artificial intelligence for IT operations (AIOps), including incident investigation, root-cause analysis, configuration synthesis, and limited self-healing.
netFound: Principled Design for Network Foundation Models
Authors: Sylee Beltiukov, Satyandra Guthula, Haarika Manda, Jaber Daneshamooz, Wenbo Guo, Walter Willinger, Arpit Gupta, Inder Monga
First: 2023-10-25T22:04:57+00:00 · Latest: 2026-05-12T17:50:15+00:00
Abstract
Network foundation models promise reusable representations for diverse traffic analysis tasks, but recent diagnostic works have revealed fundamental problems: models exploit dataset shortcuts rather than learning genuine traffic patterns, produce collapsed embedding spaces, and fail to capture the exogenous network conditions that shape real-world behavior. We translate these diagnostic insights into four concrete design principles: protocol-aware tokenization, operational context embedding, burst-flow hierarchical attention, and privacy-by-construction input design, and build netFound, a network foundation model whose architecture is motivated by this failure analysis. We pretrain netFound on a billion-token-scale corpus over 5000 GPU hours, and demonstrate that it produces high-quality representations with lower anisotropy, significantly higher alignment with domain-expert features, and an F1 of 0.95 on exogenous context discrimination where existing state-of-the-art models score below 0.62, while preserving privacy by excluding payload and IP addresses. netFound demonstrates significant improvements in frozen-encoder evaluation, showing that pretrained embeddings themselves carry useful structure, and remains the top performer across all benchmarks in end-to-end fine-tuned settings. We release full open-source code, weights for three model sizes on HuggingFace, a containerized pipeline from raw PCAPs to downstream inference, and the full 4.2 billion flows pretraining dataset to facilitate reproducibility and further research.
Summary / 总结
Network foundation models promise reusable representations for diverse traffic analysis tasks, but recent diagnostic works have revealed fundamental problems: models exploit dataset shortcuts rather than learning genuine traffic patterns, produce collapsed embedding spaces, and fail to capture the exogenous network conditions that shape real-world behavior.
Capacity Scalability of LEO Constellations With Dynamic Link Failures
Authors: Wei Li, Min Sheng
First: 2026-05-12T14:03:37+00:00 · Latest: 2026-05-12T14:03:37+00:00
Abstract
Dynamic link failures disrupt the connectivity and geometric symmetry of the constellation structure, thereby increasing protocol overhead and degrading the effective capacity for traffic transport. The fundamental relationship between constellation size and effective capacity under protocol overhead constraints remains unclear. To this end, we define capacity scalability as the ratio of constellation capacity under non-failure conditions to protocol overhead. Specifically, if ISL states follow a two-state discrete Markov chain and the maintenance period is $k \geq 1$, the upper bound of capacity scalability under the uniform traffic pattern is $O(1/n)$, where $n$ is the number of satellites. With perfect information about the constellation topology, the upper bound can be achieved via shortest-path routing. For any given protocol, there exists an optimal constellation deployment scale in terms of capacity scalability. When the constellation size is below this optimum scale, capacity scalability increases with constellation size, thereby improving effective capacity. Increasing the maintenance period $k$ can improve capacity scalability, but it does not change the fact that the capacity scalability converges to zero when the constellation size exceeds the optimal scale.
Summary / 总结
Dynamic link failures disrupt the connectivity and geometric symmetry of the constellation structure, thereby increasing protocol overhead and degrading the effective capacity for traffic transport.
Combinatorics of nondeterministic walks
Authors: Élie de Panafieu, Michael Wallner
First: 2023-11-06T16:24:51+00:00 · Latest: 2026-05-12T10:32:36+00:00
Comments: 51 pages plus 5 pages of appendix
Abstract
This paper introduces nondeterministic walks, a new variant of one-dimensional discrete walks. The main difference to classical walks is that its nondeterministic steps consist of sets of steps from a predefined set such that all possible extensions are explored in parallel. We discuss in detail the most natural nondeterministic step sets (Dyck and Motzkin step sets), and show that several nondeterministic classes of lattice paths, such as nondeterministic bridges, excursions, and meanders are algebraic. The key concept is the generalization of the ending point of a walk to its reachable points, i.e., a set of ending points. We extend our results to general step sets: We show that nondeterministic bridges and several subclasses of nondeterministic meanders are always algebraic. We conjecture the same is true for nondeterministic excursions, and we present python and Maple packages to support our conjecture. This research is motivated by the study of networks involving encapsulation and decapsulation of protocols. Our results are obtained using generating functions, analytic combinatorics, and additive combinatorics.
Keywords. Random walks, analytic combinatorics, generating functions, limit laws, networking, encapsulation.
Summary / 总结
This paper introduces nondeterministic walks, a new variant of one-dimensional discrete walks.
Agents Should Replace Narrow Predictive AI as the Orchestrator in 6G AI-RAN
Authors: Pranshav Gajjar, Vijay K Shah
First: 2026-05-12T04:39:34+00:00 · Latest: 2026-05-12T04:39:34+00:00
Abstract
This position paper argues that to achieve Level 5 autonomous 6G networks, the next generation of Artificial Intelligence in Radio Access Networks (AI-RAN) should transition away from fragmented, narrow predictive models and instead adopt multimodal Large Language Models (LLMs) as central reasoning agents. Current AI-RAN architectures rely on disjointed Deep Neural Networks (DNNs) and Deep Reinforcement Learning (DRL) agents that operate in isolated domains. These narrow models suffer from siloed knowledge, severe brittleness to out-of-distribution dynamics, and a fundamental inability to bridge the intent gap the semantic disconnect between high-level, unstructured operator directives and rigid numerical network configurations. We propose elevating LLMs, or domain-adapted Large Telecom Models (LTMs), to act as the cognitive operating system situated within the RAN Intelligent Controller (RIC), the control and orchestration layer of AI-RAN. In this architecture, LLMs do not replace narrow models but orchestrate them as executable subroutines, dynamically translating human intent into concrete policies and utilizing Retrieval-Augmented Generation (RAG) to autonomously diagnose complex, multi-vendor network anomalies. To make this architectural shift a reality, we call upon the machine learning community to prioritize critical foundational research tailored to the strict constraints of telecommunications, specifically focusing on continuous alignment via network-driven feedback (RLNF), extreme sub-8-bit edge quantization, neuro-symbolic verification to curb hallucinations, and securing orchestration frameworks against adversarial prompt injections.
Summary / 总结
This position paper argues that to achieve Level 5 autonomous 6G networks, the next generation of Artificial Intelligence in Radio Access Networks (AI-RAN) should transition away from fragmented, narrow predictive models and instead adopt multimodal Large Language Models (LLMs) as central reasoning agents.
More Than Meets the Eye: A Semantics-Aware Traffic Augmentation Framework for Generalizable Website Fingerprinting
Authors: Youquan Xian, Xueying Zeng, Lingjia Meng, Lei Cui, Runhan Song, Wei Wang, Zhengquan Ding, Peng Liu, Zhiyu Hao
First: 2026-05-12T01:48:05+00:00 · Latest: 2026-05-12T01:48:05+00:00
Comments: 18 pages, 19 figures, Submitted to NDSS 2027
Abstract
Deep learning-based website fingerprinting has emerged as an effective technique for inferring the websites users visit. Although existing methods achieve strong performance on closed-world datasets, they often fail to generalize to real-world environments, especially under geographic and temporal shifts. This limitation fundamentally stems from the coupled effects of two key challenges: application-layer resource composition variability and observable feature instability induced by cross-layer encapsulation. Intertwined, these factors induce systematic shifts between underlying application semantics and observable traffic features. To address the above challenges, we propose SATA , a semantics-aware traffic augmentation framework. Specifically, SATA first performs application-layer semantic augmentation based on protocol rules, expanding the resource composition patterns within each flow and frame sequence patterns under protocol constraints. Based on these augmented frame sequences, we further introduce a cross-layer feature alignment mechanism via knowledge distillation. It aligns frame sequence with packet-length sequence features, enabling cross-layer feature alignment between enhanced semantics and observable sequences. Extensive experiments show that SATA successfully generates traffic patterns that are absent from the training set but genuinely exist in the test set, and significantly improves the performance of mainstream models across diverse and complex scenarios. In particular, in open-world settings, SATA improves ACC by 90.81% and AUROC by 48.37%. The source code of the prototype system is available at https://anonymous.4open.science/r/SATA-B6C2/.
Summary / 总结
Deep learning-based website fingerprinting has emerged as an effective technique for inferring the websites users visit.
Large Spectrum Models (LSMs): Decoder-Only Transformer-Powered Spectrum Activity Forecasting via Tokenized RF Data
Authors: Mohammad Mosiur Lunar, Mehmet C. Vuran
First: 2026-05-11T16:43:55+00:00 · Latest: 2026-05-11T16:43:55+00:00
Abstract
Dynamic spectrum access (DSA) has become a key pillar of next-generation wireless systems to address the spectrum scarcity due to the rapid growth of connected devices. Accurate short-term spectrum forecasting is critical for DSA, where data-driven approaches have proven most effective. Recent advances in and widespread adoption of large language model (LLM) architectures present new opportunities for spectrum prediction. In this paper, foundational large spectrum models (LSMs) are presented. A novel RF tokenizer is introduced to convert raw IQ measurements into token sequences by mapping each power-spectral density value to a fixed vocabulary along with embedding gain, frequency, FFT bin, and timestamp information. Five established open-source LLM architectures (Gemma-2B, GPT-2, LLaMA-7B, Mistral-7B, and Phi-1) are trained on this tokenized spectrum data for the task of spectrum forecasting, yielding LSMs. To leverage the scaling gains of LSMs, a fully automated outdoor wireless testbed is employed to collect over 22 TB of raw spectrum data across 33 sub-GHz frequency bands, yielding 8.4B tokens in total. Across all 33 bands, the best model (LSM-Mistral) achieves a root-mean-square error of 3.25 dB and 97% of predictions have a mean absolute error below 5 dB. Generalization of LSMs is illustrated by fine-tuning the models on data collected in different locations, where RMSE is maintained below 3.7 dB. These results demonstrate that widespread decoder-only transformer architectures can serve as effective predictive models for large-scale RF spectrum forecasting.
Summary / 总结
Dynamic spectrum access (DSA) has become a key pillar of next-generation wireless systems to address the spectrum scarcity due to the rapid growth of connected devices.
A Case for CATS: A Conductor-driven Asymmetric Transport Scheme for Semantic Prioritization
Authors: Syed Muhammad Aqdas Rizvi
Venue: 2025 6th International Conference on Innovative Computing (ICIC)
First: 2026-03-14T13:36:15+00:00 · Latest: 2026-05-11T12:00:36+00:00
Comments: Extended version. Contains additional mathematical formalization of the deadlock resolution constraint, detailed ns-3 simulation parameters, and further details on possible future work and extensions not present in the IEEE conference proceedings. 7 pages, 3 figures, 2 tables. Code available at https://github.com/smarizvi110/cats
Abstract
Standard transport protocols like TCP operate as a blind, FIFO conveyor belt for data, a model that is increasingly suboptimal for latency-sensitive and interactive applications. This paper challenges this model by introducing CATS (Conductor-driven Asymmetric Transport Scheme), a framework that provides TCP with the semantic awareness necessary to prioritize critical content. By centralizing scheduling intelligence in a transport-native "Conductor", CATS significantly improves user-perceived performance by delivering essential data first. This architecture directly confronts a cascade of historical performance workarounds and their limitations, including the high overhead of parallel connections in HTTP/1.1, the transport-layer Head-of-Line blocking in HTTP/2, and the observed implementation heterogeneity of prioritization in HTTP/3 over QUIC. Built upon TCP BBR, our ns-3 implementation demonstrates this principle by reducing the First Contentful Paint by over 78% in a representative webpage download configured as a deliberate worst-case scenario, with no penalty to total page load time compared to the baseline.
Summary / 总结
Standard transport protocols like TCP operate as a blind, FIFO conveyor belt for data, a model that is increasingly suboptimal for latency-sensitive and interactive applications.
Agentic Performance at the Edge: Insights from Benchmarking
Authors: Shiqiang Wang, Herbert Woisetschläger
First: 2026-05-11T11:24:20+00:00 · Latest: 2026-05-11T11:24:20+00:00
Comments: Accepted to AutoEdge workshop, co-located with MobiSys 2026
Abstract
Agentic artificial intelligence (AI) is a natural fit for Internet of Things (IoT) and edge systems, but edge deployments are often constrained to models around 8 billion parameters or smaller. An important question is: How much agentic-task quality is lost when model size is constrained by memory, power, and latency budgets? To address this question, in this paper, we provide an initial empirical study considering edge-focused model scaling, general-purpose versus coder-oriented model effects, and tool-enabled execution under a fixed protocol. We introduce a domain-conditioned evaluation methodology, an implementation-grounded analysis of model-tool interactions, practical guidance for model selection under constraints, and an analysis of failure modes that reveals distinct semantic versus execution failure patterns across model families. Our core finding is that edge-agent quality is not a simple function of parameter count. Robust deployment depends on the joint design of model choice and tool workflow. Domain-conditioned analysis reveals Pareto fronts in the accuracy-latency space that can guide strategy selection based on operational priorities.
Summary / 总结
Agentic artificial intelligence (AI) is a natural fit for Internet of Things (IoT) and edge systems, but edge deployments are often constrained to models around 8 billion parameters or smaller.
Is DRL-based MAC Ready for Underwater Acoustic Networks? Exploring Its Practicality in Real Field Experiments
Authors: Jiani Guo, Bingwen Huangfu, Shanshan Song, Nan Sun, Miao Pan, Guangjie Han
First: 2026-05-11T07:53:30+00:00 · Latest: 2026-05-11T07:53:30+00:00
Abstract
Medium Access Control (MAC) protocols rely on neighbor and environment information to design collision-free access rules for Underwater Acoustic Networks (UANs). Acquiring this information suffers from high communication overhead due to the unique underwater acoustic channel characteristics, such as long propagation delay, spatiotemporal variations in communication quality, and high attenuation. Deep Reinforcement Learning (DRL) is promising to circumvent the UANs' physical constraints and provide a low-overhead solution for underwater MAC protocols, since it can decide access rules based on real-time observation without extra information exchange. However, the unique underwater acoustic channel characteristics impose significant challenges on observation acquisition, training time, and the balance of multiple reward factors for DRL-based MAC protocols. Most existing methods remain at the theoretical level: (1) they design partial intelligent agents failing to achieve fully autonomous access; (2) they assume unreasonable simulation scenarios, weakening the effects of underwater acoustic channel characteristics on MAC protocols. To enhance the practicality of DRL-based MAC protocols, we first analyze the application challenges of DRL in UANs through real field experiments. Based on the above challenges, we propose a DRL-based MAC protocol that considers observation loss and balances multiple reward factors to achieve efficient Entire Autonomous access in the UAN (EA-MAC). To further explore the feasibility of DRL-based MAC protocols, we implement EA-MAC and other state-of-the-art protocols on underwater acoustic modems and evaluate their performance in real field experiments. Experimental results demonstrate that EA-MAC can adaptively determine the scheduling sequence for each node, enabling high-throughput and fair communication in a straightforward manner for UANs.
Summary / 总结
Medium Access Control (MAC) protocols rely on neighbor and environment information to design collision-free access rules for Underwater Acoustic Networks (UANs).
GELATO: Generative Entropy- and Lyapunov-based Adaptive Token Offloading for Device-Edge Speculative LLM Inference
Authors: Zengzipeng Tang, Yuxuan Sun, Wei Chen, Jianwen Ding, Bo Ai
First: 2026-05-11T07:38:56+00:00 · Latest: 2026-05-11T07:38:56+00:00
Comments: This work has been submitted to the IEEE for possible publication
Abstract
The recent growth of on-device Large Language Model (LLM) inference has driven significant interest in device-edge collaborative LLM inference. As a promising architecture, Speculative Decoding (SD) is increasingly adopted where a lightweight draft model rapidly generates candidate tokens to be verified by a powerful target model. However, a fundamental challenge lies in achieving per-token resource scheduling to effectively adapt SD paradigm to resource-constrained edge environment. This paper proposes a Generative Entropy- and Lyapunov-based Adaptive Token Offloading framework, named GELATO, to maximize decoding throughput under energy constraints in a device-edge collaborative SD system. Specifically, an outer drift-plus-penalty loop makes online decisions to establish a reference drafting budget, managing long-term energy-throughput trade-off. Further, a nested entropy-driven generation mechanism executes early exiting to adapt to per-token dynamic generative uncertainty. Theoretical analysis establishes a rigorous performance bound on long-term throughput for GELATO. Extensive evaluations demonstrate that GELATO achieves a globally optimal tradeoff, outperforming state-of-the-art distributed SD architectures by 64.98% in token throughput and reducing energy consumption by 47.47% under resource-constrained environments, while preserving LLM decoding quality.
Summary / 总结
The recent growth of on-device Large Language Model (LLM) inference has driven significant interest in device-edge collaborative LLM inference.
DQN-Driven Adaptive Neighbor Discovery for Directional Aerial Networks
Authors: Md Asif Ishrak Sarder, Murat Yuksel, Elizabeth Bentley
First: 2026-05-11T03:49:48+00:00 · Latest: 2026-05-11T03:49:48+00:00
Comments: Accepted at IEEE ICC 2025. This is the author-accepted manuscript
Abstract
Directional antenna systems are gaining substantial traction for aerial networks due to their higher gain, extended transmission range, and enhanced security. However, the requirement of beam alignment makes the task of finding and reaching neighbors challenging, particularly in a mobile setting. For wireless networks, privacy concerns play an equally critical role. However, the problem of ensuring network-wide connectivity while maintaining limited exposure when probing around is still unexplored. We address this trade-off by proposing an adaptive transceiver selection protocol based on the Deep Q-Network (DQN) framework. Each node acts as an independent DQN agent and interacts with the environment to learn how to balance the trade-off. Since the directional nodes operate only based on local observations, we adopt a weighted mechanism that guides them in prioritizing either high reachability or privacy by adaptively tuning the probing patterns. Results show that DQN framework surpasses the Random and Q-Learning baselines. Weights favoring discovery provide higher probing efficiency and reachability, while weights prioritizing privacy ensure limited exposure at the cost of low reachability, eventually attaining higher objective value.
Summary / 总结
Directional antenna systems are gaining substantial traction for aerial networks due to their higher gain, extended transmission range, and enhanced security.
Mixed-Criticality Flow Scheduling with Low Delay and Limited Bandwidth in TSN
Authors: Wenyan Yan, Sijing Duan, Dongsheng Wei
First: 2026-05-11T02:25:09+00:00 · Latest: 2026-05-11T02:25:09+00:00
Comments: 7 pages
Abstract
Time-Sensitive Networking (TSN) is a promising Ethernet protocol with time determinism, widely used in time-critical systems such as industrial automation, automotive networks, and avionics. By allocating dedicated time windows for time-sensitive flows, TSN enables deterministic transmission; however, as network traffic grows, multiple flows may contend for the same window, causing large delays. Frame aggregation can mitigate this by combining multiple small frames into a larger one, thereby reducing the number of frames and required time windows, but existing approaches typically handle only single-priority traffic and cannot fully utilize pre-allocated time windows. To address this limitation, we propose MCFS-2L, a mixed-criticality flow scheduling scheme with low delay and limited bandwidth usage. MCFS-2L first aggregates critical and non-critical frames with the same source and destination nodes and harmonic periods into a single frame, and then applies a dynamic reassembly and scheduling method that selectively disaggregates non-critical frames from unschedulable aggregated frames. Experimental results show that MCFS-2L increases the acceptance ratio of critical and non-critical flows by up to 4.78% and 8.58%, respectively, while reducing bandwidth utilization by up to 11.88%.
Summary / 总结
Time-Sensitive Networking (TSN) is a promising Ethernet protocol with time determinism, widely used in time-critical systems such as industrial automation, automotive networks, and avionics.
Continuum: Efficient and Robust Multi-Turn LLM Agent Scheduling with KV Cache Time-to-Live
Authors: Hanchen Li, Runyuan He, Qiuyang Mang, Qizheng Zhang, Huanzhi Mao, Xiaokun Chen, Hangrui Zhou, Alvin Cheung, Joseph Gonzalez, Ion Stoica
First: 2025-11-04T03:43:05+00:00 · Latest: 2026-05-11T02:12:30+00:00
Abstract
KV cache management is essential for efficient LLM inference. To maximize utilization, existing inference engines evict finished requests' KV cache if new requests are waiting. This policy breaks for agentic workloads, which interleave LLM calls with tools, introducing pauses that prevent effective KV reuse across turns. Since many tool calls have much shorter durations than human response multi-turn chatbot, it would be promising to retain the KV cache in during these tools. However, many challenges remain. First, we need to consider both the potential cost of recomputation or reloading (if offloading enabled) as well as the increasing queueing delays after eviction from GPU. Second, due to the internal variance of tool call durations, the method needs to remain robust under limited predictability of tool call durations.
We present Continuum, a serving system to optimize job completion time for multi-turn agent workloads by introducing time-to-live mechanism for KV cache retention. For requests that generate tool calls, Continuum selectively pins the KV cache in GPU memory with a time-to-live value determined by the reload cost and potential queueing delay induced by eviction. When the TTL expires, the KV cache can be automatically evicted to free up GPU memory, providing robust performance under edge cases. When combined with program-level first-come-first-serve, Continuum preserves multi-turn continuity, and reduces delay for agentic workflows. Evaluations on real-world agents (SWE-Bench, BFCL, OpenHand) with Llama-3.1 8B/70B, Gemma-3 12B, and GLM-4.5 355B shows that Continuum improves the average job completion times by over 8x while improving throughput.
Summary / 总结
KV cache management is essential for efficient LLM inference.