arXiv 论文速递

2026-06-03 05:49
Snapshot: 20260603_0549
RadioMaster: Multi-Agent System for Autonomous Radio Signal Generation
Authors: Jiazhen Lei, Tianze Cao, Yuxin Sha, Sihan Wang, Bingbing Wang, Fengyuan Zhu, Zeming Yang, Xiaohua Tian
First: 2026-06-01T08:13:07+00:00 · Latest: 2026-06-01T08:13:07+00:00
Abstract
Translating user intents into physical radio signals represents the critical yet notoriously tedious final step in wireless prototyping, as it requires intricate knowledge of physical layer details and presents immense implementation challenges. Large Language Models (LLMs) and multi-agent systems have revolutionized conventional software engineering, raising the compelling question of whether they can resolve these formidable difficulties. However, our investigations reveal that current models experience significant limitations and fail to accomplish this task when applied to radio signal generation. This performance degradation primarily stems from severe domain ignorance and a fundamental insensitivity to physical hardware constraints. To bridge this gap, we introduce RadioMaster, a fully autonomous multi-agent framework designed to seamlessly translate user input into real-world wireless emissions. RadioMaster operates on three synergistic pillars: RadioWiki for domain-specific knowledge retrieval, RadioAgent for collaborative I/Q sample generation alongside hardware configuration, and RadioEmulator for closed-loop physical layer verification. Furthermore, we construct RadioBench, the first comprehensive benchmark tailored specifically for the radio signal generation domain. Extensive real-world evaluations demonstrate that RadioMaster significantly outperforms state-of-the-art (SOTA) baselines regarding configuration viability and signal fidelity.
Summary / 总结
Translating user intents into physical radio signals represents the critical yet notoriously tedious final step in wireless prototyping, as it requires intricate knowledge of physical layer details and presents immense implementation challenges.
Move the Query, Not the Cache: Characterizing Cross-Instance Latent Attention Redistribution Across GPU Fabrics
Authors: Bole Ma, Jan Eitzinger, Harald Köstler, Gerhard Wellein
First: 2026-05-31T23:53:24+00:00 · Latest: 2026-05-31T23:53:24+00:00
Abstract
Frontier LLMs increasingly decide what a query attends to with a sparse-attention indexer that picks a few KV-cache blocks per query: attention's unit is now a small, reusable chunk. Agentic workloads hammer it: many sub-agents query one large codebase, reusing the same blocks. When that corpus outgrows one GPU it is partitioned across instances, so a query and the blocks it selects often sit on different GPUs: answering it means attention across instances. The reflex of prior cross-instance KV systems is to move the cache: pull the selected blocks to the requester. Multi-head Latent Attention inverts the arithmetic, compressing each token's key and value into one narrow vector, so a routed query row is only ~1 KB, smaller than the chunk it attends; routing the query is then often cheaper than moving the cache. Which primitive wins, over which fabric and request shape, is uncharted, least of all on device-initiated RDMA that makes per-request cross-node transfers cheap. We characterize cross-instance MLA attention on a real multi-node H100 cluster, distilling two reusable artifacts: a topology-aware cost model (probe / transfer / compute / return / merge) and a closed-form route/fetch/local predicate, whose constants we measure on real IBGDA, where the model tracks batched round-trips to within ~7%. At decode it routes the query, trading the cost of moving the cache (a ~3 ms re-adaptation splice for a contiguous chunk, or a scattered gather under selection) for a tens-of-microsecond round trip, and picks the fabric by probe latency, not peak bandwidth. We instantiate the cost model and predicate for MLA, but neither is MLA-specific: they apply wherever compression or sparse selection shrinks attention to small chunks (DeepSeek-V3.2, V4, and GLM-5.1 today). Extending them to a new architecture requires measuring just two coefficients: the routed payload and fetch's move-the-cache cost.
Summary / 总结
Frontier LLMs increasingly decide what a query attends to with a sparse-attention indexer that picks a few KV-cache blocks per query: attention's unit is now a small, reusable chunk.
A Reproducible UAV-Assisted VANET Dataset Generator for Fragmentation Risk Analysis in Intelligent Transportation Systems
Authors: Bappa Muktar, Justin Moskolaï Ngossaha, Adama Nouboukpo
First: 2026-05-31T23:04:41+00:00 · Latest: 2026-05-31T23:04:41+00:00
Abstract
Vehicular Ad Hoc Networks (VANETs) are a key component of Intelligent Transportation Systems, enabling cooperative communication among vehicles and between vehicles and roadside infrastructure. However, their highly dynamic topology makes them vulnerable to network fragmentation, particularly in highway scenarios, low-density traffic conditions, localized accident zones, and communication-stressed environments. Although Unmanned Aerial Vehicles (UAVs) have been increasingly investigated as temporary aerial relays for improving VANET connectivity, reusable, future-labeled, and reproducible datasets designed to support short-term fragmentation risk analysis remain limited. This paper proposes a reproducible UAV-assisted VANET dataset generator for short-term fragmentation risk prediction. The proposed framework simulates a two-lane highway scenario in which vehicles move in opposite directions while UAVs operate as aerial support nodes. It incorporates multiple data collection profiles, including free-flow traffic, localized accidents, sparse extended topologies, dense bursty traffic, and mixed stress conditions. During each simulation episode, the generator periodically extracts mobility, topology, UAV coverage, and communication-window features, then assigns each sample a future fragmentation label based on the network state observed after a configurable prediction horizon. An illustrative generated dataset is descriptively characterized in terms of scenario balance, UAV policy balance, future-label distribution, scenario-specific label behavior, and representative feature ranges. By providing a modular, extensible, and reproducible ns-3-based data-generation framework, this work offers a practical basis for future supervised learning studies and connectivity management strategies in UAV-assisted VANETs.
Summary / 总结
Vehicular Ad Hoc Networks (VANETs) are a key component of Intelligent Transportation Systems, enabling cooperative communication among vehicles and between vehicles and roadside infrastructure.
FLUID: Slack-based Low-latency Delivery
Authors: Michael Luby
First: 2026-05-05T16:40:28+00:00 · Latest: 2026-05-31T16:43:18+00:00
Comments: 22 pages, 3 figures, 3 tables, 18 references
Abstract
We introduce FLUID (Fountain LiqUId Delivery), a protocol that uses fountain coding and receiver feedback for low-latency delivery of data blocks over lossy networks. Idealized Automatic Repeat reQuest (ARQ) protocols are bandwidth-optimal, but must deliver every packet in a block and therefore can require additional rounds under packet loss. FLUID uses a controlled amount of slack to relax this all-packets requirement, allowing delivery to finish once enough encoded packets have been received. This yields substantially tighter delivery latency while remaining deterministically close to the ARQ bandwidth optimum. FLUID is controlled by a slack parameter $ε$. Under the Loss-Product Rule, delivery finishes once the product of packet loss fractions across transmission rounds falls below $ε$. Thus, FLUID can finish delivery in a small number of rounds even when every round experiences packet loss, while $ε$ controls the gap between FLUID and bandwidth-optimal ARQ.
Summary / 总结
We introduce FLUID (Fountain LiqUId Delivery), a protocol that uses fountain coding and receiver feedback for low-latency delivery of data blocks over lossy networks.
A Communication-Centric 6G-LLM Architecture for Scalable Tactical Autonomous Defense Vehicle Networks
Authors: Kiran Khurshid, Shumaila Javaid, Nasir Saeed
Venue: K. Khurshid, S. Javaid and N. Saeed, "A Communication-Centric 6G-LLM Architecture for Scalable Tactical Autonomous Defense Vehicle Networks," in IEEE Network, Early access, 2026
First: 2026-05-31T16:00:14+00:00 · Latest: 2026-05-31T16:00:14+00:00
Comments: 10 pages, accepted in IEEE Network Magazine
Abstract
The integration of Artificial Intelligence (AI) and emerging 6G networks introduces new opportunities for scalable coordination in tactical autonomous vehicle systems. This paper proposes a communication-centric hierarchical architecture for Tactical Autonomous Defense Vehicle Networks (TADVNs) that models the integration of edge-assisted Large Language Model (LLM) reasoning with 6G-enabled connectivity and semantic communication. The framework is designed to improve coordination efficiency, reduce communication overhead, and enhance latency resilience under increasing fleet-scale operation. Unlike conventional task-specific AI pipelines that rely on structured feature processing and rule-based coordination, the proposed approach incorporates semantic abstraction and context-aware decision support within a layered edge-cloud communication architecture. We evaluate communication and coordination performance via Monte Carlo simulations across fleet sizes of 5-30 vehicles under contested network conditions. Results indicate that at a 30-vehicle scale, the 6G-LLM configuration achieves 75.2% latency reduction (29.1 ms vs. 117.5 ms), a 68.7 percentage point increase in mission success rate (82.9% vs. 14.2%), and an 88.6% reduction in communication overhead compared to a 5G-based conventional AI baseline. These findings demonstrate measurable benefits in coordination and communication when semantic reasoning is combined with low-latency 6G connectivity.
Summary / 总结
The integration of Artificial Intelligence (AI) and emerging 6G networks introduces new opportunities for scalable coordination in tactical autonomous vehicle systems.
AI-IoT-Robotics Integration: Survey of Frameworks, Emerging Trends, and the Path Toward Connected Robotics
Authors: Ranulfo Bezerra, Satoshi Tadokoro, Kazunori Ohno
Venue: IEEE Internet of Things Journal, vol. 13, no. 10, pp. 20398-20412, 15 May15, 2026
First: 2026-05-31T05:10:34+00:00 · Latest: 2026-05-31T05:10:34+00:00
Comments: 15 pages, 3 figures, 3 tables. Published in IEEE Internet of Things Journal
Abstract
The convergence of Artificial Intelligence, the Internet of Things, and Robotics is no longer a futuristic vision; it is rapidly becoming the foundation of real-time, intelligent, and context-aware systems. AI enables perception and reasoning, IoT provides scalable sensing and communication, and robotics delivers embodied actuation. Despite significant progress in pairwise combinations such as AIoT and the Internet of Robotic Things (IoRT), there remains a lack of unified design frameworks that fully integrate all three. This survey synthesizes the state-of-the-art across these domains, emphasizing the emerging role of Small Language Models (SLMs) at the edge and Large Language Models (LLMs) in the cloud for distributed cognition and autonomous decision-making. We propose a modular system architecture that aligns with these trends, analyze persistent gaps in interoperability and feedback control, and classify existing work by integration depth. Our review highlights how hybrid SLM-LLM systems, when coupled with IoT infrastructure and robotic agents, can address challenges in real-time adaptation, scalability, and reliability. This work offers a conceptual and technical roadmap for designing next-generation AI-IoT-Robotic ecosystems that are modular, interpretable, and capable of learning within dynamic environments, paving the way for the emerging paradigm of Connected Robotics and Physical AI.
Summary / 总结
The convergence of Artificial Intelligence, the Internet of Things, and Robotics is no longer a futuristic vision; it is rapidly becoming the foundation of real-time, intelligent, and context-aware systems.
Make a Video Call with LLM: A Measurement Campaign over Six Mainstream Apps
Authors: Jiayang Xu, Xiangjie Huang, Zijie Li, Antariksh Verma, Zili Meng
First: 2025-10-01T04:03:51+00:00 · Latest: 2026-05-30T15:21:07+00:00
Abstract
In 2025, Large Language Model (LLM) services have launched a new feature -- AI video chat -- allowing users to interact with AI agents via real-time video communication (RTC), just like chatting with real people. Despite its significance, no systematic study has characterized the performance of existing AI video chat systems. To address this gap, this paper proposes a comprehensive benchmark across four dimensions: quality, latency, internal mechanisms, and system overhead. Using custom testbeds, we further evaluate six mainstream AI video chatbots with this benchmark. We also build an online platform for user study. The measurement leads to interesting findings that could be beneficial to the future optimizations. For example, the network latency of AI video chat matters not as much as human video chat. The capabilities of AI agents matters most in the user experience. Our benchmarking results also open up several research questions for future optimizations of AI video chatbots. Availability: https://callarena.net/ for the online evaluation platform and our open-sourced dataset and testbed.
Summary / 总结
In 2025, Large Language Model (LLM) services have launched a new feature -- AI video chat -- allowing users to interact with AI agents via real-time video communication (RTC), just like chatting with real people.
AgentxGCore: Agentic AI for Next-Generation Mobile Core Network
Authors: Maria Katarine Santana Barbosa, Kelvin L. Dias
First: 2026-05-29T23:13:46+00:00 · Latest: 2026-05-29T23:13:46+00:00
Comments: This paper has been accepted for publication in IEEE Network
Abstract
To meet the stringent requirements of emerging applications and the increasingly complex network management and operation, the Next Generation Mobile Networks (NextG), or 6G, will adopt an AI-native architecture on the Core Network (CN). In this movement, the Third Generation Partnership Project (3GPP) has extended the cellular CN with new function as a first step toward integrating analytics, Artificial Intelligence (AI), and machine learning. However, those new functionalities are constrained by a centralized approach and managerial complexity. Furthermore, with the rise of Large Language Models (LLMs), a new era in network orchestration and management begins, leveraging and empowering the Intent-based Networking (IBN) paradigm. In addition, AI agents and Agentic AI integrate Reasoning and Acting (ReAct), enabling the usage of such intents to continuously interact with the network. Unlike state-of-the-art approaches that primarily employ Agentic AI to mitigate deployment and configuration complexity in the CN, this paper introduces AgentxGCore, which leverages an Agentic AI-Native layer to extend the 3GPP architecture and enable a system based on the existing APIs across the Beyond Next Generation Core (xGC) domain. This proposal establishes an AI-driven closed-loop for continuous optimization based on real-time information, enabling self-organization and self-adaptation. Our approach involves a multi-agent specialized system, divided into a network planner agent, capable of visualizing the network state and developing a plan to meet the intents, and a network executor, responsible for criticizing and executing the plan. To validate the proposed solution, an environment was built using an open-source CN, heterogeneous datasets, and different LLMs were employed to demonstrate its effectiveness.
Summary / 总结
To meet the stringent requirements of emerging applications and the increasingly complex network management and operation, the Next Generation Mobile Networks (NextG), or 6G, will adopt an AI-native architecture on the Core Network (CN).
KISS: Keeping it Simple and Slotted when Learning to Communicate over Wireless
Authors: Kamil Szczech, Maksymilian Wojnar, Krzysztof Rusek, Katarzyna Kosek-Szott, Szymon Szott
First: 2026-05-29T18:56:52+00:00 · Latest: 2026-05-29T18:56:52+00:00
Abstract
A long-standing challenge in distributed wireless systems is ensuring efficient and fair random channel access. Existing solutions often address specific constraints related to timing, periodicity, or centralization, but they typically rely on fixed heuristics. Motivated by recent advances in machine learning (ML), we investigate whether ML agents can autonomously learn efficient and fair access strategies, and whether such learning can offer new insights into medium access control (MAC) design. Rather than proposing a deployable protocol, our aim is to examine whether decentralized learning can rediscover or approximate theoretically efficient random-access mechanisms under minimal assumptions. To this end, we deploy an off-policy Double Deep Q-Network (DDQN) with Bayesian inference to train agents operating over a slotted channel. The resulting method is fully online (no pre-training), fully distributed (independent multi-agent learners), stochastic (non-periodic), and requires no coordination or explicit communication. Extensive simulations show that the learned strategy adapts to varying network conditions and achieves near-theoretical efficiency while maintaining fairness. Ablation studies further reveal that the learned behavior resembles slotted ALOHA with a dynamically adjusted transmission probability, leading us to refer to the method as KISS: Keeping It Simple and Slotted.
Summary / 总结
A long-standing challenge in distributed wireless systems is ensuring efficient and fair random channel access.
An efficient Progressive Swapping to the Middle distribution protocol adapted to imperfect quantum memories in quantum networks
Authors: Claire Mesny, Fabrice Guillemin, Claire Goursaud
First: 2026-05-29T16:15:01+00:00 · Latest: 2026-05-29T16:15:01+00:00
Comments: Presented at 2026 EuCNC & 6G Summit, 2-5 June, Malaga, Spain
Abstract
The distribution of entangled pairs of photons on the links composing a quantum network, combined with Bell state measurements and teleportation, is the basic apparatus to transfer quantum bits (qubits) over long distances. Entanglement distribution establishes an end-to-end entangled pair while consuming intermediate pairs on links and holding them for a certain time period. The technical literature identifies two main kinds of protocols, parallel and sequential ones, the latter having an advantage in resource consumption over the former. In this paper, we introduce an efficient swapping protocol called Progressive Swapping to the Middle (PSM) as it combines the existing Progressive Swapping (PS) protocol from both extremities of a path that meet in the middle where the received pairs are swapped. We compare PSM with two parallel protocols and PS; in our evaluation, we take into account imperfect memories and fidelity degradation. We demonstrate that PSM yields a much better link probability than PS while keeping a reasonable link fidelity, and shows an advantage in resource consumption over other protocols.
Summary / 总结
The distribution of entangled pairs of photons on the links composing a quantum network, combined with Bell state measurements and teleportation, is the basic apparatus to transfer quantum bits (qubits) over long distances.
Entanglement distribution protocols under imperfect fidelity and quantum memory conditions
Authors: Claire Mesny, Fabrice Guillemin, Claire Goursaud
First: 2026-05-29T14:26:09+00:00 · Latest: 2026-05-29T14:26:09+00:00
Comments: Presented at the Second Workshop on Workshop on Quantum Networked Applications and Protocols (QuNAP 2026), organized in conjunction with IEEE International Conference on Computer Communications, May 18, 2026
Abstract
The rapid development of quantum computers and sensors urges for the development of a quantum Internet capable of transmitting quantum bits over long distances. Photons used for quantum data transfer are fragile over time and sensitive to their environment, so that they cannot be directly used over long distances. To remedy this problem, long distance paths are segmented into shorter links and entangled pairs of photons are distributed over these links and swapped to create end-to-end entangled pairs over long distances, eventually used for teleportation. In this paper, we develop an existing protocol taking account of fidelity and imperfect memories. We shorten the execution time and thus increase its link success probability creating the so-called Locally Heralded Distribution (LHD). It turns out that the proposed protocol outperforms some previous protocols. We benchmark through simulation the performances of protocols considered in this paper by using a blind entanglement protocol as a baseline.
Summary / 总结
The rapid development of quantum computers and sensors urges for the development of a quantum Internet capable of transmitting quantum bits over long distances.
MeshGuard: MUD-Based Network Access Control for Large-Scale Thread-Powered IoT Networks
Authors: Dominik Roy George, Wouter van Hoof, Habib Mostafaei, Savio Sciancalepore
Venue: IEEE/IFIP DSN 2026 - 56th Annual IEEE/IFIP International Conference on Dependable Systems and Networks
First: 2026-05-29T14:00:42+00:00 · Latest: 2026-05-29T14:00:42+00:00
Comments: Accepted at IEEE/IFIP DSN 2026 - 56th Annual IEEE/IFIP International Conference on Dependable Systems and Networks
Abstract
The IETF standard Manufacturer Usage Description (MUD) enables manufacturers to equip IoT devices with certified URLs that provide traffic profiles for those devices, helping administrators enforce network access control. However, MUD assumes devices operate on full IP stacks and therefore does not account for constrained IoT devices running Thread--the dominant low-power mesh networking standard--which lacks complete TCP/IP functionality. While prior work proposes extensions to support MUD in Thread environments, these approaches are limited to simple topologies with a single border router and do not scale to realistic deployments with multiple, heterogeneous border routers. We introduce MeshGuard, a framework enabling MUD-based access control in complex Thread networks, with any number of border routers. MeshGuard extends the Mesh Link Establishment (MLE) protocol to deliver MUD information from constrained devices to border routers regardless of network topology. Moreover, MeshGuard leverages Software-Defined Networking (SDN) to synchronize access control lists across all routers. Experiments on our proof-of-concept with real devices (nRF5340, nRF52833, Raspberry-Pi 3) demonstrate enhanced security, minimal overhead, and linear scalability compared to state-of-the-art approaches.
Summary / 总结
The IETF standard Manufacturer Usage Description (MUD) enables manufacturers to equip IoT devices with certified URLs that provide traffic profiles for those devices, helping administrators enforce network access control.
Thou Shall Not Pass: Gatekeeping Outbound TLS Connections
Authors: Henrique B. Brum, Matteo Franzil, Riccardo Germenia, Salvatore Manfredi, Domenico Siracusa, Luis A. Dias Knob
First: 2026-05-29T08:53:24+00:00 · Latest: 2026-05-29T08:53:24+00:00
Comments: 13 pages, 10 figures. This manuscript has been submitted to IEEE Transactions on Information Forensics and Security
Abstract
Despite the widespread use of Transport Layer Security (TLS), its security guarantees are frequently compromised by outdated versions and misconfigurations. To analyze this problem, we collected more than 50 million TLS handshakes over a two-week period at our research institution, Fondazione Bruno Kessler, and analyzed three server-selected parameters against the recommendations of four TLS guidelines. Our analysis shows that while the use of insecure or outdated options is minimal, it remains persistent. More importantly, servers are adopting the latest TLS advancements much faster than official guidelines can be updated to provide directives for them. These findings, combined with the difficulty of configuring TLS clients due to their ephemeral, ubiquitous and server-dependent nature, leave users vulnerable to non-standard or outright insecure connections. To address this, we present TLSGatekeeper, a real-time, network-based tool that transparently monitors handshakes, analyzes server parameters, and, based on organizational policy, reports non-compliant connections without requiring client-side modifications. Unlike Next-Generation Firewalls, TLSGatekeeper preserves end-to-end privacy by validating only handshakes, and offers greater flexibility in defining undesired configurations. Our evaluation shows that TLSGatekeeper sustains traffic rates of up to 100 Gbps while preventing insecure connections, with an average added processing delay of 671 ns (TLS 1.3) and 795 ns (TLS 1.2) per handshake packet, making enforcement feasible at scale.
Summary / 总结
Despite the widespread use of Transport Layer Security (TLS), its security guarantees are frequently compromised by outdated versions and misconfigurations.
HetCCL: Enabling Collective Communication For Mixed-Vendor Heterogeneous Clusters
Authors: Yuejie Wang, Tao Chang, Yuanyuan Zhao, Yulong Ao, Zeyu Gu, Zhiyu Li, Yanmin Jia, Yan Zhang, Mingjun Zhang, He Liu, Yongzhe He, Yonghua Lin, Guyue Liu
First: 2026-05-29T08:34:49+00:00 · Latest: 2026-05-29T08:34:49+00:00
Abstract
Training Large Language Models (LLMs) on heterogeneous clusters presents significant challenges for collective communication, as hardware from multiple vendors introduces diverse network and computational characteristics. Existing collective communication frameworks (e.g., NCCL, RCCL) designed for homogeneous environments fail to address mixed-hardware setups, while communication libraries with heterogeneous support (e.g., Gloo, OpenMPI) incur heavy overhead in the data path. This paper presents HetCCL, a framework that enables heterogeneous collective communication by efficient P2P transport across heterogeneous devices (e.g., GPUs), eliminating the host-device memory copy overhead while offloading the control to the CPUs. For combining collectives (e.g., AllReduce, ReduceScatter), HetCCL introduces a border-communicator mechanism that achieves vendor independence by using the intrinsic reduction in the combining collectives in vendor collective communication libraries. With efficient heterogeneous P2P transport and portable reduction mechanism, HetCCL proposes a hierarchical topology abstraction for heterogeneous clusters, dissecting collective communication into cluster-level primitives that guarantee optimal cross-cluster data transfer volume and optimal bandwidth utilization. We implement HetCCL with 4 different vendor support and evaluate it in 4 heterogeneous settings with benchmarks and end-to-end LLM tasks. Our evaluation shows that HetCCL achieves 17-19x higher bandwidth than Gloo in heterogeneous communications, and speeds up end-to-end training by up to 16.9% in the per-step-time.
Summary / 总结
Training Large Language Models (LLMs) on heterogeneous clusters presents significant challenges for collective communication, as hardware from multiple vendors introduces diverse network and computational characteristics.
A distributed routing protocol for sending data from things to the cloud leveraging fog technology in the large-scale IoT ecosystem
Authors: Mohammad Reza Akbari, Hamid Barati, Ali Barati
First: 2025-10-03T21:40:13+00:00 · Latest: 2026-05-28T21:58:16+00:00
Comments: The authors are withdrawing this manuscript due to technical inaccuracies identified after submission. A revised and corrected version may be submitted in the future
Abstract
Fog computing integrates cloud and edge resources. According to an intelligent and decentralized method, this technology processes data generated by IoT sensors to seamlessly integrate physical and cyber environments. Internet of Things uses wireless and smart objects. They communicate with each other, monitor the environment, collect information, and respond to user requests. These objects have limited energy resources since they use batteries to supply energy. Also, they cannot replace their batteries. As a result, the network lifetime is limited and short. Thus, reducing energy consumption and accelerating the data transmission process are very important challenges in IoT networks to reduce the response time. In the data transmission process, selecting an appropriate cluster head node is very important because it can reduce the delay when sending data to the fog. In this paper, cluster head nodes are selected based on several important criteria such as distance, residual energy, received signal strength, and link expiration time. Then, objects send the processed data to the server hierarchically through a balanced tree. The simulation results show that the proposed method outperforms the energy-efficient centroid-based routing protocol (EECRP) and the Emergency Response IoT based on Global Information Decision (ERGID) in terms of packet delivery rate, delay, response time, and network lifetime.
Summary / 总结
Fog computing integrates cloud and edge resources.
AtlasRAN: Timing-Aware Evaluation of Open-source 5G Platforms for Integrated Wireless Testbeds
Authors: Ryan Barker, Tolunay Seyfi, Alireza Ebrahimi Dorcheh, Julia Boone, Fatemeh Afghah, Joseph Boccuzzi
First: 2026-03-15T23:34:49+00:00 · Latest: 2026-05-28T20:53:00+00:00
Comments: 6 pages, 4 figures, 2 tables
Abstract
Open-source 5G and O-RAN experimentation now spans discrete-event simulators, host-OS emulators, SDR hardware-in-the-loop testbeds, O-RU/Open Fronthaul deployments, wireless digital twins, and accelerator-backed RAN runtimes. These environments may expose similar protocol interfaces while preserving very different timing, I/O, synchronization, buffering, transport, and observability behavior. Thus, studies that appear to measure the same network property may instead measure different execution harnesses: functional compatibility is not timing fidelity. This paper presents AtlasRAN, a timing-aware evaluation framework for deciding what an open-source 5G platform can credibly measure. AtlasRAN provides two reference architectures: a CPU-centric path spanning software emulation, SDR/HIL, and O-RU/OFH execution, and an accelerator/twin path spanning offline modeling, code-realistic twins, and real-time AI-RAN runtimes, plus a compact claim-to-capability matrix. We ground the framework in a CU--DU uplink load study comparing OpenAirInterface RFSim with the Sionna Research Kit, which offloads LDPC decoding to CUDA while retaining much of the surrounding OAI host-OS emulation path. As UE concurrency increases, OAI goodput falls from 114.59 Mb/s at one UE to 16.35 Mb/s in the degraded twelve-UE region, while Sionna-RK falls from 103.34 Mb/s to 16.15 Mb/s. Fairness remains near ideal, CPU/GPU utilization falls with load, and the RFSim real-time factor drops below unity, indicating that the accelerated decoder is under-fed by host-OS inter-process communication and timing effects rather than saturated. AtlasRAN therefore argues that integrated wireless testbeds and digital twins should report timing discipline, transport path, memory movement, and observability as first-class experimental variables.
Summary / 总结
Open-source 5G and O-RAN experimentation now spans discrete-event simulators, host-OS emulators, SDR hardware-in-the-loop testbeds, O-RU/Open Fronthaul deployments, wireless digital twins, and accelerator-backed RAN runtimes.
TraceCodec: A Compiler-Backed Neural Codec for Stateful Multi-Flow Network Traffic Traces
Authors: Junhui Ding, Xinchen Zhang, Xiaohui Xie, Shinan Liu
First: 2026-05-28T13:52:40+00:00 · Latest: 2026-05-28T13:52:40+00:00
Abstract
Critical networking workflows require high-fidelity packet captures (PCAPs) for testing, security analysis, and protocol validation, not just statistical flow-level summaries. Recent packet generators have demonstrated protocol-constrained PCAP synthesis, but they universally decode directly to raw packet fields. That interface entangles learned behavioral choices with deterministic protocol consequences, which forces packet realization to depend on post-hoc heuristic repair. We identify this decode interface as the fundamental bottleneck and present TraceCodec, a state-aware neural codec for stateful multi-flow traces. TraceCodec lifts each packet into a timed packet action with explicit flow slots and transport cues, then learns a continuous per-packet latent. A deterministic compiler lowers decoded actions back to PCAPs, owning endpoint assignment, TCP state, legality constraints, and packet rendering. The latent layer exposes a generator-facing sequence space, so downstream traffic models can operate on packet-action latents rather than raw header fields. On CICIDS2017 Monday, TraceCodec matches packet count, protocol composition, and flow population to within 0.03%. Raw-field baselines under the same non-repair policy distort flow counts and TCP state by orders of magnitude. Structural diagnostics show that TraceCodec preserves TCP state transitions and multi-flow interleaving that raw-field decoders fragment. This work establishes a new foundation for high-fidelity packet-trace generation.
Summary / 总结
Critical networking workflows require high-fidelity packet captures (PCAPs) for testing, security analysis, and protocol validation, not just statistical flow-level summaries.
A Comprehensive Protocol Stack for Quantum Networks with a Global Entanglement Module
Authors: Xiaojie Fan, C. R. Ramakrishnan, Himanshu Gupta
First: 2025-09-20T21:39:25+00:00 · Latest: 2026-05-28T01:41:34+00:00
Abstract
The development of large-scale quantum networks requires not only advances in physical-layer technologies but also a comprehensive protocol stack that integrates communication, control, and resource management across all layers. We present the first such protocol stack, which introduces a Global Entanglement Module (GEM) that maintains a consistent, network-wide view of entanglement resources through distributed synchronization strategies. By enabling real-time adaptive execution of entanglement distribution plans, GEM bridges the gap between static planning and dynamic operation. The stack naturally supports pre-distributed entanglement, purification, and multi-partite state generation, making it applicable to a broad range of quantum networking applications. We design and evaluate multiple adaptive heuristics for real-time execution and show that a lightweight scoring-based strategy consistently achieves the best performance, improving entanglement generation rates by about 20% over a globally optimal but non-adaptive fixed-tree baseline and achieving more than a two-fold improvement relative to recent connectionless approaches. Across all scenarios-including predistribution and fidelity analysis-GEM consistently enables lower latency and robust operation. These results establish a practical pathway toward scalable, adaptive quantum internet systems.
Summary / 总结
The development of large-scale quantum networks requires not only advances in physical-layer technologies but also a comprehensive protocol stack that integrates communication, control, and resource management across all layers.
Dyna-5G: Dynamic Role Switching for Self-Organizing 5G M2M Networks
Authors: Evangelos Bitsikas, Adam Belfki, Aanjhan Ranganathan
First: 2024-06-21T23:11:45+00:00 · Latest: 2026-05-27T19:53:50+00:00
Abstract
M2M deployments such as drone swarms demand mission-critical communication: km-scale range, strong per-device identity and mutual authentication, and deterministic QoS for bandwidth-intensive payloads. Cellular 5G uniquely satisfies all of these, yet it has seen limited adoption in autonomous fleets. The barrier is not capability but resilience: today's 5G networks assume fixed infrastructure, and when the base station fails, recovery is uniquely complex. Unlike simpler wireless protocols where devices can transparently switch nodes, 5G failure requires reconstructing distributed state such as authentication contexts, QoS bindings, tunnels, and RRC state machines across the fleet, a process that no existing system automates. We present Dyna-5G, which makes this happen. Dyna-5G is the first 5G Standalone-compliant framework for dynamic role switching in M2M fleets, where any device can assume the role of 5G Core, RAN, or UE at runtime. It orchestrates failure detection, leader selection, and coordinated state teardown and re-establishment, all without modifying 3GPP protocols. We evaluate Dyna-5G on a high-fidelity software emulation testbed, with Open5GS and srsRAN, across hundreds of trials with up to 10 drones. Control-plane overhead averages 0.47 Mb/s (approximately 0.47% of a 100 Mb/s bearer), while failure recovery completes in about 2.5 s, of which approximately 86% is due to stack-dependent cellular procedures. Dyna-5G's orchestration logic itself adds only about 175 ms per reconfiguring role. All tested missions complete successfully, even under injected leader crashes.
Summary / 总结
M2M deployments such as drone swarms demand mission-critical communication: km-scale range, strong per-device identity and mutual authentication, and deterministic QoS for bandwidth-intensive payloads.
OpenURMA: A Clean-Room Open Implementation of the Unified Bus Protocol
Authors: Bojie Li
First: 2026-05-27T16:38:57+00:00 · Latest: 2026-05-27T16:38:57+00:00
Abstract
Modern datacenter RDMA is bottlenecked at the network interface, not the wire. A NIC running RoCE or InfiniBand holds per-connection state for every (application, remote-endpoint) pair - hundreds of megabytes at 1024-application fanout - and pays a four-traversal PCIe round trip on a 64-byte operation, inflating latency an order of magnitude beyond the wire. Both follow from the Queue Pair over PCIe abstraction RDMA inherits from InfiniBand. Huawei's Unified Bus (UB), a public 2025 specification, changes the abstraction: it decouples per-application endpoint state from per-host transport state so connection context grows additively, exposes ordering as opt-in, and reaches remote memory through native CPU load/store to an on-chip-bus controller. UB ships in Huawei's closed Ascend 950 silicon. OpenURMA is the first clean-room open implementation of UB's transport and transaction layers, realised at three tiers - synthesisable RTL on Alveo U50, a cycle-level two-node SystemC simulator, and a gem5 full-system scaffold - each with a matched OpenRoCE (RoCEv2 RC) baseline. The contribution is the implementation, harness, and controlled comparison closed silicon does not admit. On the canonical 64-byte remote fetch - LOAD on UB-spec Sec.8.3, READ on RoCEv2 RC - UB's load/store path delivers ~500 ns end-to-end, 4.37x below the matched baseline (2186 ns), sustains 2.80x higher throughput, and fits in ~14% of a U50's LUTs.
Summary / 总结
Modern datacenter RDMA is bottlenecked at the network interface, not the wire.
Scaling Multi-agent Systems: A Smart Middleware for Improving Agent Interactions
Authors: Charles Fleming, Guillaume De Saint Marc, Ramana Kompella, Peter Bosch, Vijoy Pandey
First: 2026-04-03T19:58:01+00:00 · Latest: 2026-05-27T15:58:08+00:00
Abstract
As Large Language Model (LLM) based Multi-Agent Systems (MAS) evolve from experimental pilots to complex, persistent ecosystems, the limitations of direct agent-to-agent communication have become increasingly apparent. Current architectures suffer from fragmented context, stochastic hallucinations, rigid security boundaries, and inefficient topology management. This paper introduces Cognitive Fabric Nodes (CFN), a novel middleware layer that creates an omnipresent "Cognitive Fabric" between agents. Unlike traditional message queues or service meshes, CFNs are not merely pass-through mechanisms; they are active, intelligent intermediaries. Central to this architecture is the elevation of Memory from simple storage to an active functional substrate that informs four other critical capabilities: Topology Selection, Semantic Grounding, Security Policy Enforcement, and Prompt Transformation. We propose that each of these functions be governed by learning modules utilizing Reinforcement Learning (RL) and optimization algorithms to improve system performance dynamically. By intercepting, analyzing, and rewriting inter-agent communication, the Cognitive Fabric ensures that individual agents remain lightweight while the ecosystem achieves coherence, safety, and semantic alignment. We evaluate the effectiveness of the CFN on the HotPotQA and MuSiQue datasets in a multi-agent environment and demonstrate that the CFN improves performance by more than 10\% on both datasets over direct agent to agent communication.
Summary / 总结
As Large Language Model (LLM) based Multi-Agent Systems (MAS) evolve from experimental pilots to complex, persistent ecosystems, the limitations of direct agent-to-agent communication have become increasingly apparent.
Efficient and Quantum-safe Internet Key Exchange Protocols for Satellite Communications
Authors: Davide De Zuane, Marco Baldi, Paolo Santini, Grégoire Anchelergues, Daniele Romano, Alessandro Cammarano, Juan José Grosso
First: 2026-05-27T15:58:04+00:00 · Latest: 2026-05-27T15:58:04+00:00
Comments: 6 pages, accepted for presentation at IEEE LANMAN 2026
Abstract
This paper studies cryptographic key exchange in satellite communications, which requires specific solutions because the satellite context presents unique challenges, particularly concerning onboard resource constraints and long transmission latency. We address these challenges by considering the Internet Key Exchange (IKE) protocol, which is widely used in terrestrial networks, and studying its applicability in the satellite context. This requires addressing two main issues: i) its efficiency in terms of the resources and bandwidth required to adapt to satellite terminals, and ii) its resistance even to attackers equipped with a quantum computer, in order to resist obsolescence and defend against harvest-now-decrypt-later attacks. We study these aspects from both a design and experimental point of view, defining and assessing some protocol variants characterized by low complexity and quantum resistance. To address the need to manage the transition from classic cryptographic primitives to post-quantum ones, we also consider the possibility of using hybrid cryptographic solutions that combine them both.
Summary / 总结
This paper studies cryptographic key exchange in satellite communications, which requires specific solutions because the satellite context presents unique challenges, particularly concerning onboard resource constraints and long transmission latency.
Automated Heuristic Design for Network Operations
Authors: Reza Namvar, José Gallego, Jose A. Ayala-Romero, Livia Elena Chatzieleftheriou, Andres Garcia-Saavedra, Albert Banchs, Marco Fiore
First: 2026-05-27T09:20:51+00:00 · Latest: 2026-05-27T09:20:51+00:00
Abstract
Network operation relies on heuristics to solve many tasks rapidly and efficiently across the protocol stack. These heuristics are the result of thorough human-driven design rooted in expert knowledge of the target system and problem. Recently, approaches powered by Artificial Intelligence have shown promising results in devising solutions that outperform long-established heuristics in classical problems. We explore the possibility of applying such Automated Heuristic Design (AHD) frameworks to network environments by (i) discussing the general integration of AHD with network operation and the associated challenges, as well as (ii) proposing a practical implementation of AHD for a specific networking task, i.e., 5G decoding. Initial results show how modern AHD tools can devise heuristics for Low-Density Parity Check decoding on par with state-of-the-art solutions implemented in production systems.
Summary / 总结
Network operation relies on heuristics to solve many tasks rapidly and efficiently across the protocol stack.
Kernel-Level Per-Slice UPF Latency Measurement in Containerised 5G Core Networks
Authors: Akhil Dev Mishra, Mayank Pandey
First: 2026-05-27T09:08:43+00:00 · Latest: 2026-05-27T09:08:43+00:00
Comments: 4 pages, 3 figures, dataset and code at https://github.com/MP-Akhil-5G/open5gs-slice-measurement
Abstract
The 5G Core User Plane Function is responsible for packet forwarding, GTP-U decapsulation, and quality of service enforcement for every user data session. How the UPF behaves under simultaneous multi-slice workloads remains empirically uncharacterised in the open literature. Specifically, how its forwarding latency responds to load, how well it isolates one slice from another, and what timing budgets remain available for intelligent control are all open questions. This paper presents a measurement study conducted on a containerised open5GS deployment with three concurrent network slices. We design and implement a namespace-aware TC-BPF instrumentation framework that resolves the fundamental obstacle preventing existing tools from attributing latency observations to individual containerised network functions. We deploy eMBB, URLLC, and mMTC slices with realistic application traffic under light, medium, and heavy load conditions and collect approximately 28 million matched N3 to N6 forwarding delay pairs. The gathered results reveal that eMBB forwarding delay is load-sensitive with the 99th percentile growing from 574 to 1,243 microseconds across load conditions. URLLC delay is load-insensitive, confirming per-UPF process isolation. mMTC exhibits wide-tail TCP behaviour. On this platform, N4 PFCP session modification latency remains consistently below 200 microseconds regardless of data-plane load, suggesting substantial timing headroom within the two-millisecond budget assumed by AI-driven UPF orchestration designs. The instrumentation framework, experiment scripts, and dataset schema are released at https://github.com/MP-Akhil-5G/open5gs-slice-measurement.
Summary / 总结
The 5G Core User Plane Function is responsible for packet forwarding, GTP-U decapsulation, and quality of service enforcement for every user data session.
A Preliminary Assessment of Midhaul Links at 140 GHz using Ray-Tracing
Authors: Sravan Reddy Chintareddy, Marco Mezzavilla, Sundeep Rangan, Morteza Hashemi
First: 2026-05-26T23:42:53+00:00 · Latest: 2026-05-26T23:42:53+00:00
Abstract
The ever-growing demand for mobile data necessitates a transport network architecture that can withstand the 5G-and-beyond multi-Gbps traffic requirements. To cater for such unprecedented demand, studies are being conducted to incorporate TeraHertz (THz) communications in future mobile networks. In this paper, we consider an urban environment and evaluate the feasibility of THz wireless midhaul links for the transport networks between the Central Units (CU) and Distributed Units (DU) in a disaggregated 5G network architecture with functional splits. Our goal is to study the feasibility of midhaul links at 140 GHz by minimizing the number of required CUs to serve all the DUs. To this end, we define several policies for selecting CU and DU nodes in order to determine the peak data rate that can be supported over each link between a CU and DU. Our numerical results based on ray-tracing suggest that wireless links at 140 GHz with 3GPP option 2 as High Layer Split (HLS) represents a promising technology for midhaul transport networks.
Summary / 总结
The ever-growing demand for mobile data necessitates a transport network architecture that can withstand the 5G-and-beyond multi-Gbps traffic requirements.
RouteProfile: Graph-Based Profiling for Cold-Start LLM Routing
Authors: Jingjun Xu, Hongji Pu, Tao Feng, Haozhen Zhang, Jiaxuan You, Ge Liu
First: 2026-04-30T19:56:08+00:00 · Latest: 2026-05-26T23:15:01+00:00
Abstract
LLM routing is increasingly important for selecting suitable models under diverse user needs and deployment constraints, but its practical effectiveness depends on continual adaptation to emerging queries and newly released models. New-LLM integration is particularly challenging, as newly released models lack the query-response-reward interactions required for router training and cannot be profiled as directly as new queries via semantic embeddings. Existing profiles are limited: LLM-generated descriptions are often coarse, while interaction-based embeddings are costly to construct. To address this problem, we propose RouteProfile, a graph-based profiling framework that constructs LLM profiles from public signals in technical reports or model cards, including model family, model description, reported benchmark scores, and benchmark domains. RouteProfile organizes these heterogeneous signals into a graph and studies profile construction along four dimensions: organizational form, representation type, aggregation depth, and learning configuration. We evaluate RouteProfile in training-free cold-start routing and new-LLM integration settings. Experiments show that: (1) structured profiles outperform flat baselines in training-free cold-start routing; (2) model family metadata is more reliable than benchmark domain information; and (3) effective new-LLM integration requires profile-router co-design. Overall, our findings highlight the importance of profile design for enabling routing systems to adapt to the evolving model ecosystem.
Summary / 总结
LLM routing is increasingly important for selecting suitable models under diverse user needs and deployment constraints, but its practical effectiveness depends on continual adaptation to emerging queries and newly released models.
Characterizing the Configuration of Starlink Queuing
Authors: Johan Garcia, Simon Sundberg, Anna Brunstrom
First: 2026-05-26T21:43:56+00:00 · Latest: 2026-05-26T21:43:56+00:00
Comments: This is an author-supplied definitive version of a paper accepted at IMC'26 cycle 1 on 2nd Feb 2026. First submitted to IMC'25 in 15th May 2025 and accepted with one-shot-revision on 15th Aug 2025. Please cite with ACM reference format once the ACM DOI is active. 10 pages, 7 figures
Abstract
In all networking systems, queuing is important to ensure appropriate resource utilization in the presence of bursty traffic and varying traffic demands. The Starlink access network is additionally also dynamic in terms of the capacity it can provide, and thus queuing plays an even greater role to ensure appropriate communication performance for the end-users while maintaining high resource utilization. However, for Starlink most system design details, along with the setup of the internal queuing, is private information and not publicly available. To address this we have developed a high-precision, burst-pattern controlled, traffic generation approach allowing us to precisely measure the one-way delay for Starlink. By analyzing the delay and loss in conjunction with a queue simulator we find that Starlink does not employ per-flow fair queuing or drop-tail buffers, but it does use drop-front buffer management. While drop-front reduces delay, it may also interfere with the assumptions made by loss-based congestion controls, potentially contributing to throughput degradation.
Summary / 总结
In all networking systems, queuing is important to ensure appropriate resource utilization in the presence of bursty traffic and varying traffic demands.
GENESIS: Harnessing AI Agents for Autonomous 6G RAN Synthesis, Research, and Testing
Authors: Tamerlan Aghayev, Maxime Elkael, Michele Polese, Minh Dat Nguyen, Gabriele Gemmi, Andrea Lacava, Ali Saeizadeh, Reshma Prasad, Paolo Testolina, Angelo Feraudo, Soumendra Nanda, Pedram Johari, Salvatore D'Oro, Tommaso Melodia
First: 2026-05-26T17:58:43+00:00 · Latest: 2026-05-26T17:58:43+00:00
Comments: 18 pages, 16 figures
Abstract
Cellular research and development (R&D) is throttled by six structural processes that each consume months of manual engineering work per iteration: (i) synthesizing new features from standards or research papers into production code; (ii) conformance and interoperability testing; (iii) hardening against field anomalies and diverse deployment environments; (iv) data-driven optimization of network functionalities; (v) discovering and prototyping novel waveforms, functionalities, and capabilities for future standards; and (vi) securing the stack against vulnerabilities. Although Large Language Models (LLMs) have compressed comparable R&D work in general software engineering from days to minutes, their known pitfalls worsen on Radio Access Network (RAN) use cases: they hallucinate Application Programming Interfaces (APIs) and mis-read specifications, which kills interoperability of RAN components at the first mistake, and they heavily rely on simulations for designing algorithms, which is notorious for breaking when transferred to real hardware. To address these challenges, we present GENESIS, an agentic Artificial Intelligence (AI) framework that converts intents (e.g., a specification clause, a telemetry anomaly, or a research hypothesis) into solutions validated with over-the-air experiments, fed back into a persistent knowledge base. GENESIS is built on three composable primitives (agents, skills, hooks) and a knowledge layer (SYNAPSE) that doubles as the source of ground truth and the recipient of every artifact the framework produces, making capabilities compound across runs.
Summary / 总结
Cellular research and development (R&D) is throttled by six structural processes that each consume months of manual engineering work per iteration: (i) synthesizing new features from standards or research papers into production code; (ii) conformance and interoperability testing; (iii) hardening against field anomalies and diverse deployment environments; (iv) data-driven optimization of network functionalities; (v) discovering and prototyping novel waveforms, functionalities, and capabilities for future standards; and (vi) securing the stack against vulnerabilities.
Latency in Real-Time 3D Volumetric Streaming: A Comprehensive Study
Authors: Seungwoo Hong, Hosun Yoon, Seong Moon, Inayat Ali
First: 2026-05-21T08:05:45+00:00 · Latest: 2026-05-26T02:39:59+00:00
Comments: 6 pages, 11 figures
Abstract
Real-time 3D volumetric streaming is a transformative technology that enables the seamless transmission and rendering of high-fidelity 3D models, enhancing applications in virtual reality (VR), augmented reality (AR), gaming, telepresence, and remote collaboration. However, latency remains a major challenge, affecting immersion, causing motion sickness, and disrupting real-time interactions. Addressing these latency issues is essential for improving user experience and ensuring system efficiency. This study conducts a comprehensive latency measurement and analysis within a real-time volumetric streaming environment. We systematically break down the streaming process into three key layers: the application layer, the transport protocol layer, and the network layer. By evaluating each layer in a real-world system, we identify latency bottlenecks, quantify their impact, and uncover the underlying causes of delay. Based on these findings, we propose targeted optimization strategies to mitigate latency and enhance system responsiveness. Through this research, we establish best practices and innovative solutions to improve the efficiency, scalability, and overall user experience of real-time 3D volumetric streaming. Our insights contribute to advancing the field, paving the way for more immersive and responsive digital environments.
Summary / 总结
Real-time 3D volumetric streaming is a transformative technology that enables the seamless transmission and rendering of high-fidelity 3D models, enhancing applications in virtual reality (VR), augmented reality (AR), gaming, telepresence, and remote collaboration.
Beyond Traffic Matrix: DELTA -- A DAG-Aware OCS Logical Topology Optimization for AIDCs
Authors: Niangen Ye, Jingya Liu, Guofu Zhu, Weiqiang Sun, Weisheng Hu
First: 2026-03-30T06:54:48+00:00 · Latest: 2026-05-26T00:48:38+00:00
Abstract
The rapid scaling of large language models (LLMs) exacerbates communication bottlenecks in AI data centers (AIDCs). To overcome this, optical circuit switches (OCS) are increasingly adopted for their superior bandwidth capacity and energy efficiency. However, their reconfiguration overhead precludes intra-iteration topology update, necessitating a priori engineering of a static topology to absorb time-varying LLM traffic. Existing methods engineer these topologies based on traffic matrices. However, this representation obscures the bursty concurrent bandwidth demands dictated by parallelization strategies and fails to account for the independent channels required for concurrent communication. To address this, we propose DELTA, an efficient logical topology optimization framework for AIDCs that leverages the computation-communication directed acyclic graph (DAG) to encode time-varying traffic patterns into a Mixed-Integer Linear Programming (MILP) model, while exploiting the temporal slack of non-critical tasks to save optical ports without penalizing iteration makespan. By pioneering a variable-length time interval formulation, DELTA significantly reduces the solution space compared to the fixed-time-step formulation. To scale to thousand-GPU clusters, we design a dual-track acceleration strategy that combines search space pruning (reducing complexity from quadratic to linear) with heuristic hot-starting. Evaluations on large-scale LLM workloads show that DELTA reduces communication time by up to 17.5% compared to state-of-the-art traffic-matrix-based baselines. Furthermore, the framework reduces optical port consumption by at least 20%; dynamically reallocating these surplus ports to bandwidth-bottlenecked workloads reduces their performance gap relative to ideal non-blocking electrical networks by up to 26.1%, ultimately enabling most workloads to achieve near-ideal performance.
Summary / 总结
The rapid scaling of large language models (LLMs) exacerbates communication bottlenecks in AI data centers (AIDCs).
Continuum: Efficient and Robust Multi-Turn LLM Agent Scheduling with KV Cache Time-to-Live
Authors: Hanchen Li, Runyuan He, Qiuyang Mang, Qizheng Zhang, Huanzhi Mao, Xiaokun Chen, Hangrui Zhou, Alvin Cheung, Joseph Gonzalez, Ion Stoica
First: 2025-11-04T03:43:05+00:00 · Latest: 2026-05-25T23:34:23+00:00
Abstract
KV cache management is essential for efficient LLM inference. To maximize utilization, existing inference engines evict finished requests' KV cache if new requests are waiting. This policy breaks for agentic workloads, which interleave LLM calls with tools, introducing pauses that prevent effective KV reuse across turns. Since many tool calls have much shorter durations than human response multi-turn chatbot, it would be promising to retain the KV cache in during these tools. However, many challenges remain. First, we need to consider both the potential cost of recomputation or reloading (if offloading enabled) as well as the increasing queueing delays after eviction from GPU. Second, due to the internal variance of tool call durations, the method needs to remain robust under limited predictability of tool call durations. We present Continuum, a serving system to optimize job completion time for multi-turn agent workloads by introducing time-to-live mechanism for KV cache retention. For requests that generate tool calls, Continuum selectively pins the KV cache in GPU memory with a time-to-live value determined by the reload cost and potential queueing delay induced by eviction. When the TTL expires, the KV cache can be automatically evicted to free up GPU memory, providing robust performance under edge cases. When combined with program-level first-come-first-serve, Continuum preserves multi-turn continuity, and reduces delay for agentic workflows. Evaluations on real-world agents (SWE-Bench, BFCL, OpenHand) with Llama-3.1 8B/70B, Gemma-3 12B, and GLM-4.5 355B shows that Continuum improves the average job completion times by over 8x while improving throughput.
Summary / 总结
KV cache management is essential for efficient LLM inference.
Reexamining Paradigms of End-to-End Data Movement
Authors: Chin Fang, Timothy Stitt, Michael J. McManus, Toshio Moriya
First: 2025-12-17T02:38:06+00:00 · Latest: 2026-05-25T20:03:47+00:00
Comments: 33 pages and 15 figures
Abstract
The pursuit of high-performance data transfer often focuses on raw network bandwidth. International links of 100 Gbps or higher are frequently considered the primary enabler. While necessary, this network-centric view is incomplete. It equates provisioned link speeds with practical, sustainable data movement capabilities. It is a common observation that lower-than-desired data rates manifest even on 10 Gbps links, with higher-speed networks only amplifying their visibility. We investigate six paradigms -- from network latency and TCP congestion control to host-side factors such as CPU performance and virtualization -- that critically impact data movement workflows. These paradigms represent widely accepted engineering assumptions that inform system design, procurement decisions, and operational practices in production data movement environments. We introduce the Drainage Basin Pattern conceptual model for reasoning about end-to-end data flow constraints across heterogeneous hardware and software components at varying desired data rates to address the fidelity gap between raw bandwidth and application-level throughput. Our findings are validated through rigorous production-scale deployments, from 10 Gbps links to U.S. DOE ESnet technical evaluations and transcontinental production trials over 100 Gbps operational links. The results demonstrate that principal bottlenecks often reside outside the network core, and that a holistic hardware-software co-design enables consistent, predictable performance for demanding data transports (bulk and streaming). The key goal is to transform a demanding data transfer from a struggle with unknown outcomes into a predictable, guaranteed line-rate, routine operation that anyone can do. Another goal is to rectify the general misconception that conflates complexity with expertise.
Summary / 总结
The pursuit of high-performance data transfer often focuses on raw network bandwidth.
Intelligent Detection and Mitigation of Carpet-Bombing DDoS Attacks in SDN Using Retrieval-Augmented Generation and Large Language Models
Authors: Mohammed N. Swileh, Shengli Zhang, Kai Lei
First: 2026-05-25T19:58:45+00:00 · Latest: 2026-05-25T19:58:45+00:00
Abstract
Software-Defined Networking (SDN) provides flexible and programmable network management; however, its centralized control architecture remains highly vulnerable to Distributed Denial-of-Service (DDoS) attacks, particularly Carpet-Bombing DDoS attacks that distribute malicious traffic across multiple targets to evade conventional detection mechanisms. In this paper, a Retrieval-Augmented Generation (RAG)-based framework is proposed for real-time detection and mitigation of Carpet-Bombing DDoS attacks in SDN environments. The proposed framework combines interface-level traffic features representation, semantic embedding generation, FAISS-based similarity retrieval, and Large Language Model (LLM)-driven contextual inference to classify traffic behavior without requiring conventional supervised model training or retraining. To evaluate the effectiveness of the proposed framework, extensive experiments were conducted under multiple Carpet-Bombing DDoS attack scenarios with different attack intensities. In addition, two traffic representation strategies, namely structured JSON-based representation and natural language-based representation (NLR), were investigated using multiple state-of-the-art LLMs. The experimental results demonstrate that the proposed framework achieved highly accurate and stable attack detection performance, while the framework configuration utilizing the Gemma-4-31B-IT model achieved the strongest overall detection results. Furthermore, real-time experiments confirmed the capability of the proposed framework to rapidly detect and mitigate Carpet-Bombing DDoS attacks while maintaining stable SDN network operation. The obtained results highlight the effectiveness of integrating RAG mechanisms with LLM for intelligent and adaptive SDN security analysis.
Summary / 总结
Software-Defined Networking (SDN) provides flexible and programmable network management; however, its centralized control architecture remains highly vulnerable to Distributed Denial-of-Service (DDoS) attacks, particularly Carpet-Bombing DDoS attacks that distribute malicious traffic across multiple targets to evade conventional detection mechanisms.
Neural Router: Semantic Content Matching for Agentic AI
Authors: Lauri Lovén, Abhishek Kumar, Alexander Engelhardt, Alaa Saleh, Roberto Morabito, Xiaoli Liu, Naser Hossein Motlagh, Sasu Tarkoma
First: 2026-05-25T10:58:53+00:00 · Latest: 2026-05-25T10:58:53+00:00
Comments: 35 pages, 12 figures. Combined main paper and electronic supplement, folded into one document for arXiv
Abstract
Large language models (LLMs) can serve as the semantic-matching engine of a content-based publish/subscribe broker for agentic AI across the edge-cloud computing continuum, bridging the vocabulary and modality gaps that defeat keyword and embedding filters. Framed as offline multi-label retrieval over three public datasets spanning social-media, legal, and smart-home sensor domains (six LLMs, seven baselines), our central contribution is a two-crossover cost-accuracy characterisation: an analytical context-window crossover below which a CoverAndMerge compression pipeline reduces LLM invocations, and an empirical discrimination-capacity crossover above which matching accuracy collapses independently of context budget, by a model-dependent factor of parameter count and training generation. Two findings carry practical weight: above the discrimination crossover, compression cannot recover accuracy and only frontier-scale models clear large subscription sets; and there backend choice dominates configuration choice, so model selection, not pipeline tuning, is the primary operator lever. We accompany this with three composable algorithms and a per-cluster Quality-of-Experience framework for autonomic LLM-tier selection.
Summary / 总结
Large language models (LLMs) can serve as the semantic-matching engine of a content-based publish/subscribe broker for agentic AI across the edge-cloud computing continuum, bridging the vocabulary and modality gaps that defeat keyword and embedding filters.
Communication-Efficient Hybrid Language Model via Uncertainty-Aware Opportunistic and Compressed Transmission
Authors: Seungeun Oh, Jinhyuk Kim, Jihong Park, Seung-Woo Ko, Jinho Choi, Tony Q. S. Quek, Seong-Lyun Kim
First: 2025-05-17T02:10:34+00:00 · Latest: 2026-05-25T03:16:45+00:00
Comments: 17 pages, 13 figures, 5 tables; This article has been accepted for publication in IEEE Transactions on Communications. This is the author's accepted version; the final published version will be available via IEEE Xplore
Abstract
To support emerging language-based applications using dispersed and heterogeneous computing resources, the hybrid language model (HLM) offers a promising architecture, where an on-device small language model (SLM) generates draft tokens that are validated and corrected by a remote large language model (LLM). However, the original HLM suffers from substantial communication overhead, as the LLM requires the SLM to upload the full vocabulary distribution for each token. Moreover, both communication and computation resources are wasted when the LLM validates tokens that are highly likely to be accepted. To overcome these limitations, we propose communication-efficient and uncertainty-aware HLM (CU-HLM). In CU-HLM, the SLM transmits truncated vocabulary distributions only when its output uncertainty is high. We validate the feasibility of this opportunistic transmission by discovering a strong correlation between SLM's uncertainty and LLM's rejection probability. Furthermore, we theoretically derive optimal uncertainty thresholds and optimal vocabulary truncation strategies. Simulation results show that, compared to standard HLM, CU-HLM achieves up to 206$\times$ higher token throughput by skipping 74.8% transmissions with 97.4% vocabulary compression, while maintaining 97.4% accuracy.
Summary / 总结
To support emerging language-based applications using dispersed and heterogeneous computing resources, the hybrid language model (HLM) offers a promising architecture, where an on-device small language model (SLM) generates draft tokens that are validated and corrected by a remote large language model (LLM).
TIP: A Decentralized Intent-Based Protocol for Declarative IoT Interoperability and Sandboxed Schema Adaptation
Authors: Yeison David Mejia Mosquera
First: 2026-05-25T01:28:12+00:00 · Latest: 2026-05-25T01:28:12+00:00
Comments: 12 pages, 3 figures
Abstract
Heterogeneous Internet of Things (IoT) systems suffer from fragmentation across hardware architectures, networking stacks, and data serialization formats. Existing standards (such as MQTT, COAP, and DDS) rely on address-bound, imperative routing models that require hardcoded configurations and leave no flexibility for runtime schema translation. This paper presents TIP (The Intent Protocol), a decentralized, declarative network protocol. Instead of addressing specific physical endpoints, nodes submit abstract intents specifying desired capabilities, schemas, and Quality of Service (QoS) constraints. The TIP Engine resolves matching nodes using a hybrid discovery mechanism combining local multicast DNS (mDNS) with Kademlia Distributed Hash Tables (DHT). Selection is optimized via a multi-criteria scoring algorithm incorporating network latency, historical reputation, and contract compliance. Mismatched data representations are reconciled on-the-fly inside isolated WebAssembly (WASM) sandboxes compiled dynamically from TOML specifications. Security is enforced through Ed25519 signatures, X25519 key exchanges, and ChaCha20-Poly 1305 payload encryption. Evaluation of our reference implementation in Rust and C++ shows sub-millisecond translation overhead and robust resilience under industrial conditions.
Summary / 总结
Heterogeneous Internet of Things (IoT) systems suffer from fragmentation across hardware architectures, networking stacks, and data serialization formats.
Device Context Protocol: A Compact, Safety-First Architecture for LLM-Driven Control of Constrained Devices
Authors: Dongxu Yang
First: 2026-05-24T12:37:19+00:00 · Latest: 2026-05-24T12:37:19+00:00
Comments: 15 pages, 5 figures. Reference implementation, Python package (pip install pydcp), and reproduction scripts at https://github.com/device-context-protocol/dcp
Abstract
Large language models are increasingly used as orchestrators of external tools via the Model Context Protocol (MCP), but MCP is built for software services with megabytes of memory and does not descend to the microcontrollers that dominate the long tail of physical devices. Recent work (IoT-MCP) ports MCP to edge gateways at 74 KB peak memory; this still excludes the smallest commodity MCUs and, critically, does not address the safety problem of giving an unreliable caller (an LLM that may hallucinate or be prompt-injected) direct control of physical hardware. We present the Device Context Protocol (DCP): a sub-50-byte typical frame (6-byte header + CBOR payload + optional 16-byte HMAC), a manifest schema in which capability scoping, range and type checks, dry-run evaluation, and units-as-types are protocol-layer primitives, and a host-side Bridge that rejects malformed or hallucinated calls before any byte reaches the device. Reference firmware measures 27.6 KB flash / 0.6 KB RAM on ESP32; the Python Bridge, ESP32 firmware, and a language-neutral conformance suite are MIT-licensed and public. An empirical study -- 675 tool calls produced by five LLMs across four vendors (DeepSeek, Alibaba, Zhipu, MiniMax) against six categories of adversarial prompts, with the injection category instantiating AgentDojo's attack templates -- shows DCP rejects 100% of capability-escalation attempts and 78% of prompt-injection attempts, versus 0--1% for Raw MCP and IoT-MCP, matching the expressiveness of a well-formed OpenAPI 3 schema at three orders of magnitude less firmware footprint. We position DCP as the missing layer between MCP (which is moving toward enterprise SaaS connectivity) and the physical devices it does not reach.
Summary / 总结
Large language models are increasingly used as orchestrators of external tools via the Model Context Protocol (MCP), but MCP is built for software services with megabytes of memory and does not descend to the microcontrollers that dominate the long tail of physical devices.
Clustering as Reasoning: A $k$-Means Interpretation of Chain-of-Thought Graph Learning
Authors: Xuanting Xie, Zhaochen Guo, Bingheng Li, Xingtong Yu, Zhifei Liao, Zhao Kang, Yuan Fang
Venue: ICML 2026
First: 2026-05-24T04:58:44+00:00 · Latest: 2026-05-24T04:58:44+00:00
Comments: Accepted by ICML 2026
Abstract
Chain-of-Thought (CoT) prompting has shown promise in enhancing the reasoning capabilities of large language models (LLMs) on text-attributed graphs (TAGs). This work reframes CoT-based graph learning through the principle of clustering as reasoning, offering a $k$-means interpretation of how iterative reasoning operates over graph-structured data. We observe that existing graph CoT methods rely on disjoint architectures and fixed graph representations, limiting step-by-step semantic-topological interaction and interpretability. To overcome this limitation, we propose a unified framework named KCoT that integrates CoT reasoning with graph representation learning. Our key theoretical result reveals a formal mathematical correspondence between a Transformer block and the $k$-means algorithm, allowing reasoning to be interpreted as iterative assignment and update steps. Based on this insight, we introduce a Semantic Discriminating Prompt that explicitly formulates these steps as structured CoT reasoning, together with a structure-grounded alignment strategy to fuse topological priors with evolving thought-conditioned representations. Experiments on standard benchmarks demonstrate consistent improvements over state-of-the-art methods, validating clustering as a principled mechanism for CoT-based graph learning.
Summary / 总结
Chain-of-Thought (CoT) prompting has shown promise in enhancing the reasoning capabilities of large language models (LLMs) on text-attributed graphs (TAGs).
ReclaimNet: Reclaim-Aware Network Protocols for Voluntary GPU Sharing on Campus
Authors: Wenyang Jia, Jingjing Wang, Xianneng Zou, Kai Lei
First: 2026-05-23T22:00:35+00:00 · Latest: 2026-05-23T22:00:35+00:00
Abstract
University campuses host abundant but fragmented GPU resources whose voluntary sharing is blocked by a mismatch between revocable, autonomous ownership and migration mechanisms that assume stationary failure hazards, homogeneous interconnects, and unbounded transfer windows. We present ReclaimNet, a network-layer migration protocol suite that treats provider reclaim as a first-class contract rather than a failure case, combining three mechanisms: (i) reclaim-aware checkpoint scheduling that jointly adapts to time-varying departure hazards and contended bandwidth across co-resident jobs; (ii) volatility-aware destination selection integrating topology, survival probability, and notice-window feasibility; and (iii) deadline-aware migration traffic control with edge enforcement and a submillisecond TC BPF kill-switch. A two-month deployment on a 54-node heterogeneous campus testbed reduces work loss by 66% over Slurm preempt-and-requeue and 38% over pipeline-redundancy checkpointing, with 38% shorter downtime and under 3% degradation of background research traffic. The prototype is open-sourced at the anonymous repository https://anonymous.4open.science/r/ICNP2026-ReclaimNet/.
Summary / 总结
University campuses host abundant but fragmented GPU resources whose voluntary sharing is blocked by a mismatch between revocable, autonomous ownership and migration mechanisms that assume stationary failure hazards, homogeneous interconnects, and unbounded transfer windows.
ScaleAcross Explorer: Exploring Communication Optimization for Scale-Across AI Model Training
Authors: Minghao Li, Alicia Golden, Samuel Hsia, Michael Kuchnik, Adi Gangidi, Xu Zhang, Ashmitha Jeevaraj Shetty, Zachary DeVito, Weiwei Chu, Dong He, Haoci Zhang, Yuchen Hao, Ruoming Pang, James Hongyi Zeng, Ying Zhang, Minlan Yu, Carole-Jean Wu
First: 2026-05-23T01:11:19+00:00 · Latest: 2026-05-23T01:11:19+00:00
Comments: 28 pages, 27 figures
Abstract
The rapid scaling of large language model training requires distributing GPU resources across multiple data center buildings and regions. We refer to such paradigm as "scale-across" training. As infrastructure expands, the system design space becomes increasingly intricate, encompassing new model architectures, hardware heterogeneity, and evolving communication patterns. Drawing from Meta's production experience, we highlight the complexities of deploying training jobs across a few data centers housing hundreds of thousands of GPUs. To accelerate exploration of the large design space and to enable efficient training for frontier model development, we conduct in-depth characterization of three key design dimensions: parallelism placement, parallelism scheduling, and network layer technologies. We then propose ScaleAcross Explorer, an optimizer that considers the interplay of design dimensions and holistically optimizes scale-across training. Testbed experiments and simulations demonstrate up to 64.62% training speedups over production configuration and up to 37.59% training speedups over the state-of-the-art baseline across a wide range of design points.
Summary / 总结
The rapid scaling of large language model training requires distributing GPU resources across multiple data center buildings and regions.
Network Digital Twin for Congestion-Aware Predictive Traffic Routing using Graph MPNNs
Authors: Umer Iqbal, Ashiq Anjum, Anthony S Conway, Mathias Kern, Anasol Pena Rios
First: 2026-05-23T00:53:07+00:00 · Latest: 2026-05-23T00:53:07+00:00
Abstract
Telecom networks scale with growing users and data-intensive applications, generating heavy traffic that causes congestion, reducing throughput, increasing delay, and raising computational costs. Traditional routing protocols act only after performance degradation, making them unsuitable for dynamic traffic and topological changes. Addressing these challenges requires a routing approach that adapts in real time, scales with network growth, operates without disrupting active services, and provides continuous feedback for congestion-aware traffic optimisation. The Network Digital Twin (NDT) addresses these needs by mirroring global network behaviour using Message Passing Neural Networks (MPNNs) through bidirectional communication with the physical network. To align the NDT with physical network behaviour, synthetic traffic is generated with increasing load across topological structures that incrementally scale as routers are added. These topologies are created by graph-generating models such as Erdos-Renyi, Barabasi-Albert, and Watts-Strogatz, customised with vertex degree limitations. The NDT collects performance metrics from routers and links, and MPNNs classify edges based on local vertex and global network behaviours. Based on these classifications, feedback is sent as Policy-Based Routing (PBR) protocol commands to each router, enabling optimal traffic distribution across links of the physical network.
Summary / 总结
Telecom networks scale with growing users and data-intensive applications, generating heavy traffic that causes congestion, reducing throughput, increasing delay, and raising computational costs.
Adversarial Network Imagination: Causal LLMs and Digital Twins for Proactive Telecom Mitigation
Authors: Vignesh Sriram, Yuqiao Meng, Luoxi Tang, Zhaohan Xi
First: 2026-01-09T15:15:05+00:00 · Latest: 2026-05-22T20:00:54+00:00
Abstract
Telecommunication networks experience complex failures such as fiber cuts, traffic overloads, and cascading outages. Existing monitoring and digital twin systems are largely reactive, detecting failures only after service degradation occurs. We propose Adversarial Network Imagination, a closed-loop framework that integrates a Causal Large Language Model (LLM), a Knowledge Graph, and a Digital Twin to proactively generate, simulate, and evaluate adversarial network failures. The Causal LLM produces structured failure scenarios grounded in network dependencies encoded in the Knowledge Graph. These scenarios are executed within a Digital Twin to measure performance degradation and evaluate mitigation strategies. By iteratively refining scenarios based on simulation feedback, the framework shifts network operations from reactive troubleshooting toward anticipatory resilience analysis.
Summary / 总结
Telecommunication networks experience complex failures such as fiber cuts, traffic overloads, and cascading outages.
BShare: Packet Queueing Delay-Driven Buffer Sharing for Datacenter Switches
Authors: Krishna Agarwal, Muhamad Rizka Maulana, Vamsi Addanki, Habib Mostafaei
First: 2026-05-22T19:57:33+00:00 · Latest: 2026-05-22T19:57:33+00:00
Abstract
Modern datacenter switches share packet buffers across ports to boost overall throughput and reduce packet loss. However, as buffer availability per-port-per-bandwidth unit continues to decrease, existing buffer-sharing strategies face increasing performance challenges. Recent efforts have attempted to integrate Buffer Management (BM) with Active Queue Management (AQM) to harness the advantages of both BM and AQM approaches to improve performance. While these hybrid solutions show promise, their complexity of dynamically calculating multiple factors for integration hinders generalization and efficiency. This paper presents BShare, a simple buffer sharing mechanism that uses packet queueing delay. BShare requires only a single operator-configurable parameter. Our simulation results show that BSHARE improves the flow completion time (FCT) performance of advanced transport protocols, such as PowerTCP, by up to 45.07% compared to ABM, particularly under burst-heavy datacenter workloads.
Summary / 总结
Modern datacenter switches share packet buffers across ports to boost overall throughput and reduce packet loss.
EnCoR: An end-to-end architecture for simplifying cellular networks
Authors: Wesley Woo, Zhuowei Wen, Monniiesh Velmurugan, Richard Raad, Sylvia Ratnasamy, Scott Shenker, Shaddi Hasan
First: 2026-05-21T14:16:33+00:00 · Latest: 2026-05-22T19:50:13+00:00
Abstract
Since their creation, cellular networks have made in-network mobility support a key feature of their service model. While this approach provides seamless connectivity for legacy traffic, it has the side effects of inflating end-user latency and increasing complexity and operational overhead for operators. Yet modern applications and transport protocols are increasingly mobility tolerant, prompting us to revisit the assumption that mobility must be provided as an in-network service. In this paper, we propose EnCoR (End-to-End Core and RAN), a deployable cellular network architecture that removes mobility from the core entirely. Leveraging end-to-end mobility, EnCoR eliminates tunnel-based IP anchoring while preserving compatibility with existing authentication, charging, and QoS techniques. We demonstrate that EnCoR works with unmodified phones while providing equivalent performance as traditional LTE networks for real applications including video and voice calling and video streaming. We show that EnCoR not only allows network operators to reduce end to end latency, but can also reduce the capital cost of providing low latency service to users by more than 90% compared to 3GPP networks, based on cost estimates for cellular network core and border router infrastructure provided by the FCC. Finally, we demonstrate that these gains are achieved while reducing the amount of overall handover control messaging, allowing the EnCoR core network to handle a greater number of mobility handover events than an LTE core under identical hardware constraints, achieving a 2.6x lower handover latency under load.
Summary / 总结
Since their creation, cellular networks have made in-network mobility support a key feature of their service model.
XWind: A Cross-site Router for Large Language Model Inference Serving at Renewable Energy Farms
Authors: Tella Rajashekhar Reddy, Atharva Deshmukh, Liangcheng Yu, Chaojie Zhang, Mike Shepperd, Rohan Gandhi, Anjaly Parayil, Srinivasan Iyengar, Ajay Manchepalli, Debopam Bhattacherjee
First: 2026-05-22T08:08:47+00:00 · Latest: 2026-05-22T08:08:47+00:00
Abstract
AI power demand is growing at an unprecedented rate while power grids are often ailing and struggle to keep up. Grid expansion comes with high capital expenditure and long-distance transmission losses, yet there is abundant renewable energy at the source, just not matched to demand. This paper proposes a complementary AI infrastructure deployment model, AI Greenferencing, that brings modular AI compute to renewable energy sources, focusing on wind, allowing AI footprint expansion, generating local behind-the-meter demand for renewable sites, and helping ease the growing strain on power utilities. Our feasibility analysis shows that 890+ GW of wind capacity lies within 50 ms network round trip time of Azure data centers, and that site-wise right-sizing combined with spatial complementarity of wind energy keeps aggregate fleet utilization on par with traditional deployments. To serve inference requests under variable wind power, we build XWind, a lightweight, reactive, and workload-agnostic AI inference router that uses only real-time signals: inference latency, KV-cache utilization, and queue depth, to dynamically configure sites and distribute requests. Evaluated on a real 64-GPU A100 testbed emulating three wind-powered sites with Azure production traces, XWind reduces P99 end-to-end latency by up to 52% over the strongest contender (also our idea) and by up to 98% over baselines such as power-capping and GPU idling, with consistent gains across workload types, load levels, and GPU generations.
Summary / 总结
AI power demand is growing at an unprecedented rate while power grids are often ailing and struggle to keep up.
Purification Strategy Optimization for Entanglement Routing in Quantum Networks
Authors: Javier Vecino Peñas, Ana Fernández-Vilas, Rebeca P. Díaz-Redondo, Sergio Gándara Gándara, Manuel Fernández-Veiga
First: 2026-05-22T07:45:50+00:00 · Latest: 2026-05-22T07:45:50+00:00
Comments: Accepted in IEEE qCCL'26
Abstract
Quantum networks rely on the efficient distribution of entanglement to enable long-distance quantum communication and information processing. A key challenge in these networks is the design of routing protocols capable of maintaining high quality entanglement in the presence of noise, decoherence, and imperfect operations, which progressively degrade the fidelity of entangled states through entanglement swapping. Entanglement purification provides an effective mechanism to mitigate this degradation at the cost of additional resources. In this work, we study purification-aware quantum routing and formulate the problem of selecting optimal purification strategies as an optimization task. By employing dynamic programming techniques, we identify strategies that optimally balance resource consumption and end-to-end fidelity, demonstrating the effectiveness of our approach across different scenarios.
Summary / 总结
Quantum networks rely on the efficient distribution of entanglement to enable long-distance quantum communication and information processing.
On the Performance of DCF in Full Duplex WLANs with Hidden Terminals
Authors: Anastasios C. Politis, Constantinos S. Hilas, Hristos T. Anastassiu
First: 2026-05-22T06:29:59+00:00 · Latest: 2026-05-22T06:29:59+00:00
Comments: 7 pages, 7 figures, 2022 IEEE International Black Sea Conference on Communications and Networking (BlackSeaCom)
Abstract
Full Duplex (FD) technology is considered as one of the next big leap in the evolution of modern WLANs. Allowing a node to simultaneously transmit a data frame while in receive mode, can theoretically double the system throughput. However, several requirements must be fulfilled in order for FD operation to manifest. One obvious prerequisite is that the Medium Access Control (MAC) mechanism must allow two nodes to access the shared medium simultaneously. In modern WLANs the standard MAC layer mechanism is the Distributed Coordination Function (DCF), which is specifically designed to avoid such situations. FD communications may also take place when the physical placement of the communicating parts involves the existence of hidden terminals which, in standard Half Duplex (HD) communications, imposes a significant problem. This paper investigates the performance of the Carrier Sense Multiple Access with Collision Avoidance (CSMA/CA) protocol, which constitutes the basis of the DCF mechanism, in FD WLANs with hidden terminals, and compares it with the standard HD case. Our analysis is based on performance modelling. Results indicate that, under the DCF regime, FD technology exhibits an exiguous performance improvement, in terms of saturation throughput, when compared with its half duplex counterpart.
Summary / 总结
Full Duplex (FD) technology is considered as one of the next big leap in the evolution of modern WLANs.
Relay-Based Synchronization of Replicated Data Types in Opportunistic Networks
Authors: Frédéric Guidec, Yves Mahéo
First: 2026-05-21T13:44:50+00:00 · Latest: 2026-05-21T13:44:50+00:00
Comments: 33 pages
Abstract
In Opportunistic Networks (OppNets), the dissemination of information can only rely on transient pairwise radio contacts between mobile devices (peers). Designing distributed applications that can run in such conditions is a challenge, but replicated data types, and in particular Conflict-free Replicated Data Types (CRDTs), can help meet this challenge. A CRDT is inherently replicated data type whose replicas can be updated locally, yet eventually converge thanks to an anti-entropy algorithm that allows all replicas to synchronize in the background. Whether the replicas of a CRDT can actually converge in an OppNet, and how fast they can converge, depend on the occurrence of radio contacts between mobile devices. In this paper we investigate the idea of using mobile relays as a means to boost the convergence of stated-based CRDT replicas in an OppNet. New protocols are presented that allow the synchronization of replicas and relays, and new metrics are defined to observe and characterize the convergence of replicas. Simulation results show that using relays can significantly improve this convergence, and even make it possible in scenarios where the replicas alone would be unable to converge.
Summary / 总结
In Opportunistic Networks (OppNets), the dissemination of information can only rely on transient pairwise radio contacts between mobile devices (peers).
SpaceMoE: Realizing Distributed Mixture-of-Experts Inference over Space Networks
Authors: Zhanwei Wang, Huiling Yang, Min Sheng, Khaled B. Letaief, Kaibin Huang
First: 2026-05-01T08:40:31+00:00 · Latest: 2026-05-21T12:32:36+00:00
Abstract
Leveraging continuous solar energy harvesting at high efficiency, space data centers are envisioned as a promising platform for executing energy-intensive large language models (LLMs). Recognizing this advantage, space and AI conglomerates (e.g., SpaceX, Google) are actively investing in this vision. One key challenge, however, is the efficient distributed deployment of a large-scale LLM in a satellite network due to the limited onboard computing and communication resources. This gives rise to a placement problem that involves partitioning and mapping model components to satellites such that the fundamentally different model architecture and network topology can be reconciled to ensure low-latency token generation. To address this problem, we present the Space Network of Mixture-of-Experts (SpaceMoE) framework targeting the distributed execution of a popular mixture-of-experts (MoE) model in space. The proposed placement strategies are two-level: (1) layer placement, which assigns MoE layers to satellite subnets; and (2) intra-layer expert placement, which assigns individual experts to satellites associated with the same layer/subnet. For layer placement, we exploit the ring-like communication pattern of autoregressive inference to partition the satellite constellation along the orbiting direction into subnets arranged on a ring, each hosting one MoE layer. Based on this architecture, we formulate and solve an optimization problem for intra-layer expert placement to map experts with heterogeneous activation probabilities onto satellites. The derived strategy reveals an intuitive principle: a frequently activated expert should be mapped to a satellite on a routing path with low expected latency. Experiments over a thousand-satellite constellation show that SpaceMoE achieves at least a threefold latency reduction compared with conventional random and ablation-based placement strategies.
Summary / 总结
Leveraging continuous solar energy harvesting at high efficiency, space data centers are envisioned as a promising platform for executing energy-intensive large language models (LLMs).
Eliminating Premature Termination in Multihop Rendezvous for Cognitive Radio-based Emergency Response Network
Authors: Zahid Ali, Saritha Unnikrishnan, Eoghan Furey, Ian McLoughlin, Saim Ghafoor
First: 2026-05-21T11:14:11+00:00 · Latest: 2026-05-21T11:14:11+00:00
Comments: Submitted to Results in Engineering, Elsevier
Abstract
In post-disaster environments, damaged communication infrastructure severely limits coordination among emergency response teams. Cognitive radio networks (CRNs) enable rapidly deployable communication by allowing nodes to opportunistically access available spectrum. However, existing multihop rendezvous protocols typically rely on N-1 termination conditions, which can lead to premature termination, resulting in incomplete neighbour discovery and invalid network topology formation. This work identifies this limitation as a previously overlooked issue in multihop rendezvous protocols. This paper proposes a Multihop Reliable Dual-Modular Clock Algorithm (MR-DMCA) that eliminates premature termination and ensures reliable network formation. The proposed protocol introduces a coordinate-assisted neighbour validation mechanism and an autonomous termination strategy that guarantees complete neighbour and topology discovery before protocol termination. Although implemented within MR-DMCA, the proposed validation and termination approach is applicable to a wider class of multihop rendezvous protocols. Extensive simulations demonstrate that, in a worst-case scalable scenario with 20 nodes and 20 channels under high primary radio activity (m=2), MR-DMCA achieves 100% accurate neighbour and topology discovery while reducing rendezvous time by up to 76%, 37%, and 17% compared with baseline protocols. The results highlight that addressing premature termination is critical for reliable multihop rendezvous in cognitive radiobased emergency communication networks.
Summary / 总结
In post-disaster environments, damaged communication infrastructure severely limits coordination among emergency response teams.
History
20260602_0611 20260601_0409 20260530_0509 20260529_0515 20260528_0503 20260527_0457 20260526_0424 20260524_0407 20260523_0445 20260522_0452 20260521_0500 20260520_0446 20260519_0426 20260518_0405 20260517_0402 20260516_0418 20260515_0428 20260514_0447 20260513_0430 20260512_0448 20260511_0400 20260510_0358 20260509_0408 20260508_0416 20260507_0423 20260506_0427 20260505_0436 20260504_0410 20260503_0414 20260502_0426 20260501_0429 20260430_0430 20260429_0437 20260428_0429 20260427_0405 20260426_0404 20260425_0410 20260424_0430 20260423_0426 20260422_0424 20260421_0418 20260420_0359 20260419_0358 20260418_0415 20260417_0421 20260416_0425 20260415_0426 20260414_0423 20260413_0352 20260412_0347 20260411_0356 20260410_0412 20260409_0411 20260407_0404 20260406_0347 20260405_0344 20260404_0350 20260403_0400 20260401_0408 20260331_0407 20260329_0347 20260328_0350 20260326_0357 20260325_0405 20260324_0400 20260323_0342 20260322_0340 20260321_0347 20260320_0356 20260319_0358 20260318_0405 20260317_0401 20260316_0343 20260315_0341 20260314_0344 20260313_0352 20260312_0352 20260311_0347 20260310_0350 20260309_0338 20260308_0337 20260307_0347 20260306_0402 20260305_0348 20260304_0348 20260303_0348 20260302_0336 20260301_0339 20260228_0348 20260227_0354 20260226_0402 20260225_0404 20260224_0406 20260223_0338 20260222_0339 20260221_0345 20260220_0348 20260219_0358 20260218_0358 20260217_0343 20260216_0339 20260215_0338 20260213_0401 20260212_0404 20260210_0409 20260208_0339 20260207_0349 20260206_0347 20260205_0346 20260204_0354 20260202_0337 20260201_0333 20260131_0345 20260130_0341 20260129_0344 20260128_0341 20260127_0338 20260126_0330 20260125_0329 20260124_0337 20260123_0337 20260122_0343 20260121_0424 20260119_0329 20260118_0327 20260117_0332 20260116_0339 20260115_0334 20260114_0333 20260113_0334 20260112_0331 20260111_0329 20260110_0333 20260109_0334 20260108_0335 20260107_0330 20260106_0336 20260105_0328 20260104_0328 20260103_0325 20260102_0339 20260101_0329 20251231_0333 20251230_0332 20251229_0329 20251228_0332 20251227_0329 20251226_0330 20251225_0329 20251224_0331 20251223_0332 20251222_0328 20251221_0329 20251220_0330 20251219_0330 20251218_0345 20251217_0332 20251216_0333 20251215_0333 20251214_0327 20251212_0333 20251211_0331 20251210_0332 20251209_0331 20251208_0328 20251207_0327 20251206_0330 20251205_0331 20251204_0331 20251203_0333 20251202_0335 20251201_0328 20251130_0327 20251129_0328 20251128_0327 20251127_0327 20251126_0329 20251125_0327 20251124_0327 20251123_0326 20251122_0328 20251121_0328 20251120_0329 20251119_0328 20251118_0328 20251117_0326 20251116_0325 20251115_0327 20251114_0328 20251113_0330 20251112_0329 20251111_0328 20251110_0325 20251109_0326 20251108_0328 20251107_0328 20251106_0329 20251105_0326 20251104_0327 20251103_0324 20251102_0326 20251101_0324 20251031_0328 20251030_0330 20251029_0329 20251028_0329 20251027_0322 20251026_0327 20251025_0331 20251024_0329 20251023_0329 20251022_0330 20251021_0331 20251020_0328 20251019_0321 20251018_0327 20251017_0320 20251016_0328 20251015_0328 20251014_0323 20251011_0328 20251010_0330 20251009_0321 20251008_0343 20251007_0353 20251006_0325 20251005_0350 20251004_0352 20251003_0352 20251002_0356 20251001_0321 20250925_0335 20250924_0350 20250923_0348 20250922_0346 20250921_0345 20250920_0342 20250919_0346 20250918_0342 20250917_0336 20250916_0333 20250915_0333 20250914_0328 20250913_0322 20250912_0335 20250911_0337 20250910_0338 20250909_0341 20250908_0342 20250907_0333 20250906_0350 20250905_0319 20250904_0323 20250903_0355 20250902_0325 20250901_0355 20250831_0355 20250830_0356 20250829_0355 20250828_0333 20250827_1654 20250827_1602 20250827_1557 20250827_0320 20250826_0320 20250825_1752 20250825_1709 20250825_1652 20250825_1647 20250825_1645 20250825_1631 20250825_1606 20250825_1559 20250825_1558 20250825_1556 20250825_1531 20250825_1525 20250825_1516 20250825_1450 20250825_1444 20250825_1438 20250825_1414 20250825_1413 20250825_1410 20250825_1408 20250825_1405 20250825_1401 20250825_1355 20250825_1347 20250825_1345 20250825_1344 20250825_1343 20250825_1340 20250825_1339 20250825_1333 20250825_1323 20250825_1317 20250825_1243 20250824_0342 20250823_0343 20250823_0142 20250822_2331 20250822_2308 20250822_2258 20250822_2241 20250822_2228 20250822_2206 20250822_2147 20250822_2111 20250822_1259 20250822_1233 20250822_1229 20250822_1223 20250822_1210 20250822_1201 20250822_1111 20250822_1058 20250822_1052 20250822_1045 20250822_0657 20250822_0553