arXiv 论文速递

Latest digest

Mitigating Anchoring Bias in LLM-Based Agents for Energy-Efficient 6G Autonomous Networks

Authors: Hatim Chergui, Claudia Carballo González, Farhad Rezazadeh, Merouane Debbah

First: 2026-06-05T08:29:03+00:00 · Latest: 2026-06-18T09:22:45+00:00

Comments: 7 pages, 4 figures

Abs · PDF · Code1 · Code2 · Code3

Abstract

This paper presents an autonomous agentic resource negotiation framework designed to enable zero-touch network slicing in 6G architectures using Large Language Model (LLM) agents. While LLMs offer powerful reasoning capabilities, we demonstrate that such agents inherently suffer from anchoring bias, rigidly adhering to initial heuristic proposals and causing severe network over-provisioning. To systematically mitigate this cognitive bias, we propose a novel randomized anchoring strategy modeled via a Truncated 3-Parameter Weibull distribution. This mathematically bounded approach seamlessly integrates with burst-aware Digital Twins (DTs) employing Conditional Value at Risk (CVaR) to rigorously guarantee strict Service Level Agreement (SLA) tail-latencies. To validate our methodology, we introduce and prove the \emph{Bimodal Constraint-Avoidance Utility Theorem}, demonstrating that while feasible negotiations follow classical convex bounds, highly constrained scenarios undergo a phase transition governed by an inverse rational decay envelope. Empirical results generated using a locally hosted 1B-parameter model otel-llm-1b-it confirm these dual-regime bounds. Our cognitive de-biasing successfully dismantles rigid negotiation patterns, forcing agents into active exploration to safely ride SLA boundaries and boost system energy savings up to 25\%. Crucially, the lightweight 1B LLM achieves sub-second inference latencies (0.95s mean), ensuring our multi-agent framework is compatible with the operational timescales of the O-RAN non-Real-Time RAN Intelligent Controller (non-RT RIC)\footnote{Our source code is available for non-commercial use at https://github.com/HatimChergui.

Summary / 总结

This paper presents an autonomous agentic resource negotiation framework designed to enable zero-touch network slicing in 6G architectures using Large Language Model (LLM) agents.

Oranits: Mission Assignment and Task Offloading in Open RAN-based ITS using Metaheuristic and Deep Reinforcement Learning

Authors: Ngoc Hung Nguyen, Nguyen Van Thieu, Quang-Trung Luu, Anh Tuan Nguyen, Senura Wanasekara, Nguyen Cong Luong, Fatemeh Kavehmadavani, Van-Dinh Nguyen

Venue: IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2026

First: 2025-07-25T23:13:09+00:00 · Latest: 2026-06-18T09:02:58+00:00

Comments: 16 pages, 13 figures

Abs · PDF · Code1 · Code2

Abstract

In this paper, we explore mission assignment and task offloading in an Open Radio Access Network (Open RAN)-based intelligent transportation system (ITS), where autonomous vehicles leverage mobile edge computing for efficient processing. Existing studies often overlook the intricate interdependencies between missions and the costs associated with offloading tasks to edge servers, leading to suboptimal decision-making. To bridge this gap, we introduce Oranits, a novel system model that explicitly accounts for mission dependencies and offloading costs while optimizing performance through vehicle cooperation. To achieve this, we propose a twofold optimization approach. First, we develop a metaheuristic-based evolutionary computing algorithm, namely the Chaotic Gaussian-based Global ARO (CGG-ARO), serving as a baseline for one-slot optimization. Second, we design an enhanced reward-based deep reinforcement learning (DRL) framework, referred to as the Multi-agent Double Deep Q-Network (MA-DDQN), that integrates both multi-agent coordination and multi-action selection mechanisms, significantly reducing mission assignment time and improving adaptability over baseline methods. Extensive simulations reveal that CGG-ARO improves the number of completed missions and overall benefit by approximately 7.1% and 7.7%, respectively. Meanwhile, MA-DDQN achieves even greater improvements of 11.0% in terms of mission completions and 12.5% in terms of the overall benefit. These results highlight the effectiveness of Oranits in enabling faster, more adaptive, and more efficient task processing in dynamic ITS environments.

Summary / 总结

In this paper, we explore mission assignment and task offloading in an Open Radio Access Network (Open RAN)-based intelligent transportation system (ITS), where autonomous vehicles leverage mobile edge computing for efficient processing.

From Privacy to Workflow Integrity: Communication-Graph Metadata in Autonomous Agent Interoperability

Authors: Bijaya Dangol

First: 2026-06-05T11:07:55+00:00 · Latest: 2026-06-17T17:01:06+00:00

Comments: 22 pages, 7 figures, 6 tables

Abs · PDF · Code1 · Code2

Abstract

Agent-interoperability protocols such as A2A and MCP standardize what agents say to one another but assume address-based transport. Whether over HTTP(S) or a content-protecting binding such as MLS-based SLIM, these transports protect message content yet leave the communication graph exposed: which agent contacts which, when, and how often. In agent systems this graph is more consequential than a privacy framing suggests. Endpoints are capability-labeled, workflows are structured and chained, and interactions are coupled to actions, so an observer recovers more than past relationships: it can recognize a recurring pending workflow from its opening and, at machine speed, act on it before it completes. The threat is one of workflow integrity, not privacy alone. We give a threat model for the communication graph and locate what makes its metadata distinctively consequential: not stronger fingerprinting but exposure across independent trust domains, coupled to autonomous action. We define transport- and bootstrap-layer privacy properties, give them an indistinguishability-game semantics, evaluate transports, and give an A2A case study where a metadata-protecting binding surfaces its implicit identity assumptions. On a corpus of real multi-agent A2A traffic from the official reference agents, on a live A2A binding, and with a generative model as a controlled instrument, a label-blind classifier recovers a task's class from passive metadata at 6x chance, and from only its opening; a defense-aware adversary does not overturn this, and only the full set of properties drives recovery toward chance. Acting on the leak is distinct from recoverability: under a fixed budget an adversary captures 0.63 of a clairvoyant attacker's advantage on the corpus (0.41 from a workflow's opening), governed by top-ranked precision rather than overall accuracy, so integrity and privacy come apart under defense.

Summary / 总结

Agent-interoperability protocols such as A2A and MCP standardize what agents say to one another but assume address-based transport.

A Technical Taxonomy of LLM Agent Communication Protocols

Authors: Linus Sander, Habtom Kahsay Gidey, Alexander Lenz, Alois Knoll

First: 2026-06-17T14:45:20+00:00 · Latest: 2026-06-17T14:45:20+00:00

Abs · PDF · Code1 · Code2

Abstract

As large language models (LLMs) advance and multi-agent systems aim to overcome the limits of standalone agents, robust communication protocols are becoming essential infrastructure for distributed agent networks. Nonetheless, the fragmented protocol landscape presents a significant interoperability challenge. This study develops a technical taxonomy to classify and analyze LLM agent communication protocols. Following an established iterative method, we defined the taxonomy's purpose, meta-characteristic, and ending conditions, then performed five iterations, three empirical-to-conceptual and two conceptual-to-empirical, on nine actively maintained open-source protocols with demonstrable adoption. The taxonomy comprises five dimensions: counterparty, payload, interaction state, discovery mechanism, and schema flexibility. Classification reveals recurring architectural patterns: all sampled agent-to-agent protocols combine hybrid payloads with session-state persistence; most protocols support multiple predefined schemas, and two negotiate schemas at runtime, indicating a trend toward schema flexibility; decentralized discovery remains rare. Analysis suggests short-term convergence pressure toward protocols unifying agent-to-agent and agent-to-context (tool and data) communication. Long-term, however, no single protocol is likely to maximize versatility, efficiency, and portability simultaneously. The field will more likely evolve toward a federated, layered protocol stack. The framework guides protocol selection and highlights open research gaps such as privacy and policy enforcement.}

Summary / 总结

As large language models (LLMs) advance and multi-agent systems aim to overcome the limits of standalone agents, robust communication protocols are becoming essential infrastructure for distributed agent networks.

OmniPlan: An Adaptive Framework for Timely and Near-Optimal Network Planning Optimization

Authors: Longlong Zhu, Jiashuo Yu, Zedi Chen, Yuhan Wu, Zhifan Jiang, Yuchen Xian, Yimeng Liu, Jiajie Su, Shaopeng Zhou, Xingyuan Li, Hongyan Liu, Xuan Liu, Dong Zhang, Chunming Wu, Xiang Chen

Venue: KDD 2026

First: 2026-06-16T16:06:51+00:00 · Latest: 2026-06-17T07:29:09+00:00

Comments: Accepted by ACM KDD 2026

Abs · PDF · Code1 · Code2

Abstract

Network planning optimization is a fundamental problem across diverse domains, including transportation systems, communication networks, and power grids. It requires simultaneous optimization of multiple competing objectives under complex constraints. Existing network planning optimization frameworks rely on mixed integer programming (MIP) solvers, heuristics, and deep reinforcement learning (DRL) models to compute planning decisions. However, they lack effective adaptability to diverse and dynamic user intents, thus leading to the trade-off between execution time and optimality. In this paper, we propose OmniPlan, an adaptive framework that achieves both timeliness and near-optimality in network planning optimization. To achieve the adaptability lacking in existing solutions, OmniPlan employs a large language model (LLM)-based interpreter to convert heterogeneous natural-language intents into a unified and quantifiable user-preference vector. Then it employs a mixture-of-experts architecture that integrates MIP solvers, heuristics, and DRL models as specialized experts, where OmniPlan adapts to diverse intents by dynamically selecting timely and near-optimal experts. Finally, it incorporates a DRL-based expert configuration module that fine-tunes optimization objective weights to align planning decisions with user-specific preferences. We evaluate OmniPlan with a representative real-world workload, i.e., distributed machine learning (ML), where we leverage OmniPlan to offload a wide spectrum of ML inference tasks, e.g., decision trees, SVM, naive Bayes, XGBoost, and random forests, onto a network of hardware devices. Our experiments on a real-world testbed indicate that OmniPlan achieves near-optimal and low-execution-time offloading for real-world ML inference tasks, reducing latency by up to 97.8\% and network device resource consumption by up to 11.5\%.

Summary / 总结

Network planning optimization is a fundamental problem across diverse domains, including transportation systems, communication networks, and power grids.

The Multipath Reliable Connection (MRC) Transport

Authors: Rip Sohan, Eric Spada, Eric Davis, Mark Handley, Idan Burstein, Tony Hurson, Jithin Jose, Vivek Kashyap, Rong Pan, Sayantan Sur, Sreevatsa Anantharamu, Aviv Barnea, Adrian Caulfield, Elazar Cohen, Elliot Edmunds, Yamin Friedman, Mahdieh Ghazi, Murali Guramali, Torsten Hoefler, Vipin Jain, Abdul Kabbani, Noam Katz, Yanfang Le, Charlie Mbariky, Guglielmo Morandin, Masoud Moshref, Shane O'Neil, Michael Papamichael, Jonas Pfefferle, Siva Santosh Pyla, Costin Raiciu, David Riddoch, Karen Schramm, Yuval Shpigelman, Shahaf Shuler, Shy Shyman, Raghava Sivaramu, Amin Tootoonchian, Yang Wang

First: 2026-06-16T17:07:00+00:00 · Latest: 2026-06-16T17:07:00+00:00

Abs · PDF · Code1 · Code2

Abstract

MRC is an open, production-grade transport designed for large-scale AI/ML training over best-effort Ethernet. It extends RoCEv2 with explicit, composable primitives for per-packet multipath and sender-based congestion control, decouples packet delivery from semantic processing, adds multiple new capabilities for accelerated packet-loss recovery and adds resilience against port and path failures. This paper presents MRC and details its core capabilities and mechanisms.

Summary / 总结

MRC is an open, production-grade transport designed for large-scale AI/ML training over best-effort Ethernet.

FlowCLIP: Contrastive Pretraining Using Domain Names for Encrypted Traffic Classification

Authors: Eun Hun Choi

First: 2026-06-16T10:06:34+00:00 · Latest: 2026-06-16T10:06:34+00:00

Abs · PDF · Code1 · Code2

Abstract

Network traffic classification enables website fingerprinting, intrusion detection, and Quality of Service management. However, developing methods that capture stable and generalizable traffic patterns under realistic deployment conditions remains challenging. We introduce FlowCLIP, a contrastive pretraining framework for domain name prediction from encrypted traffic using only side-channel features: packet inter-arrival times, packet sizes, and packet directions. FlowCLIP uses raw domain names as textual supervision by aligning traffic flow representations with domain name representations through a CLIP-style contrastive objective. The pretrained traffic encoder is then frozen and evaluated through linear probing on canonicalized domain name labels. We evaluate FlowCLIP on a large-scale QUIC traffic dataset using a time-based protocol, where models are trained on Week 1 traffic and evaluated on traffic from Weeks 2-4. FlowCLIP outperforms competitive machine learning baselines across later evaluation weeks, suggesting that raw domain names provide a textual supervision signal for learning transferable encrypted traffic representations.

Summary / 总结

Network traffic classification enables website fingerprinting, intrusion detection, and Quality of Service management.

Integration of 5G and Industrial Digital Models: A Case Study with AGVs

Authors: J. Cañete-Martín, J. Gómez-Jerez, M. C. Lucas-Estañ, J. Gozálvez

Venue: Proceedings of 2024 IEEE International Conference on Emerging Technologies and Factory Automation (IEEE ETFA 2024), September, 2024, Padova, Italy

First: 2026-06-16T08:14:24+00:00 · Latest: 2026-06-16T08:14:24+00:00

Abs · PDF · Code1 · Code2

Abstract

5G is a fundamental technology for the digitalization of smart manufacturing. Smart manufacturing relies on the use of digital models to optimize industrial processes before implementation on the manufacturing plants. These models should account for the impact of 5G communications to adequately dimension and optimize 5G-based industrial processes. This paper presents the first integration of industrial digital models with a 5G digital model, implemented as an Asset Administration Shell (AAS) of a 5G system. The two models are interconnected using an OPC UA-based interface. We evaluate the impact of the integrated model using a use case where Automated Guided Vehicles (AGVs) transport material from a warehouse to production lines. The AGVs periodically exchange their positions over 5G to avoid potential collisions. If the communications fail, the AGVs stop for safety reasons until a reliable 5G connection can be guaranteed. We demonstrate that, by integrating 5G and industrial digital models, it is possible to account for, and quantify, the impact of 5G communications on the operation and productivity of industrial processes. This result highlights the importance and necessity of integrating 5G into industrial digital models for their joint design and optimization.

Summary / 总结

5G is a fundamental technology for the digitalization of smart manufacturing.

LLM-Aided Joint Secrecy Precoding and Trajectory for RSMA-Based Heterogeneous UAV Networks

Authors: Lijie Zheng, Ji He, Shih Yu Chang, Yulong Shen

First: 2025-07-23T04:22:57+00:00 · Latest: 2026-06-16T07:54:03+00:00

Abs · PDF · Code1 · Code2

Abstract

This paper investigates secure communications in rate-splitting multiple access (RSMA) enabled heterogeneous UAV networks, where multiple UAVs collaboratively serve ground terminals in the presence of eavesdroppers. By jointly considering secrecy rate maximization and propulsion energy consumption minimization, we formulate a multi-objective optimization problem involving UAV trajectory design, service association, power allocation, and secrecy precoding under mobility, collision-avoidance, service-capacity, and communication constraints. The formulated problem is highly non-convex due to the coupling among UAV trajectories, RSMA transmission variables, and secrecy constraints.To address the resulting non-convex and highly coupled optimization problem, we propose a hierarchical optimization framework. The inner layer uses a semidefinite relaxation (SDR)-based S2DC algorithm combining penalty functions and difference-of-convex (D.C.) programming to solve the secrecy precoding problem with fixed UAV positions. The outer layer introduces a Large Language Model (LLM)-guided heuristic multi-agent reinforcement learning approach (LLM-HeMARL) for trajectory optimization. LLM-HeMARL efficiently incorporates LLM-generated expert heuristic policy, enabling UAVs to learn energy-aware, security-driven trajectories without the inference overhead of real-time LLM calls. The simulation results show that our method outperforms existing baselines in secrecy rate and energy efficiency, with consistent robustness across varying UAV swarm sizes and random seeds.

Summary / 总结

This paper investigates secure communications in rate-splitting multiple access (RSMA) enabled heterogeneous UAV networks, where multiple UAVs collaboratively serve ground terminals in the presence of eavesdroppers.

Distributed General-Purpose Agent Networks: Architecture, Key Mechanisms, and Prototypes

Authors: Shengli Zhang, Deen Ma, Zibin Lin, Taotao Wang

First: 2026-06-15T23:57:13+00:00 · Latest: 2026-06-15T23:57:13+00:00

Abs · PDF · Code1 · Code2

Abstract

Large language models have accelerated the transition from passive conversational assistants to autonomous agents that can understand goals, plan actions, invoke tools, and execute multi-step tasks. Yet the capability of a single agent remains constrained by its local data, tool permissions, runtime environment, and governance boundary. This paper studies distributed general-purpose agent networks: open peer-to-peer networks in which heterogeneous agents deployed on personal devices, edge nodes, or autonomous computing environments can discover one another, establish trust, negotiate cooperation rules, and execute open-ended tasks. We argue that such networks cannot be obtained by simply combining existing peer-to-peer overlays with conventional multi-agent systems. Unlike traditional P2P networks, agent networks must propagate semantic declarations about intentions, capabilities, states, and cooperation constraints. We therefore propose a layered architecture centered on a protocol adaptation layer that connects upper-level task semantics with lower-level network operations. Based on this architecture, the paper identifies three core mechanism problems: semantic announcement propagation for collaborator discovery, verifiable identity and multi-topic reputation for cooperation governance, and semantic-gradient mechanism design for open task execution. For each problem, we present a technical route, including bodyless gossip with sequential logs, BAID-based identity binding with MG-EigenTrust reputation, and a Stackelberg-style mechanism-generation loop driven by semantic attribution feedback. We further report prototype overhead results for BAID-style tiered verification and mechanism-level simulations of MG-EigenTrust under cross-topic disguise-collusion attacks. The resulting framework provides a system-level foundation for open, trustworthy, and scalable agent collaboration.

Summary / 总结

Large language models have accelerated the transition from passive conversational assistants to autonomous agents that can understand goals, plan actions, invoke tools, and execute multi-step tasks.

Large Language Models for Agentic NetOps and AIOps: Architectures, Evaluation, and Safety

Authors: Muhammad Bilal, Jon Crowcroft, Ruizhi Wang, Xiaolong Xu, Schahram Dustdar

First: 2026-05-12T20:31:41+00:00 · Latest: 2026-06-15T20:14:07+00:00

Comments: 49 pages, 15 figures, 6 tables; survey article

Abs · PDF · Code1 · Code2

Abstract

Large language models are increasingly being used to support network operations (NetOps) and artificial intelligence for IT operations (AIOps), including incident investigation, root-cause analysis, configuration synthesis, and limited self-healing. In both NetOps and AIOps, this shift is changing how tasks are managed. Agent-based operations work as workflows, from gathering evidence to taking action, following permissions, policies, and checks, and providing rollback options when necessary. This is crucial because operational decisions can have instant impacts. To make the argument concrete, we organise the relevant literature around the hierarchy of autonomy, tool scope, evidence traces, and assurance contracts. These contracts define what an agent may observe, propose, and execute. They also define the checks that must pass before any action is allowed. A consistent pattern appears across work on telemetry query recommendation, diagnosis, root-cause analysis, configuration synthesis, change planning, and limited self-healing. Operational reliability does not come chiefly from the model itself. It depends on the machinery around the model. We also argue that evaluation should go beyond static question answering. Agentic NetOps and AIOps systems require workflow-centred evaluation, including trace quality, bounded tool use, safe proposal generation, replay in sandboxed environments, and canary trials with rollback-aware scoring. Without these measures, a system may appear robust yet remain too fragile. Finally, we examine security, privacy, and governance risks that become acute when agents sit close to operational control surfaces. Taken together, the survey concludes that progress in intelligent NetOps and AIOps will depend on treating autonomy as a constrained operational control problem, whose outputs must be reliable, auditable, and securely deployable.

Summary / 总结

Large language models are increasingly being used to support network operations (NetOps) and artificial intelligence for IT operations (AIOps), including incident investigation, root-cause analysis, configuration synthesis, and limited self-healing.

STEPS: Semantic Contract-Guided Scheduling for LLM-Assisted Natural Language-Driven Edge AI Services

Authors: Houyi Qi, Minghui Liwang, Xianbin Wang, Xinlei Yi, Seyyedali Hosseinalipour

First: 2026-06-08T14:19:47+00:00 · Latest: 2026-06-15T16:32:11+00:00

Abs · PDF · Code1 · Code2

Abstract

Edge user/service scheduling has become a cornerstone of distributed AI systems, determining where and how AI services are executed under limited communication and computing resources. Existing edge scheduling frameworks usually assume that service requirements are given as numerical constraints, such as latency bounds or energy budgets. In practice, users often express service expectations through ambiguous and context-dependent natural language, creating a gap between user intent and scheduling decisions. To bridge this semantic-to-optimization gap, we propose semantic contract-guided edge potential scheduling (STEPS), a natural language-driven scheduling framework that introduces semantic contracts as executable interfaces between user-side semantics and edge-side decision making. In STEPS, a large language model (LLM)-assisted parser interprets natural language requests and extracts semantic service requirements with confidence scores, which are converted into service requirements and semantic uncertainty. Based on this information, STEPS formulates edge scheduling as a contract-guided potential game that jointly determines execution-node selection, computing-resource provisioning, and bandwidth allocation. STEPS further uses feedback signals to support adaptive scheduling under evolving service and network conditions. We characterize the exact potential game structure, establish the existence of a pure-strategy Nash equilibrium, and prove convergence and stability properties of the scheduling and adaptation processes. Extensive experiments show that STEPS improves semantic contract fulfillment, reduces contract-guided service loss, and maintains robust adaptation under ambiguous natural language requests in non-stationary networked AI environments.

Summary / 总结

Edge user/service scheduling has become a cornerstone of distributed AI systems, determining where and how AI services are executed under limited communication and computing resources.

Single-Connection Mixed-Criticality Transport with CATS: Bounded Guarantees, Three Structural Limits, and a QUIC Escape

Authors: Syed Muhammad Aqdas Rizvi

First: 2026-06-15T16:27:07+00:00 · Latest: 2026-06-15T16:27:07+00:00

Comments: 9 pages, 4 figures, 1 table

Abs · PDF · Code1 · Code2

Abstract

Mixed-criticality applications, such as satellite terminals, industrial telemetry, embedded systems, tactical, and other constrained links, often multiplex a small, latency-critical message class and bulk traffic over a single commodity transport connection. A single FIFO connection can starve the critical class under load. The obvious alternative, opening parallel connections, costs an additional five-tuple (often blocked by carrier-grade NAT, port budgets, and operator policy) and is not always available; when the critical class is light, two connections can also be bandwidth-fair only in aggregate rather than single-flow fair. We present CATS (Conductor-driven Asymmetric Transport Scheme), a sender-side, receiver-transparent transport-layer priority scheme over TCP: a Conductor assigns each message a priority class and just-in-time sequence numbers, using a credit-based shaper. CATS provides the one combination its alternatives cannot: deterministic non-starvation together with single-flow fairness, plus a provable bounded per-class delay. We then show that, crucially, CATS-over-TCP is not a tail-latency mechanism, and why. Three structural barriers bound in-band priority: the in-order sequence space (head-of-line blocking), the shared congestion window (cross-class coupling), and the per-flow granularity of network QoS (in-band priority is invisible to it). These barriers explain why fair-queuing and even the modern low-latency standard L4S cannot help a single connection, and why two parallel connections reduce the latency tail at the cost of an additional flow. We give CATS-over-QUIC as the principled escape: independent streams with per-stream isolation under aggregate-coupled congestion control self-isolate at the endpoint, attaining the guarantees on one fair flow. An ns-3 evaluation and QUIC proof-of-concept support the findings.

Summary / 总结

Mixed-criticality applications, such as satellite terminals, industrial telemetry, embedded systems, tactical, and other constrained links, often multiplex a small, latency-critical message class and bulk traffic over a single commodity transport connection.

Conflict-Aware Federated Fine-Tuning of Large Language Models with Mixture-of-Experts

Authors: Yijun Lu, Zihan Fang, Pengpeng Qiao, Zheng Lin, Jing Yang, Yuxin Zhang, Por Lip Yee, Zhe Chen, Jun Luo

First: 2026-06-14T06:22:09+00:00 · Latest: 2026-06-14T06:22:09+00:00

Comments: 6 pages, 4 figures

Abs · PDF · Code1 · Code2

Abstract

The continuous scaling of large language models (LLMs) incurs prohibitive computational costs, making Mixture-of-Experts (MoE) a scalable alternative for efficient fine-tuning via sparse activation. While federated learning (FL) emerges as the paradigm for privacy-preserving collaborative optimization, integrating MoE into FL under data heterogeneity may trigger conflicting expert optimizations. Client-specific data distributions force same-indexed experts to optimize under inconsistent or even conflicting feature-label correlations. This mismatch induces destructive interference during aggregation, thus destabilizing the optimization trajectory and degrading model performance. To address this issue, we propose FC-MoE, a federated conflict-aware framework for MoE fine-tuning. It employs an importance aware weighting scheme to prioritize reliable local updates and utilizes gradient consensus projection to suppress conflicting updates, ensuring a stable global optimization path. Moreover, a local knowledge retention mechanism further preserves specialized client expertise by re-anchoring domain-specific residuals. Extensive experiments demonstrate that FC-MoE accelerates convergence and enhances both global and local model performance in non-IID federated environments.

Summary / 总结

The continuous scaling of large language models (LLMs) incurs prohibitive computational costs, making Mixture-of-Experts (MoE) a scalable alternative for efficient fine-tuning via sparse activation.

Solyx AI Grid: Hardware-Telemetry-Aware Routing Across Geographically Distributed GPU Clusters

Authors: Aleks Bernhard, Nithin Katla

First: 2026-06-13T01:40:29+00:00 · Latest: 2026-06-13T01:40:29+00:00

Comments: 15 pages, 9 tables

Abs · PDF · Code1 · Code2

Abstract

As GPU capacity fragments across geographically distributed sites, single-cluster LLM inference routing assumptions break down in measurable ways. We present Solyx AI Grid, a cross-site inference routing control plane that integrates GPU hardware telemetry (DCGM), vLLM application metrics, and real-time WAN signals (RTT, jitter) into per-request placement decisions via a 10-signal weighted pressure scorer. Across two empirical campaigns--six H100/H200 SXM GPUs and nine RTX PRO 6000 Blackwell SE GPUs spanning three US datacenters, eight workload classes, and a 216-cell SLO matrix--Solyx AI Grid delivers 1.56--1.75x throughput at tier-2 SLO over round-robin across all eight classes, cuts capability-mismatch leakage to 0.43% (versus 32% for standard routers), and reroutes around failures at a p99 of 1,247 ms versus 4,226 ms. We further find that GPU hardware telemetry leads application-layer SLO breach by 11.2 seconds on average, enabling proactive traffic drain before user-facing latency impact. To our knowledge, this is the first public empirical study of live physical multi-site LLM inference routing combining hardware telemetry, application metrics, and active WAN path signals.

Summary / 总结

As GPU capacity fragments across geographically distributed sites, single-cluster LLM inference routing assumptions break down in measurable ways.

StreamRTPS: Increasing DDS Bandwidth Efficiency by Reducing Protocol Overhead

Authors: David Philipp Klüner, Stefan Kowalewski, Alexandru Kampmann

First: 2026-06-12T07:51:41+00:00 · Latest: 2026-06-12T07:51:41+00:00

Comments: 8 pages, 8 figures

Abs · PDF · Code1 · Code2

Abstract

In this paper, we propose three extensions to the Real-Time Publish Subscribe wire protocol, on which Data Distribution Service (DDS) is based, to improve bandwidth efficiency. First, a stream negotiation mechanism exchanges static header information during discovery, replacing the full RTPS header at runtime with a compact 2 B identifier. Second, a payload aggregation scheme aggregates samples for the same locator into single UDP packets, reducing IP and UDP header costs. Third, a predictive heartbeat suppression strategy reduces control traffic by omitting heartbeats for periodic communication patterns, falling back upon detected loss or timing violations. All three mechanisms preserve Real-Time Publish Subscribe(RTPS) compatibility by extending DDS discovery to activate these features when supported. Experimental results show that stream headers reduce bandwidth consumption by up to 27.9 % compared to conventional RTPS under best-effort transport, and that heartbeat suppression yields a further 22.7 % reduction on top of stream headers under reliable transport, while preserving transmission latency in both cases.

Summary / 总结

In this paper, we propose three extensions to the Real-Time Publish Subscribe wire protocol, on which Data Distribution Service (DDS) is based, to improve bandwidth efficiency.

The Internet Runs on Names

Authors: Geoff Huston, Lixia Zhang

First: 2026-05-15T06:01:30+00:00 · Latest: 2026-06-12T02:53:07+00:00

Abs · PDF · Code1 · Code2

Abstract

The Internet's TCP/IP architecture was designed for resilient packet delivery between hosts identified by IP addresses. Over time, however, the consolidation of applications and services into large-scale platforms built on that universal packet-delivery substrate drove deployment practices that fundamentally changed the Internet's operational model: the network now operates primarily on names. DNS names have become the basis for service identity, reachability, load balancing, and trust, while IP addresses have become ephemeral routing locators. This change was driven by application needs and platform consolidation in the absence of any overarching plan. The resulting mismatch between the original address-based design and the current name-based operation leads to serious consequences: operational complexity that grows with each new layer of indirection, fragility, and vulnerability - as seen in recent high-profile outages. This paper exposes this mismatch as a necessary first step toward understanding its consequences and addressing the risks of continuing on the same path.

Summary / 总结

The Internet's TCP/IP architecture was designed for resilient packet delivery between hosts identified by IP addresses.

Defending the Core: A Centrality-Based Protection Strategy for Supply Chain Security in npm Dependency Network

Authors: Zixin Wang

First: 2026-06-12T02:22:44+00:00 · Latest: 2026-06-12T02:22:44+00:00

Abs · PDF · Code1 · Code2 · Code3

Abstract

The modern software supply chain, taking Node Package Manager (npm) dependency network for example, relies heavily on shared open-source dependencies. While this promotes rapid development, it introduces systemic vulnerabilities as well. Concerning this potential risk, we analyze the npm dependency network by modeling 53,481 packages and 78,520 dependency edges, and classify the network as a scale-free topology. Thus, we demonstrate its inherent vulnerability to targeted attacks on high-degree hubs. To mitigate this, we propose and evaluate a dual-pronged defense strategy consisting of Centrality-Based Node-Hardening and Dependency Weight Warning system. Moreover, by simulating the network under various attack scenarios, we prove that applying strict security protocols to just the top 1% of nodes, combined with pruning 30% of structurally trivial edges, prevents catastrophic network collapse and neutralizes cascading malware infections. The source code can be found at https://github.com/5tarWhee1/Centrality-Based-Protection-Strategy-for-Supply-Chain-Security-in-npm-Dependency-Network.

Summary / 总结

The modern software supply chain, taking Node Package Manager (npm) dependency network for example, relies heavily on shared open-source dependencies.

The Bilateral Efficiency of Ethernet: Recalibrating Metcalfe and Boggs After Fifty Years

Authors: Paul Borrill

First: 2026-03-19T18:57:48+00:00 · Latest: 2026-06-11T23:39:08+00:00

Comments: 15 pages, including an appendix on the Open Aethernet fault model. 50th anniversary of Metcalfe and Boggs (1976). v2: renamed Open Aethernet; quantum/TSVF framing removed; added bilateral-zigzag history, intra-rack scope (<= 1 m), and the fault-model appendix

Abs · PDF · Code1 · Code2

Abstract

In July 1976, Metcalfe and Boggs published their foundational paper on Ethernet in Communications of the ACM. Their efficiency model -- E = (P/C)/(P/C + W*T) -- measures the fraction of Ether time carrying good forward packets under contention. For fifty years this model has framed how the community thinks about Ethernet performance. We argue it is silent on the question that matters for modern intra-rack interconnect: bilateral transaction efficiency -- the fraction of link time that produces committed agreements between sender and receiver. Metcalfe and Boggs themselves planted the seed in their EFTP "end-dally" protocol (Section 7.2.2), and the deeper anchor is older still: Abramson's Alohanet carried positive acknowledgments at the link layer -- a bilateral mechanism Metcalfe consciously removed in 1973 to obtain Ethernet's simple, ACK-free packet format. The result is a fifty-year bilateral zigzag: Aloha (bilateral) to Ethernet (unilateral) to the EFTP end-dally (bilateral) to TCP (unilateral-with-bilateral-above). We formalize bilateral efficiency, connect it to the back-to-back Shannon channel with Perfect Information Feedback, and -- scoping the claim explicitly to intra-rack distances of one meter or less -- describe how the Open Aethernet link recovers mutual knowledge at the link layer. The correction to Table 1 is not a different set of numbers. It is a different question.

Summary / 总结

In July 1976, Metcalfe and Boggs published their foundational paper on Ethernet in Communications of the ACM.

A Tutorial on IEEE 802.11bn Multi-AP Coordination for Wi-Fi 8: From Standardization to Performance Evaluation

Authors: Francesc Wilhelmi, Boris Bellalta, Giovanni Geraci, Lorenzo Galati-Giordano, Francesca Meneghello, Aleksandra Kijanka, Iñaki Val, David López-Pérez

First: 2026-06-11T16:44:37+00:00 · Latest: 2026-06-11T16:44:37+00:00

Abs · PDF · Code1 · Code2

Abstract

The IEEE 802.11bn amendment defines significant modifications to the standard by establishing Ultra High Reliability (UHR) targets in Wireless Local Area Networks (WLANs). This is expected to deliver substantial enhancements over previous standards, including modes of operation that increase throughput, reduce the 95th percentile of the latency distribution, and decrease MAC Protocol Data Unit (MPDU) loss (all by at least 25%) compared to Extremely High Throughput (EHT) operations defined in the 802.11be amendment. A fundamental innovation for achieving these ambitious goals is the introduction of Multi-Access Point Coordination (MAPC), an unprecedented feature whereby APs will be able to coordinate among themselves to enhance spectrum utilization and advance towards reliability. This paper provides a comprehensive overview and analysis of this key framework. We begin by reviewing existing AP coordination solutions that precede the 802.11bn standard, which serve as a foundation for understanding the transition to the current framework. We then describe the technical 802.11bn MAPC framework as defined by the task group. A detailed overview of each candidate MAPC feature is provided, contextualized with the relevant state-of-the-art. Furthermore, we introduce Kom8ndor, an open-source Wi-Fi 8 simulation tool, to evaluate these candidate MAPC features and showcase their potential to achieve UHR goals. Finally, we outline the future of MAPC beyond 802.11bn, exploring promising directions such as coordination schemes beyond 802.11bn (e.g., Joint Transmission (JT)) and new ideas.

Summary / 总结

The IEEE 802.11bn amendment defines significant modifications to the standard by establishing Ultra High Reliability (UHR) targets in Wireless Local Area Networks (WLANs).

Measurement-Based Performance Evaluation of SmartRSUs with Heterogeneous Antenna Architectures for V2X Communications

Authors: Marco Savarese, Gaetano Orazio Cauchi, Salvatore Iandolo, Antonio Solida Martin Klapez, Maurizio Casoni, Micaela Verucchi, Enrico Vincenzi, Ignacio Sanudo Olmedo, Marko Bertogna, Carlo Augusto Grazia

First: 2026-06-11T13:26:05+00:00 · Latest: 2026-06-11T13:26:05+00:00

Comments: Accepted for publication at the 2026 IEEE International Workshop on Metrology for Automotive (MetroAutomotive 2026)

Abs · PDF · Code1 · Code2

Abstract

This paper presents a measurement-based performance evaluation of two custom Smart Roadside Units (SmartRSUs) featuring different V2X antenna architectures. The first configuration integrates GNSS and communication antennas into an all-in-one rooftop module, whereas the second uses external dual ITS-G5 (IEEE 802.11p) antennas operating at 5.9~GHz and a dedicated GNSS antenna. Both systems are built upon a proprietary On-Board Unit (OBU) platform adapted for infrastructure deployment. The experimental campaign evaluates key V2X communication metrics, including coverage, received signal strength indicator (RSSI), packet loss, and end-to-end latency in both transmission (OBU-to-infrastructure) and reception (infrastructure-to-OBU) directions. To ensure objective validation, a commercial off-the-shelf V2X Roadside Unit is co-located on the same infrastructure and used as a performance benchmark, providing ground-truth reference measurements under identical environmental conditions through a controlled co-located deployment. Results highlight the impact of antenna design and placement on communication reliability and latency, revealing trade-offs between integrated and external antenna configurations in real-world deployment scenarios. The findings provide practical insights for the design and optimization of next-generation SmartRSUs in cooperative intelligent transportation systems (C-ITS).

Summary / 总结

This paper presents a measurement-based performance evaluation of two custom Smart Roadside Units (SmartRSUs) featuring different V2X antenna architectures.

Feasibility Assessment of Remote Driving via Latency Analysis of ITS-G5 and Cellular Networks in the MASA Living Lab

Authors: Gaetano Orazio Cauchi, Antonio Solida, Salvatore Iandolo, Marco Savarese, Martin Klapez, Enrico Rossini, Marcello Pietri, Marco Picone, Marco Mamei, Maurizio Casoni, Carlo Augusto Grazia

First: 2026-06-11T12:47:32+00:00 · Latest: 2026-06-11T12:47:32+00:00

Comments: Accepted for publication at the IEEE 2026 Vehicular Technology Conference (VTC2026-Spring)

Abs · PDF · Code1 · Code2

Abstract

Remote driving has gained increasing attention as a key enabler for connected and automated vehicles. Yet its practical deployment hinges on wireless networks' ability to guarantee low, predictable latency. In this paper, we present an extensive latency analysis of ITS-G5 and cellular (5G) technologies within the Modena Automotive Smart Area (MASA), a real-world, city-scale testbed equipped with a distributed intelligent transportation infrastructure. By conducting controlled experiments under varying network loads and traffic conditions, we measure network and end-to-end latency components relevant to remote driving, in which the uplink consists of a continuous video stream transmitted from the vehicle to the remote operator, and the downlink conveys control commands back to the car. Measurements conducted under diverse conditions reveal how latency and variability differ across the two technologies and how infrastructure coverage impacts video-stream transmission performance. Based on the observed latency distributions and reliability metrics, we assess the practical feasibility and safety margins of remote driving in mixed network environments. The results provide actionable insights for future teleoperation deployments and motivate hybrid communication strategies that combine the strengths of ITS-G5 and cellular networks.

Summary / 总结

Remote driving has gained increasing attention as a key enabler for connected and automated vehicles.

LNTest: A Testbed for Evaluating Bitcoin Lightning Network-Based Botnets

Authors: Thomas Bakaysa, Ahmet Kurt, Abdul-Salem Beibitkhan, Jesus Maria Romo Diaz de Leon, Tag Kalat, Joshua Kramer, Estela Rodriguez, Abraham Watkins, Abdullah Aydeger

First: 2026-06-11T04:29:49+00:00 · Latest: 2026-06-11T04:29:49+00:00

Comments: Accepted at the 21st International Conference on Availability, Reliability and Security (ARES 2026)

Abs · PDF · Code1 · Code2

Abstract

Bitcoin's Lightning Network (LN) can be exploited as a covert, low-cost command-and-control (C&C) channel for botnets, as demonstrated by the LNBot and D-LNBot designs. However, both remain proof-of-concept prototypes evaluated only through simulation, leaving key questions about real-world topology formation, propagation complexity, and resilience to takedowns unanswered. We present LNTest, the first reusable testbed for LN-based botnets, built from Core Lightning nodes containerized with Docker over a shared Bitcoin Core regtest chain. LNTest supports three overlay topology modes (a deterministic chain, autonomous peer discovery, and user-supplied graphs), enabling controlled experiments across different botnet structures. Using LNTest, we report three main findings. First, D-LNBot's autonomous formation protocol does not produce the uniform chain from its design; instead, it creates a clustered chain in which cliques are linked by bridge nodes whose removal fragments the network. Second, command propagation scales linearly with botnet size ($Θ(n)$), not the $O(m \log n)$ previously claimed, and gains nothing from higher neighbor connectivity. Third, the overlay topology determines the effectiveness of takedown strategies: uniform-degree chains resist targeted removal but fragment under random failure, scale-free topologies show the opposite pattern, and the autonomous clustered chain is fragile under both, making it the most vulnerable of the three. LNTest is released as open source, with a script that reproduces all our experiments, to support reproducible research on LN-based botnet defenses.

Summary / 总结

Bitcoin's Lightning Network (LN) can be exploited as a covert, low-cost command-and-control (C&C) channel for botnets, as demonstrated by the LNBot and D-LNBot designs.

The Internet of Agentic AI: Communication, Coordination, and Collective Intelligence at Scale

Authors: Quanyan Zhu

First: 2026-06-11T03:02:59+00:00 · Latest: 2026-06-11T03:02:59+00:00

Abs · PDF · Code1 · Code2

Abstract

The rapid emergence of autonomous AI agents is transforming artificial intelligence from isolated model inference into distributed systems of reasoning, communication, and action. This paper develops the vision of the Internet of Agentic AI (IoAI): an open ecosystem in which heterogeneous agents discover one another, negotiate responsibilities, exchange context, invoke tools, and execute workflows across cloud, edge, device, organizational, and cyber-physical environments. We synthesize foundations from single-agent agentic AI, multi-agent systems, distributed computing, communication networks, game theory, and security engineering to characterize the architectures and mechanisms required for scalable agent ecosystems. The paper examines agent deployment models, workflow lifecycles, communication protocols, interoperability layers, resource-management challenges, and trust architectures, with case studies in adaptive manufacturing and distributed operational coordination. The resulting framework highlights the central research challenges of controlled emergence, semantic interoperability, secure identity, incentive-compatible coordination, resource-aware orchestration, and governance for large-scale networks of autonomous agents.

Summary / 总结

The rapid emergence of autonomous AI agents is transforming artificial intelligence from isolated model inference into distributed systems of reasoning, communication, and action.

Compact LLM Deployment and World Model Assisted Offloading in Mobile Edge Computing

Authors: Ruichen Zhang, Xiaofeng Luo, Jiayi He, Jiawen Kang, Zehui Xiong, Shiwen Mao

First: 2026-02-14T06:37:29+00:00 · Latest: 2026-06-10T14:16:41+00:00

Comments: 16 pages, 10 figures

Abs · PDF · Code1 · Code2

Abstract

This paper investigates compact large language model (LLM) deployment and world-model-assisted inference offloading in mobile edge computing (MEC) networks. We first propose an edge compact LLM deployment (ECLD) framework that jointly applies structured pruning, low-bit quantization, and knowledge distillation to construct edge-deployable LLM variants, and we evaluate these models using four complementary metrics: accessibility, energy consumption, hallucination rate, and generalization accuracy. Building on the resulting compact models, we formulate an MEC offloading optimization problem that minimizes the long-term average inference latency subject to per-device energy budgets and LLM-specific quality-of-service constraints on effective accuracy and hallucination. To solve this problem under unknown and time-varying network dynamics, we develop a world model-proximal policy optimization (PPO) algorithm, which augments an on-policy PPO algorithm with a learned recurrent world model that provides improved value targets and short imagination rollouts. Extensive experiments on Llama-3.1-8B, Qwen3-8B, and Mistral-12B show that ECLD compresses base models by about 70-80% in storage (i.e., from 15.3 GB to 3.3 GB for Llama-3.1-8B) and reduces per-query energy consumption by up to 50%, while largely preserving accuracy and often lowering hallucination compared with quantization-only or pruning-only baselines. Moreover, they also show that world model-PPO speeds up convergence by about 50%, improves the final reward by 15.8% over vanilla PPO, and reduces average inference latency by 12-30% across different user populations, while satisfying the accuracy and hallucination constraints and approaching the generation quality of always-offloading with much of the efficiency of local execution.

Summary / 总结

This paper investigates compact large language model (LLM) deployment and world-model-assisted inference offloading in mobile edge computing (MEC) networks.

LLM-Enabled NWDAF: A Step Toward AI-Native 6G Network Intelligence

Authors: Henok Daniel, Omar Alhussein, Cheng Li, Jie Liang, Ernesto Damiani

First: 2026-06-10T10:00:26+00:00 · Latest: 2026-06-10T10:00:26+00:00

Comments: 20 pages

Abs · PDF · Code1 · Code2 · Code3

Abstract

The Network Data Analytics Function (NWDAF) is central to enabling zero-touch network management in fifth-generation (5G) networks by supporting real-time analytics and closed-loop automation. Despite its critical role, open-source NWDAF implementations remain limited in scope and accessibility. In this paper, we develop an open-source NWDAF, compatible with the open-source core network Free5GC, that collects network data via subscriptions to Network Functions (NFs), and also includes an integrated Large Language Model (LLM) interface that enables natural language interaction with human operators. The interface processes user intents, encodes them using a semantic embedding model, and maps them to one of seven predefined intent categories to trigger analytics queries or event subscription commands. This architecture abstracts the complexity of traditional interfaces, allowing non-expert users to manage network analytics and subscriptions with ease. The system supports Access and Management Function (AMF) and Session Management Function (SMF) event subscriptions, real-time monitoring, and analytics retrieval via Prometheus, all accessible through a conversational interface. By bridging AI-driven intent recognition with standardized network analytics, our implementation enhances operator usability and provides a foundation towards AI-native 6G networks. The source code and datasets generated during the current study are available in the github repository, https://github.com/HenokDanielbfg/testbed.

Summary / 总结

The Network Data Analytics Function (NWDAF) is central to enabling zero-touch network management in fifth-generation (5G) networks by supporting real-time analytics and closed-loop automation.

Resource-Aware LLM Reasoning for Mobile Edge General Intelligence

Authors: Mingyi Luo, Ruichen Zhang, Xiangwang Hou, Jun Du, Chunxiao Jiang, Yong Ren, Shiwen Mao

First: 2025-09-27T10:53:48+00:00 · Latest: 2026-06-10T07:43:17+00:00

Abs · PDF · Code1 · Code2

Abstract

The rapid advancement of large language models (LLMs) has enabled an emergence of agentic artificial intelligence (AI) with powerful reasoning and autonomous decision-making capabilities. This integration with edge computing has led to the development of Mobile Edge General Intelligence (MEGI), which brings real-time, privacy-preserving reasoning to the network edge. However, deploying LLM-based agentic AI reasoning in MEGI environments poses significant challenges due to the high computational demands of reasoning and the limited resources of edge devices. To address these challenges, we propose a joint optimization framework for efficient LLM reasoning deployment in MEGI. First, we systematically review enhancement methods to identify mechanisms suitable for edge adaptation. Subsequently, we present a distributed framework that synergizes reasoning enhancement via adaptive CoT prompting with scalable deployment through a distributed MoE architecture. An important innovation of this approach involves modeling reasoning depth as a dynamic network resource variable, which is optimized jointly with expert activation and transmission power. This mechanism allows the system to dynamically regulate expert networks and reasoning complexity according to task requirements and device capabilities. Experimental evaluations in mobile edge environments demonstrate that the proposed framework effectively balances reasoning quality and resource efficiency. The results show that with less than one second of additional inference time, both accuracy and latency satisfaction rate can reach 90\%, validating the practical viability of deploying sophisticated LLM reasoning in resource-constrained MEGI systems.

Summary / 总结

The rapid advancement of large language models (LLMs) has enabled an emergence of agentic artificial intelligence (AI) with powerful reasoning and autonomous decision-making capabilities.

Tiara: A Programmable Line-Rate ISA for Remote Memory Access

Authors: Bojie Li

First: 2026-06-10T05:01:17+00:00 · Latest: 2026-06-10T05:01:17+00:00

Abs · PDF · Code1 · Code2

Abstract

RDMA one-sided verbs are the natural primitive for memory disaggregation, but they require the client to supply the exact remote address. The 1-RTT performance breaks down when the target address depends on data that must first be read from remote memory, a pattern we call the Indirection Wall. Indirection is pervasive: graph traversals follow pointers hop by hop, address translation walks multi-level page tables, distributed coordination requires conditional multi-host logic, and disaggregated LLM inference must resolve paged KV caches through block-table lookups. Each level of indirection costs one sequentially dependent network round-trip, yet offloading to existing RDMA NICs either consumes remote CPU cycles or has limited throughput. We present Tiara, a compact, statically verifiable instruction set that executes on the memory-side NIC. Tiara operators are pre-registered programs, analogous to eBPF programs in the kernel, that resolve indirection locally, collapsing multi-RTT dependent chains into a single round-trip. On an FPGA-based prototype, Tiara reduces 10-hop graph-traversal latency by 2.85x over one-sided RDMA while sustaining 3.4x higher throughput, cuts page-table walk latency by 62%, reduces uncontended distributed-lock latency by 2.9x, achieves 2.8x throughput for disaggregated PagedAttention at 8 KB blocks, and 1.88x MoE expert-gather latency at 32 experts.

Summary / 总结

RDMA one-sided verbs are the natural primitive for memory disaggregation, but they require the client to supply the exact remote address.

Generative Explainability for Next-Generation Networks: LLM-Augmented XAI with Mutual Feature Interactions

Authors: Kiarash Rezaei, Omran Ayoub, Sebastian Troia, Francesco Lelli, Paolo Monti, Carlos Natalino

Venue: Proc. WiMob, Marrakesh, Morocco, 2025

First: 2026-06-09T14:48:26+00:00 · Latest: 2026-06-09T14:48:26+00:00

Comments: 7 pages, with one page for appendix. Accepted for publication at the 2025 21th International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob)

Abs · PDF · Code1 · Code2

Abstract

As artificial intelligence and machine learning (AI/ML) models become integral to network operations, their lack of transparency poses a significant barrier to operator trust. Existing explainable artificial intelligence (XAI) techniques often fail to bridge this gap for non-specialists, producing technical outputs that are difficult to translate into actionable insights. This paper presents a framework specifically designed to address this shortcoming. It leverages a moderately sized large language model (LLM) and extends beyond the standard use of SHapley Additive exPlanations (SHAP) feature influence values. The framework employs a structured prompt enriched with mutual feature interaction data to generate human-understandable natural language explanations. To validate our framework, we performed an empirical evaluation on an optical quality of transmission (QoT) estimation use case with human evaluators. We collected independent performance evaluations from specialists, which showed a high inter-evaluator agreement. Compared to a state-of-the-art baseline that uses only SHAP feature influence values in a straightforward prompt, our approach improves the explanation usefulness and scope by 12.2% and 6.2%, while achieving 97.5% correctness.

Summary / 总结

As artificial intelligence and machine learning (AI/ML) models become integral to network operations, their lack of transparency poses a significant barrier to operator trust.

TraGe: A Generic Packet Representation for Traffic Classification Based on Header-Payload Differences

Authors: Chungang Lin, Yilong Jiang, Weiyao Zhang, Xuying Meng, Tianyu Zuo, Yujun Zhang

First: 2025-06-17T03:27:44+00:00 · Latest: 2026-06-09T12:46:23+00:00

Comments: This paper has been accepted by IWQoS 2025. The code is available at https://github.com/lincgcg/TraGe

Abs · PDF · Code1 · Code2 · Code3

Abstract

Traffic classification has a significant impact on maintaining the Quality of Service (QoS) of the network. Since traditional methods heavily rely on feature extraction and large scale labeled data, some recent pre-trained models manage to reduce the dependency by utilizing different pre-training tasks to train generic representations for network packets. However, existing pre-trained models typically adopt pre-training tasks developed for image or text data, which are not tailored to traffic data. As a result, the obtained traffic representations fail to fully reflect the information contained in the traffic, and may even disrupt the protocol information. To address this, we propose TraGe, a novel generic packet representation model for traffic classification. Based on the differences between the header and payload-the two fundamental components of a network packet-we perform differentiated pre-training according to the byte sequence variations (continuous in the header vs. discontinuous in the payload). A dynamic masking strategy is further introduced to prevent overfitting to fixed byte positions. Once the generic packet representation is obtained, TraGe can be finetuned for diverse traffic classification tasks using limited labeled data. Experimental results demonstrate that TraGe significantly outperforms state-of-the-art methods on two traffic classification tasks, with up to a 6.97% performance improvement. Moreover, TraGe exhibits superior robustness under parameter fluctuations and variations in sampling configurations.

Summary / 总结

Traffic classification has a significant impact on maintaining the Quality of Service (QoS) of the network.

High-Speed Generation of Periodic Traffic Patterns on P4TG for DDoS and Burst-Load Evaluation

Authors: Fabian Ihle, Etienne Zink, Michael Menth

First: 2026-06-09T11:11:59+00:00 · Latest: 2026-06-09T11:11:59+00:00

Comments: Accepted for publication at 12th IEEE International Conference on Network Softwarization (NetSoft 2026)

Abs · PDF · Code1 · Code2

Abstract

Traffic generators are essential tools for evaluating the robustness and performance of networked systems. P4TG is an open-source, hardware-accelerated traffic generator implemented in P4 for the Intel Tofino ASIC. It has been adopted by researchers and industry due to its flexibility and multi-terabit generation capability, and its low cost compared to other traffic generators. However, like most existing generators, it primarily produces constant bit rate traffic, which does not reflect the highly time-varying behavior observed in real networks, such as flashcrowds and microbursts. Such patterns are difficult to emulate at scale with current tools. We present a data plane mechanism for P4TG that shapes periodic, time-varying traffic patterns, including patterns representative of DDoS attacks and burst-load scenarios. Pattern shaping in P4TG can be applied to its generated traffic at an aggregate throughput of up to 4 Tbit/s. We evaluate pattern accuracy and analyze scalability across different sampling resolutions and periods. Further, we demonstrate practical use cases, including zero-loss throughput determination and buffer capacity measurement. Finally, we present microburst-based attack scenarios that overload UDP receivers, switch buffers, and degrade TCP throughput on shared links while remaining undetectable to conventional rate monitoring.

Summary / 总结

Traffic generators are essential tools for evaluating the robustness and performance of networked systems.

CAMASA: A CAM-based Dataset from the MASA Living Lab

Authors: Salvatore Iandolo, Marco Savarese, Gaetano Orazio Cauchi, Antonio Solida, Martin Klapez, Maurizio Casoni, Angelo Porrello, Carlo Augusto Grazia

First: 2026-06-09T09:45:51+00:00 · Latest: 2026-06-09T09:45:51+00:00

Comments: Accepted for publication at the IEEE 2026 Vehicular Technology Conference (VTC2026-Fall). Dataset will be available at netlab.unimore.it/MASA

Abs · PDF · Code1 · Code2

Abstract

Trajectory prediction is a key enabler of autonomous and cooperative driving systems. However, most existing benchmarks are either sensor-centric, geographically constrained, or based on synthetic mobility traces that do not capture real-world V2X communication dynamics. This paper introduces CAMASA, a large-scale infrastructure-based dataset derived from Cooperative Awareness Messages (CAMs) and Decentralized Environmental Notification Messages (DENMs) collected within the Modena Automotive Smart Area (MASA). The dataset comprises more than 40 million CAMs and 2 million DENMs recorded under authentic urban traffic conditions over multiple months. We present a rigorous preprocessing pipeline that includes filtering, pseudonym reconciliation to account for ETSI privacy-driven stationID changes, and temporal normalization to 10 Hz trajectories, suitable for motion forecasting and time-series analysis. With over 14,000 km of reconstructed vehicle paths and tens of thousands of unique station IDs, CAMASA provides a statistically significant empirical foundation for research on Cooperative Intelligent Transportation Systems (C-ITS). Beyond trajectory prediction, the dataset enables calibration of microscopic urban traffic simulators (e.g., SUMO) and supports the development of realistic Intelligent Transportation Systems (ITS) Digital Twins by jointly modeling mobility patterns and V2X communication coverage in real deployments.

Summary / 总结

Trajectory prediction is a key enabler of autonomous and cooperative driving systems.

From MWM to iSLIP: A Linear-Algebraic Tutorial on Input-Queued Switch Scheduling

Authors: Xiaotong Yuan, An Guo

First: 2026-06-09T06:43:01+00:00 · Latest: 2026-06-09T06:43:01+00:00

Abs · PDF · Code1 · Code2

Abstract

This paper uses three objects -- the queue matrix Q, the matching matrix P, and the Lyapunov energy function V = ||Q||^2 -- as a shared mathematical language to explain, within a single framework, the scheduling objective of maximum weight matching (MWM), queue stability under admissible traffic (per-port loads strictly below 1), and the mechanics of iSLIP's Grant-Accept row-column decoupling together with the long-run average service matrix P-bar. The setting throughout is an N-by-N SoC crossbar, where each clock cycle permits at most one cell transfer per input-output port pair. For the experimental comparison, we built a C++ discrete-event simulator and used exact MWM (solved by the Hungarian algorithm) as the performance reference. All three approximate algorithms are given a fixed iteration budget: r = 3 rounds per cycle for iSLIP and for spectral scheduling, and r_sink = 10 Sinkhorn normalization rounds for entropy-regularized optimal transport (OT). Throughput and average cell delay are measured across four traffic patterns. Spectral scheduling and entropy-regularized OT track MWM closely in both throughput and delay across most tested conditions. iSLIP, by contrast, hits a throughput ceiling of roughly 80% under non-uniform admissible traffic at high load (unbalanced pattern w = 0.5, rho_load >= 0.9), with bottleneck queues growing without bound and delays reaching two orders of magnitude above MWM. Under uniform traffic this breakdown does not occur: at rho_load = 0.99 iSLIP delay is about 3.7x that of MWM. The performance gains of spectral scheduling and OT come at an additional per-cycle compute cost on the order of O(r*N^2) multiply-accumulate or exponential operations; whether this overhead is feasible in real hardware -- in terms of die area, power, and timing closure -- remains to be evaluated.

Summary / 总结

This paper uses three objects -- the queue matrix Q, the matching matrix P, and the Lyapunov energy function V = ||Q||^2 -- as a shared mathematical language to explain, within a single framework, the scheduling objective of maximum weight matching (MWM), queue stability under admissible traffic (per-port loads strictly below 1), and the mechanics of iSLIP's Grant-Accept row-column decoupling together with the long-run average service matrix P-bar.

Secrets Best Not Shared: DNS Privacy Enhancements for the Constrained IoT

Authors: Martine S. Lenders, Thomas C. Schmidt, Matthias Wählisch

First: 2026-06-08T19:22:30+00:00 · Latest: 2026-06-08T19:22:30+00:00

Comments: 20 pages, 20 figures, 2 tables

Abs · PDF · Code1 · Code2

Abstract

Attackers often identify DNS traffic to disrupt or compromise Internet services. While prior work has focused on encrypting queries using DNS over TLS, HTTPS, or QUIC to counter such attacks, we consider IETF protocols designed for resource-constrained IoT devices and empirically analyze the potential of obfuscating DNS traffic in addition to encryption. We create a dataset of machine-to-machine-compatible data objects along with the corresponding DNS resolution processes, evaluating 296 deployment scenarios of resolving host names, including DNS over the Constrained Application Layer Protocol (CoAP) and an onion routing flavor of CoAP under varying link-layer conditions. We compare them to DNS over HTTPS. Using Random Forest and a header field analysis, we identify fields that leak most information. Our findings show that DNS over CoAP with equalized packet lengths, block-wise transfer, and header compression reduces the accuracy of identifying DNS frames to 86% and further to 77% with payload compression. Our approach outperforms DNS over HTTPS, where classifiers always identify DNS frames based on IP addresses. The dataset is publicly available.

Summary / 总结

Attackers often identify DNS traffic to disrupt or compromise Internet services.

Autonomous Incident Resolution at Hyperscale: An Agentic AI Architecture for Network Operations

Authors: Arun Malik

First: 2026-06-08T07:15:53+00:00 · Latest: 2026-06-08T07:15:53+00:00

Comments: 7 pages, 6 figures

Abs · PDF · Code1 · Code2

Abstract

Cloud network infrastructure at hyperscale presents unique operational challenges where traditional human-driven incident response cannot keep pace with the volume, velocity, and complexity of failures. This paper presents an agentic AI architecture for autonomous incident resolution in large-scale network operations. Our system employs a multi-agent orchestration framework where specialized AI agents collaborate to detect, diagnose, and remediate network incidents without human intervention. We describe the architectural principles, including hierarchical agent decomposition, skills-based tool invocation via standardized protocols, structured knowledge encoding from operational runbooks, progressive autonomy with safety boundaries, and closed-loop verification. The architecture has been deployed in production at a major cloud provider, demonstrating that agentic AI systems can achieve autonomous resolution rates exceeding 90% for common incident categories while maintaining safety guarantees through layered authorization and rollback mechanisms. We discuss design tradeoffs, failure modes, and lessons learned from operating autonomous AI agents at scale.

Summary / 总结

Cloud network infrastructure at hyperscale presents unique operational challenges where traditional human-driven incident response cannot keep pace with the volume, velocity, and complexity of failures.

Multi-SPIN: Multi-Access Speculative Inference for Cooperative Token Generation at the Edge

Authors: Haotian Zheng, Zhanwei Wang, Mingyao Cui, Chang Cai, Hongyang Du, Kaibin Huang

First: 2026-06-03T08:16:14+00:00 · Latest: 2026-06-07T06:19:01+00:00

Abs · PDF · Code1 · Code2

Abstract

Speculative inference (SPIN) was originally developed as an efficient architecture to accelerate Large Language Models (LLMs). In this work, we propose its distributed deployment to enable cooperative token generation in a multiuser edge system; its advantage is to effectively balance computational loads between resource-constrained devices and servers. The resulting architecture, termed Multi-access SPIN (Multi-SPIN), utilizes on-device small language models to generate and upload candidate token drafts, while an edge server operates the LLM to verify them in parallel batches. Given the severe heterogeneity in users' computation and communication capabilities, the draft length emerges as a critical control variable that influences node-level computation loads and multi-access latency, thereby governing the sum token goodput. Consequently, considering frequency-division multiple access, we investigate the problem of multi-access draft control, a joint optimization of draft-length control and bandwidth allocation to maximize sum token goodput. We examine two cases: (1) homogeneous draft lengths across users to facilitate server-side batching, and (2) heterogeneous draft lengths to introduce a new dimension for goodput enhancement. By developing decomposition methods, we reduce these complex optimizations into tractable sub-problems, which allow efficient draft control algorithms to be derived in closed form. Our analysis shows that the optimal bandwidth allocation compensates users with weaker computation-and-communication capabilities in the homogeneous case due to the batching synchronization requirements, whereas its heterogeneous-case counterpart rewards users with higher acceptance rates by relaxing such requirements. Experiments using Llama-2 and Qwen3.5 model pairs across diverse tasks demonstrate that Multi-SPIN improves goodput by up to 88% over heterogeneity-agnostic baselines.

Summary / 总结

Speculative inference (SPIN) was originally developed as an efficient architecture to accelerate Large Language Models (LLMs).

AI-Native Closed-Loop Security for 6G-Enabled Cyber-Physical Systems: From Edge Detection to Network-Wide Mitigation

Authors: Bilal Hussain, Muhammad Bilal, Tan Li, Haris Pervaiz, Xiao Tang, Qinghe Du, Fawad Ahmad, Muhammad Azhar, Jun Zhang

First: 2026-06-06T13:36:59+00:00 · Latest: 2026-06-06T13:36:59+00:00

Comments: 30 pages, 12 figures, survey paper, submitted to IEEE Communications Surveys & Tutorials (IEEE COMST)

Abs · PDF · Code1 · Code2

Abstract

In sixth-generation (6G) networks, billions of cyber-physical systems (CPSs) - autonomous vehicles, smart grids, industrial robots, and remote-surgical equipment - will run over ultra-reliable low-latency slices, collapsing the gap between a remote breach and physical harm to milliseconds, a budget perimeter firewalls and centralised security operations centres cannot meet. This survey reframes 6G CPS security as a closed-loop, AI-native pipeline that senses at the multi-access edge computing (MEC) tier, using minute-scale call-detail records (CDRs) for baseline learning and sub-millisecond RAN/Open-RAN (O-RAN) telemetry for the latency-critical path. It decides locally with compressed deep models, mitigates network-wide via SDN, NFV, and O-RAN controllers, and retrains through federated learning (FL) and digital-twin (DT) replay. We formalise a per-slice, tail-bounded latency contract on the sense, detect, and mitigate stages, enforced at a slice-dependent tail percentile (p99 for safety-critical URLLC slices). Organising 128 peer-reviewed studies (2017-2026) under a PRISMA 2020 protocol, we (i) map the 6G/CPS threat surface to MITRE ATT&CK and a CDR-observable feature space; (ii) unify edge anomaly detection and DDoS classification across twelve datasets and statistical, graph, and transformer models; (iii) synthesise SDN/NFV/O-RAN primitives into one closed-loop reference architecture; (iv) treat FL, large language models (LLMs), DT, post-quantum cryptography (PQC), zero-trust architecture (ZTA), and explainable AI as cross-cutting enablers, not parallel pillars; and (v) consolidate open problems into five directions spanning data, latency, trust, standardisation, and evaluation.

Summary / 总结

In sixth-generation (6G) networks, billions of cyber-physical systems (CPSs) - autonomous vehicles, smart grids, industrial robots, and remote-surgical equipment - will run over ultra-reliable low-latency slices, collapsing the gap between a remote breach and physical harm to milliseconds, a budget perimeter firewalls and centralised security operations centres cannot meet.

Agentic World Modeling for 6G: Near-Real-Time Generative State-Space Reasoning

Authors: Farhad Rezazadeh, Amir Ashtari Gargari, Hatim Chergui, Sandra Lagen, Merouane Debbah, Houbing Song, Lingjia Liu

First: 2025-11-04T17:22:22+00:00 · Latest: 2026-06-05T08:12:49+00:00

Comments: 13 Pages, 3 Figures, 4 Tables

Abs · PDF · Code1 · Code2

Abstract

We argue that sixth-generation (6G) intelligence is not fluent token prediction but the capacity to imagine and choose -- to simulate future scenarios, weigh trade-offs, and act with calibrated uncertainty. We reframe open radio access network (O-RAN) near-real-time (Near-RT) control via counterfactual dynamics and a world modeling (WM) paradigm that learns an action-conditioned generative state space. This enables quantitative "what-if" forecasting beyond large language models (LLMs) as the primary modeling primitive. Actions such as physical resource blocks (PRBs) are treated as first-class control inputs in a causal world model, and both aleatoric and epistemic uncertainty are modeled for prediction and what-if analysis. An agentic, model predictive control (MPC)-based cross-entropy method (CEM) planner operates over short horizons, using prior-mean rollouts within data-driven PRB bounds to maximize a deterministic reward. The model couples multi-scale structured state-space mixtures (MS3M) with a compact stochastic latent to form WM-MS3M, summarizing key performance indicators (KPIs) histories and predicting next-step KPIs under hypothetical PRB sequences. On realistic O-RAN traces, WM-MS3M cuts mean absolute error (MAE) by 1.69% versus MS3M with 32% fewer parameters and similar latency, and achieves 35-80% lower root mean squared error (RMSE) than attention/hybrid baselines with 2.3-4.1x faster inference, enabling rare-event simulation and offline policy screening.

Summary / 总结

We argue that sixth-generation (6G) intelligence is not fluent token prediction but the capacity to imagine and choose -- to simulate future scenarios, weigh trade-offs, and act with calibrated uncertainty.

Scalable Joint Resource Allocation for SLO-Constrained LLM Inference in Heterogeneous GPU Clouds

Authors: Jiaming Cheng, Duong Tung Nguyen

First: 2026-04-08T18:11:09+00:00 · Latest: 2026-06-05T07:23:00+00:00

Abs · PDF · Code1 · Code2

Abstract

Serving large language model (LLM) inference in cloud environments requires jointly optimizing model selection, GPU provisioning, parallelism configuration, and workload routing under latency, accuracy, memory, and budget constraints. While mixed-integer linear programming (MILP) can model this problem, its computational cost limits frequent re-optimization under demand variability. Existing heuristics often optimize individual components separately and may become infeasible when system-wide constraints are enforced. This paper presents a scalable framework for SLO-constrained LLM inference. We formulate the problem as an MILP with a two-phase delay model capturing both prefill and autoregressive decoding under tensor and pipeline parallelism. To solve it efficiently, we develop two constraint-aware heuristics: a Greedy Heuristic (GH) and an Adaptive Greedy Heuristic (AGH). AGH extends GH through multi-start construction, local search, and GPU consolidation. Both methods maintain feasibility through parallelism-aware filtering, cost-based ranking, and adaptive parallelism scaling. Experiments based on the Azure LLM Inference Trace show that GH generates feasible solutions within one second, while AGH achieves near-optimal performance within three seconds and scales to large instances where exact solvers fail to converge. Under out-of-sample stress with up to 1.5x delay and accuracy inflation, AGH degrades gracefully through provisioned headroom, yielding substantially lower cost and SLO violations than cost-minimal MILP solutions. Across synthetic and real Azure workloads, AGH maintains SLO compliance at significantly lower cost than exact MILP solutions. These results demonstrate that high-quality allocations provide substantial robustness to demand variability while enabling rapid adaptation to workload changes.

Summary / 总结

Serving large language model (LLM) inference in cloud environments requires jointly optimizing model selection, GPU provisioning, parallelism configuration, and workload routing under latency, accuracy, memory, and budget constraints.

Natural Language Access Control (NLAC): From Help Desk Requests to Structured Policies

Authors: Jonas Wessner, Tobias Meuser, Janek Schoffit, Dennis Eisermann, Johannes Deger, Björn Scheuermann, Frank Kargl

First: 2026-06-04T21:22:08+00:00 · Latest: 2026-06-04T21:22:08+00:00

Abs · PDF · Code1 · Code2

Abstract

Configuring network access control policies in large, complex networks is error-prone and requires significant expert effort. LLMs offer a promising interface for expressing such policies in natural language, but their capability for translating user requests into access policies, and the system architectures best suited to leverage LLMs, remain underexplored. We present an architecture for natural-language access control (NLAC) that uses LLMs to translate user requests into access policies, and introduce NLACBench, a benchmark for evaluating LLM-based intent translation systems in large-scale networks. Our evaluation across multiple state-of-the-art models shows that top-performing LLMs achieve up to 96.9% accuracy in small-network settings, but performance degrades substantially (below 20% for some models) as network size increases. To address this limitation, we identify relevant network components via embedding similarity and construct compact subgraphs that are passed to the LLM. This approach enables scaling to larger networks with up to 98.7% accuracy, while simultaneously reducing inference time, hardware requirements, and operating costs to a constant resource budget. Finally, a case study indicates that top-performing models exhibit largely complementary error patterns, suggesting that intent translation accuracy may be further improved through multi-LLM architectures.

Summary / 总结

Configuring network access control policies in large, complex networks is error-prone and requires significant expert effort.

DAST: A VLM-LLM Framework for Cross-Interface Anomaly Detection in O-RAN

Authors: Francesco Spinelli, Esteban Municio, Pau Baguer, Gines Garcia-Aviles, Xavier Costa-Perez

First: 2026-06-04T15:05:04+00:00 · Latest: 2026-06-04T15:05:04+00:00

Comments: 7 pages, 5 figures. This work has been submitted to the IEEE for possible publication

Abs · PDF · Code1 · Code2

Abstract

O-RAN enables a disaggregated baseband stack with programmable functions that communicate over standardized open interfaces. The same openness that enables multi-vendor composition also expands the attack surface across logically decoupled tiers that make up the compute continuum. Among these threats, Denial-of-Service and performance-degradation attacks, which account for the majority of catalogued O-RAN threats, are particularly difficult to detect. Traditional Time-Series Anomaly Detection (TSAD) methods fail in this new regime where labelled baselines are scarce, threats evolve faster than detectors can be retrained, and the high-dimensional multivariate telemetry overwhelms monolithic inference models. To address these challenges, we present DAST, a zero-shot multi-agent framework for cross-interface anomaly detection in O-RAN that chains a three-stage VLM $\rightarrow$ LLM $\rightarrow$ VLM pipeline. DAST converts multivariate KPI streams into visual representations, scores textual per-interface descriptions against O-RAN domain knowledge, and verifies suspects on high-resolution heatmaps to output the problematic interfaces, the anomalous time intervals, an indicative O-RAN WG11-aligned operational impact rating and the decision rationale. We evaluate DAST on real network traces collected from an O-RAN testbed under representative performance degradation scenarios, achieving 0.910 F1-Score and 0.843 Accuracy, outperforming state-of-the-art TSAD baselines.

Summary / 总结

O-RAN enables a disaggregated baseband stack with programmable functions that communicate over standardized open interfaces.

Efficient Asynchronous Federated Evaluation with Strategy Similarity Awareness for Intent-Based Networking in Industrial Internet of Things

Authors: Shaowen Qin, Jianfeng Zeng, Haodong Guo, Xiaohuan Li, Jiawen Kang, Qian Chen

First: 2025-11-28T09:03:26+00:00 · Latest: 2026-06-04T13:26:29+00:00

Comments: 12 pages with 7 figures and 4 tables

Abs · PDF · Code1 · Code2

Abstract

Intent-Based Networking (IBN) offers a promising paradigm for intelligent and automated network control in Industrial Internet of Things (IIoT) environments by translating high-level user intents into executable network strategies. However, frequent strategy deployment and rollback are impractical due to tightly coupled workflows and high downtime costs, while node heterogeneity and privacy constraints further complicate centralized strategy evaluation. To address these challenges, we propose a Federated Evaluation Enhanced Intent-Based Networking framework (FEIBN), which leverages large language models (LLMs) to translate user intents into structured strategy tuples and employs federated learning to support distributed strategy evaluation. To improve training efficiency and reduce communication overhead, we design a Strategy Similarity Aware Federated Learning mechanism (SSAFL), which selects nodes relevant to the task based on strategy similarity and resource status, and triggers asynchronous model uploads only when local updates are significant. Experiments demonstrate that the proposed method improves model accuracy, accelerates convergence, and reduces communication cost compared with the baselines.

Summary / 总结

Intent-Based Networking (IBN) offers a promising paradigm for intelligent and automated network control in Industrial Internet of Things (IIoT) environments by translating high-level user intents into executable network strategies.

Dual-Mode Wireless Devices for Adaptive Pull and Push-Based Communication

Authors: Sara Cavallero, Fabio Saggese, Junya Shiraishi, Israel Leyva-Mayorga, Shashi Raj Pandey, Chiara Buratti, Petar Popovski

First: 2025-07-31T10:52:35+00:00 · Latest: 2026-06-03T12:31:01+00:00

Comments: Submitted to IEEE Transactions on Communications, Copyright might be transferred without notice

Abs · PDF · Code1 · Code2

Abstract

This paper introduces a dual-mode communication framework for wireless devices that integrates query-driven (pull) and event-driven (push) transmissions within a unified time-frame structure. Devices typically respond to information requests in pull mode, but if an anomaly is detected, they preempt the regular response to report the critical condition. Additionally, push-based communication is used to proactively send critical data without waiting for a request. This adaptive approach ensures timely, context-aware, and efficient data delivery across different network conditions. To achieve high energy efficiency, we incorporate a wake-up radio mechanism and we design a tailored medium access control (MAC) protocol that supports data traffic belonging to the different communication classes. A comprehensive system-level analysis is conducted, accounting for the wake-up control operation and evaluating three key performance metrics: the success probability of anomaly reports (push traffic), the success probability of query responses (pull traffic) and the total energy consumption. Numerical results characterize the system's behavior and highlight the inherent trade-off between push and pull success probabilities as a function of allocated communication resources. Our analysis demonstrates that the proposed approach achieves up to a 42% reduction in energy consumption per served packet compared to traditional approaches, while maintaining reliable support for both communication paradigms.

Summary / 总结

This paper introduces a dual-mode communication framework for wireless devices that integrates query-driven (pull) and event-driven (push) transmissions within a unified time-frame structure.

Treat Traffic Like Trees: A Semantic-Preserving Hierarchical Graph-Based Expert Framework for Encrypted Traffic Analysis

Authors: Yuantu Luo, Jun Tao, Linxiao Yu, Guang Cheng

First: 2026-06-03T06:52:29+00:00 · Latest: 2026-06-03T06:52:29+00:00

Comments: This work has been submitted to the IEEE for possible publication

Abs · PDF · Code1 · Code2

Abstract

Graph-based deep learning methods have been widely employed in encrypted traffic analysis to exploit latent correlations across different granularities. However, while complex preprocessing pipelines and sophisticated model structures often achieve strong performance, they may obscure inherent protocol semantics during representation learning. Moreover, the hierarchical structure of protocol layers and their corresponding fields, defined by protocol specifications and routinely utilized in manual traffic analysis, remains underexplored in existing learning frameworks. In this paper, we propose Protocol Tree Graph Attention with Mixture of Experts (PTGAMoE), a semantic-preserving hierarchical graph-based expert framework for encrypted traffic analysis. The field-based graph construction and expert committee design enable PTGAMoE to quantify the model's preferences for specific fields and protocols. Extensive experimental results on representative benchmark datasets under strict no-data-leakage settings demonstrate that PTGAMoE significantly outperforms state-of-the-art (SOTA) models. Furthermore, the semantic-preserving design provides interpretable insights into protocol-level feature importance and expert-level contributions, reflecting the model's decision-making logic in encrypted traffic classification tasks.

Summary / 总结

Graph-based deep learning methods have been widely employed in encrypted traffic analysis to exploit latent correlations across different granularities.

vLLM Semantic Router: Signal Driven Decision Routing for Mixture-of-Modality Models

Authors: Xunzhuo Liu, Huamin Chen, Samzong Lu, Yossi Ovadia, Guohong Wen, Hao Wu, Zhengda Tan, Jintao Zhang, Senan Zedan, Yehudit Kerido, Liav Weiss, Haichen Zhang, Bishen Yu, Asaad Balum, Noa Limoy, Abdallah Samara, Baofa Fan, Brent Salisbury, Ryan Cook, Zhijie Wang, Qiping Pan, Rehan Khan, Avishek Goswami, Houston H. Zhang, Shuyi Wang, Ziang Tang, Fang Han, Zohaib Hassan, Jianqiao Zheng, Avinash Changrani, Xue, Liu, Bowei He

First: 2026-02-23T15:00:01+00:00 · Latest: 2026-06-03T06:35:07+00:00

Comments: Technical Report

Abs · PDF · Code1 · Code2

Abstract

As large language models (LLMs) diversify across modalities, capabilities, and cost profiles, the problem of intelligent request routing: selecting the right model for each query at inference time, has become a critical systems challenge. We present vLLM Semantic Router, a signal-driven decision routing framework for Mixture-of-Modality (MoM) model deployments. The architecture follows two complementary Shannon-inspired views. In the information-theoretic regime, signal extraction reduces the entropy of "which model?" by distilling routing-relevant information from raw queries. In the Boolean-algebraic regime, the decision engine composes functionally complete routing policies from signal conditions. The central innovation is composable signal orchestration: thirteen heterogeneous signal types, spanning sub-millisecond heuristics and neural classifiers for semantics, safety, and modality, are composed through configurable Boolean decision rules into deployment-specific routing policies, so that fundamentally different scenarios (multi-cloud enterprise, privacy-regulated, cost-optimized) are expressed as different configurations over the same architecture. Matched decisions drive semantic model routing via thirteen selection algorithms, while per-decision plugin chains enforce safety constraints including a three-stage HaluGate hallucination detection pipeline and a lightweight episodic memory system with ReflectionGate for personalized multi-turn context. A typed neural-symbolic DSL specifies these routing policies and compiles them to multiple deployment targets, enabling configuration-first adaptation without code changes. Together, these components show that composable signal orchestration enables a single framework to serve diverse deployment scenarios with differentiated cost, privacy, and safety policies.

Summary / 总结

As large language models (LLMs) diversify across modalities, capabilities, and cost profiles, the problem of intelligent request routing: selecting the right model for each query at inference time, has become a critical systems challenge.

Toward Autonomous O-RAN: A Multi-Scale Agentic AI Framework for Real-Time Network Control and Management

Authors: Hojjat Navidan, Mohammad Cheraghinia, Jaron Fontaine, Mohamed Seif, Eli De Poorter, H. Vincent Poor, Ingrid Moerman, Adnan Shahid

First: 2026-02-15T12:34:01+00:00 · Latest: 2026-06-02T22:00:53+00:00

Comments: Submitted to the IEEE Networks Journal

Abs · PDF · Code1 · Code2

Abstract

Open Radio Access Networks (O-RAN) promise flexible 6G network access through disaggregated, software-driven components and open interfaces, but this programmability also increases operational complexity. Multiple control loops coexist across the service management layer and RAN Intelligent Controller (RIC), while independently developed control applications can interact in unintended ways. In parallel, recent advances in generative Artificial Intelligence (AI) are enabling a shift from isolated AI models toward agentic AI systems that can interpret goals, coordinate multiple models and control functions, and adapt their behavior over time. This article proposes a multi-scale agentic AI framework for O-RAN that organizes RAN intelligence as a coordinated hierarchy across the Non-Real-Time (Non-RT), Near-Real-Time (Near-RT), and Real-Time (RT) control loops: (i) A Large Language Model (LLM) agent in the Non-RT RIC translates operator intent into policies and governs model lifecycles. (ii) Small Language Model (SLM) agents in the Near-RT RIC execute low-latency optimization and can activate, tune, or disable existing control applications; and (iii) Wireless Physical-layer Foundation Model (WPFM) agents near the distributed unit provide fast inference close to the air interface. We describe how these agents cooperate through standardized O-RAN interfaces and telemetry. Using a proof-of-concept implementation built on open-source models, software, and datasets, we demonstrate the proposed agentic approach in two representative scenarios: robust operation under non-stationary conditions and intent-driven slice resource control.

Summary / 总结

Open Radio Access Networks (O-RAN) promise flexible 6G network access through disaggregated, software-driven components and open interfaces, but this programmability also increases operational complexity.

Inductive Latent Context Persistence: Closing the Post-Handover Cold Start in 6G Radio Access Networks

Authors: Anubhab Banerjee, Daniyal Amir Awan

Venue: ICML 2026

First: 2026-05-01T12:00:06+00:00 · Latest: 2026-06-02T18:13:32+00:00

Abs · PDF · Code1 · Code2

Abstract

In modern radio access networks (RANs), rule-based handover (HO) decisions (e.g., A3/A5) depend on user equipment (UE) measurements only, so UEs at the same location can receive inconsistent HO outcomes. GNN-based methods improve HO KPIs using richer context than measurements alone. However, recurrent or graph models discard the per-UE recurrent state at HO and reinitialize at the target next-generation Node B (gNB), losing mobility history and forcing the target model to rebuild from post-HO measurements only. We address this post-HO cold start with Inductive Latent Context Persistence (ILCP), compressing the source recurrent state, transporting it on the 3GPP Xn as a 128-byte payload, and adapting it at the target gNB. We model the RAN as a dynamic heterogeneous graph over UE nodes, gNB nodes, measurement edges, and Xn edges. On a Vienna 4G/5G drive-test, ILCP achieves 0.0% ping-pong HOs versus 6.5% for an identical no-transfer baseline and 22.6% for a Transformer baseline; post-HO accuracy improves by +5.1 pp on average (peak +13.3 pp) in the 50-250 ms window. On one NVIDIA GTX 1080 (8 GB), ILCP runs end-to-end at 7.7 ms p99 per handover decision. Under perturbations (shadow fading, NLOS blockage, SSB-burst sparsity), robustly trained ILCP keeps handover failure (HOF) in the 10-13% range. Under the same fixed-reference-label setting, A3/A5 rises from 1.1% to 57-65% HOF when measurements are perturbed, exposing limits of measurement-only rules.

Summary / 总结

In modern radio access networks (RANs), rule-based handover (HO) decisions (e.g., A3/A5) depend on user equipment (UE) measurements only, so UEs at the same location can receive inconsistent HO outcomes.

NetKV: Network-Aware Decode Instance Selection for Disaggregated LLM Inference

Authors: Mubarak Adetunji Ojewale

First: 2026-06-02T17:06:57+00:00 · Latest: 2026-06-02T17:06:57+00:00

Abs · PDF · Code1 · Code2

Abstract

Disaggregated LLM inference forces the KV cache to traverse the datacenter network before decoding begins, so transfer time enters directly into the Time to First Token (TTFT) budget. Current schedulers route on compute load and prefix-cache locality alone, ignoring the topological distance and dynamic congestion between prefill and decode instances. We close this gap with a thin operator-to-scheduler interface, the network cost oracle, and we prove that ignoring the network term renders cache-aware-only scheduling arbitrarily suboptimal as context length grows. NetKV, the O(|D|) per-request greedy that consumes this oracle, has tier rankings that are provably robust to stale telemetry. On a 64-GPU four-tier fat-tree simulator driven by Mooncake traces, NetKV reduces mean TTFT by up to 21.2% over round-robin and 17.6% over a tuned cache+load-aware scheduler, lifts SLO attainment by up to 20.1 percentage points, and keeps the Time Between Tokens overhead below 0.5 ms in every condition tested, with no changes to the transport, inference engine, or hardware.

Summary / 总结

Disaggregated LLM inference forces the KV cache to traverse the datacenter network before decoding begins, so transfer time enters directly into the Time to First Token (TTFT) budget.

When BBR Meets Live Streaming

Authors: Xu Yan, Tong Li, Bo Wu, Cheng Luo, Jiuxiang Zhu, Laizhong Cui

First: 2026-06-02T10:46:17+00:00 · Latest: 2026-06-02T10:46:17+00:00

Abs · PDF · Code1 · Code2

Abstract

Recently, industrial pioneers like Amazon, Tencent, ByteDance, and Huawei have been adopting BBR as their congestion control algorithm for live-streaming applications, including TikTok Live. However, BBR, originally crafted for bulk data transmission, faces multiple challenges in live-streaming scenarios. In this paper, we first explore two key issues associated with BBR due to inaccurate bandwidth estimation in live-streaming scenarios: (i) BBR cannot easily exit its startup phase, resulting in a fierce self-inflicted loss. (ii) BBR sends data at a lower rate than the available bandwidth during its stable phase. We then propose BBR-Copilot, an auxiliary congestion control component that cooperates with BBR, making BBR better adapt to live-streaming scenarios. BBR-Copilot allows for proactively generating accurate bandwidth measurement samples by smartly creating and sending extra data. We implement the BBR-Copilot prototype upon QUIC and evaluate it via testbed. Experimental evaluation results show that BBR-Copilot effectively enhances BBR's performance in live-streaming scenarios.

Summary / 总结

Recently, industrial pioneers like Amazon, Tencent, ByteDance, and Huawei have been adopting BBR as their congestion control algorithm for live-streaming applications, including TikTok Live.

BigDipper: Sharded Censorship Resistant Data Availability for Leader-Based BFT

Authors: Bowen Xue, Samuel Laferriere, Soubhik Deb, Sreeram Kannan

First: 2023-07-03T22:41:27+00:00 · Latest: 2026-06-02T02:19:35+00:00

Abs · PDF · Code1 · Code2

Abstract

Leader-based Byzantine-fault-tolerant (BFT) protocols provide low latency and simple communication structure, but they give the leader short-term control over transaction inclusion. A malicious leader can keep the protocol live while delaying or excluding time-sensitive transactions such as auction bids, oracle updates, liquidations, and bridge messages. Existing responses often build a fixed censorship-resistance, hiding, or ordering mechanism into the protocol path, forcing all transactions to pay for the same protection level. name follows the end-to-end principle: the consensus layer exposes inclusion primitives rather than hardcoding stronger policies. Higher-layer protocols can then choose their own submission strategies and resources, whether through replication, erasure coding, or other mechanisms, to obtain the censorship-resistance, hiding, ordering, or execution guarantees they need. At the core of BigDipper is censorship-resistant data availability, or DA-CR, which certifies available replica-contributed mini-blocks for use by leader-based consensus. A central design goal is that data remains sharded on the consensus critical path: validators do not reconstruct or execute the full payload before voting, but instead check commitments, availability evidence, and the DA-CR inclusion rule. We define DA-CR guarantees for data-tampering resistance, honest mini-block inclusion, and residual leader influence. We then give concrete constructions based on erasure coding and linear commitments, analyze client-tunable transaction submission, and instantiate BigDipper inside HotStuff-2.

Summary / 总结

Leader-based Byzantine-fault-tolerant (BFT) protocols provide low latency and simple communication structure, but they give the leader short-term control over transaction inclusion.

History

20260621_0416 20260620_0427 20260619_0526 20260618_0507 20260617_0543 20260616_0544 20260615_0419 20260614_0417 20260613_0503 20260612_0522 20260611_0526 20260610_0459 20260609_0512 20260608_0413 20260607_0410 20260606_0449 20260604_0550 20260603_0549 20260602_0611 20260601_0409 20260530_0509 20260529_0515 20260528_0503 20260527_0457 20260526_0424 20260524_0407 20260523_0445 20260522_0452 20260521_0500 20260520_0446 20260519_0426 20260518_0405 20260517_0402 20260516_0418 20260515_0428 20260514_0447 20260513_0430 20260512_0448 20260511_0400 20260510_0358 20260509_0408 20260508_0416 20260507_0423 20260506_0427 20260505_0436 20260504_0410 20260503_0414 20260502_0426 20260501_0429 20260430_0430 20260429_0437 20260428_0429 20260427_0405 20260426_0404 20260425_0410 20260424_0430 20260423_0426 20260422_0424 20260421_0418 20260420_0359 20260419_0358 20260418_0415 20260417_0421 20260416_0425 20260415_0426 20260414_0423 20260413_0352 20260412_0347 20260411_0356 20260410_0412 20260409_0411 20260407_0404 20260406_0347 20260405_0344 20260404_0350 20260403_0400 20260401_0408 20260331_0407 20260329_0347 20260328_0350 20260326_0357 20260325_0405 20260324_0400 20260323_0342 20260322_0340 20260321_0347 20260320_0356 20260319_0358 20260318_0405 20260317_0401 20260316_0343 20260315_0341 20260314_0344 20260313_0352 20260312_0352 20260311_0347 20260310_0350 20260309_0338 20260308_0337 20260307_0347 20260306_0402 20260305_0348 20260304_0348 20260303_0348 20260302_0336 20260301_0339 20260228_0348 20260227_0354 20260226_0402 20260225_0404 20260224_0406 20260223_0338 20260222_0339 20260221_0345 20260220_0348 20260219_0358 20260218_0358 20260217_0343 20260216_0339 20260215_0338 20260213_0401 20260212_0404 20260210_0409 20260208_0339 20260207_0349 20260206_0347 20260205_0346 20260204_0354 20260202_0337 20260201_0333 20260131_0345 20260130_0341 20260129_0344 20260128_0341 20260127_0338 20260126_0330 20260125_0329 20260124_0337 20260123_0337 20260122_0343 20260121_0424 20260119_0329 20260118_0327 20260117_0332 20260116_0339 20260115_0334 20260114_0333 20260113_0334 20260112_0331 20260111_0329 20260110_0333 20260109_0334 20260108_0335 20260107_0330 20260106_0336 20260105_0328 20260104_0328 20260103_0325 20260102_0339 20260101_0329 20251231_0333 20251230_0332 20251229_0329 20251228_0332 20251227_0329 20251226_0330 20251225_0329 20251224_0331 20251223_0332 20251222_0328 20251221_0329 20251220_0330 20251219_0330 20251218_0345 20251217_0332 20251216_0333 20251215_0333 20251214_0327 20251212_0333 20251211_0331 20251210_0332 20251209_0331 20251208_0328 20251207_0327 20251206_0330 20251205_0331 20251204_0331 20251203_0333 20251202_0335 20251201_0328 20251130_0327 20251129_0328 20251128_0327 20251127_0327 20251126_0329 20251125_0327 20251124_0327 20251123_0326 20251122_0328 20251121_0328 20251120_0329 20251119_0328 20251118_0328 20251117_0326 20251116_0325 20251115_0327 20251114_0328 20251113_0330 20251112_0329 20251111_0328 20251110_0325 20251109_0326 20251108_0328 20251107_0328 20251106_0329 20251105_0326 20251104_0327 20251103_0324 20251102_0326 20251101_0324 20251031_0328 20251030_0330 20251029_0329 20251028_0329 20251027_0322 20251026_0327 20251025_0331 20251024_0329 20251023_0329 20251022_0330 20251021_0331 20251020_0328 20251019_0321 20251018_0327 20251017_0320 20251016_0328 20251015_0328 20251014_0323 20251011_0328 20251010_0330 20251009_0321 20251008_0343 20251007_0353 20251006_0325 20251005_0350 20251004_0352 20251003_0352 20251002_0356 20251001_0321 20250925_0335 20250924_0350 20250923_0348 20250922_0346 20250921_0345 20250920_0342 20250919_0346 20250918_0342 20250917_0336 20250916_0333 20250915_0333 20250914_0328 20250913_0322 20250912_0335 20250911_0337 20250910_0338 20250909_0341 20250908_0342 20250907_0333 20250906_0350 20250905_0319 20250904_0323 20250903_0355 20250902_0325 20250901_0355 20250831_0355 20250830_0356 20250829_0355 20250828_0333 20250827_1654 20250827_1602 20250827_1557 20250827_0320 20250826_0320 20250825_1752 20250825_1709 20250825_1652 20250825_1647 20250825_1645 20250825_1631 20250825_1606 20250825_1559 20250825_1558 20250825_1556 20250825_1531 20250825_1525 20250825_1516 20250825_1450 20250825_1444 20250825_1438 20250825_1414 20250825_1413 20250825_1410 20250825_1408 20250825_1405 20250825_1401 20250825_1355 20250825_1347 20250825_1345 20250825_1344 20250825_1343 20250825_1340 20250825_1339 20250825_1333 20250825_1323 20250825_1317 20250825_1243 20250824_0342 20250823_0343 20250823_0142 20250822_2331 20250822_2308 20250822_2258 20250822_2241 20250822_2228 20250822_2206 20250822_2147 20250822_2111 20250822_1259 20250822_1233 20250822_1229 20250822_1223 20250822_1210 20250822_1201 20250822_1111 20250822_1058 20250822_1052 20250822_1045 20250822_0657 20250822_0553