Adaptive Bandwidth Allocation via Uncertainty-Constrained Deep Reinforcement Learning

Li Wei¹, Wu Yong¹ and Yan Dong¹

Suzhou Suneng Group Co., LTD
215004 Suzhou, China
{liwei sz,wuy11,yandong sz}@js.sgcc.com.cn

Abstract

With the rapid growth of network services, traditional static bandwidth allocation schemes can no longer meet the demands of multi-user, dynamic, and QoS-sensitive applications. Ensuring both efficiency and stability in bandwidth allocation remains a significant challenge, especially under high variability and uncer-tainty conditions. To address this, we propose a novel algorithm named Uncertainty-Constrained Stability-aware Deep Reinforcement Learning (UCS-DRL) for dynamic bandwidth allocation. UCS-DRL adopts a dual-policy architecture: a task policy that learns optimal bandwidth allocation decisions, and a stability policy guided by uncertainty-aware value estimation to identify and mitigate potential risky or unstable behaviors during deployment. Furthermore, the framework incorporates a curiosity-driven exploration mechanism based on Random Network Distillation, which enhances exploration efficiency by encouraging the agent to visit informative and under-explored states. Experimental results show that UCS-DRL achieves high bandwidth utilization and service quality while reducing policy volatility and risky actions, balancing performance and robustness in dynamic bandwidth allocation.

Key words

Dynamic Resource Allocation, Reinforcement Learning, Uncertainty Estimation, Stability-aware Control, Dual-policy Framework

Digital Object Identifier (DOI)

https://doi.org/10.2298/CSIS250923008L

Publication information

Volume 23, Issue 1 (January 2026)
Year of Publication: 2026
ISSN: 2406-1018 (Online)
Publisher: ComSIS Consortium

Full text

Download Available in PDF
Portable Document Format

How to cite

Wei, L., Yong, W., Dong, Y.: Adaptive Bandwidth Allocation via Uncertainty-Constrained Deep Reinforcement Learning. Computer Science and Information Systems, Vol. 23, No. 1, 277-297. (2026), https://doi.org/10.2298/CSIS250923008L