A hierarchical reinforcement learning method based on decision frequency and internal reward mechanism
Author Identifier (ORCID)
Jianxin Li: https://orcid.org/0000-0002-9059-330X
Abstract
Wargames require participants to make real-time reasoning and decisions based on complex battlefield environment changes, facing severe challenges brought by large-scale decision spaces and rapidly changing battlefield situations. In recent years, many reinforcement learning algorithms have been continuously applied to wargames to simulate confrontational situations. However, existing methods have not yet provided satisfactory solutions for wargames and have certain limitations. In this context, we propose a hierarchical decision reinforcement learning algorithm based on decision frequency and internal reward mechanism. The algorithm divides decisions into three layers, decomposes complex tasks to reduce decision complexity, and compresses the action space to accelerate convergence. In addition, the decision frequency and internal reward mechanism are introduced to improve decision stability and guide learning. To verify the performance of the algorithm, experiments are designed for wargame scenarios. The experimental results show that the hierarchical structure enables the agent to effectively learn strategies and accelerate convergence.
Keywords
Decision frequency, Internal reward mechanism, reinforcement learning, wargames
Document Type
Conference Proceeding
Date of Publication
1-1-2026
Volume
16115 LNCS
Publication Title
Lecture Notes in Computer Science
Publisher
Springer
School
School of Business and Law
Funders
China University of Geosciences (2023080)
Copyright
subscription content
First Page
130
Last Page
145
Comments
Wang, C., Li, M., Wang, Y., Zhang, X., Huang, X., Li, B., Li, J., & Chen, Y. (2026). A hierarchical reinforcement learning method based on decision frequency and internal reward mechanism. Lecture Notes in Computer Science, 16115, 130–145. https://doi.org/10.1007/978-981-95-5719-6_9