Document Type

Journal Article

Publication Title

Energies

Volume

18

Issue

10

Publisher

MDPI

School

School of Engineering

RAS ID

82095

Funders

Edith Cowan University

Comments

Ikram, M., Habibi, D., & Aziz, A. (2025). Networked multi-agent deep reinforcement learning framework for the provision of ancillary services in hybrid power plants. Energies, 18(10), 2666. https://doi.org/10.3390/en18102666

Abstract

Inverter-based resources (IBRs) are becoming more prominent due to the increasing penetration of renewable energy sources that reduce power system inertia, compromising power system stability and grid support services. At present, optimal coordination among generation technologies remains a significant challenge for frequency control services. This paper presents a novel networked multi-agent deep reinforcement learning (N—MADRL) scheme for optimal dispatch and frequency control services. First, we develop a model-free environment consisting of a photovoltaic (PV) plant, a wind plant (WP), and an energy storage system (ESS) plant. The proposed framework uses a combination of multi-agent actor-critic (MAAC) and soft actor-critic (SAC) schemes for optimal dispatch of active power, mitigating frequency deviations, aiding reserve capacity management, and improving energy balancing. Second, frequency stability and optimal dispatch are formulated in the N—MADRL framework using the physical constraints under a dynamic simulation environment. Third, a decentralised coordinated control scheme is implemented in the HPP environment using communication-resilient scenarios to address system vulnerabilities. Finally, the practicality of the N—MADRL approach is demonstrated in a Grid2Op dynamic simulation environment for optimal dispatch, energy reserve management, and frequency control. Results demonstrated on the IEEE 14 bus network show that compared to PPO and DDPG, N—MADRL achieves 42.10% and 61.40% higher efficiency for optimal dispatch, along with improvements of 68.30% and 74.48% in mitigating frequency deviations, respectively. The proposed approach outperforms existing methods under partially, fully, and randomly connected scenarios by effectively handling uncertainties, system intermittency, and communication resiliency.

DOI

10.3390/en18102666

Creative Commons License

Creative Commons Attribution 4.0 License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

 
COinS
 

Link to publisher version (DOI)

10.3390/en18102666