Author Identifier (ORCID)

Muhammad Ikram: https://orcid.org/0000-0001-6559-9067

Asma Aziz: https://orcid.org/0000-0003-3538-0536

Daryoush Habibi: https://orcid.org/0000-0002-7662-6830

Abstract

The increasing penetration of inverter−based resources (IBR) in power grids necessitates novel control techniques to deliver advanced ancillary services such as fast frequency response (FFR), crucial for ensuring power system stability. Existing schemes−droop control, virtual synchronous generator (VSG), and hybrid approaches−require full observability, and centralized control, which are unsuitable for utility−scale hybrid power plant (HPP) with a distributed grid−forming inverter (GFMI). In addition, it lacks adaptive control capabilities in response to real−time disturbances according to the IEEE 2800−2022 standard. To address these challenges, this paper proposes a design of a novel multi−agent deep reinforcement learning (MADRL) framework for utility−scale IBR−dominated HPP environment. The proposed framework employs a model−free actor−critic architecture within a centralized training and decentralized execution (CTDE) mechanism, incorporating non−linear dynamics through GRU and LSTM agents. Adaptive control of droop gains, virtual inertia, damping coefficients, and FFR−sensitive control parameters is demonstrated using four MADRL algorithms: MAA2C, IPPO, MAPPO, and novel FreqMAPPO to assess agent−level and system−level frequency stability. Extensive simulations in the modified IEEE 118-bus Grid2Op dynamic environment with 80% IBR penetration are conducted to observe the frequency response following grid disturbances. The proposed framework enhances frequency stability by optimizing GFMI control dynamics to minimize frequency deviation, reduce RoCof, and enable optimal FFR contribution. In the evaluated disturbance suite, FreqMAPPO provides superior performance, achieving a 96.41% success rate, outperforming MAPPO 89.62%, IPPO 78.23%, and MAA2C 65.66% for FFR metrics. Specifically, FreqMAPPO achieves the lowest frequency deviation (0.041 Hz), RoCof (−0.479 Hz/s), and highest FFR capacity (3.41 MW), compared to MAPPO (0.078 Hz, −0.781 Hz/s, and 3.17 MW), IPPO (0.132 Hz, −0.876 Hz/s, and 2.89 MW), and MAA2C (0.164 Hz, −0.857 Hz/s, and 2.76 MW). A sensitivity analysis further validates the robustness, adaptability, and scalability of the proposed framework, showing consistent performance for 10%−20% variations in FFR parameters, highlighting its resilience in real−time frequency stabilization and optimal power dispatch capabilities under dynamic disturbing scenarios.

Keywords

fast frequency response, grid−forming inverters, hybrid power plants, inverter−based resources, multi−agent deep reinforcement learning, proximal policy optimization

Document Type

Journal Article

Date of Publication

6-1-2026

Volume

46

Publication Title

Sustainable Energy, Grids and Networks

Publisher

Elsevier

School

School of Engineering

RAS ID

94349

Funding Information

The first author would like to thank Edith Cowan University (ECU) for the award of an ECU Higher Degree by Research (ECU-HDR) scholarship to support this research project.

Creative Commons License

Creative Commons Attribution 4.0 License
This work is licensed under a Creative Commons Attribution 4.0 License.

Comments

Ikram, M., Aziz, A., & Habibi, D. (2026). A novel multi−agent deep reinforcement learning framework for fast frequency response in inverter−based hybrid power plants. Sustainable Energy, Grids and Networks, 46, 102282. https://doi.org/10.1016/j.segan.2026.102282

Share

 
COinS
 

Link to publisher version (DOI)

10.1016/j.segan.2026.102282