Electric Power ›› 2025, Vol. 58 ›› Issue (10): 14-26.DOI: 10.11930/j.issn.1004-9649.202503014

• Key Technologies for the Coordinated Planning and Operation of Power Sources, Grids, Loads and Storage in the "15th Five-Year Plan" Period • Previous Articles     Next Articles

Energy Management Strategy for Microgrid Cluster Based on Improved Double Deep Q-Network

HE Jintao1,2(), WANG Can1,2(), WANG Mingchao1,2(), CHENG Bentao1,2, LIU Yuzheng1,2, CHANG Wenhan1,2, WANG Rui3, YU Han4   

  1. 1. College of Electrical Engineering and New Energy, China Three Gorges University, Yichang 443002, China
    2. Hubei Provincial Engineering Technology Research Center for Microgrid (China Three Gorges University), Yichang 443002, China
    3. Wuhan Great Sea Hi-tech Co., Ltd., Wuhan 430223, China
    4. State Grid Hubei Central China Technology Development of Electric Power Co., Ltd., Wuhan 430077, China
  • Received:2025-03-07 Online:2025-10-23 Published:2025-10-28
  • Supported by:
    This work is supported by National Natural Science Foundation of China (No.52107108).

Abstract:

To address the overestimation bias and poor decision accuracy of conventional microgrid cluster energy management methods, an energy management strategy based on improved double deep Q-network is proposed. Firstly, this study constructed a dual-objective value network framework based on clipped double Q-learning, which enhances decision-making precision by suppressing value overestimation bias through parallel computation of temporal difference (TD) targets for dual value networks and clipping high TD target values. And then, a dynamic greedy strategy was adopted to calculate the value function of all possible actions based on the current state, avoiding persistent exploitation of the greedy actions to ensure sufficient exploration and prevent premature convergence of the agent. Finally, a case study of a microgrid cluster with three sub-microgrids was conducted for verification. The simulation results show that compared to the energy management strategies based on model predictive control and conventional double deep Q-network, the proposed method achieves superior optimization performance and convergence characteristics, while reducing system operating costs by 44.62% and 26.39% respectively.

Key words: microgrid cluster, energy management, improved double deep Q-network, clipped double Q values, greedy strategy

CLC Number: