中国电力 ›› 2025, Vol. 58 ›› Issue (10): 14-26.DOI: 10.11930/j.issn.1004-9649.202503014

• “十五五”电力系统源网荷储协同规划运行关键技术 • 上一篇    下一篇

基于改进双深度Q网络的微电网群能量管理策略

何锦涛1,2(), 王灿1,2(), 王明超1,2(), 程本涛1,2, 刘于正1,2, 常文涵1,2, 王锐3, 余涵4   

  1. 1. 三峡大学 电气与新能源学院,湖北 宜昌 443002
    2. 湖北省微电网工程技术研究中心(三峡大学),湖北 宜昌 443002
    3. 武汉长海高新技术有限公司,湖北 武汉 430223
    4. 湖北华中电力科技开发有限责任公司,湖北 武汉 430077
  • 收稿日期:2025-03-07 发布日期:2025-10-23 出版日期:2025-10-28
  • 作者简介:
    何锦涛(2001),男,硕士研究生,从事微电网优化运行与控制研究,E-mail:hejintao1017@163.com
    王灿(1987),男,副教授,从事综合能源系统优化运行、微电网协调控制与优化运行研究,E-mail:xfcancan@163.com
    王明超(2001),男,通信作者,从事微电网优化运行与控制研究,E-mail:2949948561@qq.com
  • 基金资助:
    国家自然科学基金资助项目(52107108)。

Energy Management Strategy for Microgrid Cluster Based on Improved Double Deep Q-Network

HE Jintao1,2(), WANG Can1,2(), WANG Mingchao1,2(), CHENG Bentao1,2, LIU Yuzheng1,2, CHANG Wenhan1,2, WANG Rui3, YU Han4   

  1. 1. College of Electrical Engineering and New Energy, China Three Gorges University, Yichang 443002, China
    2. Hubei Provincial Engineering Technology Research Center for Microgrid (China Three Gorges University), Yichang 443002, China
    3. Wuhan Great Sea Hi-tech Co., Ltd., Wuhan 430223, China
    4. State Grid Hubei Central China Technology Development of Electric Power Co., Ltd., Wuhan 430077, China
  • Received:2025-03-07 Online:2025-10-23 Published:2025-10-28
  • Supported by:
    This work is supported by National Natural Science Foundation of China (No.52107108).

摘要:

针对传统微电网群能量管理方法存在的高估偏差与决策精度不足问题,提出一种基于改进双深度Q网络的能量管理策略。首先,构建基于裁剪双Q值思想的双目标价值网络框架,通过并行计算双价值网络的时序差分(temporal difference,TD)目标值并裁剪高TD目标值,抑制价值函数的高估偏差,提高决策精度。然后,采用动态贪婪策略,基于当前状态计算所有可能动作的值函数,避免频繁选择最大Q值动作,使智能体充分探索动作以防止过早收敛。最后,以包含3个子微网的微电网群进行算例验证。仿真结果表明,相较于基于模型预测控制和传统双深度Q网络的能量管理策略,本文所提方法具有更好的寻优效果和收敛性,同时将系统运行成本分别降低了44.62%和26.39%。

关键词: 微电网群, 能量管理, 改进双深度Q网络, 裁剪双Q值, 贪婪策略

Abstract:

To address the overestimation bias and poor decision accuracy of conventional microgrid cluster energy management methods, an energy management strategy based on improved double deep Q-network is proposed. Firstly, this study constructed a dual-objective value network framework based on clipped double Q-learning, which enhances decision-making precision by suppressing value overestimation bias through parallel computation of temporal difference (TD) targets for dual value networks and clipping high TD target values. And then, a dynamic greedy strategy was adopted to calculate the value function of all possible actions based on the current state, avoiding persistent exploitation of the greedy actions to ensure sufficient exploration and prevent premature convergence of the agent. Finally, a case study of a microgrid cluster with three sub-microgrids was conducted for verification. The simulation results show that compared to the energy management strategies based on model predictive control and conventional double deep Q-network, the proposed method achieves superior optimization performance and convergence characteristics, while reducing system operating costs by 44.62% and 26.39% respectively.

Key words: microgrid cluster, energy management, improved double deep Q-network, clipped double Q values, greedy strategy

中图分类号: 


AI


AI小编
您好!我是《中国电力》AI小编,有什么可以帮您的吗?