Two-layer Optimization Scheduling for Off-grid Microgrids Based on Multi-agent Deep Policy Gradient

doi:10.11930/j.issn.1004-9649.202408092

Abstract

Abstract:

To address the voltage limit violations and bidirectional power flow problems arising from high-penetration integration of distributed renewable energy, this paper proposes a two-layer active-reactive power cooperative optimization method to achieve cooperative optimal dispatch of active and reactive power in off-grid microgrids, ensuring the secure and stable operation of the system while enhancing operational economy. The lower-level model optimizes slow-regulating discrete devices based on mixed-integer second-order cone programming, while the upper-level model optimizes fast-regulating continuous devices using a multi-agent deep policy gradient algorithm. The two-layer model coordinates both active and reactive power flows of the microgrid, enabling real-time monitoring of the microgrid's status and online decision-making for the optimization of device regulation, without reliance on precise power flow models or complex communication systems. Finally, the feasibility and effectiveness of the two-layer optimization model are validated in the improved IEEE 33-bus microgrid system.

Key words: off-grid microgrid, multi-agent, deep reinforcement learning, mixed integer linear programming, multi-time scale, active-reactive power cooperative optimization

FAN Huicong, DUAN Zhiguo, CHEN Zhiyong, ZHU Shijia, LIU Hang, LI Wenxiao, YANG Yang. Two-layer Optimization Scheduling for Off-grid Microgrids Based on Multi-agent Deep Policy Gradient[J]. Electric Power, 2025, 58(5): 11-20, 32.

Add to citation manager EndNote|Ris|BibTeX

URL: https://www.electricpower.com.cn/EN/10.11930/j.issn.1004-9649.202408092

https://www.electricpower.com.cn/EN/Y2025/V58/I5/11

Figures/Tables 13

Fig.1 Schematic of two-layer optimal scheduling for off-grid microgrid

Fig.2 Schematic of second-order cone relaxation process

Fig.3 33-node microgrid system

Table 1 Parameters of controllable devices

设备	参数	安装节点
ESS	1 MW	13
OLTC	0.95~1.05 p.u.	4, 5, 7, 8
SC	0.2 MV·A	26, 30
SVC1	–1~1 MV·A	18
SVC2	–2~2 MV·A	33

Fig.4 Voltage amplitude of each node before and after optimization

Fig.5 Network losses of microgrids before and after optimization

Fig.6 Results of day-ahead scheduling optimization

Fig.7 Voltage amplitude of each node before and after optimization

Fig.8 Results of intraday scheduling optimization

Table 2 Power quality on multiple testing days

场景	电压偏差(p.u.)	功率损耗/MW
日前优化	8.3785×10^–2	9.2554×10^–3
日内优化	3.7125×10^–2	4.7943×10^–3

Fig.9 Voltage deviation on multiple test days

Fig.10 Power loss on multiple test days

Table 3 Decision time of different optimization algorithms

算法		平均决策时间/ms
MISOCP		173.2
MADDPG		4.6

References 26

1	于昊正, 赵寒杰, 李科, 等. 计及需求响应的分布式光伏集群承载能力评估[J]. 电力建设, 2023, 44 (2): 122- 131.
	YU Haozheng, ZHAO Hanjie, LI Ke, et al. Carrying capacity evaluation of distribution network for distributed photovoltaic cluster considering user-side demand response[J]. Electric Power Construction, 2023, 44 (2): 122- 131.
2	崔茗莉, 冯天天, 刘利利. 双碳目标下区块链与可再生能源的融合发展研究[J]. 智慧电力, 2024, 52 (2): 17- 24.
	CUI Mingli, FENG Tiantian, LIU Lili. Integration and development of blockchain and renewable energy under double carbon target[J]. Smart Power, 2024, 52 (2): 17- 24.
3	翟苏巍, 李银银, 杜凡, 等. 考虑海量分布式能源接入的配电网分布式无功控制策略[J]. 中国电力, 2024, 57 (8): 138- 144.
	ZHAI Suwei, LI Yinyin, DU Fan, et al. Distributed reactive power control strategy of distribution network considering massive distributed energy access[J]. Electric Power, 2024, 57 (8): 138- 144.
4	王华伟, 程小虎, 赵蒙蒙, 等. 面向分布式光伏消纳的中压配电网储能规划模型和求解方法[J]. 电力建设, 2023, 44 (9): 58- 67.
	WANG Huawei, CHENG Xiaohu, ZHAO Mengmeng, et al. Method for energy storage planning in medium-voltage distribution networks for distributed photovoltaic consumption[J]. Electric Power Construction, 2023, 44 (9): 58- 67.
5	焦昊, 殷岩岩, 吴晨, 等. 基于安全强化学习的主动配电网有功-无功协调优化调度[J]. 中国电力, 2024, 57 (3): 43- 50.
	JIAO Hao, YIN Yanyan, WU Chen, et al. Coordinated optimization of active and reactive power of active distribution network based on safety reinforcement learning[J]. Electric Power, 2024, 57 (3): 43- 50.
6	BLETTERIE B, KADAM S, BOLGARYN R, et al. Voltage control with PV inverters in low voltage networks: in depth analysis of different concepts and parameterization criteria[J]. IEEE Transactions on Power Systems, 2017, 32 (1): 177- 185. DOI
7	ZERAATI M, HAMEDANI GOLSHAN M E, GUERRERO J M. A consensus-based cooperative control of PEV battery and PV active power curtailment for voltage regulation in distribution networks[J]. IEEE Transactions on Smart Grid, 2019, 10 (1): 670- 680. DOI
8	ZHU H, LIU H J. Fast local voltage control under limited reactive power: optimality and stability analysis[J]. IEEE Transactions on Power Systems, 2016, 31 (5): 3794- 3803. DOI
9	冯昌森, 张瑜, 文福拴, 等. 基于深度期望Q网络算法的微电网能量管理策略[J]. 电力系统自动化, 2022, 46 (3): 14- 22.
	FENG Changsen, ZHANG Yu, WEN Fushuan, et al. Energy management strategy for microgrid based on deep expected Q network algorithm[J]. Automation of Electric Power Systems, 2022, 46 (3): 14- 22.
10	龚锦霞, 刘艳敏. 基于深度确定策略梯度算法的主动配电网协调优化[J]. 电力系统自动化, 2020, 44 (6): 113- 120.
	GONG Jinxia, LIU Yanmin. Coordinated optimization of active distribution network based on deep deterministic policy gradient algorithm[J]. Automation of Electric Power Systems, 2020, 44 (6): 113- 120.
11	齐韵英, 许潇, 殷科, 等. 基于深度强化学习的含储能有源配电网电压联合调控技术[J]. 电力建设, 2023, 44 (11): 64- 74.
	QI Yunying, XU Xiao, YIN Ke, et al. Voltage co-regulation technology of active distribution network with energy storage based on deep reinforcement learning[J]. Electric Power Construction, 2023, 44 (11): 64- 74.
12	XU H C, DOMÍNGUEZ-GARCÍA A D, SAUER P W. Optimal tap setting of voltage regulation transformers using batch reinforcement learning[J]. IEEE Transactions on Power Systems, 2020, 35 (3): 1990- 2001. DOI
13	YANG Q L, WANG G, SADEGHI A, et al. Two-timescale voltage control in distribution grids using deep reinforcement learning[J]. IEEE Transactions on Smart Grid, 2020, 11 (3): 2313- 2323. DOI
14	刘俊峰, 陈剑龙, 王晓生, 等. 基于深度强化学习的微能源网能量管理与优化策略研究[J]. 电网技术, 2020, 44 (10): 3794- 3803.
	LIU Junfeng, CHEN Jianlong, WANG Xiaosheng, et al. Energy management and optimization of multi-energy grid based on deep reinforcement learning[J]. Power System Technology, 2020, 44 (10): 3794- 3803.
15	WANG W, YU N P, GAO Y Q, et al. Safe off-policy deep reinforcement learning algorithm for volt-VAR control in power distribution systems[J]. IEEE Transactions on Smart Grid, 2020, 11 (4): 3008- 3018. DOI
16	刘洪, 李吉峰, 葛少云, 等. 基于多主体博弈与强化学习的并网型综合能源微网协调调度[J]. 电力系统自动化, 2019, 43 (1): 40- 48.
	LIU Hong, LI Jifeng, GE Shaoyun, et al. Coordinated scheduling of grid-connected integrated energy microgrid based on multi-agent game and reinforcement learning[J]. Automation of Electric Power Systems, 2019, 43 (1): 40- 48.
17	ZHANG Y, WANG X N, WANG J H, et al. Deep reinforcement learning based volt-VAR optimization in smart distribution systems[J]. IEEE Transactions on Smart Grid, 2021, 12 (1): 361- 371. DOI
18	倪爽, 崔承刚, 杨宁, 等. 基于深度强化学习的配电网多时间尺度在线无功优化[J]. 电力系统自动化, 2021, 45 (10): 77- 85. DOI
	NI Shuang, CUI Chenggang, YANG Ning, et al. Multi-time-scale online optimization for reactive power of distribution network based on deep reinforcement learning[J]. Automation of Electric Power Systems, 2021, 45 (10): 77- 85. DOI
19	李扬, 马文捷, 卜凡金, 等. 多智能体深度强化学习驱动的跨园区能源交互优化调度[J]. 电力建设, 2024, 45 (5): 59- 70. DOI
	Li Yang, Ma Wenjie, Bu Fanjin, et al. Multi agent deep reinforcement learning driven cross park energy interaction optimization scheduling[J]. Electric Power Construction, 2024, 45 (5): 59- 70. DOI
20	胡丹尔, 彭勇刚, 韦巍, 等. 多时间尺度的配电网深度强化学习无功优化策略[J]. 中国电机工程学报, 2022, 42 (14): 5034- 5045.
	HU Daner, PENG Yonggang, WEI Wei, et al. Multi-timescale deep reinforcement learning for reactive power optimization of distribution network[J]. Proceedings of the CSEE, 2022, 42 (14): 5034- 5045.
21	张兴平, 王腾, 张馨月, 等. 基于多智能体深度确定策略梯度算法的火力发电商竞价策略[J]. 中国电力, 2024, 57 (11): 161- 172.
	ZHANG Xingping, WANG Teng, ZHANG Xinyue, et al. Bidding strategy for thermal power generation companies based on multi-agent deep deterministic policy gradient algorithm[J]. Electric Power, 2024, 57 (11): 161- 172.
22	刘一兵, 吴文传, 张伯明, 等. 基于混合整数二阶锥规划的主动配电网有功–无功协调多时段优化运行[J]. 中国电机工程学报, 2014, 34 (16): 2575- 2583.
	LIU Yibing, WU Wenchuan, ZHANG Boming, et al. A mixed integer second-order cone programming based active and reactive power coordinated multi-period optimization for active distribution network[J]. Proceedings of the CSEE, 2014, 34 (16): 2575- 2583.
23	巨云涛, 陈希, 李嘉伟, 等. 基于分布式深度强化学习的微网群有功无功协调优化调度[J]. 电力系统自动化, 2023, 47 (1): 115- 125. DOI
	JU Yuntao, CHEN Xi, LI Jiawei, et al. Active and reactive power coordinated optimal dispatch of networked microgrids based on distributed deep reinforcement learning[J]. Automation of Electric Power Systems, 2023, 47 (1): 115- 125. DOI
24	孙国强, 殷岩岩, 卫志农, 等. 基于深度确定性策略梯度的主动配电网有功-无功协调优化调度[J]. 电力建设, 2023, 44 (11): 33- 42. DOI
	SUN Guoqiang, YIN Yanyan, WEI Zhinong, et al. Coordinated optimal dispatch of active and reactive power in active distribution networks using deep deterministic strategy gradient[J]. Electric Power Construction, 2023, 44 (11): 33- 42. DOI
25	张淑兴, 马驰, 杨志学, 等. 基于深度确定性策略梯度算法的风光储系统联合调度策略[J]. 中国电力, 2023, 56 (2): 68- 76.
	ZHANG Shuxing, MA Chi, YANG Zhixue, et al. Deep deterministic policy gradient algorithm based wind-photovoltaic-storage hybrid system joint dispatch[J]. Electric Power, 2023, 56 (2): 68- 76.
26	WANG J H, XU W K, GU Y J, et al. Multi-agent reinforcement learning for active voltage control on power distribution networks[C]//Proceedings of the 35th International Conference on Neural Information Processing Systems. ACM, 2021: 3271–3284.

[1]	CHEN Minghongtian, GENG Jianghai, ZHAO Yuze, XU Peng, HAN Yushan, ZHANG Yuming, ZHANG Zimo. Two-Stage Stochastic Optimization Based Weekly Operation Strategy for Electric-Hydrogen Coupled Microgrid [J]. Electric Power, 2025, 58(5): 82-90.
[2]	WANG Li, JIANG Yuxiang, ZENG Xiangjun, ZHAO Bin, LI Junhao. Secondary Frequency Control of Islanded Microgrid Based on Deep Reinforcement Learning [J]. Electric Power, 2025, 58(5): 176-188.
[3]	ZHOU Feihang, WANG Hao, WANG Haili, WANG Meng, JIN Yaojie, LI Zhongchun, ZHANG Zhongde, WANG Peng. Multi-entity Behaviors in Electricity-Carbon-Green Certificate Coupled Markets Based on Multi-agent Reinforcement Learning [J]. Electric Power, 2025, 58(4): 44-55.
[4]	WANG Guanchao, HUO Yuchong, LI Qun, LI Qiang. Power Optimization of Wind Farms Based on Improved Jensen Model and Deep Reinforcement Learning [J]. Electric Power, 2025, 58(4): 78-89.
[5]	Yushan LIU, Junru CHEN, Xiqiang CHANG, Muyang LIU. Multi-level Evaluation Index System and Application of Grid-Connected Performance of Grid-Forming Energy Storage Converters [J]. Electric Power, 2025, 58(3): 193-203.
[6]	Yi ZENG, Yi ZHOU, Jixiang LU, Liangcai ZHOU, Ningkai TANG, Hong LI. Voltage Control Based on Multi-Agent Safe Deep Reinforcement Learning [J]. Electric Power, 2025, 58(2): 111-117.
[7]	Quanpeng HE, Wei LIU, Weiyong YANG, Xingshen WEI, Qi WANG. A Moving Target Defense Strategy against Load Redistribution Attacks [J]. Electric Power, 2024, 57(9): 44-52.
[8]	Hui WANG, Kerui ZHOU, Zuohui WU, Zhichao ZOU, Xin LI. Multi-time Scale Optimal Scheduling of Integrated Energy System Coupling Power-to-Gas and Carbon Capture System [J]. Electric Power, 2024, 57(8): 214-226.
[9]	Jiawu WANG, Dianyun ZHAO, Changfeng LIU, Kang CHEN, Yumin ZHANG. Analytical Target Cascading Based Active Distribution Network Level Multi-agent Autonomous Collaborative Optimization [J]. Electric Power, 2024, 57(7): 214-226.
[10]	Chaoying LI, Qinliang TAN. Market Trading Strategy for Thermal Power Enterprise in New Power System Based on Agent Modeling [J]. Electric Power, 2024, 57(2): 212-225.
[11]	Xingping ZHANG, Teng WANG, Xinyue ZHANG, Haonan ZHANG. Bidding Strategy for Thermal Power Generation Companies Based on Multi-agent Deep Deterministic Policy Gradient Algorithm [J]. Electric Power, 2024, 57(11): 161-172.
[12]	Songping XUE, Dequan GAO, Ziyan ZHAO, Yuqian LIN, Zejing GUANG, Dawei ZHANG. Routing Algorithm for Power Communication Networks Based on Serivce Differentiated Transmission Requirements [J]. Electric Power, 2024, 57(11): 183-190.
[13]	Chao ZHANG, Dongmei ZHAO, Yu JI, Ying ZHANG. Real Time Optimal Dispatch of Virtual Power Plant Based on Improved Deep Q Network [J]. Electric Power, 2024, 57(1): 91-100.
[14]	Guang MA, Wen ZHU, Huijie GU, Huashi ZHAO, Xiqi HE, Shijie CHEN. Topology Identification Method for Active Distribution Network Based on Weighted Minimum Absolute Value State Estimation [J]. Electric Power, 2024, 57(1): 167-174.
[15]	LIU Cencen, XIA Tian, LI Yan, NI Huxuan, HE Xiaohui, GUO Kai. A Multi-stage Expansion Planning Method for Distribution Networks Based on Explicit Reliability Index [J]. Electric Power, 2023, 56(9): 87-95.