基于改进深度Q网络的虚拟电厂实时优化调度

doi:10.11930/j.issn.1004-9649.202307006

中国电力 ›› 2024, Vol. 57 ›› Issue (1): 91-100.DOI: 10.11930/j.issn.1004-9649.202307006

基于改进深度Q网络的虚拟电厂实时优化调度

张超¹^,²(), 赵冬梅¹(), 季宇²(), 张颖²()

1. 华北电力大学电气与电子工程学院，北京　102206
2. 国网上海能源互联网研究院有限公司，上海　200120

收稿日期:2023-07-03 录用日期:2023-10-01 发布日期:2024-01-23 出版日期:2024-01-28
作者简介:张超（1998—），男，硕士研究生，从事虚拟电厂优化调度等研究，E-mail：120212201477@ncepu.edu.cn
赵冬梅（ 1965—），女，教授，博士生导师，从事电力系统分析与控制、新能源发电与智能电网等研究，E-mail：zhao-dm@ncepu.edu.cn
季宇（1982—），男，博士，高级工程师（教授级），从事分布式电源微电网及虚拟电厂技术研究，E-mail：jiyu@epri.sgcc.com.cn
张颖（1994—），女，硕士，工程师，从事分布式电源优化调度及虚拟电厂技术研究，E-mail：zhangying@epri.sgcc.com.cn
基金资助:
国家重点研发计划资助项目（规模化灵活资源虚拟电厂聚合互动调控关键技术，2021YFB2401200）。

Real Time Optimal Dispatch of Virtual Power Plant Based on Improved Deep Q Network

Chao ZHANG¹^,²(), Dongmei ZHAO¹(), Yu JI²(), Ying ZHANG²()

1. School of Electrical and Electronic Engineering, North China Electric Power University, Beijing 102206, China
2. State Grid Shanghai Energy Internet Research Institute Co., Ltd., Shanghai 200120, China

Received:2023-07-03 Accepted:2023-10-01 Online:2024-01-23 Published:2024-01-28
Supported by:
This work is supported by the National Key R&D Program of China (Aggregation Interaction Regulation Key Technologies of Virtual Power Plant with Enormous Flexible Distributed Energy Resources, No.2021YFB2401200).

摘要/Abstract

摘要：

深度强化学习算法以数据为驱动，且不依赖具体模型，能有效应对虚拟电厂运营中的复杂性问题。然而，现有算法难以严格执行操作约束，在实际系统中的应用受到限制。为了克服这一问题，提出了一种基于深度强化学习的改进深度Q网络（improved deep Q-network，MDQN）算法。该算法将深度神经网络表达为混合整数规划公式，以确保在动作空间内严格执行所有操作约束，从而保证了所制定的调度在实际运行中的可行性。此外，还进行了敏感性分析，以灵活地调整超参数，为算法的优化提供了更大的灵活性。最后，通过对比实验验证了MDQN算法的优越性能。该算法为应对虚拟电厂运营中的复杂性问题提供了一种有效的解决方案。

关键词: 虚拟电厂, 实时优化, 深度强化学习, 云边协同, 优化调度

Abstract:

The deep reinforcement learning algorithm is data-driven and does not rely on specific models, which can effectively address the complexity issues in virtual power plant (VPP) operation. However, existing algorithms are difficult to strictly enforce operational constraints, which limits their application in practical systems. To overcome this problem, an improved deep Q-network (MDQN) algorithm based on deep reinforcement learning is proposed. This algorithm expresses deep neural networks as mixed integer programming formulas to ensure strict execution of all operational constraints within the action space, thus ensuring the feasibility of the formulated scheduling in actual operation. In addition, sensitivity analysis is conducted to flexibly adjust hyperparameters, providing greater flexibility for algorithm optimization. Finally, the superior performance of the MDQN algorithm is verified through comparative experiments. An effective solution is provided to address the complexity issues in the operation of VPP.

Key words: virtual power plant, real time optimization, deep reinforcement learning, cloud edge collaboration, optimal dispatch

张超, 赵冬梅, 季宇, 张颖. 基于改进深度Q网络的虚拟电厂实时优化调度[J]. 中国电力, 2024, 57(1): 91-100.

Chao ZHANG, Dongmei ZHAO, Yu JI, Ying ZHANG. Real Time Optimal Dispatch of Virtual Power Plant Based on Improved Deep Q Network[J]. Electric Power, 2024, 57(1): 91-100.

导出引用管理器 EndNote|Ris|BibTeX

链接本文: https://www.electricpower.com.cn/CN/10.11930/j.issn.1004-9649.202307006

https://www.electricpower.com.cn/CN/Y2024/V57/I1/91

图/表 10

图 1 VPP云边协同架构

Fig.1 VPP cloud edge collaboration architecture

图 2 VPP调控云平台逻辑

Fig.2 VPP regulation cloud platform logic diagram

图 3 VPP结构框架

Fig.3 VPP structural framework

图 4 动作价值函数Q(s, a)的DNN的层结构

Fig.4 The layer structure of DNN for action value function Q(s, a)

表 1 各微燃机参数

Table 1 Parameters of each micro gas turbine

设备	a	b	c	$ P_{{\text{mt}}}^{\min }/\rm kW $	$ P_{{\text{mt}}}^{\max }/\rm kW$	$ \Delta {P_{{\text{mt}}}}/\rm kW $
微燃机1	0.0034	3	30	10	150	100
微燃机2	0.0010	10	40	50	375	200
微燃机3	0.0010	15	70	100	500	200

表 2 各DRL算法参数

Table 2 Parameters of each DRL algorithm

算法	样本数	学习率	折扣因子	网络维度	缓冲区大小
DDPG	256	1×10^–4	0.995	（64，64，64）	5×10⁴
SAC	256	1×10^–4	0.995	（64，64，64）	5×10⁴
TD3	256	1×10^–4	0.995	（64，64，64）	5×10⁴
MDQN	256	1×10^–4	0.995	（64，64，64）	5×10⁴

图 5 DDPG、SAC、TD3、MDQN训练过程中奖励和功率不平衡量

Fig.5 Rewards and power imbalance during DDPG, SAC, TD3, MDQN training

图 6 DDPG、SAC、TD3、MDQN累计10天运行成本和不平衡量

Fig.6 Accumulated 10-day operating costs and imbalance of DDPG, SAC, TD3, and MDQN

图 7 MDQN优化的所有机组运行计划

Fig.7 MDQN optimized operation plans for all units

图 8 MDQN不同σ2的运行成本和功率不平衡量

Fig.8 MDQN operating costs and imbalance with different σ2 values

参考文献 24

1	林毓军, 苗世洪, 杨炜晨, 等. 面向多重不确定性环境的虚拟电厂日前优化调度策略[J]. 电力自动化设备, 2021, 41 (12): 143- 150.
	LIN Yujun, MIAO Shihong, YANG Weichen, et al. Day-ahead optimal scheduling strategy of virtual power plant for environment with multiple uncertainties[J]. Electric Power Automation Equipment, 2021, 41 (12): 143- 150.
2	尚文强, 李广磊, 丁月明, 等. 考虑源荷不确定性和新能源消纳的综合能源系统协同调度方法[J/OL]. 电网技术: 1–17.[2023-10-23]. https: //doi. org/10.13335/j. 1000-3673. pst. 2023.0577.
	SHANG Wenqiang, LI Guanglei, DING Yueming, et al. A collaborative dispatching method for comprehensive energy systems considering source load uncertainty and new energy consumption [J/OL]. Power System Technology: 1–17[2023-10-23].https://doi.org/10.13335/j.1000-3673.pst.2023.0577.
3	陶力, 杨夏喜, 顾金辉, 等. 基于SAC和TD3的含电动汽车虚拟电厂调度策略[J]. 电气传动, 2023, 53 (9): 25- 34.
	TAO Li, YANG Xiaxi, GU Jinhui, et al. Scheduling strategy of virtual power plant with electric vehicle based on SAC and TD3[J]. Electric Drive, 2023, 53 (9): 25- 34.
4	CHEN C S, XIAO L L, DUAN S D, et al. Cooperative optimization of electric vehicles in microgrids considering across-time-and-space energy transmission[J]. IEEE Transactions on Industrial Electronics, 2019, 66 (2): 1532- 1542. DOI
5	程逸帆, 乔飞, 侯珂, 等. 区域微电网群两级能量调度策略优化研究[J]. 仪器仪表学报, 2019, 40 (5): 68- 77.
	CHENG Yifan, QIAO Fei, HOU Ke, et al. Research on bi-level energy dispatching strategy optimization for regional microgrid cluster[J]. Chinese Journal of Scientific Instrument, 2019, 40 (5): 68- 77.
6	梁涛, 孙博峰, 谭建鑫, 等. 基于深度强化学习算法的风光互补可再生能源制氢系统调度方案[J]. 高电压技术, 2023, 49 (6): 2264- 2275.
	LIANG Tao, SUN Bofeng, TAN Jianxin, et al. Scheduling scheme of wind-solar complementary renewable energy hydrogen production system based on deep reinforcement learning[J]. High Voltage Engineering, 2023, 49 (6): 2264- 2275.
7	王尧, 樊伟, 王佳伟, 等. 考虑风光不确定性的分布式能源集成虚拟电厂收益-风险均衡模型[J]. 可再生能源, 2020, 38 (1): 76- 83. DOI
	WANG Yao, FAN Wei, WANG Jiawei, et al. Revenue-risk equilibrium model for distributed energy integrated virtual power plant benefits considering wind and light uncertainty[J]. Renewable Energy Resources, 2020, 38 (1): 76- 83. DOI
8	LI Z M, XU Y. Temporally-coordinated optimal operation of a multi-energy microgrid under diverse uncertainties[J]. Applied Energy, 2019, 240, 719- 729. DOI
9	罗建勋, 张玮, 王辉, 等. 基于深度强化学习的微电网优化调度研究[J]. 电力学报, 2023, 38 (1): 54- 63. DOI
	LUO Jianxun, ZHANG Wei, WANG Hui, et al. Research on optimal scheduling of micro-grid based on deep reinforcement learning[J]. Journal of Electric Power, 2023, 38 (1): 54- 63. DOI
10	郭庆来, 兰健, 周艳真, 等. 基于混合智能的新型电力系统运行方式分析决策架构及其关键技术[J]. 中国电力, 2023, 56 (9): 1- 13.
	GUO Qinglai, LAN Jian, ZHOU Yanzhen, et al. Architecture and key technologies of hybrid-intelligence-based decision-making of operation modes for new type power systems[J]. Electric Power, 2023, 56 (9): 1- 13.
11	石文喆, 李冰洁, 尤培培, 等. 基于深度强化学习的建筑能源系统优化策略[J]. 中国电力, 2023, 56 (6): 114- 122.
	SHI Wenzhe, LI Bingjie, YOU Peipei, et al. Optimization strategy of building energy system based on deep reinforcement learning[J]. Electric Power, 2023, 56 (6): 114- 122.
12	蔺伟山, 王小君, 孙庆凯, 等. 不确定性环境下基于深度强化学习的综合能源系统动态调度[J]. 电力系统保护与控制, 2022, 50 (18): 50- 60.
	LIN Weishan, WANG Xiaojun, SUN Qingkai, et al. Dynamic dispatch of an integrated energy system based on deep reinforcement learning in an uncertain environment[J]. Power System Protection and Control, 2022, 50 (18): 50- 60.
13	刘林鹏, 朱建全, 陈嘉俊, 等. 基于柔性策略-评价网络的微电网源储协同优化调度策略[J]. 电力自动化设备, 2022, 42 (1): 79- 85.
	LIU Linpeng, ZHU Jianquan, CHEN Jiajun, et al. Cooperative optimal scheduling strategy of source and storage in microgrid based on soft actor-critic[J]. Electric Power Automation Equipment, 2022, 42 (1): 79- 85.
14	ALI K H, ABUSARA M, ALI TAHIR A, et al. Dual-layer Q-learning strategy for energy management of battery storage in grid-connected microgrids[J]. Energies, 2023, 16 (3): 1334. DOI
15	张宏涛, 吴怡之, 邓开连, 等. 一种基于强化学习的微电网能量管理算法[J]. 物联网技术, 2022, 12 (12): 74- 78.
16	CHEN P Z, LIU M C, CHEN C X, et al. A battery management strategy in microgrid for personalized customer requirements[J]. Energy, 2019, 189, 116245. DOI
17	吴利刚, 张梁, 周倩, 等. 基于强化学习的微电网能量调度优化策略研究[J]. 控制工程, 2022, 29 (7): 1162- 1172.
	WU Ligang, ZHANG Liang, ZHOU Qian, et al. Research on energy scheduling optimization strategy of micro-grid based on reinforcement learning[J]. Control Engineering of China, 2022, 29 (7): 1162- 1172.
18	马苗苗, 董利鹏, 刘向杰. 基于Q-learning算法的多智能体微电网能量管理策略[J]. 系统仿真学报, 2023, 35 (7): 1487- 1496.
	MA Miaomiao, DONG Lipeng, LIU Xiangjie. Energy management strategy of multi-agent microgrid based on Q-learning algorithm[J]. Journal of System Simulation, 2023, 35 (7): 1487- 1496.
19	冯斌, 胡轶婕, 黄刚, 等. 基于深度强化学习的新型电力系统调度优化方法综述[J]. 电力系统自动化, 2023, 47 (17): 187- 199.
	FENG Bin, HU Yijie, HUANG Gang, et al. Review on optimization methods for new power system dispatch based on deep reinforcement learning[J]. Automation of Electric Power Systems, 2023, 47 (17): 187- 199.
20	张华赢, 艾精文, 汪伟. 基于约束型深度强化学习的主动配电网电压控制策略[J]. 电测与仪表, 2023, 60 (5): 159- 166.
	ZHANG Huaying, AI Jingwen, WANG Wei. Volt/Var control strategy for active distribution network based on constrained deep reinforcement learning[J]. Electrical Measurement & Instrumentation, 2023, 60 (5): 159- 166.
21	朱卉乔, 王卫东. 网络拥塞控制的优化研究[J]. 保密科学技术, 2022, (5): 56- 60.
22	胡维昊, 曹迪, 黄琦, 等. 深度强化学习在配电网优化运行中的应用[J]. 电力系统自动化, 2023, 47 (14): 174- 191.
	HU Weihao, CAO Di, HUANG Qi, et al. Application of deep reinforcement learning in optimal operation of distribution network[J]. Automation of Electric Power Systems, 2023, 47 (14): 174- 191.
23	陈兴国, 孙丁源昊, 杨光, 等. 不动点视角下的强化学习算法综述[J]. 计算机学报, 2023, 46 (6): 1246- 1271.
	CHEN Xingguo, SUN Dingyuanhao, YANG Guang, et al. A survey of reinforcement learning algorithms from a fixed point perspective[J]. Chinese Journal of Computers, 2023, 46 (6): 1246- 1271.
24	FISCHETTI M, JO J. Deep neural networks and mixed integer linear optimization[J]. Constraints, 2018: 296–309.

[1]	许世杰, 胡邦杰, 赵亮, 王沛. 基于能碳耦合模型的微能源网源荷协同优化调度研究[J]. 中国电力, 2025, 58(4): 1-12.
[2]	汪进锋, 李金鹏, 许银亮, 刘海涛, 何锦雄, 许建远. 考虑不确定性和绿证交易的虚拟电厂与配电网分布式优化[J]. 中国电力, 2025, 58(4): 21-30, 192.
[3]	王冠朝, 霍雨翀, 李群, 李强. 基于深度强化学习与改进Jensen模型的风电场功率优化[J]. 中国电力, 2025, 58(4): 78-89.
[4]	沈舒仪, 王国腾, 但扬清, 孙飞飞, 黄莹, 徐政. 频率偏差与电压刚度约束下多馈入直流优化调度方法[J]. 中国电力, 2025, 58(3): 132-141.
[5]	李明冰, 李强, 管西洋, 周皓阳, 卢瑞, 冯延坤. 市场环境下考虑多元用户侧资源协同的虚拟电厂低碳优化调度[J]. 中国电力, 2025, 58(2): 66-76.
[6]	周建华, 梁昌誉, 史林军, 李杨, 易文飞. 计及阶梯式碳交易机制的综合能源系统优化调度[J]. 中国电力, 2025, 58(2): 77-87.
[7]	张玉敏, 尹延宾, 吉兴全, 叶平峰, 孙东磊, 宋爱全. 计及热网不同运行状态下灵活性供给能力的综合能源系统优化调度[J]. 中国电力, 2025, 58(2): 88-102.
[8]	闫志彬, 李立, 阳鹏, 宋蕙慧, 车彬, 靳盘龙. 考虑构网型储能支撑能力的微电网优化调度策略[J]. 中国电力, 2025, 58(2): 103-110.
[9]	曾仪, 周毅, 陆继翔, 周良才, 唐宁恺, 李红. 基于多智能体安全深度强化学习的电压控制[J]. 中国电力, 2025, 58(2): 111-117.
[10]	许文俊, 马刚, 姚云婷, 孟宇翔, 李伟康. 考虑绿证-碳交易机制与混氢天然气的工业园区多能优化调度[J]. 中国电力, 2025, 58(2): 154-163.
[11]	杨珂, 王栋, 李达, 张王俊, 向尕, 李军. 虚拟电厂网络安全风险评估指标体系构建及量化计算[J]. 中国电力, 2024, 57(8): 130-137.
[12]	王云龙, 韩璐, 罗树林, 吴涛. 集成电动汽车的家庭电热综合能源系统负荷调度优化[J]. 中国电力, 2024, 57(5): 39-49.
[13]	吕志鹏, 宋振浩, 李立生, 刘洋. 含电动汽车的工业园区综合能源系统优化调度[J]. 中国电力, 2024, 57(4): 25-31.
[14]	陈伟铭, 陈金玉, 范元亮, 吴涵, 李泽文, 王光达, 李怡然. 基于智能融合终端云边协同的智能台区实用化设计及应用[J]. 中国电力, 2024, 57(4): 190-199.
[15]	陈苏豪, 吴越, 曾伟, 杨晓辉, 王晓鹏, 伍云飞. 基于NNC法和DMC算法的CCHP型微电网两阶段调度[J]. 中国电力, 2024, 57(2): 171-182.

基于改进深度Q网络的虚拟电厂实时优化调度

Real Time Optimal Dispatch of Virtual Power Plant Based on Improved Deep Q Network

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

图/表 10

参考文献 24

相关文章 15

编辑推荐

Metrics

模态框（Modal）标题

基于改进深度Q网络的虚拟电厂实时优化调度

Real Time Optimal Dispatch of Virtual Power Plant Based on Improved Deep Q Network

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

图/表 10

参考文献 24

相关文章 15

编辑推荐

Metrics