中国电力 ›› 2023, Vol. 56 ›› Issue (9): 127-133.DOI: 10.11930/j.issn.1004-9649.202210068

• 配电网规划与优化运行 • 上一篇    下一篇

基于对抗性强化学习的配电网融合通信效率提升方法

彭琳钰1, 刘旭1, 汤玮1, 刘晴1, 方浩2, 张光辉3   

  1. 1. 贵州电网有限责任公司,贵州 贵阳 550002;
    2. 贵州电网有限责任公司贵阳供电局,贵州 贵阳 550004;
    3. 贵州电网有限责任公司电力调度控制中心,贵州 贵阳 550002
  • 收稿日期:2022-10-18 修回日期:2023-08-10 发布日期:2023-09-20
  • 作者简介:彭琳钰(1993-),女,硕士,助理工程师,从事电力系统通信研究,E-mail:penglinyu666@163.com;刘旭(1984-),男,通信作者,硕士,高级工程师,从事电力系统通信研究,E-mail:290698192@qq.com;汤玮(1988-),男,硕士,高级工程师,从事电力系统通信研究,E-mail:tangwei@gz.csg.cn;刘晴(1976-),女,硕士,高级工程师,从事电力系统通信研究,E-mail:liuqing1949@sina.com;方浩(1976-),男,工程师,从事配网监控研究,E-mail:fanghao@gz.csg.cn;张光辉(1995-),男,助理工程师,从事电力系统通信研究,E-mail:zgh199509@163.com
  • 基金资助:
    中国南方电网有限责任公司科技项目(066500GS62200017)。

Adversarial Reinforcement Learning-Based Converged Communication Efficiency Improvement Method for Power Distribution Network

PENG Linyu1, LIU Xu1, TANG Wei1, LIU Qing1, FANG Hao2, ZHANG Guanghui3   

  1. 1. Guizhou Power Grid Co., Ltd., Guiyang 550002, China;
    2. Guiyang Power Supply Bureau of Guizhou Power Grid Co., Ltd., Guiyang 550004, China;
    3. Guizhou Power Dispatching & Communication Center, Guiyang 550002, China
  • Received:2022-10-18 Revised:2023-08-10 Published:2023-09-20
  • Supported by:
    This work is supported by Science and Technology Project of China Southern Power Grid Corporation (No.066500GS62200017).

摘要: 为了满足海量配电网终端源节点的多样化通信需求,须对配电网融合通信进行合理的编排优化。基于此,首先构建数据传输时延与传输能耗联合优化问题;其次,将优化问题建模为多臂赌博机问题,并提出一种基于对抗性强化学习的配电网融合通信编排算法,利用历史编排信息和感知源节点间的对抗性,动态学习通信编排决策;最后,通过仿真验证所提算法的优越性能。

关键词: 配电网, 强化学习, 对抗性感知, 通信编排

Abstract: In order to satisfy the diversified communication requirements of terminal source nodes in power distribution network, it is necessary to optimize the communication orchestration in power distribution unified communication network. Firstly, we construct the joint optimization problem of data transmission delay and energy consumption. Then, the joint optimization problem is modeled as a multi-armed bandit problem, and an adversarial reinforcement learning-based communication orchestration algorithm for power distribution unified communication network is proposed, which uses the historical orchestration information and the perceived adversary between source nodes to dynamically learn the communication orchestration strategy. Finally, the superior performance of the proposed algorithm is verified through simulation.

Key words: distribution network, reinforcement learning, adversary awareness, communication orchestration