中国电力 ›› 2023, Vol. 56 ›› Issue (11): 60-66.DOI: 10.11930/j.issn.1004-9649.202305116

• 面向电网设备状态感知的低功耗无线传感网技术及应用 • 上一篇    下一篇

基于DDQN的电力传感网资源分配算法

朱雪琼(), 胡成博, 杨景刚, 路永玲   

  1. 国网江苏省电力有限公司电力科学研究院,江苏 南京 211103
  • 收稿日期:2023-05-26 接受日期:2023-10-12 出版日期:2023-11-28 发布日期:2023-11-28
  • 作者简介:朱雪琼(1990—),男,博士,高级工程师,从事电力设备状态智能传感研究,E-mail: 18795897606@163.com
    胡成博(1984—),男,硕士,高级工程师,从事智能运检和电力设备物联网通信算法研究
    杨景刚(1984—),男,硕士,高级工程师(教授级),从事高压输变电设备状态评估、故障诊断与检测等研究
    路永玲(1988—),女,硕士,高级工程师,从事智能运检和电力设备物联网通信算法研究
  • 基金资助:
    国家电网有限公司科技项目(与电力传感装置融合的低功耗宽窄融合无线传感网络系统研发及应用,5108-202218280A-2-201-XG)。

DDQN Based Resource Allocation Algorithm for Power Sensor Networks

Xueqiong ZHU(), Chengbo HU, Jinggang YANG, Yongling LU   

  1. State Grid Jiangsu Electric Power Co., Ltd. Research Institute, Nanjing 211103, China
  • Received:2023-05-26 Accepted:2023-10-12 Online:2023-11-28 Published:2023-11-28
  • Supported by:
    This work is supported by Science and Technology Project of SGCC (Development and Application of Low-Power Wideband and Narrowband Fusion Wireless Sensor Network System Integrated with Power Sensing Device, No.5108-202218280A-2-201-XG).

摘要:

电力传感网可以用于对电力网络的设备工作状态和工作环境等信息实时采集和获取,对于电力网络设施的实时监控与快速响应具有重要作用。针对系统在数据排队时延和丢包率上的特殊要求,提出了一种基于强化学习的电力传感网资源分配方案。在资源受限的情况下,通过资源分配算法来优化传感器节点的排队时延和丢包率,并将该优化问题建模为马尔可夫决策过程(Markov decision process,MDP),通过双深度Q网络(double deep Q-learning,DDQN)来对优化目标函数求解。仿真结果与数值分析表明,所提方案在收敛性、排队时延和丢包率等方面的性能均优于基准方案。

关键词: 电力传感网, 资源分配, 马尔可夫决策过程, 双深度Q网络

Abstract:

The power sensor network can collect and obtain information on the power grid equipment's working status and working environment in real time, which plays an important role in monitoring and quick response of the grid facilities. Aiming at the special requirements of system such as data queuing time delay and packet loss rate, this paper proposes a resource allocation scheme based on reinforcement learning (RL) for power sensing networks. Under the resource constraints, the scheme optimizes the queuing time and packet loss rate of sensor nodes through the resource allocation algorithms, and the optimization problem is modeled as a Markov decision process (MDP), which is solved by double deep Q-network(DDQN) algorithm. Simulation results and numerical analysis show that the proposed scheme outperforms the benchmark scheme in convergence, time delay and packet loss rate.

Key words: power sensor network, resource allocation, Markov decision process, double deep Q-network