Electric Power ›› 2023, Vol. 56 ›› Issue (7): 146-155.DOI: 10.11930/j.issn.1004-9649.202302001

• Power System • Previous Articles     Next Articles

The Construction of the Professional Dictionary of Relay Protection Defect Text in a Regional Power Grid and Its Natural Language Characteristics Analysis

LIU Zhongshuo1, ZHENG Shaoming2, TAO Chang1, LIU Yimin2, CHEN Qian1, WANG Shuhong1, YU Yiting1, XUE Ancheng1   

  1. 1. State Key Laboratory of Alternate Electrical Power System with Renewable Energy Source (North China Electric Power University), Beijing 102206, China;
    2. North China Branch of State Grid Corporation of China, Beijing 100053, China
  • Received:2023-02-01 Revised:2023-03-13 Accepted:2023-05-02 Online:2023-07-23 Published:2023-07-28
  • Supported by:
    This work is supported by Technical Service Project of North China Branch of State Grid Corporation of China (No.SGNC0000DKJS2100369).

Abstract: Massive defect text data of relay protection devices is lack of data mining based on professional dictionary. It cannot provide sufficient support for grading, diagnosing, and eliminating relay protection defects, thus unable to meet efficient operation and maintenance needs. A professional dictionary construction method suitable for defects in relay protection devices is proposed, and relevant professional dictionaries are constructed taking a regional power grid as an example. Firstly, relevant defect logs and management protocols are aggregated to form a defect text corpus; secondly, a regular expression-based deactivation word identification method is applied to realize the rejection of irrelevant words in the defect text; then, a combined machine and manual method is used to build a relay protection defect text dictionary; Besides, it adopts latent semantic analysis and decision tree classification to achieve synonym merging. By integrating the deactivation word list, the split word lexicon and the synonym list, a specialized dictionary of protection device defects in the regional power grid is constructed. Finally, the Zipf distribution feature analysis of the professional dictionary and the corpus information entropy analysis before and after using the dictionary are carried out, which shows the effectiveness of the professional dictionary.

Key words: relay protection, corpus, defect record, text mining, professional dictionary