Abstract:
The textual generalization of accident risk factors is an important step to establish the knowledge graph of accident risk factors of the oil & gas storage and transportation enterprises. In order to solve the problem of semantic representation limitations and word segmentation errors for the textual generalization of risk factors accumulated in the production process of oil & gas storage and transportation enterprises by existing event text generalization methods, a textual generalization method of accident risk factors based on the Char-Word feature based AGNES (CW-AGNES) was put forward according to the complicated and changeable text expression of safety management. Definitely, the character feature and binary word feature vectors of the oil & gas storage and transportation enterprises were obtained by Word2vec method. The text of accident risk factors is vectorized according to the pre-trained word vector model. Then, the char-word features of the text are added with the agglomerative nesting method, and the error caused by word segmentation can be reduced on the basis of retaining the semantic information of the words, so as to realize the generalization of the risk factor text. Specifically, the CW-AGNES method was applied to the actual safety management texts of the oil & gas storage and transportation enterprises. Meanwhile, comparison was made with other generalization methods. The results show that: The CW-AGNES method has a better generalization effect with 2.44%–5.74% improvement in quantitative evaluation indicators such as AMI, ARI, V-Measure and FMI. Therefore, the proposed method could provide support for the construction of accident risk knowledge graph in the field of oil & gas storage and transportation.