黄俏娟* **,曹存根*,陈志文* **.模式与深度学习融合抽取因果事件三元组[J].高技术通讯(中文),2024,34(9):921~934
Integration of patterns and deep learning for extracting causal event triples
DOI:10. 3772 / j. issn. 1002-0470. 2024. 09. 002
中文关键词: 因果事件三元组; 词法句法模式; 双向长短期记忆-条件随机场(BiLSTM-CRF); 多特征融合; 深度学习
英文关键词: causal event triples, lexical-syntactic pattern, bidirectional long short-term memory-conditional random field (BiLSTM-CRF), multi-feature fusion, deep learning
黄俏娟* ** (*中国科学院计算技术研究所智能信息处理重点实验室北京 100190) (**中国科学院大学北京 100049) 
陈志文* **  
摘要点击次数: 408
全文下载次数: 350
      Causal event triplets play a pivotal role in understanding logical links between events. The research combined pattern methods with deep learning to address the lack of high-quality data sets and limited coverage of causal knowledge in extracting causal event triplets from texts. Firstly, lexical-syntactic patterns, reflecting causal relationships, are created and matched within the Web corpus. Secondly, inverse document frequency and causal event boundary word strategies filter noise from the pattern matches. Then, rule-based normalisation of causal events follow, resulting in a high-quality causal event triplet dataset. Finally, in the bidirectional long short-term memory-conditional random fields (BiLSTM-CRF) model, characters, words, parts of speech, causal pattern feature words, and causal event boundary words are effectively integrated, along with the introduction of deep learning strategies. After training on the causal event triple dataset, the model performs well in extracting causal event triples from a large-scale web corpus covering broad domain knowledge. Experimental results show that the causal event triplets F1 score is 92.44% and boundary word identification precision is 94.00%. These findings validate the efficient integration of patterns with deep learning, the high quality of the dataset, and the method’s significant value in extracting causal event triplets from the Web corpus.
查看全文   查看/发表评论  下载PDF阅读器
