文章摘要
高甦,金佩,张德政.基于深度学习的中医典籍命名实体识别研究[J].情报工程,2019,5(1):113-123
基于深度学习的中医典籍命名实体识别研究
Research on Named Entity Recognition of TCM Classics Based on Deep Learning
  
DOI:10.3772/j.issn.2095-915X.2019.01.010
中文关键词: 命名实体识别;深度学习;中医;黄帝内经
英文关键词: Named entity recognition; deep learning; traditional Chinese Medicine; Huangdi Neijing
基金项目:国家重点研发计划云计算和大数据专项“ 大数据驱动的中医智能辅助诊断服务系统”(2017YFB1002300)。
作者单位
高甦 1. 北京师范大学医院 
金佩 2. 北京科技大学计算机与通信工程学院 
张德政 2. 北京科技大学计算机与通信工程学院 
摘要点击次数: 3273
全文下载次数: 3266
中文摘要:
      本文针对中医典籍存在的知识体系复杂、分词困难等难点以及传统方法人工构建特征不准确等问题,提出了一种基于深度学习的中医典籍命名实体识别方法。根据中医典籍的语料特征及主流的深度学习模型特点,以中医典籍的字向量为输入,采用基于双向长短时记忆神经网络和条件随机场(BiLSTM-CRF)的实体识别模型,对《黄帝内经》中的中医认识方法、中医生理、中医病理、中医自然、治则治法等5 种实体进行识别,精确率为85.44%,召回率为85.19%,F1 值为85.32%。在相同的中医典籍语料上做了大量对比分析实验,验证了该方法的有效性。结果证明:该方法有效提高了中医典籍的实体识别准确率,是深度学习在特殊语料处理领域的一次较有价值的尝试,具有一定的实践意义。
英文摘要:
      Aiming at the problems of complex knowledge system, difficult word segmentation and inaccurate artificial construction of the Chinese medical classics, a method of named entity recognition of Chinese medical classics based on deep learning is proposed. According to the characteristics of Chinese medical classics and mainstream deep learning model, taking the character vectors of Chinese medical classics as input, a column labeling model based on Bidirectional Long Short Term Memory neural network and conditional random field (BiLSTM-CRF) is adopted to recognize the cognitive methods, physiology, pathology, nature and therapy in Huangdi Neijing. The recognition accuracy is 85.44%, the recall rate is 85.19%, and the F1 value is 85.32%.Besides, a large number of comparative analysis experiments have been done on the same corpus of TCM classics to verify the effectiveness of the method. The results show that the method significantly improves the recognition accuracy of entities in TCM classics, and makes a valuable attempt in the field of special corpus processing, which has a certain practical significance.
查看全文   查看/发表评论  下载PDF阅读器
关闭

分享按钮