林鑫,杜莹,罗宇.基于多阶段分类的科研项目申请书结构功能识别[J].数字图书馆论坛,2024,20(3):25~33 |
基于多阶段分类的科研项目申请书结构功能识别 |
Structure Function Recognition of Scientific Research Project Application Based on Multi-Stage Classification |
投稿时间:2023-12-21 |
DOI:10.3772/j.issn.1673-2286.2024.03.003 |
中文关键词: 科研项目申请书;结构功能识别;多阶段分类;BiLSTM-Attention |
英文关键词: Scientific Research Project Application; Structure Function Recognition; Multi-Stage Classification; BiLSTM-Attention |
基金项目:本研究得到国家社会科学基金项目“面向多模态发布的学术论文语义标注与对象链接研究”(编号:23BTQ083)资助。 |
作者 | 单位 | 林鑫 | 华中师范大学信息管理学院 | 杜莹 | 华中师范大学信息管理学院 | 罗宇 | 华中师范大学信息管理学院 |
|
摘要点击次数: 381 |
全文下载次数: 351 |
中文摘要: |
科研项目申请书蕴含丰富的科学知识,被广泛用作科技情报分析的基础数据,其中重复检测、分析挖掘等智能处理工作需要在明晰申请书结构功能的前提下展开。因此,构建一种基于多阶段分类的科研项目申请书结构功能识别模型。首先,对申请书进行预处理,识别申请书的正文内容及其包含的多模态要素,并将文本段落规范化;之后,基于BiLSTM-Attention模型,依次区分申请书中的章节标题与正文文本,基于标题识别正文文本的一级功能,进而识别申请书的细粒度结构功能。实验结果显示,所提方法的准确率与召回率分别达到93.7%和93.1%,该方法能较好支撑科研项目申请书的结构化解析,也能为其他类型学术文本的结构功能识别提供参考。 |
英文摘要: |
The research project applications contain rich scientific knowledge and are widely used as the basic data for scientific and technological information analyses. Some information analyses such as duplicate detection and analysis mining need to be carried out on the premise of clarifying the structure function of the applications. Therefore, this paper proposes a research project application structure function recognition model based on multistage classification. Firstly, the research project applications should be preprocessed, including identifying the main content and multimodal elements of the applications, and standardizing the text paragraphs. Afterwards, based on the BiLSTM-Attention model, the chapter titles and their text are distinguished, and the primary structure function is recognized based on the titles. Furtherly, the fine-grained structure function of the application is identified. The experimentshows that the precision and recall rate of the model reach 93.7% and 93.1%. The model can support the structured analysis of scientific research project applications and provide references for the structure function recognition of other types of academic texts. |
查看全文
查看/发表评论 下载PDF阅读器 |
关闭 |