陈海华,黄永,张炯,陆伟.基于引文上下文的学术文本自动摘要技术研究[J].数字图书馆论坛,2016,(8):43~49 |
基于引文上下文的学术文本自动摘要技术研究 |
Research on Citation Context Based on Automatic Summarization of Academic Literature |
|
DOI: |
中文关键词: 文本自动摘要;引文上下文;支持向量回归;词向量 |
英文关键词: Automatic Summarization;Citation Context;Support Vector Regression;Word Embedding |
基金项目:本研究得到国家自然科学基金面上项目“面向词汇功能的学术文本语义识别与知识图谱构建”(编号71473183)资助。 |
作者 | 单位 | 陈海华 | 武汉大学 | 黄永 | 武汉大学 | 张炯 | 武汉大学 | 陆伟 | 武汉大学 |
|
摘要点击次数: 2219 |
全文下载次数: 1580 |
中文摘要: |
学术文本自动摘要是指对于给定学术文献,自动地抽取其核心内容,以提高用户撰写和阅读文献的效率。目前基于文本词频对句子重要性排序的自动摘要技术,无法从语义层面揭示学术文本的核心内容。本文在已有研究的基础上,引入引文上下文内容特征,并通过构建支持向量回归模型,综合考虑自动摘要系统中的各个特征对句子权重的影响,重新对句子重要性进行排序。基于WE-ROUGE的评测表明,相比于传统基于词频统计和图模型的方法,本文提出的算法能够有效提升自动摘要的准确度。 |
英文摘要: |
Text summarization of academic literature refers to automatical y generate abstract for a given paper. With the aid of automatic summarization, authors can improve the efficiency of writing and reading academic literature. Existing works evaluate and rank the sentences based on term frequency, they can' t reveal the main idea of an article from a deeper semantic dimension. Based on previous research, this article introduces citation context as an enhanced feature. Combining it with other existing features, we conduct an automatic re-scoring of each sentence by utilizing support vector regression (SVR) model. A significant improvement over traditional term frequency-based and graph-based method based on WE-ROUGE shows the effectiveness of citation context in automatical y text summarization. |
查看全文
查看/发表评论 下载PDF阅读器 |
关闭 |