许琦.一种基于知网的文档语义模型构建方法[J].中国科技资源导刊,2010,(4):55~60 |
一种基于知网的文档语义模型构建方法 |
A HowNet-Based Semantic Modeling Method of Document |
|
DOI: |
中文关键词: 语义消歧;知网;向量空间模型;相似度计算 |
英文关键词: semantic disambiguation, hownet, vector space model, similarity calculation |
基金项目:浙江省高校优秀青年教师资助计划项目“面向多终端设备的知识信息服务平台研究及应用”;浙江省教育厅科研项目
(Y200909672);台州职业技术学院校级重点课题(2010ZD03)。 |
|
摘要点击次数: 2928 |
全文下载次数: 3968 |
中文摘要: |
文章提出一种基于语义知识库知网和向量空间模型理论的文档语义模型构建方法,论述知网知识描述方式
的特点,提出一种滑动窗口语义消歧算法,利用知网的义原层次体系对文档模型进行语义化处理,根据语境确定语义,
将模型特征项转换为关键词的义项,较好地解决了由于自然语言中存在的同义、近义、上下位等语义关系而产生的模型
偏差问题。通过计算义项相似度,加权得到文档相似度。实验证明,该方法较好地描述了文档特征,能够达到良好的聚
类效果,是切实可行的。 |
英文摘要: |
A semantic modeling method of document is put forward. It’s based on the semantic repository How-
Net and vector space model theory. The characteristics of knowledge describing manner in HowNet are discussed.
A slipped window semantic disambiguation algorithm is put forward where the model is semantic handled using the
sememe hierarchical system. The model’s terms are transformed into meanings according to context and its deviation
problems that caused by semantic relations in the language such as synonymous, similar and hypernym-hyponym are
solved. The similarity of documents is weighted by calculating the similarity of meanings. Experiment results indicate
that the method accurately describes the features of resources which achieves good clustering effects and is feasible. |
查看全文
查看/发表评论 下载PDF阅读器 |
关闭 |