基于加权子序列核函数的次范畴论元分析

朱聪慧; 赵铁军; 韩习武; 郑德权

文章摘要

朱聪慧,赵铁军,韩习武,郑德权.基于加权子序列核函数的次范畴论元分析[J].高技术通讯(中文),2010,20(2):127~132

基于加权子序列核函数的次范畴论元分析

Arguments analysis of Chinese verb subcategorization based on weighted gap subsequence kernel function

DOI：

中文关键词: 汉语动词次范畴(SCF)，论元分析，主动学习，间隔加权子序列

英文关键词: Chinese verb subcategorization frame (SCF)， argument analysis， active learning strategies， weighted gap subsequence kernel

基金项目:国家自然科学基金(60773069，60973169)资助项目

作者	单位
朱聪慧	哈尔滨工业大学教育部微软语言语音重点实验室
赵铁军	哈尔滨工业大学教育部微软语言语音重点实验室
韩习武	黑龙江大学计算机科学与技术学院
郑德权	哈尔滨工业大学教育部微软语言语音重点实验室

摘要点击次数: 6558

全文下载次数: 4509

中文摘要:

为提高汉语动词次范畴化框架(SCFs)的分析性能，提出了一种新的次范畴论元分析方法。该方法引入了基于间隙加权子序列的核函数，以传统规则的右部作为分类类别，将规则左部作为问题输入空间，将原本规则推导的问题转化为机器学习问题。由于间隙加权子序列核函数可以考虑跨距离的词之间的依赖关系，加之机器学习方法的引入，使得论元识别精度从5516%提到了9343%，并且极大提高了次范畴整句获取精度。

英文摘要:

The paper proposes a new arguments analysis method for Chinese verb subcategorization to improve the present performance of analyzing Chinese verb subcategorization frames (SCFs). The method introduces the weighted gap subsequence kernel function into the analysis, and treats the left part and the right part of the traditional rules as training samples and related categories respectively, transforming the originally rule derived problem into the machine learning problem. This new kernel can take more cross distance features. Compared with the rule based methods, the weighted gap subsequence kernel based method improves the precision of argument type analysis from 5516% to 9343% on syntactic noisy data. The analyzing performance of whole sentence is also much improved.

查看全文查看/发表评论下载PDF阅读器

关闭