基于社会标注的Web资源语义聚类研究

杨鲲; 马慧芳; 史忠植

文章摘要

杨鲲,马慧芳,史忠植.基于社会标注的Web资源语义聚类研究[J].高技术通讯(中文),2012,22(1):48~54

基于社会标注的Web资源语义聚类研究

Semantic clustering of web resources based on social annotation

修订日期：2010-05-24

DOI：

中文关键词: 社会标注，语义抽取，语义聚类算法，广义关联

英文关键词: social annotation, semantic extraction, semantic clustering algorithm, general correlation

基金项目:863计划(2007AA01Z132)，国家自然科学基金(60435010)， 973计划(2007CB311004)和国家科技支撑计划(2006BAC08B06)资助项目

作者	单位
杨鲲	中国科学院计算技术研究所智能信息处理重点实验室；中国科学院研究生院；中国计量科学研究院
马慧芳	中国科学院计算技术研究所智能信息处理重点实验室；中国科学院研究生院
史忠植	中国计量科学研究院

摘要点击次数: 6701

全文下载次数: 4464

中文摘要:

在深入分析社会标注系统中用户、标签及被标注Web资源之间的关联关系的基础上,提出了基于用户标签的Web资源语义描述获取算法，并基于所获取的Web资源语义描述及其与用户之间的关联关系,利用一种迭代的聚类算法对社会标注系统中的Web资源进行基于语义的聚类, 该聚类算法通过迭代不断加强被聚类资源间的一致性信息，从而能够克服传统聚类算法所面临的数据稀疏以及性能问题。研究表明,对Web资源所处环境的各种关联关系的深入分析,能够帮助用户更好地理解和操作相关Web资源，尤其是对于本身特征不充分或难以获取的Web资源来说

英文摘要:

By analyzing the correlations between users, tags and Web resources in social annotation systems, this paper proposes an algorithm to acquire the semantic descriptions of Web resources based on users’ tags. And based on the acquired semantic descriptions and the correlations between the descriptions and users, an iterative algorithm is proposed for semantic clustering of the Web resources in social annotation systems. By mutually reinforcing the agreed information between Web resources during the clustering process, the clustering algorithm can tackle, to some extent, the challenges faced by traditional clustering algorithms such as the data sparseness and the performance constraints. The research illustrates the importance of the analysis of the correlations in the environment of Web resources, especially to those whose features are not sufficient or difficult to acquire.

查看全文查看/发表评论下载PDF阅读器

关闭