文章摘要
曾洪,宋爱国,卢伟.基于最大化间隔准则和成对约束的鲁棒半监督聚类研究[J].高技术通讯(中文),2013,23(1):
基于最大化间隔准则和成对约束的鲁棒半监督聚类研究
Research on robust semi-supervised clustering algorithm based on the maximum margin principle and pairwise constraints
  修订日期:2012-02-16
DOI:
中文关键词: 半监督聚类, 成对约束, 最大化间隔准则, 鲁棒的损失函数, 约束凹凸过程(CCCP)
英文关键词: semi-supervised clustering, pairwise constraints, maximum margin principle, robust loss function, constrained concave-convex procedure (CCCP)
基金项目:国家自然科学基金(61105048,60972165,51175080),教育部博士点基金(20100092120012, 20110092120034),人事部留学人员科技活动择优资助基金(6722000008)和江苏省自然科学基金(BK2010240,BK2010423)资助项目
作者单位
曾洪 东南大学仪器科学与工程学院南京 
宋爱国 东南大学仪器科学与工程学院南京 
卢伟 东南大学仪器科学与工程学院南京 
摘要点击次数: 3746
全文下载次数: 2571
中文摘要:
      针对现有半监督最大间隔聚类算法在不同类别中有不少样本非常相似的情况下难以提高聚类准确度的问题,提出了下述解决策略:首先,基于最大化间隔准则设计一种鲁棒的成对约束损失函数,即使不同类别有较多样本非常相似,该函数仍然能有效地检测不能满足成对约束的聚类结果,并提供相应的惩罚,从而能较好地提高聚类的性能。其次,基于约束凹凸过程设计一种迭代算法进行求解。进而,基于这一策略,提出了一种新的聚类算法——鲁棒的成对约束最大化间隔聚类(BPCMMC)算法。实验结果表明,该算法能有效克服现有半监督最大间隔聚类算法的不足,其聚类错误率明显低于传统的半监督聚类算法。
英文摘要:
      To solve the problem that the existing semi-supervised maximum margin clustering algorithm does not work robustly when lots of very similar samples exist in different categories, this study adopted the tactics below: Firstly, design a robust loss function for violating the pairwise constraints based on the maximum margin principle, which features robust penalization to the violation of the pairwise constraints; Secondly, design an iterative algorithm based on the constrained concave-convex procedure (CCCP) to improve the clustering accuracy. Based on the tactics, a new semi-supervised clustering algorithm, the robus pairwise constrained maximum margin clusting (RPCMMC) algorithm, was put forward. The experimental results demonstrate that the proposed algorithm can overcome the drawbacks of the existing semi-supervised maximum margin clustering algorithm and outperform some representative semi-supervised clustering algorithms.
查看全文   查看/发表评论  下载PDF阅读器
关闭

分享按钮