文章摘要
詹康,王逸文,何熊熊.基于数据相似度和引力理论的密度峰聚类算法[J].高技术通讯(中文),2023,33(1):88~96
基于数据相似度和引力理论的密度峰聚类算法
Density peak clustering algorithm based on data similarity and gravity theory
  
DOI:10. 3772/ j. issn. 1002-0470. 2023. 01. 009
中文关键词: 聚类分析; 密度峰值; 数据相似度; 引力理论; 聚类合并
英文关键词: clustering analysis, density peak, data similarity, gravity theory, cluster merging
基金项目:
作者单位
詹康 (浙江工业大学信息工程学院杭州 310023) 
王逸文 (浙江工业大学信息工程学院杭州 310023) 
何熊熊 (浙江工业大学信息工程学院杭州 310023) 
摘要点击次数: 956
全文下载次数: 777
中文摘要:
      本文针对密度峰聚类算法(DPC)中存在的参数敏感、算法不连续和聚类分块化问题,提出一种基于数据相似度和引力理论的密度峰聚类算法(SLDPC)。该算法基于数据相似度确定局部密度,采用引力理论扩大簇心与非簇心数据点之间的差异,通过设定阈值自动确定簇心,通过基于边缘分布的合并策略对聚类分块化进行合并。实验共采用16个数据集,并与DPC、K means、基于密度的噪声应用空间聚类算法(DBSCAN)及DPC改进算法进行了对比。实验结果表明,本方法具有优异的聚类准确性和良好的稳定性。
英文摘要:
      In this paper, a L1-norm based data similarity density peak clustering (SLDPC) algorithm is proposed to solve the problems of parameter sensitivity, algorithm discontinuity and clustering fragmentation in density peak clustering (DPC). The local density is determined based on the data similarity, and the gravity theory is used to enlarge the difference between cluster centers and non-cluster centers. The cluster centers are determined automatically by setting a threshold value. The clustering partitioning is merged by a merging strategy based on edge distribution. A total of 16 data sets are used in the experiment, and compared with DPC, K-means, density-based spatial clustering of applications with noise (DBSCAN) and the improved DPC algorithm. The experimental result shows that the proposed method has excellent clustering accuracy and good stability.
查看全文   查看/发表评论  下载PDF阅读器
关闭

分享按钮