孙鹏,韩承德,曾涛.S-DBSCAN:一种基于DBSCAN发现高密度簇的算法[J].高技术通讯(中文),2012,22(6):589~595 |
S-DBSCAN:一种基于DBSCAN发现高密度簇的算法 |
S-DBSCAN: an algorithm for finding high density clusters based on DBSCAN |
修订日期:2011-02-25 |
DOI: |
中文关键词: 基于密度的带有噪声的空间聚类(DBSCAN), S-DBSCAN, 高密度簇, 聚类, 参数可变 |
英文关键词: density based spatial clustering of applications with noise (DBSCAN), S DBSCAN, high density clusters, clustering, parameter changing |
基金项目:863计划(2009AA12Z220, 2009AA12Z226)资助项目 |
作者 | 单位 | 孙鹏 | 中国科学院研究生院;中国科学院计算技术研究所 | 韩承德 | 中国科学院计算技术研究所 | 曾涛 | 天津师范大学计算机与信息工程学院 |
|
摘要点击次数: 3468 |
全文下载次数: 2396 |
中文摘要: |
针对基于密度的带有噪声的空间聚类(DBSCAN)算法用于交互式数据挖掘时用户经常调整算法参数以发现感兴趣的知识以及数据集相对稳定的特点,提出了一种基于DBSCAN发现高密度簇的算法——S DBSCAN算法,确定了需调整的算法参数——对象的邻域范围ε(Eps)和满足核心对象条件的ε邻域内最小对象个数MinPts,阐述了参数ε与MinPts的3种适合S DBSCAN算法的变化情况,并给出了相应的证明,同时分析了算法的时间复杂度。在对真实和合成数据集的测试中,S DBSCAN算法相比DBSCAN算法具有较好的效率。 |
英文摘要: |
Considering that when the algorithm based on density based spatial clustering of applications with noise (DBSCAN) is applied to interactive data mining, certain algorithm parameters are usually adjusted to find new knowledge, and the data set used in data mining is relatively stable, this paper presents an algorithm for finding high density clusters based on DBSCAN, called the S DBSCAN algorithm, and determines the parameters needing to be adjusted, the ε, neighborhood of an object, and the MinPts, minimal number of objects of ε neighborhood to form a core object. Then three different combinations of the variations of ε neighborhood and MinPts fit for the S DBSCAN algorithm are introduced, and the rightness is demonstrated and the time complexity is analyzed. The experiments on real and synthetic data were performed to verify the efficiency and the results show that the S DBSCAN algorithm has a better efficiency than DBSCAN. |
查看全文
查看/发表评论 下载PDF阅读器 |
关闭 |