Distance based k-means clustering algorithm for determining number of clusters for high dimensional data

To improve the clustering task on high dimensional data sets, the distance based k-means algorithm is proposed. The proposed algorithm is tested using eighteen sets of normal and non-normal multivariate simulation data under various combinations. | Nội dung Text Distance based k-means clustering algorithm for determining number of clusters for high dimensional data Decision Science Letters 9 2020 51 58 Contents lists available at GrowingScience Decision Science Letters homepage dsl Distance based k-means clustering algorithm for determining number of clusters for high dimensional data Mohamed Cassim Alibuhttoa and Nor Idayu Mahatb a Department of Mathematical Sciences Faculty of Applied Sciences South Eastern University of Sri Lanka Sri Lanka b Department of Mathematics and Statistics School of Quantitative Sciences Universiti Utara Malaysia Malaysia CHRONICLE ABSTRACT Article history Clustering is one of the most common unsupervised data mining classification techniques for Received March 23 2019 splitting objects into a set of meaningful groups. However the traditional k-means algorithm is Received in revised format not applicable to retrieve useful information clusters particularly when there is an August 12 2019 overwhelming growth of multidimensional data. Therefore it is necessary to introduce a new Accepted August 12 2019 Available online strategy to determine the optimal number of clusters. To improve the clustering task on high August 12 2019 dimensional data sets the distance based k-means algorithm is proposed. The proposed algorithm Keywords is tested using eighteen sets of normal and non-normal multivariate simulation data under various Clustering combinations. Evidence gathered from the simulation reveal that the proposed algorithm is High Dimensional Data capable of identifying the exact number of clusters. K-means algorithm Optimal Cluster Simulation 2020 by the authors licensee Growing Science Canada. 1. Introduction The amount of data collected daily is increasing but only part of the data that can be used to extract information which are valuable. This has led to data mining a process of extracting interesting and useful information in the form of relations and .

Không thể tạo bản xem trước, hãy bấm tải xuống
TỪ KHÓA LIÊN QUAN
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.