Data Mining and Knowledge Discovery Handbook, 2 Edition part 52

Data Mining and Knowledge Discovery Handbook, 2 Edition part 52. Knowledge Discovery demonstrates intelligent computing at its best, and is the most desirable and interesting end-product of Information Technology. To be able to discover and to extract knowledge from data is a task that many researchers and practitioners are endeavoring to accomplish. There is a lot of hidden knowledge waiting to be discovered – this is the challenge created by today’s abundance of data. Data Mining and Knowledge Discovery Handbook, 2nd Edition organizes the most current concepts, theories, standards, methodologies, trends, challenges and applications of data mining (DM) and knowledge discovery. | 490 Swagatam Das and Ajith Abraham Consider the following example Example 2 Positional coordinates of one particular particle is illustrated below. There are at most five 3-dimensional cluster centers among which according to the rule presented in Equation the second 6 7 third 5 and fifth one 8 4 4 have been activated for partitioning the dataset and marked in bold. The quality of the partition yielded by such a particle can be judged by an appropriate cluster validity index. 6 7 5 8 8 4 T During the PSO iterations if some threshold T in a particle exceeds 1 or becomes negative it is fixed to 1 or zero respectively. However if it is found that no flag could be set to one in a particle all activation threshholds are smaller than we randomly select 2 thresholds and re-initialize them to a random value between and . Thus the minimum number of possible clusters is 2. The Fitness Function One advantage of the proposed algorithm is that it can use any suitable validity index as its fitness function. We have used the kernelized CS measure as the basis of our fitness function which for i-th particle can be described as fi U C ker nek k eps where eps is a very small constant we used . Maximization of fi implies a minimization of the kernelized CS measure leading to the optimal partitioning of the dataset. Avoiding Erroneous particles with Empty Clusters or Unreasonable Fitness Evaluation There is a possibility that in our scheme during computation of the kernelized CS index a division by zero may be encountered. For example the positive infinity such as or the not-a-number such as condition always occurs when one of the selected cluster centers is outside the boundary of distributions of data set as far as possible. To avoid this problem we first check to see if in any particle any cluster has fewer than 2 data points in it. If so the cluster center .

Không thể tạo bản xem trước, hãy bấm tải xuống
TỪ KHÓA LIÊN QUAN
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.