Data Mining and Knowledge Discovery Handbook, 2 Edition part 17

Data Mining and Knowledge Discovery Handbook, 2 Edition part 17. Knowledge Discovery demonstrates intelligent computing at its best, and is the most desirable and interesting end-product of Information Technology. To be able to discover and to extract knowledge from data is a task that many researchers and practitioners are endeavoring to accomplish. There is a lot of hidden knowledge waiting to be discovered – this is the challenge created by today’s abundance of data. Data Mining and Knowledge Discovery Handbook, 2nd Edition organizes the most current concepts, theories, standards, methodologies, trends, challenges and applications of data mining (DM) and knowledge discovery. | 140 Lior Rokach and Oded Maimon classes. Stratified random subsampling with a paired t-test is used herein to evaluate accuracy. Computational Complexity Another useful criterion for comparing inducers and classifiers is their computational complexities. Strictly speaking computational complexity is the amount of CPU consumed by each inducer. It is convenient to differentiate between three metrics of computational complexity Computational complexity for generating a new classifier This is the most important metric especially when there is a need to scale the Data Mining algorithm to massive data sets. Because most of the algorithms have computational complexity which is worse than linear in the numbers of tuples mining massive data sets might be prohibitively expensive . Computational complexity for updating a classifier Given new data what is the computational complexity required for updating the current classifier such that the new classifier reflects the new data Computational Complexity for classifying a new instance Generally this type is neglected because it is relatively small. However in certain methods like k-nearest neighborhood or in certain real time applications like anti-missiles applications this type can be critical. Comprehensibility Comprehensibility criterion also known as interpretability refers to how well humans grasp the classifier induced. While the generalization error measures how the classifier fits the data comprehensibility measures the mental fit of that classifier. Many techniques like neural networks or support vectors machines are designed solely to achieve accuracy. However as their classifiers are represented using large assemblages of real valued parameters they are also difficult to understand and are referred to as black-box models. It is often important for the researcher to be able to inspect an induced classifier. For domains such as medical diagnosis the users must understand how the system makes its decisions .

Không thể tạo bản xem trước, hãy bấm tải xuống
TỪ KHÓA LIÊN QUAN
TÀI LIỆU MỚI ĐĂNG
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.