Data Mining and Knowledge Discovery Handbook, 2 Edition part 78

Data Mining and Knowledge Discovery Handbook, 2 Edition part 78. Knowledge Discovery demonstrates intelligent computing at its best, and is the most desirable and interesting end-product of Information Technology. To be able to discover and to extract knowledge from data is a task that many researchers and practitioners are endeavoring to accomplish. There is a lot of hidden knowledge waiting to be discovered – this is the challenge created by today’s abundance of data. Data Mining and Knowledge Discovery Handbook, 2nd Edition organizes the most current concepts, theories, standards, methodologies, trends, challenges and applications of data mining (DM) and knowledge discovery. | 750 Gary M. Weiss The learned decision boundaries are displayed in Figure and Figure using dashed lines. The learned boundary in Figure is far off from the true boundary and excludes a substantial portion of P3. The inclusion of additional positive examples in Figure addresses the problem with absolute rarity and causes all of P3 to be covered learned although some examples not belonging to P3 will be mistakenly assigned a positive label. Figure which includes additional positive and negative examples corrects this last problem the learned decision boundary nearly overlaps the true boundary and hence is not shown . Figures and demonstrate that additional data can address the problem with absolute rarity. Of course in practice it is not always possible to obtain additional training data. Another problem associated with mining rare cases is reflected by the phrase like a needle in a haystack. The difficulty is not so much due to the needle being small or there being only one needle but by the fact that the needle is obscured by a huge number of strands of hay. Similarly in Data Mining rare cases may be obscured by common cases relative rarity . This is especially a problem when Data Mining algorithms rely on greedy search heuristics that examine one variable at a time since rare cases may depend on the conjunction of many conditions and any single condition in isolation may not provide much guidance. As a specific example of the problem with relative rarity consider the association rule mining problem described earlier where we want to be able to detect the association between mop and broom. Because this association occurs rarely this association can only be found if the minimum support minsup threshold the number of times the association is found in the data is set very low. However setting this threshold low would cause a combinatorial explosion because frequently occurring items will be associated with one another in an .

Không thể tạo bản xem trước, hãy bấm tải xuống
TỪ KHÓA LIÊN QUAN
TÀI LIỆU MỚI ĐĂNG
463    20    1    26-11-2024
12    21    1    26-11-2024
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.