Data Mining and Knowledge Discovery Handbook, 2 Edition part 15

Data Mining and Knowledge Discovery Handbook, 2 Edition part 15. Knowledge Discovery demonstrates intelligent computing at its best, and is the most desirable and interesting end-product of Information Technology. To be able to discover and to extract knowledge from data is a task that many researchers and practitioners are endeavoring to accomplish. There is a lot of hidden knowledge waiting to be discovered – this is the challenge created by today’s abundance of data. Data Mining and Knowledge Discovery Handbook, 2nd Edition organizes the most current concepts, theories, standards, methodologies, trends, challenges and applications of data mining (DM) and knowledge discovery. | 120 Irad Ben-Gal Traditionally ln dn are estimated respectively by the sample mean xn and the sample standard deviation Sn. Since these estimates are highly affected by the presence of outliers many procedures often replace them by other more robust estimates that are discussed in Section . The multiple-comparison correction is used when several statistical tests are being performed simultaneously. While a given a-value may be appropriate to decide whether a single observation lies in the outlier region . a single comparison this is not the case for a set of several comparisons. In order to avoid spurious positives the a-value needs to be lowered to account for the number of performed comparisons. The simplest and most conservative approach is the Bonferroni s correction which sets the a-value for the entire set of n comparisons equal to a by taking the a-value for each comparison equal to a n. Another popular and simple correction uses an 1 - 1 - a 1 n. Note that the traditional Bonferroni s method is quasi-optimal when the observations are independent which is in most cases unrealistic. The critical value g n an is often specified by numerical procedures such as Monte Carlo simulations for different sample sizes . Davies and Gather 1993 . Inward and Outward Procedures Sequential identifiers can be further classified to inward and outward procedures. In inward testing or forward selection methods at each step of the procedure the most extreme observation . the one with the largest outlyingness measure is tested for being an outlier. If it is declared as an outlier it is deleted from the dataset and the procedure is repeated. If it is declared as a non-outlying observation the procedure terminates. Some classical examples for inward procedures can be found in Hawkins 1980 Barnett and Lewis 1994 . In outward testing procedures the sample of observations is first reduced to a smaller sample . by a factor of two while the removed observations .

Không thể tạo bản xem trước, hãy bấm tải xuống
TỪ KHÓA LIÊN QUAN
TÀI LIỆU MỚI ĐĂNG
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.