Data Mining and Knowledge Discovery Handbook, 2 Edition part 74

Data Mining and Knowledge Discovery Handbook, 2 Edition part 74. Knowledge Discovery demonstrates intelligent computing at its best, and is the most desirable and interesting end-product of Information Technology. To be able to discover and to extract knowledge from data is a task that many researchers and practitioners are endeavoring to accomplish. There is a lot of hidden knowledge waiting to be discovered – this is the challenge created by today’s abundance of data. Data Mining and Knowledge Discovery Handbook, 2nd Edition organizes the most current concepts, theories, standards, methodologies, trends, challenges and applications of data mining (DM) and knowledge discovery. | 710 VicençTorra one the results are similar. Some parameterizations of rank swapping Rank with parameter p in the Table and microaggregation Micmul with parameter k in the Table are ranked in both Domingo-Ferrer and Torra 2001b and here among the best algorithms. The comparison can be extended evaluating new masking methods and comparing them with the existing scores. For example results from Jimenez and Torra 2009 would permit to include in this table with a score lower than 40 some parameterizations of lossy compression using JPEG 2000. R-U Maps Duncan et al. 2001 Duncan et al. 2004 propose the R-U maps for Risk-Utility maps. This is a graphical representation of the two measures. R for risk and U for utility. Figure represents an R-U map for the methods listed in the previous section each with several parameterizations. Namely RankXXX corresponds to Rank Swapping MicXXX are variations of Microaggregation JPEGXXX corresponds to Lossy Compression using JPEG and RemuestX is resampling not described in this chapter . In the figure DR corresponds to the Disclosure Risk R following the standard jargon of R-U maps and IL to information loss in our case computed as aPIL . Formally IL and utility U are related as follows 1 U IL. Note that in addition to the protection procedures represented in Table the figure includes all the other methods analyzed in Domingo-Ferrer and Torra 2001b but with the new measures DR and aPIL described above. In this figure the lines represent scores of 50 40 30 and 20. Naturally the nearer a method to 0 0 the better. Conclusions In this chapter we have reviewed the major topics concerning privacy in data mining. We have rewiewed major protection methods and discussed how to measure disclosure risk and information loss. Finally some tools for visualizing such measures and for comparing the methods have been described. Acknowledgements Part of the research described in this chapter is supported by the Spanish MEC projects

Không thể tạo bản xem trước, hãy bấm tải xuống
TỪ KHÓA LIÊN QUAN
TÀI LIỆU MỚI ĐĂNG
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.