Data Mining and Knowledge Discovery Handbook, 2 Edition part 71. Knowledge Discovery demonstrates intelligent computing at its best, and is the most desirable and interesting end-product of Information Technology. To be able to discover and to extract knowledge from data is a task that many researchers and practitioners are endeavoring to accomplish. There is a lot of hidden knowledge waiting to be discovered – this is the challenge created by today’s abundance of data. Data Mining and Knowledge Discovery Handbook, 2nd Edition organizes the most current concepts, theories, standards, methodologies, trends, challenges and applications of data mining (DM) and knowledge discovery. | 680 Grigorios Tsoumakas Ioannis Katakis and Ioannis Vlahavas gw i 1 if A Y 0 otherwise Coverage evaluates how far we need on average to go down the ranked list of labels in order to cover all the relevant labels of the example. Cov y maxrAA 1 m AeYi Ranking loss expresses the number of times that irrelevant labels are ranked higher than relevant labels R-Loss m y Y1Y l Aa Afc ri a ri Ab aM e Yi x Y where Y is the complementary set of Y with respect to L. Average precision evaluates the average fraction of labels ranked above a particular label A e Yi which actually are in Y. riW AvgPrec 1 y y Y m i 1 Yil ACY Hierarchical The hierarchical loss Cesa-Bianchi et al. 2006b is a modified version of the Hamming loss that takes into account an existing hierarchical structure of the labels. It examines the predicted labels in a top-down manner according to the hierarchy and whenever the prediction for a label is wrong the subtree rooted at that node is not considered further in the calculation of the loss. Let anc A be the set of all the ancestor nodes of A. The hierarchical loss is defined as follows H-Loss - y A A c YiAZi anc A n YiAZi 0 mi 1 Several other measures for hierarchical multi-label classification are examined in Moskovitch et al. 2006 Sun Lim 2001 . Related Tasks One of the most popular supervised learning tasks is multi-class classification which involves a set of labels L where L 2. The critical difference with respect to multi-label classification is that each instance is associated with only one element of L instead of a subset of L. Jin and Ghahramani Jin Ghahramani 2002 call multiple-label problems the semisupervised classification problems where each example is associated with more than one classes but only one of those classes is the true class of the example. This task is not that common in real-world applications as the one we are studying. Multiple-instance or multi-instance learning is a variation of supervised learning where labels are