Chapter 2: Data Mining includes about Overview of data mining, Association rules, Classification, Regression, Clustering, Other Data Mining problems, Applications of data mining. | Chapter 2 Data Mining Faculty of Computer Science and Engineering HCM City University of Technology October- 2010 Outline Overview of data mining Association rules Classification Regression Clustering Other Data Mining problems Applications of data mining DATA MINING Data mining refers to the mining or discovery of new information in terms of patterns or rules from vast amount of data. To be practically useful, data mining must be carried out efficiently on large files and databases. This chapter briefly reviews the state-of-the-art of this extensive field of data mining. Data mining uses techniques from such areas as machine learning, statistics, neural networks genetic algorithms. OVERVIEW OF DATA MINING Data Mining as a Part of the Knowledge Discovery Process. Knowledge Discovery in Databases, abbreviated as KDD, encompasses more than data mining. The knowledge discovery process comprises six phases: data selection, data cleansing, enrichment, data transformation or . | Chapter 2 Data Mining Faculty of Computer Science and Engineering HCM City University of Technology October- 2010 Outline Overview of data mining Association rules Classification Regression Clustering Other Data Mining problems Applications of data mining DATA MINING Data mining refers to the mining or discovery of new information in terms of patterns or rules from vast amount of data. To be practically useful, data mining must be carried out efficiently on large files and databases. This chapter briefly reviews the state-of-the-art of this extensive field of data mining. Data mining uses techniques from such areas as machine learning, statistics, neural networks genetic algorithms. OVERVIEW OF DATA MINING Data Mining as a Part of the Knowledge Discovery Process. Knowledge Discovery in Databases, abbreviated as KDD, encompasses more than data mining. The knowledge discovery process comprises six phases: data selection, data cleansing, enrichment, data transformation or encoding, data mining and the reporting and displaying of the discovered information. Example Consider a transaction database maintained by a specially consumer goods retails. Suppose the client data includes a customer name, zip code, phone number, date of purchase, item code, price, quantity, and total amount. A variety of new knowledge can be discovered by KDD processing on this client database. During data selection, data about specific items or categories of items, or from stores in a specific region or area of the country, may be selected. The data cleansing process then may correct invalid zip codes or eliminate records with incorrect phone prefixes. Enrichment enhances the data with additional sources of information. For example, given the client names and phone numbers, the store may purchases other data about age, income, and credit rating and append them to each record. Data transformation and encoding may be done to reduce the amount of data. Example (cont.) The result of