Tham khảo tài liệu 'description data mining techniques for marketing_8', khoa học xã hội, kinh tế chính trị phục vụ nhu cầu học tập, nghiên cứu và làm việc hiệu quả | 380 Chapter 11 Using Thematic Clusters to Adjust Zone Boundaries The goal of the clustering project was to validate editorial zones that already existed. Each editorial zone consisted of a set of towns assigned one of the four clusters described above. The next step was to manually increase each zone s purity by swapping towns with adjacent zones. For example Table shows that all of the towns in the City zone are in Cluster 1B except Brookline which is Cluster 2. In the neighboring West 1 zone all the towns are in Cluster 2 except for Waltham and Watertown which are in Cluster 1B. Swapping Brookline into West 1 and Watertown and Waltham into City would make it possible for both editorial zones to be pure in the sense that all the towns in each zone would share the same cluster assignment. The new West 1 would be all Cluster 2 and the new City would be all Cluster 1B. As can be seen in the map in Figure the new zones are still geographically contiguous. Having editorial zones composed of similar towns makes it easier for the Globe to provide sharper editorial focus in its localized content which should lead to higher circulation and better advertising sales. Table Towns in the City and l Uest 1 Editorial Zones 1 TOWN EDITORIAL ZONE CLUSTER ASSIGNMENT 1 Brookline City 2 Boston City 1B Cambridge City 1B Somerville City 1B Needham West 1 2 Newton West 1 2 Wellesley West 1 2 Waltham West 1 1B Weston West 1 2 Watertown West 1 1B Automatic Cluster Detection 381 Lessons Learned Automatic cluster detection is an undirected data mining technique that can be used to learn about the structure of complex databases. By breaking complex datasets into simpler clusters automatic clustering can be used to improve the performance of more directed techniques. By choosing different distance measures automatic clustering can be applied to almost any kind of data. It is as easy to find clusters in collections of news stories or insurance claims as in astronomical or .