Có một sự khác biệt giữa các biện pháp và kiểu dữ liệu. Một biến số, ví dụ, có thể đại diện cho một mã hóa đề án nói cho tình trạng tài khoản hoặc ngay cả đối với chữ viết tắt tiểu bang. Mặc dù các giá trị giống như số, họ có thực sự phân loại. Mã Zip là một ví dụ phổ biến của hiện tượng này. | 550 Chapter 17 True numeric variables are interval variables that support addition and other mathematical operations. Monetary amounts and customer tenure measured in days are examples of numeric variables. The difference between true numerics and intervals is subtle. However data mining algorithms treat both of these the same way. Also note that these measures form a hierarchy. Any ordered variable is also categorical any interval is also categorical and any numeric is also interval. There is a difference between measure and data type. A numeric variable for instance might represent a coding scheme say for account status or even for state abbreviations. Although the values look like numbers they are really categorical. Zip codes are a common example of this phenomenon. Some algorithms expect variables to be of a certain measure. Statistical regression and neural networks for instance expect their inputs to be numeric. So if a zip code field is included and stored as a number then the algorithms treat its values as numeric generally not a good approach. Decision trees on the other hand treat all their inputs as categorical or ordered even when they are numbers. Measure is one important property. In practice variables have associated types in databases and file layouts. The following sections talk about data types and measures in more detail. Numbers Numbers usually represent quantities and are good variables for modeling purposes. Numeric quantities have both an ordering which is used by decision trees and an ability to perform arithmetic used by other algorithms such as clustering and neural networks . Sometimes what looks like a number really represents a code or an ID. In such cases it is better to treat the number as a categorical value discussed in the next two sections since the ordering and arithmetic properties of the numbers may mislead data mining algorithms attempting to find patterns. There are many different ways to transform numeric quantities. Figure