Data Mining and Knowledge Discovery Handbook, 2 Edition part 88. Knowledge Discovery demonstrates intelligent computing at its best, and is the most desirable and interesting end-product of Information Technology. To be able to discover and to extract knowledge from data is a task that many researchers and practitioners are endeavoring to accomplish. There is a lot of hidden knowledge waiting to be discovered – this is the challenge created by today’s abundance of data. Data Mining and Knowledge Discovery Handbook, 2nd Edition organizes the most current concepts, theories, standards, methodologies, trends, challenges and applications of data mining (DM) and knowledge discovery. | 850 Shashi Shekhar Pusheng Zhang and Yan Huang of modeling spatial autocorrelation are many The residual error will have much lower spatial autocorrelation . systematic variation . With the proper choice of W the residual error should at least theoretically have no systematic variation. If the spatial autocorrelation coefficient is statistically significant then SAR will quantify the presence of spatial autocorrelation. It will indicate the extent to which variations in the dependent variable y are explained by the average of neighboring observation values. Finally the model will have a better fit . a higher R-squared statistic . Markov Random Field-based Bayesian Classifiers Markov random fieldbased Bayesian classifiers estimate the classification model fc using MRF and Bayes rule. A set of random variables whose interdependency relationship is represented by an undirected graph . a symmetric neighborhood matrix is called a Markov Random Field Li 1995 . The Markov property specifies that a variable depends only on its neighbors and is independent of all other variables. The location prediction problem can be modeled in this framework by assuming that the class label li fc si of different locations s constitutes an MRF. In other words random variable l is independent of lj if W s sj 0. The Bayesian rule can be used to predict l from feature value vector X and neighborhood class label vector L as follows PrWW P - X lL The solution procedure can estimate Pr li L from the training data where L denotes a set of labels in the neighborhood of si excluding the label at si by examining the ratios of the frequencies of class labels to the total number of locations in the spatial framework. Pr X L can be estimated using kernel functions from the observed values in the training dataset. For reliable estimates even larger training datasets are needed relative to those needed for the Bayesian classifiers without spatial context since we are estimating a more .