The data banks of the National Bureau of Economic Research contain time-series data on 2000 macroeconomic variables. Even if observations were available since the birth of Christ, the degrees of freedom in a model explaining gross national product in terms of all these variables would not turn positive for another two decades. If annual observations were restricted to the 30-year period from 1950 to 1979, the degrees of freedom deficit would be 1970. | Chapter 5 MODEL CHOICE AND SPECIFICATION ANALYSIS EDWARD E. LEAMER University of California Los Angeles Contents 1. Introduction 286 2. Model selection with prior distributions 288 . Hypothesis testing searches 289 . Interpretive searches 296 3. Model selection with loss functions 304 . Model selection with quadratic loss 306 . Simplification searches Model selection with fixed costs 311 . Ridge regression 313 . Inadmissibility 313 4. Proxy searches Model selection with measurement errors 314 5. Model selection without a true model 315 6. Data-instigated models 317 7. Miscellaneous topics 320 . Stepwise regression 320 . Cross-validation 320 . Goodness-of-fit tests 324 8. Conclusion 325 References 325 Helpful comments from David Belsley Zvi Griliches Michael Intriligator and Peter Schmidt are gratefully acknowledged. Work was supported by NSF grant SOC78-09477. Handbook of Econometrics Volume I Edited by Z. Griliches and . Intriligator North-Holland Publishing Company 1983 286 E. E. Learner 1. Introduction The data banks of the National Bureau of Economic Research contain time-series data on 2000 macroeconomic variables. Even if observations were available since the birth of Christ the degrees of freedom in a model explaining gross national product in terms of all these variables would not turn positive for another two decades. If annual observations were restricted to the 30-year period from 1950 to 1979 the degrees of freedom deficit would be 1970. A researcher who sought to sublimate the harsh reality of the degrees of freedom deficit and who restricted himself to exactly five explanatory variables could select from a menu of 2000j 2 65xWi4 equations to be estimated which at the cost of ten cents per regression would consume a research budget of twenty-six trillion dollars. What is going on Although it is safe to say that economists have not tried anything like 1014 regressions to explain GNP I rather think a reasonably large number