Chương này đưa ra hồi quy và dự báo dựa trên phương pháp tiếp cận hữu ích cho việc phân tích QSAR độc tố tiên đoán. Các phương pháp được thảo luận và ví dụ điển hình là: hồi quy tuyến tính nhiều (MLR), phân tích thành phần chính (PCA), hồi quy thành phần chính (PCR), và hình vuông ít nhất một phần dự báo cho các cấu trúc tiềm ẩn (PLS). Hai dữ liệu QSAR bộ, rút ra từ các lĩnh vực độc học môi trường và thiết kế ma túy, được làm việc trong chi tiết, cho thấy. | 6 Regression- and Projection-Based Approaches in Predictive Toxicology LENNART ERIKSSON and ERIK JOHANSSON Umetrics AB Umea Sweden TORBJORN LUNDSTEDT Acurepharma AB Uppsala Sweden and BMC Uppsala Sweden OVERVIEW This chapter outlines regression- and projection-based approaches useful for QSAR analysis in predictive toxicology. The methods discussed and exemplified are multiple linear regression MLR principal component analysis PCA principal component regression PCR and partial least squares projections to latent structures PLS . Two QSAR data sets drawn from the fields of environmental toxicology and drug design are worked out in detail showing the benefits of these methods. PCA is useful when overviewing a data set and 177 2005 by Taylor Francis Group LLC 178 Eriksson et al. exploring relationships among compounds and relationships among variables. MLR PCR and PLS are used for establishing the QSARs. Additionally the concept of statistical molecular design is considered which is an essential ingredient for selecting an informative training set of compounds for QSAR calibration. 1. INTRODUCTION Much of today s activities in medicinal chemistry molecular biology predictive toxicology and drug design are centered around exploring the relationships between X chemical structure and Y measured properties of compounds such as toxicity solubility acidity enzyme binding and membrane penetration. For almost any series of compounds dependencies between chemistry and biology are usually very complex particularly when addressing in vivo biological data. To investigate understand and use such relationships we need a sound description characterization of the variation in chemical structure of relevant molecules and biological targets reliable biological and pharmacological data and possibilities of fabricating new compounds deemed to be of interest. In addition we need good mathematical tools to establish and express the relationships as well as informationally optimal .