skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: In silico prediction of toxicity of non-congeneric industrial chemicals using ensemble learning based modeling approaches

Journal Article · · Toxicology and Applied Pharmacology

Ensemble learning approach based decision treeboost (DTB) and decision tree forest (DTF) models are introduced in order to establish quantitative structure–toxicity relationship (QSTR) for the prediction of toxicity of 1450 diverse chemicals. Eight non-quantum mechanical molecular descriptors were derived. Structural diversity of the chemicals was evaluated using Tanimoto similarity index. Stochastic gradient boosting and bagging algorithms supplemented DTB and DTF models were constructed for classification and function optimization problems using the toxicity end-point in T. pyriformis. Special attention was drawn to prediction ability and robustness of the models, investigated both in external and 10-fold cross validation processes. In complete data, optimal DTB and DTF models rendered accuracies of 98.90%, 98.83% in two-category and 98.14%, 98.14% in four-category toxicity classifications. Both the models further yielded classification accuracies of 100% in external toxicity data of T. pyriformis. The constructed regression models (DTB and DTF) using five descriptors yielded correlation coefficients (R{sup 2}) of 0.945, 0.944 between the measured and predicted toxicities with mean squared errors (MSEs) of 0.059, and 0.064 in complete T. pyriformis data. The T. pyriformis regression models (DTB and DTF) applied to the external toxicity data sets yielded R{sup 2} and MSE values of 0.637, 0.655; 0.534, 0.507 (marine bacteria) and 0.741, 0.691; 0.155, 0.173 (algae). The results suggest for wide applicability of the inter-species models in predicting toxicity of new chemicals for regulatory purposes. These approaches provide useful strategy and robust tools in the screening of ecotoxicological risk or environmental hazard potential of chemicals. - Graphical abstract: Importance of input variables in DTB and DTF classification models for (a) two-category, and (b) four-category toxicity intervals in T. pyriformis data. Generalization and predictive abilities of the constructed (c) DTB and (d) DTF regression models to predict the T. pyriformis toxicity of diverse chemicals. - Highlights: • Ensemble learning (EL) based models constructed for toxicity prediction of chemicals • Predictive models used a few simple non-quantum mechanical molecular descriptors. • EL-based DTB/DTF models successfully discriminated toxic and non-toxic chemicals. • DTB/DTF regression models precisely predicted toxicity of chemicals in multi-species. • Proposed EL based models can be used as tool to predict toxicity of new chemicals.

OSTI ID:
22423802
Journal Information:
Toxicology and Applied Pharmacology, Vol. 275, Issue 3; Other Information: Copyright (c) 2014 Elsevier Science B.V., Amsterdam, The Netherlands, All rights reserved.; Country of input: International Atomic Energy Agency (IAEA); ISSN 0041-008X
Country of Publication:
United States
Language:
English