DOI QR코드

DOI QR Code

Machine-learning based prediction models for assessing skin irritation and corrosion potential of liquid chemicals using physicochemical properties by XGBoost

  • Received : 2022.10.31
  • Accepted : 2022.12.23
  • Published : 2023.04.15

Abstract

Skin irritation test is an essential part of the safety assessment of chemicals. Recently, computational models to predict the skin irritation draw attention as alternatives to animal testing. We developed prediction models on skin irritation/corrosion of liquid chemicals using machine learning algorithms, with 34 physicochemical descriptors calculated from the structure. The training and test dataset of 545 liquid chemicals with reliable in vivo skin hazard classifcations based on UN Globally Harmonized System [category 1 (corrosive, Cat 1), 2 (irritant, Cat 2), 3 (mild irritant, Cat 3), and no category (nonirritant, NC)] were collected from public databases. After the curation of input data through removal and correlation analysis, every model was constructed to predict skin hazard classifcation for liquid chemicals with 22 physicochemical descriptors. Seven machine learning algorithms [Logistic regression, Naïve Bayes, k-nearest neighbor, Support vector machine, Random Forest, Extreme gradient boosting (XGB), and Neural net] were applied to ternary and binary classifcation of skin hazard. XGB model demonstrated the highest accuracy (0.73-0.81), sensitivity (0.71-0.92), and positive predictive value (0.65-0.81). The contribution of physicochemical descriptors to the classifcation was analyzed using Shapley Additive exPlanations plot to provide an insight into the skin irritation of chemicals.

Keywords

Acknowledgement

This study was Cosmetic safety evaluation project carried out by the Korea Cosmetic Industry Institute (KCII) funded by the Ministry of Health and Welfare and the Korea Environment Industry and Technology Institute (KEITI) funded by Korea Ministry of Environment (MOE) (2021002970001, 1485017976).

References

  1. Gallegos Saliner A, Tsakovska I, Pavan M, Patlewicz G, Worth AP, Research QiE (2007) Evaluation of SARs for the prediction of skin irritation/corrosion potential-structural inclusion rules in the BfR decision support system. SAR 18:331-342. https://doi.org/10.1080/10629360701304014 
  2. OECD (2015) Test guideline no. 404: acute dermal irritation/corrosion. OECD guidelines for the testing of chemicals. Organisation for Economic Cooperation and Development, Paris. https://doi.org/10.1787/9789264070622 
  3. OECD (2021) Test no. 439: in vitro skin irritation: reconstructed human epidermis test method. Organisation for Economic Cooperation and Development, Paris. https://doi.org/10.1787/20745788 
  4. OECD (2019) Test no. 431: In vitro skin corrosion: reconstructed human epidermis (RHE) test method. Organisation for Economic Cooperation and Development, Paris. https://doi.org/10.1787/20745788 
  5. Desprez B, Barroso J, Griesinger C, Kandarova H, Alepee N, Fuchs HW (2015) Two novel prediction models improve predictions of skin corrosive sub-categories by test methods of OECD test guideline no. 431. Toxicol In Vitro 29:2055-2080. https://doi.org/10.1016/j.tiv.2015.08.015 
  6. Ball N, Cronin MT, Shen J, Blackburn K, Booth ED, Bouhifd M, Donley E, Egnash L, Hastings C, Juberg DR (2016) T4 report: toward good read-across practice (GRAP) guidance. Altex 33:149. https://doi.org/10.14573/altex.1601251 
  7. Patlewicz G, Ball N, Booth ED, Hulzebos E, Zvinavashe E, Hennes C (2013) Use of category approaches, read-across and (Q) SAR: general considerations. Regul Pharmacol Toxicol 67:1-12. https://doi.org/10.1016/j.yrtph.2013.06.002 
  8. Saliner AG, Worth AP (2007) Testing strategies for the prediction of skin and eye irritation and corrosion for regulatory purposes: Publications Ofce of the European Union. https://doi.org/10.2788/64337 
  9. Benfenati E, Chaudhry Q, Gini G, Dorne JL (2019) Integrating in silico models and read-across methods for predicting toxicity of chemicals: a step-wise strategy. Environ Int 131:105060. https://doi.org/10.1016/j.envint.2019.105060 
  10. Raies AB, Bajic VB (2016) In silico toxicology: computational methods for the prediction of chemical toxicity. Wiley Interdiscip Rev Comput Mol Sci 6:147-172. https://doi.org/10.1002/wcms.1240 
  11. Organisation for Economic Co-operation and Development (2014) Guidance document on the validation of (quantitative) structure-activity relationship [(Q) SAR] models. Organisation for Economic Co-operation and Development. https://doi.org/10.1787/9789264085442-en 
  12. Verheyen GR, Braeken E, Van Deun K, Van Miert S (2017) Evaluation of existing (Q) SAR models for skin and eye irritation and corrosion to use for REACH registration. Toxicol Lett 265:47-52. https://doi.org/10.1016/j.toxlet.2016.11.007 
  13. Marchant CA, Briggs KA, Long A (2008) In silico tools for sharing data and knowledge on toxicity and metabolism: Derek for windows, meteor, and vitic. Toxicol Mech Methods 18:177-187. https://doi.org/10.1080/15376510701857320 
  14. Hulzebos E, Walker JD, Gerner I, Schlegel K (2005) Use of structural alerts to develop rules for identifying chemical substances with skin irritation or skin corrosion potential. QSAR Combina Sci 24:332-342. https://doi.org/10.1002/qsar.200430905 
  15. Han J, Lee G-Y, Bae G, Kang M-J, Lim K-M (2021) Chemskin reference chemical database for the development of an in vitro skin irritation test. Toxics 9:314. https://doi.org/10.3390/toxics9110314 
  16. Schober P, Vetter TR, Analgesia (2020) Linear regression in medical research. Anesthesia 132:108. https://doi.org/10.1213/ANE.0000000000005206 
  17. Vetter TR, Schober P, Analgesia (2018) Regression: the apple does not fall far from the tree. Anesthesia 127:277-283. https://doi.org/10.1213/ane.0000000000003424 
  18. Berger JO (2013) Statistical decision theory and Bayesian analysis. Springer Science & Business Media, Berlin. https://doi.org/10.1007/978-1-4757-4286-2_4 
  19. Altman NS (1992) An introduction to kernel and nearest-neighbor nonparametric regression. Am Stat 46:175-185. https://doi.org/10.1080/00031305.1992.10475879 
  20. Huang S, Cai N, Pacheco PP, Narrandes S, Wang Y, Xu W (2018) Applications of support vector machine (SVM) learning in cancer genomics. Cancer Genom Proteom 15:41-51. https://doi.org/10.21873/cgp.20063 
  21. Noble WS (2006) What is a support vector machine? Nat Biotechnol 24:1565-1567. https://doi.org/10.1038/nbt1206-1565 
  22. Pellegrino E, Jacques C, Beaufils N, Nanni I, Carlioz A, Metellus P, Ouafik Lh (2021) Machine learning random forest for predicting oncosomatic variant NGS analysis. Sci Rep 11:1-14. https://doi.org/10.1038/s41598-021-01253-y 
  23. Noh B, Youm C, Goh E, Lee M, Park H, Jeon H, Kim OY (2021) XGBoost based machine learning approach to predict the risk of fall in older adults using gait outcomes. Sci Rep 11:1-9. https://doi.org/10.1038/s41598-021-91797-w 
  24. Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pp 785-794. https://doi.org/10.1145/2939672.2939785 
  25. Kriegeskorte N, Golan T (2019) Neural network models and deep learning. Curr Biol 29:R231-R236. https://doi.org/10.1016/j.cub.2019.02.034 
  26. Silva AC, Borba JV, Alves VM, Hall SU, Furnham N, Kleinstreuer N, Muratov E, Tropsha A, Andrade CH (2021) Novel computational models offer alternatives to animal testing for assessing eye irritation and corrosion potential of chemicals. Artif Intell Life Sci 1:100028. https://doi.org/10.1016/j.ailsci.2021.100028 
  27. Ying G-S, Maguire MG, Glynn RJ, Rosner B (2020) Calculating sensitivity, specificity, and predictive values for correlated eye data. Investig Ophthalmol Vis Sci 61:29-29. https://doi.org/10.1167/iovs.61.11.29 
  28. Akbar NA, Sunyoto A, Arief MR, and Caesarendra W (2020) Improvement of decision tree classifer accuracy for healthcare insurance fraud prediction by using extreme gradient boosting algorithm. In: 2020 international conference on informatics, multimedia, cyber and information system (ICIMCIS). IEEE, pp 110-114. https://doi.org/10.1109/ICIMCIS51567.2020.9354286 
  29. Wang F, Ross CL (2018) Machine learning travel mode choices: comparing the performance of an extreme gradient boosting model with a multinomial logit model. Transp Res Rec 2672:35-45. https://doi.org/10.1177/0361198118773556 
  30. Bae SY, Lee J, Jeong J, Lim C, Choi J (2021) Efective data-balancing methods for class-imbalanced genotoxicity datasets using machine learning algorithms and molecular fngerprints. Comput Toxicol 20:100178. https://doi.org/10.1016/j.comtox.2021.100178 
  31. Shi Z, Chu Y, Zhang Y, Wang Y, Wei D-Q (2020) Prediction of blood-brain barrier permeability of compounds by fusing resampling strategies and extreme gradient boosting. IEEE Access 9:9557-9566. https://doi.org/10.1109/ACCESS.2020.3047852 
  32. Feng H, Zhang L, Li S, Liu L, Yang T, Yang P, Zhao J, Arkin IT, Liu H (2021) Predicting the reproductive toxicity of chemicals using ensemble learning methods and molecular fingerprints. Toxicol Lett 340:4-14. https://doi.org/10.1016/j.toxlet.2021.01.002 
  33. Macfarlane M, Jones P, Goebel C, Dufour E, Rowland J, Araki D, Costabel-Farkas M, Hewitt NJ, Hibatallah J, Kirst AJRT (2009) A tiered approach to the use of alternatives to animal testing for the safety assessment of cosmetics: skin irritation. Regul Toxicol Pharmacol 54:188-196. https://doi.org/10.1016/j.yrtph.2009.04.003 
  34. Gallegos Saliner A, Tsakovska I, Pavan M, Patlewicz G, Worth A (2007) Evaluation of SARs for the prediction of skin irritation/corrosion potential-structural inclusion rules in the BfR decision support system. SAR QSAR Environ Res 18:331-342. https://doi.org/10.1080/10629360701304014 
  35. Mombelli E (2008) An evaluation of the predictive ability of the QSAR software packages, DEREK, HAZARDEXPERT and TOPKAT, to describe chemically-induced skin irritation. Altern Lab Anim 36:15-24. https://doi.org/10.1177/026119290803600104 
  36. Tsakovska I, Saliner AG, Netzeva T, Pavan M, Worth A (2007) Evaluation of SARs for the prediction of eye irritation/corrosion potential-structural inclusion rules in the BfR decision support system. SAR QSAR Environ Res 18:221-235. https://doi.org/10.1080/10629360701304063 
  37. Musa AY, Jalgham RT, Mohamad AB (2012) Molecular dynamic and quantum chemical calculations for phthalazine derivatives as corrosion inhibitors of mild steel in 1 M HCl. Corros Sci 56:176-183. https://doi.org/10.1016/j.corsci.2011.12.005 
  38. Usha T, Tripathi P, Pande V, Middha SK (2013) Molecular docking and quantum mechanical studies on pelargonidin-3-glucoside as renoprotective ACE inhibitor. Int Sch Res Not 2013:428378. https://doi.org/10.1155/2013/428378 
  39. Eddy NO, Essien NB (2017) Computational chemistry study of toxicity of some m-tolyl acetate derivatives insecticides and molecular design of structurally related products. In Silico Pharmacol 5:1-17. https://doi.org/10.1007/s40203-017-0036-y 
  40. Ferguson J (1939) The use of chemical potentials as indices of toxicity. Proc R Soc Lond Ser B Biol Sci 127:387-404. https://doi.org/10.1098/rspb.1939.0030 
  41. Kehrer JP (2000) The Haber-Weiss reaction and mechanisms of toxicity. Toxicology 149:43-50. https://doi.org/10.1016/S0300-483X(00)00231-6 
  42. Lyakurwa F, Yang X, Li X, Qiao X, Chen J (2014) Development and validation of theoretical linear solvation energy relationship models for toxicity prediction to fathead minnow (Pimephales promelas). Chemosphere 96:188-194. https://doi.org/10.1016/j.chemosphere.2013.10.039 
  43. Bakire S, Yang X, Ma G, Wei X, Yu H, Chen J, Lin H (2018) Developing predictive models for toxicity of organic chemicals to green algae based on mode of action. Chemosphere 190:463-470. https://doi.org/10.1016/j.chemosphere.2017.10.028 
  44. Ameh PO, Eddy NO (2016) Theoretical and experimental studies on the corrosion inhibition potentials of 3-nitrobenzoic acid for mild steel in 0.1 M H2SO4. Cogent Chem 2:1253904. https://doi.org/10.1080/23312009.2016.1253904