DOI QR코드

DOI QR Code

PCBs 독성 예측을 위한 주요 분자표현자 선택 기법 및 계산독성학 기반 QSAR 모델 개발

Development of QSAR Model Based on the Key Molecular Descriptors Selection and Computational Toxicology for Prediction of Toxicity of PCBs

  • 김동우 (경희대학교 환경학 및 환경공학과 환경연구센터) ;
  • 이승철 (경희대학교 환경학 및 환경공학과 환경연구센터) ;
  • 김민정 (경희대학교 환경학 및 환경공학과 환경연구센터) ;
  • 이은지 (경희대학교 환경학 및 환경공학과 환경연구센터) ;
  • 유창규 (경희대학교 환경학 및 환경공학과 환경연구센터)
  • Kim, Dongwoo (Department of Environmental Science and Engineering, Center for Environmental Studies, College of Engineering, Kyung Hee University) ;
  • Lee, Seungchel (Department of Environmental Science and Engineering, Center for Environmental Studies, College of Engineering, Kyung Hee University) ;
  • Kim, Minjeong (Department of Environmental Science and Engineering, Center for Environmental Studies, College of Engineering, Kyung Hee University) ;
  • Lee, Eunji (Department of Environmental Science and Engineering, Center for Environmental Studies, College of Engineering, Kyung Hee University) ;
  • Yoo, ChangKyoo (Department of Environmental Science and Engineering, Center for Environmental Studies, College of Engineering, Kyung Hee University)
  • 투고 : 2016.04.19
  • 심사 : 2016.07.04
  • 발행 : 2016.10.01

초록

EU의 REACH 제도 도입에 따라 각종 화학물질에 대한 독성 및 활성 정보 확보를 위해 화학물질의 분자구조 정보를 기반으로 화학물질의 독성 및 활성을 예측하는 정량적구조활성관계(QSAR)에 대한 연구가 최근 활발히 진행되고 있다. QSAR 모델에 사용되는 분자표현자는 매우 다양하기 때문에 화학물질의 물성 및 활성을 잘 표현할 수 있는 주요한 분자표현자를 선택하는 과정은 QSAR 모델 개발에 있어 중요한 부분이다. 본 연구에서는 화학물질의 분자구조 정보를 나타내는 주요 분자표현자의 통계적 선택 방법과 부분최소자승법(Partial least square: PLS) 기반의 새로운 QSAR 모델을 제안하였다. 제안된 QSAR 모델은 130종의 폴리염화바이페닐(Polychlorinated biphenyl: PCB)에 대한 분배계수(log P)와 14종의 PCBs에 대한 반수 치사 농도(Lethal concentration 50%: $LC_{50}$) 예측에 사용되고, 제안된 QSAR 모델 예측 정확도는 기존의 OECD QSAR Toolbox에서 제공하는 QSAR 모델과 비교하였다. 관심 화학물질의 분자표현자와 활성정보 간의 높은 상관관계를 갖는 주요 분자표현자를 선별하기 위해서, 상관계수(r)와 variable importance on projections (VIP)기법을 적용하였으며, 화학물질의 독성 및 활성정보를 예측하기 위해 선별된 분자표현자와 활성정보를 이용해 부분최소자승법(PLS)를 사용하였다. 회귀계수($R^2$)와 prediction residual error sum of square (PRESS)을 이용한 성능평가결과, 제안된 QSAR 모델은 OECD QSAR Toolbox의 QSAR 모델보다 PCBs의 log P와 $LC_{50}$에 대하여 각각 26%, 91% 향상된 예측력을 나타내었다. 본 연구에서 제안된 계산독성학 기반의 QSAR 모델은 화학물질의 독성 및 활성정보에 대한 예측력을 향상시킬 수 있고 이러한 방법은 유독 화학물질의 인체 및 환경 위해성 평가에 기여할 것으로 판단된다.

Recently, the researches on quantitative structure activity relationship (QSAR) for describing toxicities or activities of chemicals based on chemical structural characteristics have been widely carried out in order to estimate the toxicity of chemicals in multiuse facilities. Because the toxicity of chemicals are explained by various kinds of molecular descriptors, an important step for QSAR model development is how to select significant molecular descriptors. This research proposes a statistical selection of significant molecular descriptors and a new QSAR model based on partial least square (PLS). The proposed QSAR model is applied to estimate the logarithm of partition coefficients (log P) of 130 polychlorinated biphenyls (PCBs) and lethal concentration ($LC_{50}$) of 14 PCBs, where the prediction accuracies of the proposed QSAR model are compared to a conventional QSAR model provided by OECD QSAR toolbox. For the selection of significant molecular descriptors that have high correlation with molecular descriptors and activity information of the chemicals of interest, correlation coefficient (r) and variable importance of projection (VIP) are applied and then PLS model of the selected molecular descriptors and activity information is used to predict toxicities and activity information of chemicals. In the prediction results of coefficient of regression ($R^2$) and prediction residual error sum of square (PRESS), the proposed QSAR model showed improved prediction performances of log P and $LC_{50}$ by 26% and 91% than the conventional QSAR model, respectively. The proposed QSAR method based on computational toxicology can improve the prediction performance of the toxicities and the activity information of chemicals, which can contribute to the health and environmental risk assessment of toxic chemicals.

키워드

참고문헌

  1. Ahlers, J., Stock, F. and Werschkun, B., "Integrated Testing and Intelligent Assessment - New Challenges Under REACH," Environ. Sci. Pollut. Res., 15(7), 565-572(2008). https://doi.org/10.1007/s11356-008-0043-y
  2. Kananpanah, S., Dizadji, N., Abolghasemi, H. and Salamatinia, B., "Developing a New Model to Predict Mass Transfer Coefficient of Salicylic Acid Adsorption onto IRA-93: Experimental and Modeling," Korean J. Chem. Eng., 26(5), 1208-1212(2009). https://doi.org/10.1007/s11814-009-0215-6
  3. TGD, E., Technical Guidance Document (TGD) in support of commission directive 93/67/EEC on risk assessment for new notified substances and commission regulation (EC) No. 1488/94 on risk assessment for existing substances, Part i to IV, Office for official publications of the European Communities (1996).
  4. Devillers, J. and Balaban, A. T. (Ed.), Topological indices and related descriptors in QSAR and QSPAR. CRC Press(2000).
  5. Song, I. S., Cha, J. Y. and Lee, S. K., "Prediction and Analysis of Acute Fish Toxicity of Pesticides to the Rainbow Trout Using 2D-QSAR," Anal. Sci. Technol., 24(6), 544-555(2011). https://doi.org/10.5806/AST.2011.24.6.544
  6. Ammi, Y., Khaouane, L. and Hanini, S., "Prediction of the Rejection of Organic Compounds (neutral and ionic) by Nanofiltration and Reverse Osmosis Membranes Using Neural Networks," Korean J. Chem. Eng., 32(11), 2300-2310(2015). https://doi.org/10.1007/s11814-015-0086-y
  7. Kim, J., Jung, D. H., Rhee, H., Choi, S. H., Sung, M. J. and Choi, W. S., "Aqueous Solubility of Poorly Water-soluble Drugs: Prediction Using Similarity and Quantitative Structure-property Relationship Models," Korean J. Chem. Eng., 25(4), 865-873 (2008). https://doi.org/10.1007/s11814-008-0143-x
  8. Coccini, T., Giannoni, L., Karcher, W., Manzo, L. and Roi, R., Quantitative structure/Activity relationships (QSAR) in Toxicology. Joint Research Centre, Pavia, Italy (1991).
  9. Todeschini, R. and Consonni, V. Handbook of molecular descriptors, Vol. 11., John Wiley & Sons (2008).
  10. Shi, H., "IAQ monitoring of sub-PCA and health risk assessment of nonlinear QSAR for indoor air pollutants," Master Dissertation, Kyung Hee University, Seoul, Korea (2015).
  11. Ock, H. S., "Developing trend of QSAR modeling and pesticides," Korean J. Pestic. Sci., 15(1), 68-85(2011).
  12. Han, I. S. and Shin, H. K., "Modeling of a PEM Fuel Cell Stack Using Partial Least Squares and Artificial Neural Networks," Korean Chem. Eng. Res., 53(2), 236-242(2015). https://doi.org/10.9713/kcer.2015.53.2.236
  13. Lee, C. J., Ko, J. W. and Lee, G. B., "Comparison of Partial Least Squares and Support Vector Machine for the Flash Point Prediction of Organic Compounds," Korean Chem. Eng. Res., 48(6), 717-724(2010).
  14. Montgomery, D. C., Runger, G. C. and Hubele, N. F. Engineering statistics. John Wiley & Sons (2009).
  15. Pao, S. Y., Lin, W. L. and Hwang, M. J., "In Silico Identification and Comparative Analysis of Differentially Expressed Genes in Human and Mouse Tissues," BMC genomics, 7(1), 1(2006). https://doi.org/10.1186/1471-2164-7-1
  16. Mehmood, T., Liland, K. H., Snipen, L. and Saebo, S., "A Review of Variable Selection Methods in Partial Least Squares Regression," Chemometr. Intell. Lab., 118, 62-69(2012). https://doi.org/10.1016/j.chemolab.2012.07.010
  17. Chong, I. G. and Jun, C. H., "Performance of Some Variable Selection Methods when Multicollinearity is Present," Chemometr. Intell. Lab., 78(1), 103-112(2005). https://doi.org/10.1016/j.chemolab.2004.12.011
  18. Talete srl, Dragon Version 6.0, http://www.talete.mi.it/.
  19. Gholivand, K., Ebrahimi Valmoozi, A. A., Mahzouni, H. R., Ghadimi, S. and Rahimi, R., "Molecular Docking and QSAR Studies: Noncovalent Interaction between Acephate Analogous and the Receptor Site of Human Acetylcholinesterase," J. Agric. Food Chem., 61(28), 6776-6785(2013). https://doi.org/10.1021/jf401092h
  20. OECD QSAR toolbox Version 3.2, http://www.qsartoolbox.org.
  21. Robertson, L. W. and Hansen, L. G. (Ed)., PCBs: recent advances in environmental toxicology and health effects. University Press of Kentucky(2015).
  22. Gramatica, P., Navas, N. and Todeschini, R., "3D-modelling and prediction by WHIM descriptors. Part 9. Chromatographic relative retention time and physico-chemical properties of polychlorinated biphenyls (PCBs)," Chemometr. Intell. Lab., 40(1), 53-63 (1998). https://doi.org/10.1016/S0169-7439(97)00079-8
  23. Randic, M., "Molecular profiles novel geometry-dependent molecular descriptors," New J. Chem., 19(7), 781-791(1995).
  24. Randic, M., "Molecular shape profiles," J. Chem. Inform. Comput. Sci., 35(3), 373-382(1995). https://doi.org/10.1021/ci00025a005
  25. Randic, M. and Razinger, M., "On Characterization of Molecular Shapes," J. Chem. Inform. Comput. Sci., 35(3), 594-606(1995). https://doi.org/10.1021/ci00025a031
  26. Consonni, V., Todeschini, R. and Pavan, M., "Structure/response Correlations and Similarity/diversity Analysis by GETAWAY Descriptors. 1. Theory of the novel 3D Molecular Descriptors," J. Chem. Inform. Comput. Sci., 42(3), 682-692(2002). https://doi.org/10.1021/ci015504a
  27. Consonni, V., Todeschini, R., Pavan, M. and Gramatica, P., "Structure/response Correlations and Similarity/diversity Analysis by GETAWAY Descriptors. 2. Application of the Novel 3D Molecular Descriptors to QSAR/QSPR Studies," J. Chem. Inform. Comput. Sci., 42(3), 693-705(2002). https://doi.org/10.1021/ci0155053
  28. Carhart, R. E., Smith, D. H. and Venkataraghavan, R., "Atom Pairs as Molecular Features in Structure-activity Studies: Definition and Applications," J. Chem. Inform. Comput. Sci., 25(2), 64-73(1985). https://doi.org/10.1021/ci00046a002
  29. Broto, P., Moreau, G. and Vandycke, C., "Molecular Structures: Perception, Autocorrelation Descriptor and SAR Studies. Autocorrelation Descriptor," Eur. J. Med. Chem., 19(1), 66-70(1984).
  30. Buckley, F. and Harary, F., Distance in graphs. Addison-Wesley Longman(1990).
  31. Pearlman, R. S. and Smith, K. M., "Novel Software Tools for Chemical Diversity," 3D QSAR in drug design, 339-353. Springer Netherlands(2002).
  32. Hemmer, M. C., Steinhauer, V. and Gasteiger, J., "Deriving the 3D Structure of Organic Molecules from Their Infrared Spectra," Vib. Spectrosc., 19(1), 151-164(1999). https://doi.org/10.1016/S0924-2031(99)00014-4

피인용 문헌

  1. 당뇨병 치료제 후보약물 정보를 이용한 기계 학습 모델과 주요 분자표현자 도출 vol.10, pp.3, 2016, https://doi.org/10.15207/jkcs.2019.10.3.023
  2. 머신 러닝과 데이터 전처리를 활용한 증류탑 온도 예측 vol.59, pp.2, 2016, https://doi.org/10.9713/kcer.2021.59.2.191