• Title/Summary/Keyword: Quantitative structure-property relationship

Search Result 13, Processing Time 0.024 seconds

A New Variable Selection Method Based on Mutual Information Maximization by Replacing Collinear Variables for Nonlinear Quantitative Structure-Property Relationship Models

  • Ghasemi, Jahan B.;Zolfonoun, Ehsan
    • Bulletin of the Korean Chemical Society
    • /
    • v.33 no.5
    • /
    • pp.1527-1535
    • /
    • 2012
  • Selection of the most informative molecular descriptors from the original data set is a key step for development of quantitative structure activity/property relationship models. Recently, mutual information (MI) has gained increasing attention in feature selection problems. This paper presents an effective mutual information-based feature selection approach, named mutual information maximization by replacing collinear variables (MIMRCV), for nonlinear quantitative structure-property relationship models. The proposed variable selection method was applied to three different QSPR datasets, soil degradation half-life of 47 organophosphorus pesticides, GC-MS retention times of 85 volatile organic compounds, and water-to-micellar cetyltrimethylammonium bromide partition coefficients of 62 organic compounds.The obtained results revealed that using MIMRCV as feature selection method improves the predictive quality of the developed models compared to conventional MI based variable selection algorithms.

QSPR Studies on Impact Sensitivities of High Energy Density Molecules

  • Kim, Chan-Kyung;Cho, Soo-Gyeong;Li, Jun;Kim, Chang-Kon;Lee, Hai-Whang
    • Bulletin of the Korean Chemical Society
    • /
    • v.32 no.12
    • /
    • pp.4341-4346
    • /
    • 2011
  • Impact sensitivity, one of the most important screening factors for novel high energy density materials (HEDMs), was predicted by use of quantitative structure-property relationship (QSPR) based on the electrostatic potential (ESP) values calculated on the van der Waals molecular surface (MSEP). Among various 3D descriptors derived from MSEP, we utilized total and positive variance of MSEP, and devised a new QSPR equation by combining three other parameters. We employed 37 HEDMs bearing a benzene scaffold and nitro substituents, which were also utilized by Rice and Hare. All the molecular structures were optimized at the B3LYP/6-31G(d) level of theory and confirmed as minima by the frequency calculations. Our new QSPR equation provided a good result to predict the impact sensitivities of the molecules in the training set including zwitterionic molecules.

Artificial Neural Network Prediction of Normalized Polarity Parameter for Various Solvents with Diverse Chemical Structures

  • Habibi-Yangjeh, Aziz
    • Bulletin of the Korean Chemical Society
    • /
    • v.28 no.9
    • /
    • pp.1472-1476
    • /
    • 2007
  • Artificial neural networks (ANNs) are successfully developed for the modeling and prediction of normalized polarity parameter (ETN) of 216 various solvents with diverse chemical structures using a quantitative-structure property relationship. ANN with architecture 5-9-1 is generated using five molecular descriptors appearing in the multi-parameter linear regression (MLR) model. The most positive charge of a hydrogen atom (q+), total charge in molecule (qt), molecular volume of solvent (Vm), dipole moment (μ) and polarizability term (πI) are input descriptors and its output is ETN. It is found that properly selected and trained neural network with 192 solvents could fairly represent the dependence of normalized polarity parameter on molecular descriptors. For evaluation of the predictive power of the generated ANN, an optimized network is applied for prediction of the ETN values of 24 solvents in the prediction set, which are not used in the optimization procedure. Correlation coefficient (R) and root mean square error (RMSE) of 0.903 and 0.0887 for prediction set by MLR model should be compared with the values of 0.985 and 0.0375 by ANN model. These improvements are due to the fact that the ETN of solvents shows non-linear correlations with the molecular descriptors.

Prediction Acidity Constant of Various Benzoic Acids and Phenols in Water Using Linear and Nonlinear QSPR Models

  • Habibi Yangjeh, Aziz;Danandeh Jenagharad, Mohammad;Nooshyar, Mahdi
    • Bulletin of the Korean Chemical Society
    • /
    • v.26 no.12
    • /
    • pp.2007-2016
    • /
    • 2005
  • An artificial neural network (ANN) is successfully presented for prediction acidity constant (pKa) of various benzoic acids and phenols with diverse chemical structures using a nonlinear quantitative structure-property relationship. A three-layered feed forward ANN with back-propagation of error was generated using six molecular descriptors appearing in the multi-parameter linear regression (MLR) model. The polarizability term $(\pi_1)$, most positive charge of acidic hydrogen atom $(q^+)$, molecular weight (MW), most negative charge of the acidic oxygen atom $(q^-)$, the hydrogen-bond accepting ability $(\epsilon_B)$ and partial charge weighted topological electronic (PCWTE) descriptors are inputs and its output is pKa. It was found that properly selected and trained neural network with 205 compounds could fairly represent dependence of the acidity constant on molecular descriptors. For evaluation of the predictive power of the generated ANN, an optimized network was applied for prediction pKa values of 37 compounds in the prediction set, which were not used in the optimization procedure. Squared correlation coefficient $(R^2)$ and root mean square error (RMSE) of 0.9147 and 0.9388 for prediction set by the MLR model should be compared with the values of 0.9939 and 0.2575 by the ANN model. These improvements are due to the fact that acidity constant of benzoic acids and phenols in water shows nonlinear correlations with the molecular descriptors.

QSPR Study of the Absorption Maxima of Azobenzene Dyes

  • Xu, Jie;Wang, Lei;Liu, Li;Bai, Zikui;Wang, Luoxin
    • Bulletin of the Korean Chemical Society
    • /
    • v.32 no.11
    • /
    • pp.3865-3872
    • /
    • 2011
  • A quantitative structure-property relationship (QSPR) study was performed for the prediction of the absorption maxima of azobenzene dyes. The entire set of 191 azobenzenes was divided into a training set of 150 azobenzenes and a test set of 41 azobenzenes according to Kennard and Stones algorithm. A seven-descriptor model, with squared correlation coefficient ($R^2$) of 0.8755 and standard error of estimation (s) of 14.476, was developed by applying stepwise multiple linear regression (MLR) analysis on the training set. The reliability of the proposed model was further illustrated using various evaluation techniques: leave-many-out crossvalidation procedure, randomization tests, and validation through the test set.

Prediction of retention of uncharged solutes in nanofiltration by means of molecular descriptors

  • Nowaczyk, Alicja;Nowaczyk, Jacek;Koter, Stanislaw
    • Membrane and Water Treatment
    • /
    • v.1 no.3
    • /
    • pp.181-192
    • /
    • 2010
  • A linear quantitative structure-property relationship (QSPR) model is presented for the prediction of rejection in permeation through membrane. The model was produced by using the multiple linear regression (MLR) technique on the database consisting of retention data of 25 pesticides in 4 different membrane separation experiments. Among the 3224 different physicochemical, topological and structural descriptors that were considered as inputs to the model only 50 were selected using several criteria of elimination. The physical meaning of chosen descriptor is discussed in detail. The accuracy of the proposed MLR models is illustrated using the following evaluation techniques: leave-one-out cross validation procedure, leave-many-out cross validation procedure and Y-randomization.

Ligand-based QSAR Studies on the Indolinones Derivatives as Inhibitors of the Protein Tyrosine Kinase of Fibroblast Growth Factor Receptor by CoMFA and CoMSIA

  • Hyun, Kwan-Hoon;Kwack, In-Young;Lee, Do-Young;Park, Hyung-Yeon;Lee, Bon-Su;Kim, Chan-Kyung
    • Bulletin of the Korean Chemical Society
    • /
    • v.25 no.12
    • /
    • pp.1801-1806
    • /
    • 2004
  • Ligand-based quantitative structure-activity relationship (QSAR) studies were performed on indolinones derivatives as a potential inhibitor of the protein tyrosine kinase of fibroblast growth factor receptor (FGFR) by comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) implemented in the SYBYL packages. The initial X-ray structure of docked ligand (Su5402) to FGFR was used to minimize the 27 training set molecules using TRIPOS force field. Seven models were generated using CoMFA and CoMSIA with grid spacing 2 ${\AA}$. After the PLS analysis the best predicted CoMSIA model with hydrophobicity, hydrogen bond donor and acceptor property showed that a leave-one out(LOO) cross validated value $({r^2}_{cv})^$ and non-cross validated conventional value $({r^2}_{ncv})^$ are 0.543 and 0.938, respectively.

Searching of the Potent Pig Pheromonal Odorants by Receptor Based Approach (수용체 접근방법에 의한 잠재적인 돼지 페로몬 성 냄새 물질의 탐색)

  • Joo, Sung-Mo;Cho, Yun-Gi;Park, Chang-Sik;Sung, Nack-Do
    • Reproductive and Developmental Biology
    • /
    • v.34 no.3
    • /
    • pp.117-122
    • /
    • 2010
  • To search the potent pig pheromonal odorants through receptor-based approach methods, molecular dockings between 680 Flavomets as substrate molecule and pig odorants binding proteins OBP (1HQP) and PBP (1GM6) as receptor, and QSPR (quantitative structure-property relationship) analyses from physico-chemical parameters of Flavomets and their docking scores (DS) were performed and discussed quantitatively. From the basis on the findings, the optimal value $(MSA)_{opt.}=407.595\;{\AA}^2$ of MSA (molecular surface area; ${\AA}$), and RB (number of rotational bond) had the Flavomets will be able to increase DS. Therefore, it is expected that the stearyl alcohol from DS and H-bond type between substrate and receptor would be shows the character as potent pig pheromonal odorant.

Prediction of Melting Point for Drug-like Compounds Using Principal Component-Genetic Algorithm-Artificial Neural Network

  • Habibi-Yangjeh, Aziz;Pourbasheer, Eslam;Danandeh-Jenagharad, Mohammad
    • Bulletin of the Korean Chemical Society
    • /
    • v.29 no.4
    • /
    • pp.833-841
    • /
    • 2008
  • Principal component-genetic algorithm-multiparameter linear regression (PC-GA-MLR) and principal component-genetic algorithm-artificial neural network (PC-GA-ANN) models were applied for prediction of melting point for 323 drug-like compounds. A large number of theoretical descriptors were calculated for each compound. The first 234 principal components (PC’s) were found to explain more than 99.9% of variances in the original data matrix. From the pool of these PC’s, the genetic algorithm was employed for selection of the best set of extracted PC’s for PC-MLR and PC-ANN models. The models were generated using fifteen PC’s as variables. For evaluation of the predictive power of the models, melting points of 64 compounds in the prediction set were calculated. Root-mean square errors (RMSE) for PC-GA-MLR and PC-GA-ANN models are 48.18 and $12.77{^{\circ}C}$, respectively. Comparison of the results obtained by the models reveals superiority of the PC-GA-ANN relative to the PC-GA-MLR and the recently proposed models (RMSE = $40.7{^{\circ}C}$). The improvements are due to the fact that the melting point of the compounds demonstrates non-linear correlations with the principal components.

Changes in Mechanical Properties of Wood Due to 1 Year Outdoor Exposure

  • KIM, Gwang-Chul;KIM, Jun-Ho
    • Journal of the Korean Wood Science and Technology
    • /
    • v.48 no.1
    • /
    • pp.12-21
    • /
    • 2020
  • For quantitative evaluation of wooden structures, the mechanical performance of members has undergone outdoor exposure tests. A year-long monitoring was conducted using an SPF species. Test groups were divided into twelve (each month) to measure the moisture content, density and ultimate load. Starting from May when moisture content of the test group was at the lowest, simple failure modes were observed more frequently during the first half of the experiment, whereas complex failure modes took over during the second half. Starting from June when moisture content of the test group was the highest, ultimate load decreased by 30% in the second half compared to the first half. A multiple regression analysis confirmed that moisture content of the test group was the variable with most effect on ultimate load of various outdoor variables, and an estimation equation of a simple regression analysis revealed that moisture content and ultimate load formed an inversely proportionate relationship. It is thought that correlational relationships of variables other than moisture content could be applied with the increase in added data amount by longer periods of outdoor exposure tests.