• Title/Summary/Keyword: Validation Metrics

Search Result 69, Processing Time 0.024 seconds

Isomer Differentiation Using in silico MS2 Spectra. A Case Study for the CFM-ID Mass Spectrum Predictor

  • Milman, Boris L.;Ostrovidova, Ekaterina V.;Zhurkovich, Inna K.
    • Mass Spectrometry Letters
    • /
    • v.10 no.3
    • /
    • pp.93-101
    • /
    • 2019
  • Algorithms and software for predicting tandem mass spectra have been developed in recent years. In this work, we explore how distinct in silico $MS^2$ spectra are predicted for isomers, i.e. compounds having the same formula and similar molecular structures, to differentiate between them. We used the CFM-ID 2.0/3.0 predictor with regard to (a) test compounds, whose experimental mass spectra had been randomly sampled from the MassBank of North America (MoNA) collection, and to (b) the most widespread isomers of test compounds searched in the PubChem database. In the first validation test, in silico mass spectra constitute a reference library, and library searches are performed for test experimental spectra of "unknowns". The searches led to the true positive rate (TPR) of ($46-48{\pm}10$)%. In the second test, in silico and experimental spectra were interchanged and this resulted in a TPR of ($58{\pm}10$)%. There were no significant differences between results obtained with different metrics of spectral similarity and predictor versions. In a comparison of test compounds vs. their isomers, a statistically significant correlation between mass spectral data and structural features was observed. The TPR values obtained should be regarded as reasonable results for predicting tandem mass spectra of related chemical structures.

The Quality Evaluation Model for Mobile RPG (모바일 RPG의 품질평가모델)

  • Kim, Giuhyoung;Lee, Nam-yong
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2014.10a
    • /
    • pp.457-460
    • /
    • 2014
  • As to the rapid development of mobile technology, the era of 1-person 1-smart phone has come. Also, the application of smart phone became the part of our life. Especially mobile RPG(Role Playing Game) has the biggest market and the largest number of users. There could be no precise quality assessment result, if quality evaluation model of PC RPG game were applied to the mobile RPG. Thus, it is time to focus on new quality evaluation model for mobile RPG with the great reflection of smart phone attributes. In this paper, based on ISO/IEC 9126, the international standard of software quality, quality evaluation elements for mobile RPG were deducted. Then metrics for validation of elements were redefined and new quality evaluation model was proposed.

  • PDF

Machine Learning Methods for Trust-based Selection of Web Services

  • Hasnain, Muhammad;Ghani, Imran;Pasha, Muhammad F.;Jeong, Seung R.
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.1
    • /
    • pp.38-59
    • /
    • 2022
  • Web services instances can be classified into two categories, namely trusted and untrusted from users. A web service with high throughput (TP) and low response time (RT) instance values is a trusted web service. Web services are not trustworthy due to the mismatch in the guaranteed instance values and the actual values achieved by users. To perform web services selection from users' attained TP and RT values, we need to verify the correct prediction of trusted and untrusted instances from invoked web services. This accurate prediction of web services instances is used to perform the selection of web services. We propose to construct fuzzy rules to label web services instances correctly. This paper presents web services selection using a well-known machine learning algorithm, namely REPTree, for the correct prediction of trusted and untrusted instances. Performance comparison of REPTree with five machine learning models is conducted on web services datasets. We have performed experiments on web services datasets using a ten k-fold cross-validation method. To evaluate the performance of the REPTree classifier, we used accuracy metrics (Sensitivity and Specificity). Experimental results showed that web service (WS1) gained top selection score with the (47.0588%) trusted instances, and web service (WS2) was selected the least with (25.00%) trusted instances. Evaluation results of the proposed web services selection approach were found as (asymptotic sig. = 0.019), demonstrating the relationship between final selection and recommended trust score of web services.

Compositional Feature Selection and Its Effects on Bandgap Prediction by Machine Learning (기계학습을 이용한 밴드갭 예측과 소재의 조성기반 특성인자의 효과)

  • Chunghee Nam
    • Korean Journal of Materials Research
    • /
    • v.33 no.4
    • /
    • pp.164-174
    • /
    • 2023
  • The bandgap characteristics of semiconductor materials are an important factor when utilizing semiconductor materials for various applications. In this study, based on data provided by AFLOW (Automatic-FLOW for Materials Discovery), the bandgap of a semiconductor material was predicted using only the material's compositional features. The compositional features were generated using the python module of 'Pymatgen' and 'Matminer'. Pearson's correlation coefficients (PCC) between the compositional features were calculated and those with a correlation coefficient value larger than 0.95 were removed in order to avoid overfitting. The bandgap prediction performance was compared using the metrics of R2 score and root-mean-squared error. By predicting the bandgap with randomforest and xgboost as representatives of the ensemble algorithm, it was found that xgboost gave better results after cross-validation and hyper-parameter tuning. To investigate the effect of compositional feature selection on the bandgap prediction of the machine learning model, the prediction performance was studied according to the number of features based on feature importance methods. It was found that there were no significant changes in prediction performance beyond the appropriate feature. Furthermore, artificial neural networks were employed to compare the prediction performance by adjusting the number of features guided by the PCC values, resulting in the best R2 score of 0.811. By comparing and analyzing the bandgap distribution and prediction performance according to the material group containing specific elements (F, N, Yb, Eu, Zn, B, Si, Ge, Fe Al), various information for material design was obtained.

Metric based Performance Measurement of Software Development Methodologies from Traditional to DevOps Automation Culture

  • Poonam Narang;Pooja Mittal
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.6
    • /
    • pp.107-114
    • /
    • 2023
  • Successful implementations of DevOps practices significantly improvise software efficiency, collaboration and security. Most of the organizations are adopting DevOps for faster and quality software delivery. DevOps brings development and operation teams together to overcome all kind of communication gaps responsible for software failures. It relies on different sets of alternative tools to automate the tasks of continuous integration, testing, delivery, deployment and monitoring. Although DevOps is followed for being very reliable and responsible environment for quality software delivery yet it lacks many quantifiable aspects to prove it on the top of other traditional and agile development methods. This research evaluates quantitative performance of DevOps and traditional/ agile development methods based on software metrics. This research includes three sample projects or code repositories to quantify the results and for DevOps integrated selective tool chain; current research considers our earlier proposed and implemented DevOps hybrid model of integrated automation tools. For result discussion and validation, tabular and graphical comparisons have also been included to retrieve best performer model. This comparative and evaluative research will be of much advantage to our young researchers/ students to get well versed with automotive environment of DevOps, latest emerging buzzword of development industries.

Identification of Pb-Zn ore under the condition of low count rate detection of slim hole based on PGNAA technology

  • Haolong Huang;Pingkun Cai;Wenbao Jia;Yan Zhang
    • Nuclear Engineering and Technology
    • /
    • v.55 no.5
    • /
    • pp.1708-1717
    • /
    • 2023
  • The grade analysis of lead-zinc ore is the basis for the optimal development and utilization of deposits. In this study, a method combining Prompt Gamma Neutron Activation Analysis (PGNAA) technology and machine learning is proposed for lead-zinc mine borehole logging, which can identify lead-zinc ores of different grades and gangue in the formation, providing real-time grade information qualitatively and semi-quantitatively. Firstly, Monte Carlo simulation is used to obtain a gamma-ray spectrum data set for training and testing machine learning classification algorithms. These spectra are broadened, normalized and separated into inelastic scattering and capture spectra, and then used to fit different classifier models. When the comprehensive grade boundary of high- and low-grade ores is set to 5%, the evaluation metrics calculated by the 5-fold cross-validation show that the SVM (Support Vector Machine), KNN (K-Nearest Neighbor), GNB (Gaussian Naive Bayes) and RF (Random Forest) models can effectively distinguish lead-zinc ore from gangue. At the same time, the GNB model has achieved the optimal accuracy of 91.45% when identifying high- and low-grade ores, and the F1 score for both types of ores is greater than 0.9.

Accuracy Evaluation of Machine Learning Model for Concrete Aging Prediction due to Thermal Effect and Carbonation (콘크리트 탄산화 및 열효과에 의한 경년열화 예측을 위한 기계학습 모델의 정확성 검토)

  • Kim, Hyun-Su
    • Journal of Korean Association for Spatial Structures
    • /
    • v.23 no.4
    • /
    • pp.81-88
    • /
    • 2023
  • Numerous factors contribute to the deterioration of reinforced concrete structures. Elevated temperatures significantly alter the composition of the concrete ingredients, consequently diminishing the concrete's strength properties. With the escalation of global CO2 levels, the carbonation of concrete structures has emerged as a critical challenge, substantially affecting concrete durability research. Assessing and predicting concrete degradation due to thermal effects and carbonation are crucial yet intricate tasks. To address this, multiple prediction models for concrete carbonation and compressive strength under thermal impact have been developed. This study employs seven machine learning algorithms-specifically, multiple linear regression, decision trees, random forest, support vector machines, k-nearest neighbors, artificial neural networks, and extreme gradient boosting algorithms-to formulate predictive models for concrete carbonation and thermal impact. Two distinct datasets, derived from reported experimental studies, were utilized for training these predictive models. Performance evaluation relied on metrics like root mean square error, mean square error, mean absolute error, and coefficient of determination. The optimization of hyperparameters was achieved through k-fold cross-validation and grid search techniques. The analytical outcomes demonstrate that neural networks and extreme gradient boosting algorithms outshine the remaining five machine learning approaches, showcasing outstanding predictive performance for concrete carbonation and thermal effect modeling.

Development of Machine Learning Based Seismic Response Prediction Model for Shear Wall Structure considering Aging Deteriorations (경년열화를 고려한 전단벽 구조물의 기계학습 기반 지진응답 예측모델 개발)

  • Kim, Hyun-Su;Kim, Yukyung;Lee, So Yeon;Jang, Jun Su
    • Journal of Korean Association for Spatial Structures
    • /
    • v.24 no.2
    • /
    • pp.83-90
    • /
    • 2024
  • Machine learning is widely applied to various engineering fields. In structural engineering area, machine learning is generally used to predict structural responses of building structures. The aging deterioration of reinforced concrete structure affects its structural behavior. Therefore, the aging deterioration of R.C. structure should be consider to exactly predict seismic responses of the structure. In this study, the machine learning based seismic response prediction model was developed. To this end, four machine learning algorithms were employed and prediction performance of each algorithm was compared. A 3-story coupled shear wall structure was selected as an example structure for numerical simulation. Artificial ground motions were generated based on domestic site characteristics. Elastic modulus, damping ratio and density were changed to considering concrete degradation due to chloride penetration and carbonation, etc. Various intensity measures were used input parameters of the training database. Performance evaluation was performed using metrics like root mean square error, mean square error, mean absolute error, and coefficient of determination. The optimization of hyperparameters was achieved through k-fold cross-validation and grid search techniques. The analysis results show that neural networks and extreme gradient boosting algorithms present good prediction performance.

Estimation of tunnel boring machine penetration rate: Application of long-short-term memory and meta-heuristic optimization algorithms

  • Mengran Xu;Arsalan Mahmoodzadeh;Abdelkader Mabrouk;Hawkar Hashim Ibrahim;Yasser Alashker;Adil Hussein Mohammed
    • Geomechanics and Engineering
    • /
    • v.39 no.1
    • /
    • pp.27-41
    • /
    • 2024
  • Accurately estimating the performance of tunnel boring machines (TBMs) is crucial for mitigating the substantial financial risks and complexities associated with tunnel construction. Machine learning (ML) techniques have emerged as powerful tools for predicting non-linear time series data. In this research, six advanced meta-heuristic optimization algorithms based on long short-term memory (LSTM) networks were developed to predict TBM penetration rate (TBM-PR). The study utilized 1125 datasets, partitioned into 20% for testing, 70% for training, and 10% for validation, incorporating six key input parameters influencing TBM-PR. The performances of these LSTM-based models were rigorously compared using a suite of statistical evaluation metrics. The results underscored the profound impact of optimization algorithms on prediction accuracy. Among the models tested, the LSTM optimized by the particle swarm optimization (PSO) algorithm emerged as the most robust predictor of TBM-PR. Sensitivity analysis further revealed that the orientation of discontinuities, specifically the alpha angle (α), exerted the greatest influence on the model's predictions. This research is significant in that it addresses critical concerns of TBM manufacturers and operators, offering a reliable predictive tool adaptable to varying geological conditions.

Online analysis of iron ore slurry using PGNAA technology with artificial neural network

  • Haolong Huang;Pingkun Cai;Xuwen Liang;Wenbao Jia
    • Nuclear Engineering and Technology
    • /
    • v.56 no.7
    • /
    • pp.2835-2841
    • /
    • 2024
  • Real-time analysis of metallic mineral grade and slurry concentration is significant for improving flotation efficiency and product quality. This study proposes an online detection method of ore slurry combining the Prompt Gamma Neutron Activation Analysis (PGNAA) technology and artificial neural network (ANN), which can provide mineral information rapidly and accurately. Firstly, a PGNAA analyzer based on a D-T neutron generator and a BGO detector was used to obtain a gamma-ray spectrum dataset of ore slurry samples, which was used to construct and optimize the ANN model for adaptive analysis. The evaluation metrics calculated by leave-one-out cross-validation indicated that, compared with the weighted library least squares (WLLS) approach, ANN obtained more precise and stable results, with mean absolute percentage errors of 4.66% and 2.80% for Fe grade and slurry concentration, respectively, and the highest average standard deviation of only 0.0119. Meanwhile, the analytical errors of the samples most affected by matrix effects was reduced to 0.61 times and 0.56 times of the WLLS method, respectively.