• Title/Summary/Keyword: Cross validation technique

Search Result 126, Processing Time 0.023 seconds

Estimating the Important Components in Three Different Sample Types of Soybean by Near Infrared Reflectance Spectroscopy

  • Lee, Ho-Sun;Kim, Jung-Bong;Lee, Young-Yi;Lee, Sok-Young;Gwag, Jae-Gyun;Baek, Hyung-Jin;Kim, Chung-Kon;Yoon, Mun-Sup
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.56 no.1
    • /
    • pp.88-93
    • /
    • 2011
  • This experiment was carried out to find suitable sample type for the more accurate prediction and non-destructive way in the application of near infrared reflectance spectroscopy (NIRS) technique for estimation the protein, total amino acids, and total isoflavone of soybean by comparing three different sample types, single seed, whole seeds, and milled seeds powder. The coefficient of determination in calibration ($R^2$) and coefficient of determination in cross-validation (1-VR) for three components analyzed using NIRS revealed that milled powder sample type yielded the highest, followed by single seed, and the whole seeds as the lowest. The coefficient of determination in calibration for single seed was moderately low($R^2$ 0.70-0.84), while the calibration equation developed with NIRS data scanned with whole seeds showed the lowest accuracy and reliability compared with other sample groups. The scatter plot for NIRS data versus the reference data of whole seeds showed the widest data cloud, in contrary with the milled powder type which showed flatter data cloud. By comparison of NIRS results for total isoflavone, total amino acids, and protein of soybean seeds with three sample types, the powder sample could be estimated for the most accurate prediction. However, based from the results, the use of single bean samples, without grinding the seeds and in consideration with NIRS application for more nondestructive and faster prediction, is proven to be a promising strategy for soybean component estimation using NIRS.

Prediction of Ultimate Bearing Capacity of Soft Soils Reinforced by Gravel Compaction Pile Using Multiple Regression Analysis and Artificial Neural Network (다중회귀분석 및 인공신경망을 이용한 자갈다짐말뚝 개량지반의 극한 지지력 예측)

  • Bong, Tae-Ho;Kim, Byoung-Il
    • Journal of the Korean Geotechnical Society
    • /
    • v.33 no.6
    • /
    • pp.27-36
    • /
    • 2017
  • Gravel compaction pile method has been widely used to improve the soft ground on the land or sea as one of the soft ground improvement technique. The ultimate bearing capacity of the ground reinforced by gravel compaction piles is affected by the soil strength, the replacement ratio of pile, construction conditions, and so on, and various prediction equations have been proposed to predict this. However, the prediction of the ultimate bearing capacity using the existing models has a very large error and variation, and it is not suitable for practical design. In this study, multiple regression analysis was performed using field loading test results to predict the ultimate bearing capacity of ground reinforced by gravel compaction pile, and the most efficient input variables are selected through evaluation of error by leave one out cross validation, and a multiple regression equation for the prediction of ultimate bearing capacity was proposed. In addition, the prediction error was evaluated by applying artificial neural network using the selected input variables, and the results were compared with those of the existing model.

Prediction of Protein-Protein Interaction Sites Based on 3D Surface Patches Using SVM (SVM 모델을 이용한 3차원 패치 기반 단백질 상호작용 사이트 예측기법)

  • Park, Sung-Hee;Hansen, Bjorn
    • The KIPS Transactions:PartD
    • /
    • v.19D no.1
    • /
    • pp.21-28
    • /
    • 2012
  • Predication of protein interaction sites for monomer structures can reduce the search space for protein docking and has been regarded as very significant for predicting unknown functions of proteins from their interacting proteins whose functions are known. In the other hand, the prediction of interaction sites has been limited in crystallizing weakly interacting complexes which are transient and do not form the complexes stable enough for obtaining experimental structures by crystallization or even NMR for the most important protein-protein interactions. This work reports the calculation of 3D surface patches of complex structures and their properties and a machine learning approach to build a predictive model for the 3D surface patches in interaction and non-interaction sites using support vector machine. To overcome classification problems for class imbalanced data, we employed an under-sampling technique. 9 properties of the patches were calculated from amino acid compositions and secondary structure elements. With 10 fold cross validation, the predictive model built from SVM achieved an accuracy of 92.7% for classification of 3D patches in interaction and non-interaction sites from 147 complexes.

Prediction and Verification of Distribution Potential of the Debris Landforms in the Southwest Region of the Korean Peninsula (한반도 서남부 암설사면지형의 분포가능성 예측 및 검증)

  • Lee, Seong-Ho;Jang, Dong-Ho
    • Journal of The Geomorphological Association of Korea
    • /
    • v.27 no.2
    • /
    • pp.1-17
    • /
    • 2020
  • This study evaluated a debris landform distribution potential area map in the southwest region of the Korean peninsula. A GIS spatial integration technique and logistic regression method were used to produce a distribution potential area map. Seven topographic and environmental factors were considered for analysis and 28 different data set were combined and used to get most effective results. Moreover, in an accuracy assessment, the extracted results of the Distribution Potential area were evaluated by conducting a cross-validation module. Block stream showed the highest accuracy in the combination No. 6, and that DEM (digital elevation model) and TWI (topographic wetness index) have relatively high influences on the production of the Block stream Distribution Potential area map. Talus showed the highest accuracy in the combination No. 13. We also found that slope, TWI and geology have relatively high influences on the production of the Talus Distribution Potential area map. In addition, fieldwork confirmed the accuracy of the input data that were used in this study, and the slope and geology were also similar. It was also determined that these input data were relatively accurate. In the case of angularity, the block stream was composed of sub-rounded and sub-angular systems and Talus showed differences according to the terrain formation. Although the results of the rebound strain measurement using a Schmidt's hammer did not shown any difference in topographic conditions, it is determined that the rebound strain results reflected the underlying geological setting.

Data Mining Approach Using Practical Swarm Optimization (PSO) to Predicting Going Concern: Evidence from Iranian Companies

  • Salehi, Mahdi;Fard, Fezeh Zahedi
    • Journal of Distribution Science
    • /
    • v.11 no.3
    • /
    • pp.5-11
    • /
    • 2013
  • Purpose - Going concern is one of fundamental concepts in accounting and auditing and sometimes the assessment of a company's going concern status that is a tough process. Various going concern prediction models' based on statistical and data mining methods help auditors and stakeholders suggested in the previous literature. Research design - This paper employs a data mining approach to prediction of going concern status of Iranian firms listed in Tehran Stock Exchange using Particle Swarm Optimization. To reach this goal, at the first step, we used the stepwise discriminant analysis it is selected the final variables from among of 42 variables and in the second stage; we applied a grid-search technique using 10-fold cross-validation to find out the optimal model. Results - The empirical tests show that the particle swarm optimization (PSO) model reached 99.92% and 99.28% accuracy rates for training and holdout data. Conclusions - The authors conclude that PSO model is applicable for prediction going concern of Iranian listed companies.

  • PDF

Corporate credit rating prediction using support vector machines

  • Lee, Yong-Chan
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2005.11a
    • /
    • pp.571-578
    • /
    • 2005
  • Corporate credit rating analysis has drawn a lot of research interests in previous studies, and recent studies have shown that machine learning techniques achieved better performance than traditional statistical ones. This paper applies support vector machines (SVMs) to the corporate credit rating problem in an attempt to suggest a new model with better explanatory power and stability. To serve this purpose, the researcher uses a grid-search technique using 5-fold cross-validation to find out the optimal parameter values of kernel function of SVM. In addition, to evaluate the prediction accuracy of SVM, the researcher compares its performance with those of multiple discriminant analysis (MDA), case-based reasoning (CBR), and three-layer fully connected back-propagation neural networks (BPNs). The experiment results show that SVM outperforms the other methods.

  • PDF

Hybridized Decision Tree methods for Detecting Generic Attack on Ciphertext

  • Alsariera, Yazan Ahmad
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.7
    • /
    • pp.56-62
    • /
    • 2021
  • The surge in generic attacks execution against cipher text on the computer network has led to the continuous advancement of the mechanisms to protect information integrity and confidentiality. The implementation of explicit decision tree machine learning algorithm is reported to accurately classifier generic attacks better than some multi-classification algorithms as the multi-classification method suffers from detection oversight. However, there is a need to improve the accuracy and reduce the false alarm rate. Therefore, this study aims to improve generic attack classification by implementing two hybridized decision tree algorithms namely Naïve Bayes Decision tree (NBTree) and Logistic Model tree (LMT). The proposed hybridized methods were developed using the 10-fold cross-validation technique to avoid overfitting. The generic attack detector produced a 99.8% accuracy, an FPR score of 0.002 and an MCC score of 0.995. The performances of the proposed methods were better than the existing decision tree method. Similarly, the proposed method outperformed multi-classification methods for detecting generic attacks. Hence, it is recommended to implement hybridized decision tree method for detecting generic attacks on a computer network.

Deep-learning based In-situ Monitoring and Prediction System for the Organic Light Emitting Diode

  • Park, Il-Hoo;Cho, Hyeran;Kim, Gyu-Tae
    • Journal of the Semiconductor & Display Technology
    • /
    • v.19 no.4
    • /
    • pp.126-129
    • /
    • 2020
  • We introduce a lifetime assessment technique using deep learning algorithm with complex electrical parameters such as resistivity, permittivity, impedance parameters as integrated indicators for predicting the degradation of the organic molecules. The evaluation system consists of fully automated in-situ measurement system and multiple layer perceptron learning system with five hidden layers and 1011 perceptra in each layer. Prediction accuracies are calculated and compared depending on the physical feature, learning hyperparameters. 62.5% of full time-series data are used for training and its prediction accuracy is estimated as r-square value of 0.99. Remaining 37.5% of the data are used for testing with prediction accuracy of 0.95. With k-fold cross-validation, the stability to the instantaneous changes in the measured data is also improved.

Updating finite element model using dynamic perturbation method and regularization algorithm

  • Chen, Hua-Peng;Huang, Tian-Li
    • Smart Structures and Systems
    • /
    • v.10 no.4_5
    • /
    • pp.427-442
    • /
    • 2012
  • An effective approach for updating finite element model is presented which can provide reliable estimates for structural updating parameters from identified operational modal data. On the basis of the dynamic perturbation method, an exact relationship between the perturbation of structural parameters such as stiffness change and the modal properties of the tested structure is developed. An iterative solution procedure is then provided to solve for the structural updating parameters that characterise the modifications of structural parameters at element level, giving optimised solutions in the least squares sense without requiring an optimisation method. A regularization algorithm based on the Tikhonov solution incorporating the generalised cross-validation method is employed to reduce the influence of measurement errors in vibration modal data and then to produce stable and reasonable solutions for the structural updating parameters. The Canton Tower benchmark problem established by the Hong Kong Polytechnic University is employed to demonstrate the effectiveness and applicability of the proposed model updating technique. The results from the benchmark problem studies show that the proposed technique can successfully adjust the reduced finite element model of the structure using only limited number of frequencies identified from the recorded ambient vibration measurements.

Analysis of market share attraction data using LS-SVM (최소제곱 서포트벡터기계를 이용한 시장점유율 자료 분석)

  • Park, Hye-Jung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.5
    • /
    • pp.879-886
    • /
    • 2009
  • The purpose of this article is to present the application of Least Squares Support Vector Machine in analyzing the existing structure of brand. We estimate the parameters of the Market Share Attraction Model using a non-parametric technique for function estimation called Least Squares Support Vector Machine, which allows us to perform even nonlinear regression by constructing a linear regression function in a high dimensional feature space. Estimation by Least Squares Support Vector Machine technique makes it a good candidate for solving the Market Share Attraction Model. To illustrate the performance of the proposed method, we use the car sales data in South Korea's car market.

  • PDF