• 제목/요약/키워드: Validation methods

검색결과 1,858건 처리시간 0.028초

APPLICATION AND CROSS-VALIDATION OF SPATIAL LOGISTIC MULTIPLE REGRESSION FOR LANDSLIDE SUSCEPTIBILITY ANALYSIS

  • LEE SARO
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2004년도 Proceedings of ISRS 2004
    • /
    • pp.302-305
    • /
    • 2004
  • The aim of this study is to apply and crossvalidate a spatial logistic multiple-regression model at Boun, Korea, using a Geographic Information System (GIS). Landslide locations in the Boun area were identified by interpretation of aerial photographs and field surveys. Maps of the topography, soil type, forest cover, geology, and land-use were constructed from a spatial database. The factors that influence landslide occurrence, such as slope, aspect, and curvature of topography, were calculated from the topographic database. Texture, material, drainage, and effective soil thickness were extracted from the soil database, and type, diameter, and density of forest were extracted from the forest database. Lithology was extracted from the geological database and land-use was classified from the Landsat TM image satellite image. Landslide susceptibility was analyzed using landslide-occurrence factors by logistic multiple-regression methods. For validation and cross-validation, the result of the analysis was applied both to the study area, Boun, and another area, Youngin, Korea. The validation and cross-validation results showed satisfactory agreement between the susceptibility map and the existing data with respect to landslide locations. The GIS was used to analyze the vast amount of data efficiently, and statistical programs were used to maintain specificity and accuracy.

  • PDF

신경망 학습앙상블에 관한 연구 - 주가예측을 중심으로 - (A Study on Training Ensembles of Neural Networks - A Case of Stock Price Prediction)

  • 이영찬;곽수환
    • 지능정보연구
    • /
    • 제5권1호
    • /
    • pp.95-101
    • /
    • 1999
  • In this paper, a comparison between different methods to combine predictions from neural networks will be given. These methods are bagging, bumping, and balancing. Those are based on the analysis of the ensemble generalization error into an ambiguity term and a term incorporating generalization performances of individual networks. Neural Networks and AI machine learning models are prone to overfitting. A strategy to prevent a neural network from overfitting, is to stop training in early stage of the learning process. The complete data set is spilt up into a training set and a validation set. Training is stopped when the error on the validation set starts increasing. The stability of the networks is highly dependent on the division in training and validation set, and also on the random initial weights and the chosen minimization procedure. This causes early stopped networks to be rather unstable: a small change in the data or different initial conditions can produce large changes in the prediction. Therefore, it is advisable to apply the same procedure several times starting from different initial weights. This technique is often referred to as training ensembles of neural networks. In this paper, we presented a comparison of three statistical methods to prevent overfitting of neural network.

  • PDF

고성능 액체크로마토그래프 기기의 성능검증을 위한 밸리데이션 가이드라인에 대한 연구 (Investigation of Validation Guidelines for Performance Verification of High Performance Liquid Chromatograph)

  • 윤원남;이범규;이원재
    • 약학회지
    • /
    • 제57권5호
    • /
    • pp.362-368
    • /
    • 2013
  • High performance liquid chromatograph (HPLC) is the most frequently used analytical instrument in analytical laboratories for pharmaceutical analysis. In order to provide a high level of assurance for reliable data generated from the HPLC analysis, the performance qualification of the HPLC system is required. For this purpose, the performance of HPLC system should be regularly monitored by examining the key functions of the typical HPLC system (solvent delivery system, injector system, column oven, UV-VIS detector system). We have investigated the validation guidelines of the performance verification of these key modules for HPLC system. And we proposed and evaluated its validation guidelines and the related verification methods for pharmaceutical analysis that could be practically applied in Korea.

A-SMGCS 개발에 따른 적정성 평가와 검증방법에 관한 연구 (A Verification & Validation Methodology Study on the Development of A-SMGCS)

  • 홍승범;최승훈;조영진;최연철
    • 한국항공운항학회지
    • /
    • 제22권2호
    • /
    • pp.81-86
    • /
    • 2014
  • In this paper, we states the verification and validation methodology for the modular system of A-SMGCS which defined in the ICAO Manual on Advanced Surface Movement Guidance and Control Systems. Such systems aim to maintain the declared surface movement rate under all weather conditions while maintaining the required level of safety. With the complete concept of an A-SMGCS, air traffic controllers, vehicle drivers, flight crews, and are assisted with surface operations in terms of surveillance, control, routing/planning and guidance tasks. A-SMGCS verification and validation for the development of Real Time Simulation, shadow mode trials, operational trials are conducted through three methods. In this study, the characteristics and the need for such a verification method was examined.

Estimating Prediction Errors in Binary Classification Problem: Cross-Validation versus Bootstrap

  • Kim Ji-Hyun;Cha Eun-Song
    • Communications for Statistical Applications and Methods
    • /
    • 제13권1호
    • /
    • pp.151-165
    • /
    • 2006
  • It is important to estimate the true misclassification rate of a given classifier when an independent set of test data is not available. Cross-validation and bootstrap are two possible approaches in this case. In related literature bootstrap estimators of the true misclassification rate were asserted to have better performance for small samples than cross-validation estimators. We compare the two estimators empirically when the classification rule is so adaptive to training data that its apparent misclassification rate is close to zero. We confirm that bootstrap estimators have better performance for small samples because of small variance, and we have found a new fact that their bias tends to be significant even for moderate to large samples, in which case cross-validation estimators have better performance with less computation.

객체지향 설계 및 시뮬레이션을 이용한 자동 물류 핸들링 시스템의 제어 로직 검증 (Validation of the Control Logic for Automated Material Handling System Using an Object-Oriented Design and Simulation Method)

  • 한관희
    • 제어로봇시스템학회논문지
    • /
    • 제12권8호
    • /
    • pp.834-841
    • /
    • 2006
  • Recently, many enterprises are installing AMSs(Automated Manufacturing Systems) for their competitive advantages. As the level of automation increases, proper design and validation of control logic is a imperative task for the successful operation of AMSs. However, current discrete event simulation methods mainly focus on the performance evaluation. As a result, they lack the modeling capabilities for the detail logic of automated manufacturing system controller. Proposed in this paper is a method of validation of the controller logic for automated material handling system using an object-oriented design and simulation. Using this method, FA engineers can validate the controller logic easily in earlier stage of system design, so they can reduce the time for correcting the logic errors and enhance the productivity of control program development Generated simulation model can also be used as a communication tool among FA engineers who have different experiences and disciplines.

Unsupervised learning algorithm for signal validation in emergency situations at nuclear power plants

  • Choi, Younhee;Yoon, Gyeongmin;Kim, Jonghyun
    • Nuclear Engineering and Technology
    • /
    • 제54권4호
    • /
    • pp.1230-1244
    • /
    • 2022
  • This paper proposes an algorithm for signal validation using unsupervised methods in emergency situations at nuclear power plants (NPPs) when signals are rapidly changing. The algorithm aims to determine the stuck failures of signals in real time based on a variational auto-encoder (VAE), which employs unsupervised learning, and long short-term memory (LSTM). The application of unsupervised learning enables the algorithm to detect a wide range of stuck failures, even those that are not trained. First, this paper discusses the potential failure modes of signals in NPPs and reviews previous studies conducted on signal validation. Then, an algorithm for detecting signal failures is proposed by applying LSTM and VAE. To overcome the typical problems of unsupervised learning processes, such as trainability and performance issues, several optimizations are carried out to select the inputs, determine the hyper-parameters of the network, and establish the thresholds to identify signal failures. Finally, the proposed algorithm is validated and demonstrated using a compact nuclear simulator.

Consensus Clustering for Time Course Gene Expression Microarray Data

  • Kim, Seo-Young;Bae, Jong-Sung
    • Communications for Statistical Applications and Methods
    • /
    • 제12권2호
    • /
    • pp.335-348
    • /
    • 2005
  • The rapid development of microarray technologies enabled the monitoring of expression levels of thousands of genes simultaneously. Recently, the time course gene expression data are often measured to study dynamic biological systems and gene regulatory networks. For the data, biologists are attempting to group genes based on the temporal pattern of their expression levels. We apply the consensus clustering algorithm to a time course gene expression data in order to infer statistically meaningful information from the measurements. We evaluate each of consensus clustering and existing clustering methods with various validation measures. In this paper, we consider hierarchical clustering and Diana of existing methods, and consensus clustering with hierarchical clustering, Diana and mixed hierachical and Diana methods and evaluate their performances on a real micro array data set and two simulated data sets.

유전체 코호트 연구의 주요 통계학적 과제 (Statistical Issues in Genomic Cohort Studies)

  • 박소희
    • Journal of Preventive Medicine and Public Health
    • /
    • 제40권2호
    • /
    • pp.108-113
    • /
    • 2007
  • When conducting large-scale cohort studies, numerous statistical issues arise from the range of study design, data collection, data analysis and interpretation. In genomic cohort studies, these statistical problems become more complicated, which need to be carefully dealt with. Rapid technical advances in genomic studies produce enormous amount of data to be analyzed and traditional statistical methods are no longer sufficient to handle these data. In this paper, we reviewed several important statistical issues that occur frequently in large-scale genomic cohort studies, including measurement error and its relevant correction methods, cost-efficient design strategy for main cohort and validation studies, inflated Type I error, gene-gene and gene-environment interaction and time-varying hazard ratios. It is very important to employ appropriate statistical methods in order to make the best use of valuable cohort data and produce valid and reliable study results.

A new extension of Lindley distribution: modified validation test, characterizations and different methods of estimation

  • Ibrahim, Mohamed;Yadav, Abhimanyu Singh;Yousof, Haitham M.;Goual, Hafida;Hamedani, G.G.
    • Communications for Statistical Applications and Methods
    • /
    • 제26권5호
    • /
    • pp.473-495
    • /
    • 2019
  • In this paper, a new extension of Lindley distribution has been introduced. Certain characterizations based on truncated moments, hazard and reverse hazard function, conditional expectation of the proposed distribution are presented. Besides, these characterizations, other statistical/mathematical properties of the proposed model are also discussed. The estimation of the parameters is performed through different classical methods of estimation. Bayes estimation is computed under gamma informative prior under the squared error loss function. The performances of all estimation methods are studied via Monte Carlo simulations in mean square error sense. The potential of the proposed model is analyzed through two data sets. A modified goodness-of-fit test using the Nikulin-Rao-Robson statistic test is investigated via two examples and is observed that the new extension might be used as an alternative lifetime model.