• Title/Summary/Keyword: Cross - Validation

Search Result 999, Processing Time 0.026 seconds

Development of a Probability Prediction Model for Tropical Cyclone Genesis in the Northwestern Pacific using the Logistic Regression Method

  • Choi, Ki-Seon;Kang, Ki-Ryong;Kim, Do-Woo;Kim, Tae-Ryong
    • Journal of the Korean earth science society
    • /
    • v.31 no.5
    • /
    • pp.454-464
    • /
    • 2010
  • A probability prediction model for tropical cyclone (TC) genesis in the Northwestern Pacific area was developed using the logistic regression method. Total five predictors were used in this model: the lower-level relative vorticity, vertical wind shear, mid-level relative humidity, upper-level equivalent potential temperature, and sea surface temperature (SST). The values for four predictors except for SST were obtained from difference of spatial-averaged value between May and January, and the time average of Ni$\tilde{n}$o-3.4 index from February to April was used to see the SST effect. As a result of prediction for the TC genesis frequency from June to December during 1951 to 2007, the model was capable of predicting that 21 (22) years had higher (lower) frequency than the normal year. The analysis of real data indicated that the number of year with the higher (lower) frequency of TC genesis was 28 (29). The overall predictability was about 75%, and the model reliability was also verified statistically through the cross validation analysis method.

Multiple octave-band based genre classification algorithm for music recommendation (음악추천을 위한 다중 옥타브 밴드 기반 장르 분류기)

  • Lim, Shin-Cheol;Jang, Sei-Jin;Lee, Seok-Pil;Kim, Moo-Young
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.7
    • /
    • pp.1487-1494
    • /
    • 2011
  • In this paper, a novel genre classification algorithm is proposed for music recommendation system. Especially, to improve the classification accuracy, the band-pass filter for octave-based spectral contrast (OSC) feature is designed considering the psycho-acoustic model and actual frequency range of musical instruments. The GTZAN database including 10 genres was used for 10-fold cross validation experiments. The proposed multiple-octave based OSC produces better accuracy by 2.26% compared with the conventional OSC. The combined feature vector based on the proposed OSC and mel-frequency cepstral coefficient (MFCC) gives even better accuracy.

Composing Recommended Route through Machine Learning of Navigational Data (항적 데이터 학습을 통한 추천 항로 구성에 관한 연구)

  • Kim, Joo-Sung;Jeong, Jung Sik;Lee, Seong-Yong;Lee, Eun-seok
    • Proceedings of the Korean Institute of Navigation and Port Research Conference
    • /
    • 2016.05a
    • /
    • pp.285-286
    • /
    • 2016
  • We aim to propose the prediction modeling method of ship's position with extracting ship's trajectory model through pattern recognition based on the data that are being collected in VTS centers at real time. Support Vector Machine algorithm was used for data modeling. The optimal parameters are calculated with k-fold cross validation and grid search. We expect that the proposed modeling method could support VTS operators' decision making in case of complex encountering traffic situations.

  • PDF

A Study on Exploration of the Recommended Model of Decision Tree to Predict a Hard-to-Measure Mesurement in Anthropometric Survey (인체측정조사에서 측정곤란부위 예측을 위한 의사결정나무 추천 모형 탐지에 관한 연구)

  • Choi, J.H.;Kim, S.K.
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.5
    • /
    • pp.923-935
    • /
    • 2009
  • This study aims to explore a recommended model of decision tree to predict a hard-to-measure measurement in anthropometric survey. We carry out an experiment on cross validation study to obtain a recommened model of decision tree. We use three split rules of decision tree, those are CHAID, Exhaustive CHAID, and CART. CART result is the best one in real world data.

A Study on the Structure of Neural Network for Predicting Defect Size of Steam Generator Tube in Nuclear Power Plant (원전SG 세관 결함크기 예측을 위한 신경회로망 구조에 관한 연구)

  • Jo, Nam-Hoon
    • Journal of the Korean Institute of Illuminating and Electrical Installation Engineers
    • /
    • v.24 no.1
    • /
    • pp.63-70
    • /
    • 2010
  • In this paper, we study the structure of neural network for predicting defect size of steam generator tube. After extracting the features from the eddy current testing (ECT) signals, multi-layer neural networks are used to predict the defect size. In order to maximize the prediction performance for the defect size, we should carefully choose the structure of neural networks, especially the number of neurons in the hidden layer. In this paper, it is shown that, for the prediction of defect size, the number of neurons in the hidden layer can be efficiently determined by using cross-validation.

Validation Study of Korean Version of the Rothbart's Children's Behavior Questionnaire (한국판 Rothbart 유아용 기질 척도(Children's Behavior Questionnaire)의 타당화)

  • Lim, Ji-Young;Bae, Yun-Jin
    • Korean Journal of Human Ecology
    • /
    • v.24 no.4
    • /
    • pp.477-497
    • /
    • 2015
  • The purpose of this study was to examine the psychometric property of the Children's Behavior Questionnaire(CBQ), including reliability, content validity, construct validity, cross validity, and concurrent validity with EAS(Emotionality, Activity, Sociability) scale. The CBQ is a caregiver report measure designed to provide a detailed assessment of temperament in children 3 to 7 years of age. In this study, two groups of participants were included to check cross validity. The first group of participants were 108 preschoolers, 3 to 7 years of age attending kindergartens or child care centers, and their mothers. The second group of participants were 168 preschoolers and their mothers. The CBQ subscales demonstrate adequate internal consistencies. Also, exploratory and confirmatory factor analyses of the CBQ scale reliably recover a three-factor solution indicating three broad dimension of temperament: extraversion/surgency, negative affectivity, and effortful control. Evidence for concurrent validity derives from results of correlation analysis with EAS scale.

A Study on the Improvement of Annual Runoff Estimation Model (연유출량 추정모형의 개선방안)

  • 이상훈
    • Water for future
    • /
    • v.26 no.1
    • /
    • pp.51-62
    • /
    • 1993
  • The most significant factor in estimating annual runoff must be the precipitation. But in the previous study, the watershed area instead of precitation was included as an independent variable in regression model in the process of checking accurate data. The criterion of accurate data was the runoff ratio in the range of 20% to 100%. In this study the valid range of evapotranspiration was adopted as a criterion of accurate data and the same data were reexamined. It came up with following model which has a high coefficient of determination and conforms to hydrologic theory. R=-518.25+0.8834P where, R: runoff depth(mm) P: precipitation(mm) This regression model was found to be stable by cross-validation and is proposed as annual runoff estimation model applicable to ungaged small and medium watersheds in Korea.

  • PDF

Noise Removal using Support Vector Regression in Noisy Document Images

  • Kim, Hee-Hoon;Kang, Seung-Hyo;Park, Jai-Hyun;Ha, Hyun-Ho;Lim, Dong-Hoon
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.4
    • /
    • pp.669-680
    • /
    • 2012
  • Noise removal of document images is a necessary step during preprocessing to recognize characters effectively because it has influences greatly on processing speed and performance for character recognition. We have considered using the spatial filters such as traditional mean filters and Gaussian filters, and wavelet transformed based methods for noise deduction in natural images. However, these methods are not effective for the noise removal of document images. In this paper, we present noise removal of document images using support vector regression. The proposed approach consists of two steps which are SVR training step and SVR test step. We construct an optimal prediction model using grid search with cross-validation in SVR training step, and then apply it to noisy images to remove noises in test step. We evaluate our SVR based method both quantitatively and qualitatively for noise removal in Korean, English and Chinese character documents, and compare it to some existing methods. Experimental results indicate that the proposed method is more effective and can get satisfactory removal results.

A Study on the Prediction of Traffic Counts Based on Shortest Travel Path (최단경로 기반 교통량 공간 예측에 관한 연구)

  • Heo, Tae-Young;Park, Man-Sik;Eom, Jin-Ki;Oh, Ju-Sam
    • The Korean Journal of Applied Statistics
    • /
    • v.20 no.3
    • /
    • pp.459-473
    • /
    • 2007
  • In this paper, we suggest a spatial regression model to predict AADT. Although Euclidian distances between one monitoring site and its neighboring sites were usually used in the many analysis, we consider the shortest travel path between monitoring sites to predict AADT for unmonitoring site using spatial regression model. We used universal Kriging method for prediction and found that the overall predictive capability of the spatial regression model based on shortest travel path is better than that of the model based on multiple regression by cross validation.

Computational Detection of Prokaryotic Core Promoters in Genomic Sequences

  • Kim Ki-Bong;Sim Jeong Seop
    • Journal of Microbiology
    • /
    • v.43 no.5
    • /
    • pp.411-416
    • /
    • 2005
  • The high-throughput sequencing of microbial genomes has resulted in the relatively rapid accumulation of an enormous amount of genomic sequence data. In this context, the problem posed by the detection of promoters in genomic DNA sequences via computational methods has attracted considerable research attention in recent years. This paper addresses the development of a predictive model, known as the dependence decomposition weight matrix model (DDWMM), which was designed to detect the core promoter region, including the -10 region and the transcription start sites (TSSs), in prokaryotic genomic DNA sequences. This is an issue of some importance with regard to genome annotation efforts. Our predictive model captures the most significant dependencies between positions (allowing for non­adjacent as well as adjacent dependencies) via the maximal dependence decomposition (MDD) procedure, which iteratively decomposes data sets into subsets, based on the significant dependence between positions in the promoter region to be modeled. Such dependencies may be intimately related to biological and structural concerns, since promoter elements are present in a variety of combinations, which are separated by various distances. In this respect, the DDWMM may prove to be appropriate with regard to the detection of core promoter regions and TSSs in long microbial genomic contigs. In order to demonstrate the effectiveness of our predictive model, we applied 10-fold cross-validation experiments on the 607 experimentally-verified promoter sequences, which evidenced good performance in terms of sensitivity.