• 제목/요약/키워드: mean squared prediction error

검색결과 150건 처리시간 0.027초

Sums-of-Products Models for Korean Segment Duration Prediction

  • Chung, Hyun-Song
    • 음성과학
    • /
    • 제10권4호
    • /
    • pp.7-21
    • /
    • 2003
  • Sums-of-Products models were built for segment duration prediction of spoken Korean. An experiment for the modelling was carried out to apply the results to Korean text-to-speech synthesis systems. 670 read sentences were analyzed. trained and tested for the construction of the duration models. Traditional sequential rule systems were extended to simple additive, multiplicative and additive-multiplicative models based on Sums-of-Products modelling. The parameters used in the modelling include the properties of the target segment and its neighbors and the target segment's position in the prosodic structure. Two optimisation strategies were used: the downhill simplex method and the simulated annealing method. The performance of the models was measured by the correlation coefficient and the root mean squared prediction error (RMSE) between actual and predicted duration in the test data. The best performance was obtained when the data was trained and tested by ' additive-multiplicative models. ' The correlation for the vowel duration prediction was 0.69 and the RMSE. 31.80 ms. while the correlation for the consonant duration prediction was 0.54 and the RMSE. 29.02 ms. The results were not good enough to be applied to the real-time text-to-speech systems. Further investigation of feature interactions is required for the better performance of the Sums-of-Products models.

  • PDF

Context-Based Minimum MSE Prediction and Entropy Coding for Lossless Image Coding

  • Musik-Kwon;Kim, Hyo-Joon;Kim, Jeong-Kwon;Kim, Jong-Hyo;Lee, Choong-Woong
    • 한국방송∙미디어공학회:학술대회논문집
    • /
    • 한국방송공학회 1999년도 KOBA 방송기술 워크샵 KOBA Broadcasting Technology Workshop
    • /
    • pp.83-88
    • /
    • 1999
  • In this paper, a novel gray-scale lossless image coder combining context-based minimum mean squared error (MMSE) prediction and entropy coding is proposed. To obtain context of prediction, this paper first defines directional difference according to sharpness of edge and gradients of localities of image data. Classification of 4 directional differences forms“geometry context”model which characterizes two-dimensional general image behaviors such as directional edge region, smooth region or texture. Based on this context model, adaptive DPCM prediction coefficients are calculated in MMSE sense and the prediction is performed. The MMSE method on context-by-context basis is more in accord with minimum entropy condition, which is one of the major objectives of the predictive coding. In entropy coding stage, context modeling method also gives useful performance. To reduce the statistical redundancy of the residual image, many contexts are preset to take full advantage of conditional probability in entropy coding and merged into small number of context in efficient way for complexity reduction. The proposed lossless coding scheme slightly outperforms the CALIC, which is the state-of-the-art, in compression ratio.

관성 마찰용접 공정에서 심층 신경망을 이용한 업셋 길이와 업셋 시간의 예측 (Prediction of Upset Length and Upset Time in Inertia Friction Welding Process Using Deep Neural Network)

  • 양영수;배강열
    • 한국기계가공학회지
    • /
    • 제18권11호
    • /
    • pp.47-56
    • /
    • 2019
  • A deep neural network (DNN) model was proposed to predict the upset in the inertia friction welding process using a database comprising results from a series of FEM analyses. For the database, the upset length, upset beginning time, and upset completion time were extracted from the results of the FEM analyses obtained with various of axial pressure and initial rotational speed. A total of 35 training sets were constructed to train the proposed DNN with 4 hidden layers and 512 neurons in each layer, which can relate the input parameters to the welding results. The mean of the summation of squared error between the predicted results and the true results can be constrained to within 1.0e-4 after the training. Further, the network model was tested with another 10 sets of welding input parameters and results for comparison with FEM. The test showed that the relative error of DNN was within 2.8% for the prediction of upset. The results of DNN application revealed that the model could effectively provide welding results with respect to the exactness and cost for each combination of the welding input parameters.

공공 기상데이터와 기계학습 모델을 이용한 토양수분 예측 (Prediction of Soil Moisture with Open Source Weather Data and Machine Learning Algorithms)

  • 장영빈;장익훈;최영찬
    • 한국농림기상학회지
    • /
    • 제22권1호
    • /
    • pp.1-12
    • /
    • 2020
  • 토양수분은 농업에서 필수적인 자원으로 이의 변화와 부족을 예측함으로써 관리되어왔다. 최근 현장에서의 적용 용이성과 다양한 지역에 대한 일반화 가능성이 뛰어난 통계 및 기계학습 알고리즘을 활용한 토양수분 예측 연구가 활발히 진행되고 있다. 하지만 국내에서 생성되는 데이터를 이용한 연구들은 부족한 실정이다. 이에 본 연구는 1) 국내 공공기상 데이터만으로 충분한 성능을 내는 토양수분 예측 모델을 만들 수 있는지, 2) 어떠한 기계학습 모델이 국내에서 생산되는 데이터와 토양환경에서 가장 높은 예측 성능을 보이는지, 3) 단일 기계학습 모델을 이용해 다양한 지역에 적용 가능한지를 확인해보려 한다. 본 연구에서 Support Vector Machines (SVM), Random Forest (RF), Extremely Randomized Trees (ET), Gradient Boosting Machines (GBM), and Deep Feedforward Network (DFN) 알고리즘과 종관기상관측 자료, 농업기상관측자료를 활용하여 안동, 보성, 철원, 순천 지역의 토양 수분을 예측하는 모델을 만들었다. 그 결과, GBM을 이용한 모델이 R2 : 0.96, Root Mean Squared Error(RMSE) : 1.8로 가장 낮은 예측 오차를 보였다. 또한 GBM을 사용한 모델이 가장 낮은 지역간 예측 오차 분산을 보여 가장 일반화하기에 적절한 모델로 확인되었다.

Preliminary Study of Deep Learning-based Precipitation

  • Kim, Hee-Un;Bae, Tae-Suk
    • 한국측량학회지
    • /
    • 제35권5호
    • /
    • pp.423-430
    • /
    • 2017
  • Recently, data analysis research has been carried out using the deep learning technique in various fields such as image interpretation and/or classification. Various types of algorithms are being developed for many applications. In this paper, we propose a precipitation prediction algorithm based on deep learning with high accuracy in order to take care of the possible severe damage caused by climate change. Since the geographical and seasonal characteristics of Korea are clearly distinct, the meteorological factors have repetitive patterns in a time series. Since the LSTM (Long Short-Term Memory) is a powerful algorithm for consecutive data, it was used to predict precipitation in this study. For the numerical test, we calculated the PWV (Precipitable Water Vapor) based on the tropospheric delay of the GNSS (Global Navigation Satellite System) signals, and then applied the deep learning technique to the precipitation prediction. The GNSS data was processed by scientific software with the troposphere model of Saastamoinen and the Niell mapping function. The RMSE (Root Mean Squared Error) of the precipitation prediction based on LSTM performs better than that of ANN (Artificial Neural Network). By adding GNSS-based PWV as a feature, the over-fitting that is a latent problem of deep learning was prevented considerably as discussed in this study.

Variable selection and prediction performance of penalized two-part regression with community-based crime data application

  • Seong-Tae Kim;Man Sik Park
    • Communications for Statistical Applications and Methods
    • /
    • 제31권4호
    • /
    • pp.441-457
    • /
    • 2024
  • Semicontinuous data are characterized by a mixture of a point probability mass at zero and a continuous distribution of positive values. This type of data is often modeled using a two-part model where the first part models the probability of dichotomous outcomes -zero or positive- and the second part models the distribution of positive values. Despite the two-part model's popularity, variable selection in this model has not been fully addressed, especially, in high dimensional data. The objective of this study is to investigate variable selection and prediction performance of penalized regression methods in two-part models. The performance of the selected techniques in the two-part model is evaluated via simulation studies. Our findings show that LASSO and ENET tend to select more predictors in the model than SCAD and MCP. Consequently, MCP and SCAD outperform LASSO and ENET for β-specificity, and LASSO and ENET perform better than MCP and SCAD with respect to the mean squared error. We find similar results when applying the penalized regression methods to the prediction of crime incidents using community-based data.

LSTM 오토인코더를 활용한 축산 환경 시계열 데이터의 이상치 탐지: 경계값 설정에 따른 성능 비교 (Anomaly Detection in Livestock Environmental Time Series Data Using LSTM Autoencoders: A Comparison of Performance Based on Threshold Settings)

  • 정세연;김상철
    • 스마트미디어저널
    • /
    • 제13권4호
    • /
    • pp.48-56
    • /
    • 2024
  • 축산업에서 환경의 이상치 탐지와 데이터 예측은 매우 중요한 과제이다. 대부분 시계열 데이터로 수집되는 축산 환경 데이터의 이상치는 급격한 생육환경의 변화와 예상치 못한 전염병의 징후를 나타낼 수 있으므로 이상치를 빠르게 탐지하는 것이 중요하다. 이상치의 빠른 탐지와 효과적인 대응은 가축의 스트레스를 최소화하고 전염병 발생 환경을 조기에 발견하여 농가의 경제적인 손실을 감소시키는 역할을 할 수 있다. 본 연구에서는 축산환경 데이터의 이상치 탐지 분야에서 이상치를 규정하는 경계값(Threshold) 설정에서 두 가지 설정 방법을 이용하여 실험하고 성능을 비교하였다. Mean Squared Error(MSE)를 활용한 이상치 탐지 방법과 Dynamic Threshold를 이용한 이상치 탐지 방법을 이용하여 이를 통해 주어진 이전 데이터의 평균값과의 변동성을 분석하여 이상 상황을 식별하는 연구를 진행하였다. MSE를 활용한 이상치 탐지 방법은 94.98% 정확도를 보였고 표준편차를 활용한 Dynamic Threshold 방법은 99.66%정확도로 성능이 더 우수함을 확인할 수 있었다.

Modelling of dissolved oxygen (DO) in a reservoir using artificial neural networks: Amir Kabir Reservoir, Iran

  • Asadollahfardi, Gholamreza;Aria, Shiva Homayoun;Abaei, Mehrdad
    • Advances in environmental research
    • /
    • 제5권3호
    • /
    • pp.153-167
    • /
    • 2016
  • We applied multilayer perceptron (MLP) and radial basis function (RBF) neural network in upstream and downstream water quality stations of the Karaj Reservoir in Iran. For both neural networks, inputs were pH, turbidity, temperature, chlorophyll-a, biochemical oxygen demand (BOD) and nitrate, and the output was dissolved oxygen (DO). We used an MLP neural network with two hidden layers, for upstream station 15 and 33 neurons in the first and second layers respectively, and for the downstream station, 16 and 21 neurons in the first and second hidden layer were used which had minimum amount of errors. For learning process 6-fold cross validation were applied to avoid over fitting. The best results acquired from RBF model, in which the mean bias error (MBE) and root mean squared error (RMSE) were 0.063 and 0.10 for the upstream station. The MBE and RSME were 0.0126 and 0.099 for the downstream station. The coefficient of determination ($R^2$) between the observed data and the predicted data for upstream and downstream stations in the MLP was 0.801 and 0.904, respectively, and in the RBF network were 0.962 and 0.97, respectively. The MLP neural network had acceptable results; however, the results of RBF network were more accurate. A sensitivity analysis for the MLP neural network indicated that temperature was the first parameter, pH the second and nitrate was the last factor affecting the prediction of DO concentrations. The results proved the workability and accuracy of the RBF model in the prediction of the DO.

FitRec 기반 달리기 심박수 예측 시스템 (Prediction System of Running Heart Rate based on FitRec)

  • 김진욱;김광현;선준호;이승우;김수현;김진영
    • 한국인터넷방송통신학회논문지
    • /
    • 제22권6호
    • /
    • pp.165-171
    • /
    • 2022
  • 사람의 심박수는 운동 강도 측정의 기준으로 사용되는 중요한 지표이다. 만약 심박수를 예측한다면 운동 중 운동 강도를 미리 조절하여 효율적으로 운동할 수 있다. 본 논문에서는 FitRec 기반 달리기 운동을 수행하는 사용자의 심박수를 예측하는 모델을 제안한다. 학습을 위해 Endomondo의 데이터를 사용하여 예측 모델에 적용한다. 성능 비교를 위해 시계열 데이터 처리 알고리즘 LSTM(long short term memory)과 GRU(gated recurrent unit)를 사용하였다. FitRec에 유산소 운동 중 달리기 데이터만 학습한 결과 여러 유산소 운동 데이터를 모두 학습한 모델보다 MAE(mean absolute error)와 RMSE(root mean squared error) 둘 다 성능이 향상됨을 확인하였다.

원인균별 식중독 발생 건수 예측 (Prediction of the Number of Food Poisoning Occurrences by Microbes)

  • 여인권
    • 응용통계연구
    • /
    • 제26권6호
    • /
    • pp.923-932
    • /
    • 2013
  • 이 논문에서는 우리나라에서 발생하는 원인균별 식중독 발생건수를 예측하는 방법을 제안한다. 우리나라에서 보고되는 주별 식중독 발생 건수를 원인균로 나누면 자료에 많은 0의 관측값이 포함되어 있으며 식중독 발생 간에 종속성을 가진다. 이 현상을 모형화하기 위해 이 논문에서는 전체 식중독 건수를 자기회귀모형으로 예측하고 원인균별 식중독 발생 확률을 다범주 로짓모형으로 추정한다. 예측된 식중독 건수와 추정된 원인균별 식중독 발생 확률을 곱하여 원인균별 식중독 발생건수를 예측한다. 제안된 방법의 타당성을 확인하기 위해 평균제곱오차와 평균절대편차를 이용하여 제안 방법과 영과잉모형을 비교해 본다.