• Title/Summary/Keyword: Vector data model

Search Result 1,180, Processing Time 0.025 seconds

Forecasting volatility via conditional autoregressive value at risk model based on support vector quantile regression

  • Shim, Joo-Yong;Hwang, Chang-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.3
    • /
    • pp.589-596
    • /
    • 2011
  • The conditional autoregressive value at risk (CAViaR) model is useful for risk management, which does not require the assumption that the conditional distribution does not vary over time but the volatility does. But it does not provide volatility forecasts, which are needed for several important applications such as option pricing and portfolio management. For a variety of probability distributions, it is known that there is a constant relationship between the standard deviation and the distance between symmetric quantiles in the tails of the distribution. This inspires us to use a support vector quantile regression (SVQR) for volatility forecasts with the distance between CAViaR forecasts of symmetric quantiles. Simulated example and real example are provided to indicate the usefulness of proposed forecasting method for volatility.

Machine learning-based Predictive Model of Suicidal Thoughts among Korean Adolescents. (머신러닝 기반 한국 청소년의 자살 생각 예측 모델)

  • YeaJu JIN;HyunKi KIM
    • Journal of Korea Artificial Intelligence Association
    • /
    • v.1 no.1
    • /
    • pp.1-6
    • /
    • 2023
  • This study developed models using decision forest, support vector machine, and logistic regression methods to predict and prevent suicidal ideation among Korean adolescents. The study sample consisted of 51,407 individuals after removing missing data from the raw data of the 18th (2022) Youth Health Behavior Survey conducted by the Korea Centers for Disease Control and Prevention. Analysis was performed using the MS Azure program with Two-Class Decision Forest, Two-Class Support Vector Machine, and Two-Class Logistic Regression. The results of the study showed that the decision forest model achieved an accuracy of 84.8% and an F1-score of 36.7%. The support vector machine model achieved an accuracy of 86.3% and an F1-score of 24.5%. The logistic regression model achieved an accuracy of 87.2% and an F1-score of 40.1%. Applying the logistic regression model with SMOTE to address data imbalance resulted in an accuracy of 81.7% and an F1-score of 57.7%. Although the accuracy slightly decreased, the recall, precision, and F1-score improved, demonstrating excellent performance. These findings have significant implications for the development of prediction models for suicidal ideation among Korean adolescents and can contribute to the prevention and improvement of youth suicide.

Fuzzy c-Regression Using Weighted LS-SVM

  • Hwang, Chang-Ha
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2005.10a
    • /
    • pp.161-169
    • /
    • 2005
  • In this paper we propose a fuzzy c-regression model based on weighted least squares support vector machine(LS-SVM), which can be used to detect outliers in the switching regression model while preserving simultaneous yielding the estimates of outputs together with a fuzzy c-partitions of data. It can be applied to the nonlinear regression which does not have an explicit form of the regression function. We illustrate the new algorithm with examples which indicate how it can be used to detect outliers and fit the mixed data to the nonlinear regression models.

  • PDF

Censored varying coefficient regression model using Buckley-James method

  • Shim, Jooyong;Seok, Kyungha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.5
    • /
    • pp.1167-1177
    • /
    • 2017
  • The censored regression using the pseudo-response variable proposed by Buckley and James has been one of the most well-known models. Recently, the varying coefficient regression model has received a great deal of attention as an important tool for modeling. In this paper we propose a censored varying coefficient regression model using Buckley-James method to consider situations where the regression coefficients of the model are not constant but change as the smoothing variables change. By using the formulation of least squares support vector machine (LS-SVM), the coefficient estimators of the proposed model can be easily obtained from simple linear equations. Furthermore, a generalized cross validation function can be easily derived. In this paper, we evaluated the proposed method and demonstrated the adequacy through simulate data sets and real data sets.

Research on prediction and analysis of supercritical water heat transfer coefficient based on support vector machine

  • Ma Dongliang;Li Yi;Zhou Tao;Huang Yanping
    • Nuclear Engineering and Technology
    • /
    • v.55 no.11
    • /
    • pp.4102-4111
    • /
    • 2023
  • In order to better perform thermal hydraulic calculation and analysis of supercritical water reactor, based on the experimental data of supercritical water, the model training and predictive analysis of the heat transfer coefficient of supercritical water were carried out by using the support vector machine (SVM) algorithm. The changes in the prediction accuracy of the supercritical water heat transfer coefficient are analyzed by the changes of the regularization penalty parameter C, the slack variable epsilon and the Gaussian kernel function parameter gamma. The predicted value of the SVM model obtained after parameter optimization and the actual experimental test data are analyzed for data verification. The research results show that: the normalization of the data has a great influence on the prediction results. The slack variable has a relatively small influence on the accuracy change range of the predicted heat transfer coefficient. The change of gamma has the greatest impact on the accuracy of the heat transfer coefficient. Compared with the calculation results of traditional empirical formula methods, the trained algorithm model using SVM has smaller average error and standard deviations. Using the SVM trained algorithm model, the heat transfer coefficient of supercritical water can be effectively predicted and analyzed.

A Study on the Validation of Vector Data Model for River-Geospatial Information and Building Its Portal System (하천공간정보의 벡터데이터 모델 검증 및 포털 구축에 관한 연구)

  • Shin, Hyung-Jin;Chae, Hyo-Sok;Hwang, Eui-Ho
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.17 no.2
    • /
    • pp.95-106
    • /
    • 2014
  • In this study, the applicability of a standard vector model was evaluated using RIMGIS vector data and a portal based river-geospatial information web service system was developed using XML and JSON based data linkage between the server and the client. The RIMGIS vector data including points, lines, and polygons were converted to the Geospatial Data Model(GDM) developed in this study and were validated by layers. After the conversion, it was identified that the attribute data of a shape file remained without loss. The GeoServer GDB(GeoDataBase) that manages a DB in the portal was developed as a management module. The XML-based Geography Markup Language(GML) standards of OGC was used for accessing to and managing vector layers and encoding spatial data. The separation of data content and expression in the GML allowed the different expressions of the same data, convenient data revision and update, and enhancing the expandability. In the future, it is necessary to improve the access, exchange, and storage of river-geospatial information through the user's customized services and Internet accessibility.

REGRESSION WITH CENSORED DATA BY LEAST SQUARES SUPPORT VECTOR MACHINE

  • Kim, Dae-Hak;Shim, Joo-Yong;Oh, Kwang-Sik
    • Journal of the Korean Statistical Society
    • /
    • v.33 no.1
    • /
    • pp.25-34
    • /
    • 2004
  • In this paper we propose a prediction method on the regression model with randomly censored observations of the training data set. The least squares support vector machine regression is applied for the regression function prediction by incorporating the weights assessed upon each observation in the optimization problem. Numerical examples are given to show the performance of the proposed prediction method.

Concept Drift Based on CNN Probability Vector in Data Stream Environment

  • Kim, Tae Yeun;Bae, Sang Hyun
    • Journal of Integrative Natural Science
    • /
    • v.13 no.4
    • /
    • pp.147-151
    • /
    • 2020
  • In this paper, we propose a method to detect concept drift by applying Convolutional Neural Network (CNN) in a data stream environment. Since the conventional method compares only the final output value of the CNN and detects it as a concept drift if there is a difference, there is a problem in that the actual input value of the data stream reacts sensitively even if there is no significant difference and is incorrectly detected as a concept drift. Therefore, in this paper, in order to reduce such errors, not only the output value of CNN but also the probability vector are used. First, the data entered into the data stream is patterned to learn from the neural network model, and the difference between the output value and probability vector of the current data and the historical data of these learned neural network models is compared to detect the concept drift. The proposed method confirmed that only CNN output values could be used to reduce detection errors compared to how concept drift were detected.

Semiparametric support vector machine for accelerated failure time model

  • Hwang, Chang-Ha;Shim, Joo-Yong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.21 no.4
    • /
    • pp.765-775
    • /
    • 2010
  • For the accelerated failure time (AFT) model a lot of effort has been devoted to develop effective estimation methods. AFT model assumes a linear relationship between the logarithm of event time and covariates. In this paper we propose a semiparametric support vector machine to consider situations where the functional form of the effect of one or more covariates is unknown. The proposed estimating equation can be computed by a quadratic programming and a linear equation. We study the effect of several covariates on a censored response variable with an unknown probability distribution. We also provide a generalized approximate cross-validation method for choosing the hyper-parameters which affect the performance of the proposed approach. The proposed method is evaluated through simulations using the artificial example.

The Development of a Fault Diagnosis Model Based on Principal Component Analysis and Support Vector Machine for a Polystyrene Reactor (주성분 분석과 서포트 벡터 머신을 이용한 폴리스티렌 중합 반응기 이상 진단 모델 개발)

  • Jeong, Yeonsu;Lee, Chang Jun
    • Korean Chemical Engineering Research
    • /
    • v.60 no.2
    • /
    • pp.223-228
    • /
    • 2022
  • In chemical processes, unintended faults can make serious accidents. To tackle them, proper fault diagnosis models should be designed to identify the root cause of faults. To design a fault diagnosis model, a process and its data should be analyzed. However, most previous researches in the field of fault diagnosis just handle the data set of benchmark processes simulated on commercial programs. It indicates that it is really hard to get fresh data sets on real processes. In this study, real faulty conditions of an industrial polystyrene process are tested. In this process, a runaway reaction occurred and this caused a large loss since operators were late aware of the occurrence of this accident. To design a proper fault diagnosis model, we analyzed this process and a real accident data set. At first, a mode classification model based on support vector machine (SVM) was trained and principal component analysis (PCA) model for each mode was constructed under normal operation conditions. The results show that a proposed model can quickly diagnose the occurrence of a fault and they indicate that this model is able to reduce the potential loss.