• 제목/요약/키워드: Prediction of variables

Search Result 1,817, Processing Time 0.026 seconds

Determination of Carbon Equivalent Equation by Using Neural Network for Roll Force Prediction in hot Strip Mill (신경망을 이용한 열간 압연하중 예측용 탄소당량식의 개발)

  • 김필호;문영훈;이준정
    • Transactions of Materials Processing
    • /
    • v.6 no.6
    • /
    • pp.482-488
    • /
    • 1997
  • New carbon equivalent equation for the better prediction for the better prediction of roll force in a continuous hot strip mill has been formulated by applying a neural network method. In predicting roll force of steel strip, carbon equivalent equation which normalize the effects of various alloying elements by a carbon equivalent content is very critical for the accurate prediction of roll force. To overcome the complex relationships between alloying elements and operational variables such as temperature, strain, strain rate and so forth, a neural network method which is effective for multi-variable analysis was adopted in the present work as a tool to determine a proper carbon equivalent equation. The application of newly formulated carbon equivalent equation has increased prediction accuracy of roll force significantly and the effectiveness of neural network method is well confirmed in this study.

  • PDF

Development of the Machine Learning-based Employment Prediction Model for Internship Applicants (인턴십 지원자를 위한 기계학습기반 취업예측 모델 개발)

  • Kim, Hyun Soo;Kim, Sunho;Kim, Do Hyun
    • Journal of the Semiconductor & Display Technology
    • /
    • v.21 no.2
    • /
    • pp.138-143
    • /
    • 2022
  • The employment prediction model proposed in this paper uses 16 independent variables, including self-introductions of M University students who applied for IPP and work-study internship, and 3 dependent variable data such as large companies, mid-sized companies, and unemployment. The employment prediction model for large companies was developed using Random Forest and Word2Vec with the result of F1_Weighted 82.4%. The employment prediction model for medium-sized companies and above was developed using Logistic Regression and Word2Vec with the result of F1_Weighted 73.24%. These two models can be actively used in predicting employment in large and medium-sized companies for M University students in the future.

Development of the Plywood Demand Prediction Model

  • Kim, Dong-Jun
    • Journal of Korean Society of Forest Science
    • /
    • v.97 no.2
    • /
    • pp.140-143
    • /
    • 2008
  • This study compared the plywood demand prediction accuracy of econometric and vector autoregressive models using Korean data. The econometric model of plywood demand was specified with three explanatory variables; own price, construction permit area, dummy. The vector autoregressive model was specified with lagged endogenous variable, own price, construction permit area and dummy. The dummy variable reflected the abrupt decrease in plywood consumption in the late 1990's. The prediction accuracy was estimated on the basis of Residual Mean Squared Error, Mean Absolute Percentage Error and Theil's Inequality Coefficient. The results showed that the plywood demand prediction can be performed more accurately by econometric model than by vector autoregressive model.

Prediction of Quantitative Traits Using Common Genetic Variants: Application to Body Mass Index

  • Bae, Sunghwan;Choi, Sungkyoung;Kim, Sung Min;Park, Taesung
    • Genomics & Informatics
    • /
    • v.14 no.4
    • /
    • pp.149-159
    • /
    • 2016
  • With the success of the genome-wide association studies (GWASs), many candidate loci for complex human diseases have been reported in the GWAS catalog. Recently, many disease prediction models based on penalized regression or statistical learning methods were proposed using candidate causal variants from significant single-nucleotide polymorphisms of GWASs. However, there have been only a few systematic studies comparing existing methods. In this study, we first constructed risk prediction models, such as stepwise linear regression (SLR), least absolute shrinkage and selection operator (LASSO), and Elastic-Net (EN), using a GWAS chip and GWAS catalog. We then compared the prediction accuracy by calculating the mean square error (MSE) value on data from the Korea Association Resource (KARE) with body mass index. Our results show that SLR provides a smaller MSE value than the other methods, while the numbers of selected variables in each model were similar.

The prediction Models for Clearance Times for the unexpected Incidences According to Traffic Accident Classifications in Highway (고속도로 사고등급별 돌발상황 처리시간 예측모형 및 의사결정나무 개발)

  • Ha, Oh-Keun;Park, Dong-Joo;Won, Jai-Mu;Jung, Chul-Ho
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.9 no.1
    • /
    • pp.101-110
    • /
    • 2010
  • In this study, a prediction model for incident reaction time was developed so that we can cope with the increasing demand for information related to the accident reaction time. For this, the time for dealing with accidents and dependent variables were classified into incident grade, A, B, and C. Then, fifteen independent variables including traffic volume, number of accident-related vehicles and the accidents time zone were utilized. As a result, traffic volume, possibility of including heavy vehicles, and an accident time zone were found as important variables. The results showed that the model has some degree of explanatory power. In addition, when the CHAID Technique was applied, the Answer Tree was constructed based on the variables included in the prediction model for incident reaction time. Using the developed Answer Tree model, accidents firstly were classified into grades A, B, and C. In the secondary classification, they were grouped according to the traffic volume. This study is expected to make a contribution to provide expressway users with quicker and more effective traffic information through the prediction model for incident reaction time and the Answer Tree, when incidents happen on expressway

Movie Box-office Prediction using Deep Learning and Feature Selection : Focusing on Multivariate Time Series

  • Byun, Jun-Hyung;Kim, Ji-Ho;Choi, Young-Jin;Lee, Hong-Chul
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.6
    • /
    • pp.35-47
    • /
    • 2020
  • Box-office prediction is important to movie stakeholders. It is necessary to accurately predict box-office and select important variables. In this paper, we propose a multivariate time series classification and important variable selection method to improve accuracy of predicting the box-office. As a research method, we collected daily data from KOBIS and NAVER for South Korean movies, selected important variables using Random Forest and predicted multivariate time series using Deep Learning. Based on the Korean screen quota system, Deep Learning was used to compare the accuracy of box-office predictions on the 73rd day from movie release with the important variables and entire variables, and the results was tested whether they are statistically significant. As a Deep Learning model, Multi-Layer Perceptron, Fully Convolutional Neural Networks, and Residual Network were used. Among the Deep Learning models, the model using important variables and Residual Network had the highest prediction accuracy at 93%.

Development of Thermal Error Model with Minimum Number of Variables Using Fuzzy Logic Strategy

  • Lee, Jin-Hyeon;Lee, Jae-Ha;Yang, Seong-Han
    • Journal of Mechanical Science and Technology
    • /
    • v.15 no.11
    • /
    • pp.1482-1489
    • /
    • 2001
  • Thermally-induced errors originating from machine tool errors have received significant attention recently because high speed and precise machining is now the principal trend in manufacturing proce sses using CNC machine tools. Since the thermal error model is generally a function of temperature, the thermal error compensation system contains temperature sensors with the same number of temperature variables. The minimization of the number of variables in the thermal error model can affect the economical efficiency and the possibility of unexpected sensor fault in a error compensation system. This paper presents a thermal error model with minimum number of variables using a fuzzy logic strategy. The proposed method using a fuzzy logic strategy does not require any information about the characteristics of the plant contrary to numerical analysis techniques, but the developed thermal error model guarantees good prediction performance. The proposed modeling method can also be applied to any type of CNC machine tool if a combination of the possible input variables is determined because the error model parameters are only calculated mathematically-based on the number of temperature variables.

  • PDF

CNN-LSTM Coupled Model for Prediction of Waterworks Operation Data

  • Cao, Kerang;Kim, Hangyung;Hwang, Chulhyun;Jung, Hoekyung
    • Journal of Information Processing Systems
    • /
    • v.14 no.6
    • /
    • pp.1508-1520
    • /
    • 2018
  • In this paper, we propose an improved model to provide users with a better long-term prediction of waterworks operation data. The existing prediction models have been studied in various types of models such as multiple linear regression model while considering time, days and seasonal characteristics. But the existing model shows the rate of prediction for demand fluctuation and long-term prediction is insufficient. Particularly in the deep running model, the long-short-term memory (LSTM) model has been applied to predict data of water purification plant because its time series prediction is highly reliable. However, it is necessary to reflect the correlation among various related factors, and a supplementary model is needed to improve the long-term predictability. In this paper, convolutional neural network (CNN) model is introduced to select various input variables that have a necessary correlation and to improve long term prediction rate, thus increasing the prediction rate through the LSTM predictive value and the combined structure. In addition, a multiple linear regression model is applied to compile the predicted data of CNN and LSTM, which then confirms the data as the final predicted outcome.

Building battery deterioration prediction model using real field data (머신러닝 기법을 이용한 납축전지 열화 예측 모델 개발)

  • Choi, Keunho;Kim, Gunwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.243-264
    • /
    • 2018
  • Although the worldwide battery market is recently spurring the development of lithium secondary battery, lead acid batteries (rechargeable batteries) which have good-performance and can be reused are consumed in a wide range of industry fields. However, lead-acid batteries have a serious problem in that deterioration of a battery makes progress quickly in the presence of that degradation of only one cell among several cells which is packed in a battery begins. To overcome this problem, previous researches have attempted to identify the mechanism of deterioration of a battery in many ways. However, most of previous researches have used data obtained in a laboratory to analyze the mechanism of deterioration of a battery but not used data obtained in a real world. The usage of real data can increase the feasibility and the applicability of the findings of a research. Therefore, this study aims to develop a model which predicts the battery deterioration using data obtained in real world. To this end, we collected data which presents change of battery state by attaching sensors enabling to monitor the battery condition in real time to dozens of golf carts operated in the real golf field. As a result, total 16,883 samples were obtained. And then, we developed a model which predicts a precursor phenomenon representing deterioration of a battery by analyzing the data collected from the sensors using machine learning techniques. As initial independent variables, we used 1) inbound time of a cart, 2) outbound time of a cart, 3) duration(from outbound time to charge time), 4) charge amount, 5) used amount, 6) charge efficiency, 7) lowest temperature of battery cell 1 to 6, 8) lowest voltage of battery cell 1 to 6, 9) highest voltage of battery cell 1 to 6, 10) voltage of battery cell 1 to 6 at the beginning of operation, 11) voltage of battery cell 1 to 6 at the end of charge, 12) used amount of battery cell 1 to 6 during operation, 13) used amount of battery during operation(Max-Min), 14) duration of battery use, and 15) highest current during operation. Since the values of the independent variables, lowest temperature of battery cell 1 to 6, lowest voltage of battery cell 1 to 6, highest voltage of battery cell 1 to 6, voltage of battery cell 1 to 6 at the beginning of operation, voltage of battery cell 1 to 6 at the end of charge, and used amount of battery cell 1 to 6 during operation are similar to that of each battery cell, we conducted principal component analysis using verimax orthogonal rotation in order to mitigate the multiple collinearity problem. According to the results, we made new variables by averaging the values of independent variables clustered together, and used them as final independent variables instead of origin variables, thereby reducing the dimension. We used decision tree, logistic regression, Bayesian network as algorithms for building prediction models. And also, we built prediction models using the bagging of each of them, the boosting of each of them, and RandomForest. Experimental results show that the prediction model using the bagging of decision tree yields the best accuracy of 89.3923%. This study has some limitations in that the additional variables which affect the deterioration of battery such as weather (temperature, humidity) and driving habits, did not considered, therefore, we would like to consider the them in the future research. However, the battery deterioration prediction model proposed in the present study is expected to enable effective and efficient management of battery used in the real filed by dramatically and to reduce the cost caused by not detecting battery deterioration accordingly.

Development of Predictive Models for Rights Issues Using Financial Analysis Indices and Decision Tree Technique (경영분석지표와 의사결정나무기법을 이용한 유상증자 예측모형 개발)

  • Kim, Myeong-Kyun;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.4
    • /
    • pp.59-77
    • /
    • 2012
  • This study focuses on predicting which firms will increase capital by issuing new stocks in the near future. Many stakeholders, including banks, credit rating agencies and investors, performs a variety of analyses for firms' growth, profitability, stability, activity, productivity, etc., and regularly report the firms' financial analysis indices. In the paper, we develop predictive models for rights issues using these financial analysis indices and data mining techniques. This study approaches to building the predictive models from the perspective of two different analyses. The first is the analysis period. We divide the analysis period into before and after the IMF financial crisis, and examine whether there is the difference between the two periods. The second is the prediction time. In order to predict when firms increase capital by issuing new stocks, the prediction time is categorized as one year, two years and three years later. Therefore Total six prediction models are developed and analyzed. In this paper, we employ the decision tree technique to build the prediction models for rights issues. The decision tree is the most widely used prediction method which builds decision trees to label or categorize cases into a set of known classes. In contrast to neural networks, logistic regression and SVM, decision tree techniques are well suited for high-dimensional applications and have strong explanation capabilities. There are well-known decision tree induction algorithms such as CHAID, CART, QUEST, C5.0, etc. Among them, we use C5.0 algorithm which is the most recently developed algorithm and yields performance better than other algorithms. We obtained data for the rights issue and financial analysis from TS2000 of Korea Listed Companies Association. A record of financial analysis data is consisted of 89 variables which include 9 growth indices, 30 profitability indices, 23 stability indices, 6 activity indices and 8 productivity indices. For the model building and test, we used 10,925 financial analysis data of total 658 listed firms. PASW Modeler 13 was used to build C5.0 decision trees for the six prediction models. Total 84 variables among financial analysis data are selected as the input variables of each model, and the rights issue status (issued or not issued) is defined as the output variable. To develop prediction models using C5.0 node (Node Options: Output type = Rule set, Use boosting = false, Cross-validate = false, Mode = Simple, Favor = Generality), we used 60% of data for model building and 40% of data for model test. The results of experimental analysis show that the prediction accuracies of data after the IMF financial crisis (59.04% to 60.43%) are about 10 percent higher than ones before IMF financial crisis (68.78% to 71.41%). These results indicate that since the IMF financial crisis, the reliability of financial analysis indices has increased and the firm intention of rights issue has been more obvious. The experiment results also show that the stability-related indices have a major impact on conducting rights issue in the case of short-term prediction. On the other hand, the long-term prediction of conducting rights issue is affected by financial analysis indices on profitability, stability, activity and productivity. All the prediction models include the industry code as one of significant variables. This means that companies in different types of industries show their different types of patterns for rights issue. We conclude that it is desirable for stakeholders to take into account stability-related indices and more various financial analysis indices for short-term prediction and long-term prediction, respectively. The current study has several limitations. First, we need to compare the differences in accuracy by using different data mining techniques such as neural networks, logistic regression and SVM. Second, we are required to develop and to evaluate new prediction models including variables which research in the theory of capital structure has mentioned about the relevance to rights issue.