• Title/Summary/Keyword: Data prediction

Search Result 9,952, Processing Time 0.037 seconds

Pixel-level prediction of velocity vectors on hull surface based on convolutional neural network (합성곱 신경망 기반 선체 표면 유동 속도의 픽셀 수준 예측)

  • Jeongbeom Seo;Dayeon Kim;Inwon Lee
    • Journal of the Korean Society of Visualization
    • /
    • v.21 no.1
    • /
    • pp.18-25
    • /
    • 2023
  • In these days, high dimensional data prediction technology based on neural network shows compelling results in many different kind of field including engineering. Especially, a lot of variants of convolution neural network are widely utilized to develop pixel level prediction model for high dimensional data such as picture, or physical field value from the sensors. In this study, velocity vector field of ideal flow on ship surface is estimated on pixel level by Unet. First, potential flow analysis was conducted for the set of hull form data which are generated by hull form transformation method. Thereafter, four different neural network with a U-shape structure were conFig.d to train velocity vectors at the node position of pre-processed hull form data. As a result, for the test hull forms, it was confirmed that the network with short skip-connection gives the most accurate prediction results of streamlines and velocity magnitude. And the results also have a good agreement with potential flow analysis results. However, in some cases which don't have nothing in common with training data in terms of speed or shape, the network has relatively high error at the region of large curvature.

Bankruptcy Prediction Modeling Using Qualitative Information Based on Big Data Analytics (빅데이터 기반의 정성 정보를 활용한 부도 예측 모형 구축)

  • Jo, Nam-ok;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.2
    • /
    • pp.33-56
    • /
    • 2016
  • Many researchers have focused on developing bankruptcy prediction models using modeling techniques, such as statistical methods including multiple discriminant analysis (MDA) and logit analysis or artificial intelligence techniques containing artificial neural networks (ANN), decision trees, and support vector machines (SVM), to secure enhanced performance. Most of the bankruptcy prediction models in academic studies have used financial ratios as main input variables. The bankruptcy of firms is associated with firm's financial states and the external economic situation. However, the inclusion of qualitative information, such as the economic atmosphere, has not been actively discussed despite the fact that exploiting only financial ratios has some drawbacks. Accounting information, such as financial ratios, is based on past data, and it is usually determined one year before bankruptcy. Thus, a time lag exists between the point of closing financial statements and the point of credit evaluation. In addition, financial ratios do not contain environmental factors, such as external economic situations. Therefore, using only financial ratios may be insufficient in constructing a bankruptcy prediction model, because they essentially reflect past corporate internal accounting information while neglecting recent information. Thus, qualitative information must be added to the conventional bankruptcy prediction model to supplement accounting information. Due to the lack of an analytic mechanism for obtaining and processing qualitative information from various information sources, previous studies have only used qualitative information. However, recently, big data analytics, such as text mining techniques, have been drawing much attention in academia and industry, with an increasing amount of unstructured text data available on the web. A few previous studies have sought to adopt big data analytics in business prediction modeling. Nevertheless, the use of qualitative information on the web for business prediction modeling is still deemed to be in the primary stage, restricted to limited applications, such as stock prediction and movie revenue prediction applications. Thus, it is necessary to apply big data analytics techniques, such as text mining, to various business prediction problems, including credit risk evaluation. Analytic methods are required for processing qualitative information represented in unstructured text form due to the complexity of managing and processing unstructured text data. This study proposes a bankruptcy prediction model for Korean small- and medium-sized construction firms using both quantitative information, such as financial ratios, and qualitative information acquired from economic news articles. The performance of the proposed method depends on how well information types are transformed from qualitative into quantitative information that is suitable for incorporating into the bankruptcy prediction model. We employ big data analytics techniques, especially text mining, as a mechanism for processing qualitative information. The sentiment index is provided at the industry level by extracting from a large amount of text data to quantify the external economic atmosphere represented in the media. The proposed method involves keyword-based sentiment analysis using a domain-specific sentiment lexicon to extract sentiment from economic news articles. The generated sentiment lexicon is designed to represent sentiment for the construction business by considering the relationship between the occurring term and the actual situation with respect to the economic condition of the industry rather than the inherent semantics of the term. The experimental results proved that incorporating qualitative information based on big data analytics into the traditional bankruptcy prediction model based on accounting information is effective for enhancing the predictive performance. The sentiment variable extracted from economic news articles had an impact on corporate bankruptcy. In particular, a negative sentiment variable improved the accuracy of corporate bankruptcy prediction because the corporate bankruptcy of construction firms is sensitive to poor economic conditions. The bankruptcy prediction model using qualitative information based on big data analytics contributes to the field, in that it reflects not only relatively recent information but also environmental factors, such as external economic conditions.

Effect of Heterogeneous Variance by Sex and Genotypes by Sex Interaction on EBVs of Postweaning Daily Gain of Angus Calves

  • Oikawa, T.;Hammond, K.;Tier, B.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.12 no.6
    • /
    • pp.850-853
    • /
    • 1999
  • Angus postweaning daily gain (PWDG) was analyzed to investigate effects of the heterogeneous variance and the genotypes by sex interaction on prediction of EBVs with data sets of various environmental levels. A whole data (16,239 records) was divided into six data sets according to averages of the best linear unbiased estimator (BLUE) of herd environment. The results comparing prediction models showed that single-trait model is adequate for most of the data sets except for the data set of poor environment for both of the bulls and the heifers where the heterogeneity of variance and the genotypes by sex interaction exists. In the prediction with the data set of the low environment level, the bull's EBVs by single-trait models had high product moment correlations with male EBVs of the bulls by the multitrait model. Whereas the heifer's EBVs had moderate correlations with female EBVs by the multitrait model. This moderate correlation seems to be resulted by the heterogeneity of variance and low heritability of the heifer's PWDG. The prediction models with heterogeneity of variance had little effect on the prediction of EBVs for the data sets with moderate to high genetic correlations.

Efficient Training Data Construction Scheme for Prediction of Transferring Students

  • Lee, Ji-Young;Song, Gyu-Moon;Kim, Tae-Yoon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.14 no.3
    • /
    • pp.481-488
    • /
    • 2003
  • Kim et al.(2003) studied a prediction model for students likely to transfer. In their study they claim that a training data construction scheme is better than other schemes, which trains neural network on the data from the year right before prediction year. One problem with their claim is that it is based on rather high prediction error rate. In this paper we establish a more sound comparison for various training data construction schemes and check validity of their claim. It turns out that the favored scheme has sufficient advantages over other schemes.

  • PDF

A Study on Data Availability Improvement using Mobility Prediction Technique with Location Information (위치 정보와 이동 예측 기법을 이용한 데이터 가용성 향상에 관한 연구)

  • Yang, Hwan Seok
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.8 no.4
    • /
    • pp.143-149
    • /
    • 2012
  • MANET is a network that is a very useful application to build network environment in difficult situation to build network infrastructure. But, nodes that configures MANET have difficulties in data retrieval owing to resources which aren't enough and mobility. Therefore, caching scheme is required to improve accessibility and availability for frequently accessed data. In this paper, we proposed a technique that utilize mobility prediction of nodes to retrieve quickly desired information and improve data availability. Mobility prediction of modes is performed through distance calculation using location information. We used technique which global cluster table and local member table is managed by cluster head to reduce data consistency and query latency time. We compared COCA and CacheData and experimented to confirm performance of proposed scheme in this paper and efficiency of the proposed technique through experience was confirmed.

Basic Study on Safety Accident Prediction Model Using Random Forest in Construction Field (랜덤 포레스트 기법을 이용한 건설현장 안전재해 예측 모형 기초 연구)

  • Kang, Kyung-Su;Ryu, Han-Guk
    • Proceedings of the Korean Institute of Building Construction Conference
    • /
    • 2018.11a
    • /
    • pp.59-60
    • /
    • 2018
  • The purpose of this study is to predict and classify the accident types based on the KOSHA (Korea Occupational Safety & Health Agency) and weather data. We also have an effort to suggest an important management method according to accident types by deriving feature importance. We designed two models based on accident data and weather data (model(a)) and only weather data (model(b)). As a result of random forest method, the model(b) showed a lack of accuracy in prediction. However, the model(a) presented more accurate prediction results than the model(b). Thus we presented safety management plan based on the results. In the future, this study will continue to carry out real time prediction to occurrence types to prevent safety accidents by supplementing the real time accident data and weather data.

  • PDF

Optimizing Artificial Neural Network-Based Models to Predict Rice Blast Epidemics in Korea

  • Lee, Kyung-Tae;Han, Juhyeong;Kim, Kwang-Hyung
    • The Plant Pathology Journal
    • /
    • v.38 no.4
    • /
    • pp.395-402
    • /
    • 2022
  • To predict rice blast, many machine learning methods have been proposed. As the quality and quantity of input data are essential for machine learning techniques, this study develops three artificial neural network (ANN)-based rice blast prediction models by combining two ANN models, the feed-forward neural network (FFNN) and long short-term memory, with diverse input datasets, and compares their performance. The Blast_Weathe long short-term memory r_FFNN model had the highest recall score (66.3%) for rice blast prediction. This model requires two types of input data: blast occurrence data for the last 3 years and weather data (daily maximum temperature, relative humidity, and precipitation) between January and July of the prediction year. This study showed that the performance of an ANN-based disease prediction model was improved by applying suitable machine learning techniques together with the optimization of hyperparameter tuning involving input data. Moreover, we highlight the importance of the systematic collection of long-term disease data.

A Real-Time Data Mining for Stream Data Sets (연속발생 데이터를 위한 실시간 데이터 마이닝 기법)

  • Kim Jinhwa;Min Jin Young
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.29 no.4
    • /
    • pp.41-60
    • /
    • 2004
  • A stream data is a data set that is accumulated to the data storage from a data source over time continuously. The size of this data set, in many cases. becomes increasingly large over time. To mine information from this massive data. it takes much resource such as storage, memory and time. These unique characteristics of the stream data make it difficult and expensive to use this large size data accumulated over time. Otherwise. if we use only recent or part of a whole data to mine information or pattern. there can be loss of information. which may be useful. To avoid this problem. we suggest a method that efficiently accumulates information. in the form of rule sets. over time. It takes much smaller storage compared to traditional mining methods. These accumulated rule sets are used as prediction models in the future. Based on theories of ensemble approaches. combination of many prediction models. in the form of systematically merged rule sets in this study. is better than one prediction model in performance. This study uses a customer data set that predicts buying power of customers based on their information. This study tests the performance of the suggested method with the data set alone with general prediction methods and compares performances of them.

Gradient Descent Training Method for Optimizing Data Prediction Models (데이터 예측 모델 최적화를 위한 경사하강법 교육 방법)

  • Hur, Kyeong
    • Journal of Practical Engineering Education
    • /
    • v.14 no.2
    • /
    • pp.305-312
    • /
    • 2022
  • In this paper, we focused on training to create and optimize a basic data prediction model. And we proposed a gradient descent training method of machine learning that is widely used to optimize data prediction models. It visually shows the entire operation process of gradient descent used in the process of optimizing parameter values required for data prediction models by applying the differential method and teaches the effective use of mathematical differentiation in machine learning. In order to visually explain the entire operation process of gradient descent, we implement gradient descent SW in a spreadsheet. In this paper, first, a two-variable gradient descent training method is presented, and the accuracy of the two-variable data prediction model is verified by comparison with the error least squares method. Second, a three-variable gradient descent training method is presented and the accuracy of a three-variable data prediction model is verified. Afterwards, the direction of the optimization practice for gradient descent was presented, and the educational effect of the proposed gradient descent method was analyzed through the results of satisfaction with education for non-majors.

Prediction Model of Real Estate Transaction Price with the LSTM Model based on AI and Bigdata

  • Lee, Jeong-hyun;Kim, Hoo-bin;Shim, Gyo-eon
    • International Journal of Advanced Culture Technology
    • /
    • v.10 no.1
    • /
    • pp.274-283
    • /
    • 2022
  • Korea is facing a number difficulties arising from rising housing prices. As 'housing' takes the lion's share in personal assets, many difficulties are expected to arise from fluctuating housing prices. The purpose of this study is creating housing price prediction model to prevent such risks and induce reasonable real estate purchases. This study made many attempts for understanding real estate instability and creating appropriate housing price prediction model. This study predicted and validated housing prices by using the LSTM technique - a type of Artificial Intelligence deep learning technology. LSTM is a network in which cell state and hidden state are recursively calculated in a structure which added cell state, which is conveyor belt role, to the existing RNN's hidden state. The real sale prices of apartments in autonomous districts ranging from January 2006 to December 2019 were collected through the Ministry of Land, Infrastructure, and Transport's real sale price open system and basic apartment and commercial district information were collected through the Public Data Portal and the Seoul Metropolitan City Data. The collected real sale price data were scaled based on monthly average sale price and a total of 168 data were organized by preprocessing respective data based on address. In order to predict prices, the LSTM implementation process was conducted by setting training period as 29 months (April 2015 to August 2017), validation period as 13 months (September 2017 to September 2018), and test period as 13 months (December 2018 to December 2019) according to time series data set. As a result of this study for predicting 'prices', there have been the following results. Firstly, this study obtained 76 percent of prediction similarity. We tried to design a prediction model of real estate transaction price with the LSTM Model based on AI and Bigdata. The final prediction model was created by collecting time series data, which identified the fact that 76 percent model can be made. This validated that predicting rate of return through the LSTM method can gain reliability.