• 제목/요약/키워드: Prediction of variables

검색결과 1,845건 처리시간 0.029초

Development and Comparison of Data Mining-based Prediction Models of Building Fire Probability

  • 홍성관;정승렬
    • 인터넷정보학회논문지
    • /
    • 제19권6호
    • /
    • pp.101-112
    • /
    • 2018
  • A lot of manpower and budgets are being used to prevent fires, and only a small portion of the data generated during this process is used for disaster prevention activities. This study develops a prediction model of fire occurrence probability based on data mining in order to more actively use these data for disaster prevention activities. For this purpose, variables for predicting fire occurrence probability of various buildings were selected and data of construction administrative system, national fire information system, and Korea Fire Insurance Association were collected and integrated data set was constructed. After appropriate data cleansing and preprocessing, various data mining methodologies such as artificial neural network, decision trees, SVM, and Naive Bayesian were used to develop a prediction model of the fire occurrence probability of buildings. The most accurate model among the derived models is Linear SVM model which shows 68.42% as experimental data and 63.54% as verification data and it is the best model to predict fire occurrence probability of buildings. As this study develops the prediction model which uses only the set values of the specific ranges, future studies may explore more opportunites to use various setting values not shown in this study.

Water Demand Forecasting by Characteristics of City Using Principal Component and Cluster Analyses

  • Choi, Tae-Ho;Kwon, O-Eun;Koo, Ja-Yong
    • Environmental Engineering Research
    • /
    • 제15권3호
    • /
    • pp.135-140
    • /
    • 2010
  • With the various urban characteristics of each city, the existing water demand prediction, which uses average liter per capita day, cannot be used to achieve an accurate prediction as it fails to consider several variables. Thus, this study considered social and industrial factors of 164 local cities, in addition to population and other directly influential factors, and used main substance and cluster analyses to develop a more efficient water demand prediction model that considers unique localities of each city. After clustering, a multiple regression model was developed that proved that the $R^2$ value of the inclusive multiple regression model was 0.59; whereas, those of Clusters A and B were 0.62 and 0.74, respectively. Thus, the multiple regression model was considered more reasonable and valid than the inclusive multiple regression model. In summary, the water demand prediction model using principal component and cluster analyses as the standards to classify localities has a better modification coefficient than that of the inclusive multiple regression model, which does not consider localities.

인공신경망을 활용한 고등어의 위판가격 변동 예측 -어획량 제한이 없었던 TAC제도 시행 이전의 경우- (Forecasting common mackerel auction price by artificial neural network in Busan Cooperative Fish Market before introducing TAC system in Korea)

  • 황강석;최정화;오택윤
    • 수산해양기술연구
    • /
    • 제48권1호
    • /
    • pp.72-81
    • /
    • 2012
  • Using artificial neural network (ANN) technique, auction prices for common mackerel were forecasted with the daily total sale and auction price data at the Busan Cooperative Fish Market before introducing Total Allowable Catch (TAC) system, when catch data had no limit in Korea. Virtual input data produced from actual data were used to improve the accuracy of prediction and the suitable neural network was induced for the prediction. We tested 35 networks to be retained 10, and found good performance network with regression ratio of 0.904 and determination coefficient of 0.695. There were significant variations between training and verification errors in this network. Ideally, it should require more training cases to avoid over-learning, which leads to improve performance and makes the results more reliable. And the precision of prediction was improved when environmental factors including physical and biological variables were added. This network for prediction of price and catch was considered to be applicable for other fishes.

Utilization of Electrical Conductivity to Improve Prediction Accuracy of Cooking Loss of Pork Loin

  • Kyung Jo;Seonmin Lee;Hyun Gyung Jeong;Dae-Hyun Lee;Sangwon Yoon;Yoonji Chung;Samooel Jung
    • 한국축산식품학회지
    • /
    • 제43권1호
    • /
    • pp.113-123
    • /
    • 2023
  • This study investigated the predictability of cooking loss of pork loin through relatively easy and quick measurable quality properties. The pH, color, moisture, protein content, and cooking loss of 100 pork loins were measured. The explanatory variables included in all linear regression models with an adjust-r2 value of ≥0.5 were pH and the protein content. In the linear regression model predicting cooking loss, the highest adjust-r2 value was 0.7, with pH, CIE L*, CIE b*, moisture, and protein content as the explanatory variables. In 30 pork loins, electrical conductivity was additionally measured, and as a result of linear regression analysis for predicting cooking loss, the highest adjust-r2 value was 0.646 with electrical conductivity measured at 40 Hz, with pH and color as the explanatory variables. Ordinal logistic regression analysis was performed to predict the three grades (low, middle, and high) of loin cooking loss using pH, color, and 40 Hz electrical conductivity as the explanatory variables, and the percent concordance was 93.8%. In conclusion, the addition of electrical conductivity as an explanatory variable did not increase the prediction accuracy of the linear regression model for predicting cooking loss; however, it was demonstrated that it is possible to predict and classify the cooking loss grade of pork loin through quality properties that can be measured quickly and easily.

시스템다이내믹스를 활용한 중환자실 환자의 비계획적 자가 발관 모델 (Model for Unplanned Self Extubation of ICU Patients Using System Dynamics Approach)

  • 송유길;윤은경
    • 대한간호학회지
    • /
    • 제45권2호
    • /
    • pp.280-292
    • /
    • 2015
  • Purpose: In this study a system dynamics methodology was used to identify correlation and nonlinear feedback structure among factors affecting unplanned extubation (UE) of ICU patients and to construct and verify a simulation model. Methods: Factors affecting UE were identified through a theoretical background established by reviewing literature and preceding studies and referencing various statistical data. Related variables were decided through verification of content validity by an expert group. A causal loop diagram (CLD) was made based on the variables. Stock & Flow modeling using Vensim PLE Plus Version 6.0b was performed to establish a model for UE. Results: Based on the literature review and expert verification, 18 variables associated with UE were identified and CLD was prepared. From the prepared CLD, a model was developed by converting to the Stock & Flow Diagram. Results of the simulation showed that patient stress, patient in an agitated state, restraint application, patient movability, and individual intensive nursing were variables giving the greatest effect to UE probability. To verify agreement of the UE model with real situations, simulation with 5 cases was performed. Equation check and sensitivity analysis on TIME STEP were executed to validate model integrity. Conclusion: Results show that identification of a proper model enables prediction of UE probability. This prediction allows for adjustment of related factors, and provides basic data do develop nursing interventions to decrease UE.

머신러닝 기법을 활용한 대졸 구직자 취업 예측모델에 관한 연구 (Study on the Prediction Model for Employment of University Graduates Using Machine Learning Classification)

  • 이동훈;김태형
    • 한국정보시스템학회지:정보시스템연구
    • /
    • 제29권2호
    • /
    • pp.287-306
    • /
    • 2020
  • Purpose Youth unemployment is a social problem that continues to emerge in Korea. In this study, we create a model that predicts the employment of college graduates using decision tree, random forest and artificial neural network among machine learning techniques and compare the performance between each model through prediction results. Design/methodology/approach In this study, the data processing was performed, including the acquisition of the college graduates' vocational path survey data first, then the selection of independent variables and setting up dependent variables. We use R to create decision tree, random forest, and artificial neural network models and predicted whether college graduates were employed through each model. And at the end, the performance of each model was compared and evaluated. Findings The results showed that the random forest model had the highest performance, and the artificial neural network model had a narrow difference in performance than the decision tree model. In the decision-making tree model, key nodes were selected as to whether they receive economic support from their families, major affiliates, the route of obtaining information for jobs at universities, the importance of working income when choosing jobs and the location of graduation universities. Identifying the importance of variables in the random forest model, whether they receive economic support from their families as important variables, majors, the route to obtaining job information, the degree of irritating feelings for a month, and the location of the graduating university were selected.

청소년의 개인적 변인, 가족 및 학교환경 변인이 또래공격피해에 미치는 영향 (The Individual Variables, Family and School Environmental Variables That Affect Victimization by Peer Aggression among Adolescents)

  • 이영선;이경님
    • 한국생활과학회지
    • /
    • 제13권5호
    • /
    • pp.659-672
    • /
    • 2004
  • This study examines different individual, family, and school environmental variables that affect victimization by peer aggression among adolescents. The sample consists of 868 seventh and eighth graders. Statistics and method for data analysis include Cronbach's alpha, percentage, means, standard deviation, Pearson correlation, multiple regression, and hierarchical regression. The major findings of this study are as follows: First, adolescents, both withdrawn and aggressive, have lower achievement in school work. Boys experience more direct victimization by peer aggression. Adolescents, especially boys, often experience indirect victimization by peer aggression, when they become withdrawn, own lower self-esteem, and have lower achievement in school work. Second, adolescents have more direct victimization by peer aggression when their parents are negligent of them. Also, adolescents seem exposed to indirect victimization by peer aggression when they receive more physical and emotional abuse and negligence from their parents. Third, adolescents experience more victimization by peer aggression-whether it's direct or indirect, when they cannot get adjusted to peer relations and get teachers' supervision. Fourth, as to direct victimization by peer aggression, withdrawal, one of the individual variables, is the most reliable prediction followed by gender, negligence, adaptability in peer relations, aggression, and teacher's supervision in sequence. For indirect victimization by peer aggression, withdrawal is the most reliable prediction followed by adaptability in peer relations, gender, physical and emotional abuse, and negligence in sequence.

  • PDF

Identifying Factors for Corn Yield Prediction Models and Evaluating Model Selection Methods

  • Chang Jiyul;Clay David E.
    • 한국작물학회지
    • /
    • 제50권4호
    • /
    • pp.268-275
    • /
    • 2005
  • Early predictions of crop yields call provide information to producers to take advantages of opportunities into market places, to assess national food security, and to provide early food shortage warning. The objectives of this study were to identify the most useful parameters for estimating yields and to compare two model selection methods for finding the 'best' model developed by multiple linear regression. This research was conducted in two 65ha corn/soybean rotation fields located in east central South Dakota. Data used to develop models were small temporal variability information (STVI: elevation, apparent electrical conductivity $(EC_a)$, slope), large temporal variability information (LTVI : inorganic N, Olsen P, soil moisture), and remote sensing information (green, red, and NIR bands and normalized difference vegetation index (NDVI), green normalized difference vegetation index (GDVI)). Second order Akaike's Information Criterion (AICc) and Stepwise multiple regression were used to develop the best-fitting equations in each system (information groups). The models with $\Delta_i\leq2$ were selected and 22 and 37 models were selected at Moody and Brookings, respectively. Based on the results, the most useful variables to estimate corn yield were different in each field. Elevation and $EC_a$ were consistently the most useful variables in both fields and most of the systems. Model selection was different in each field. Different number of variables were selected in different fields. These results might be contributed to different landscapes and management histories of the study fields. The most common variables selected by AICc and Stepwise were different. In validation, Stepwise was slightly better than AICc at Moody and at Brookings AICc was slightly better than Stepwise. Results suggest that the Alec approach can be used to identify the most useful information and select the 'best' yield models for production fields.

하천의 지형학적 인자와 식생종수의 관계 -한강수계를 중심으로- (Relationship between Stream Geomophological Factors and the Vegetation Abundance - With a Special Reference to the Han River System -)

  • 이광우;김태균;심우경
    • 한국조경학회지
    • /
    • 제30권3호
    • /
    • pp.73-85
    • /
    • 2002
  • The purpose of this study was to develop prediction models for plant species abundance by stream restoration. Generally the stream plant is affected by stream gemophology. So in this study, the relationship between the vegetation abundance and stream gemophology was developed by multiple regression analysis. The stream characteristics utilized in this study were longitudinal slope, transectional slope, micro-landforms through the longitudinal direction, riparian width and geometric mean diameter and biggest diameter of bed material, and cumulated coarse and fine sand weight portion. The Pyungchang River with mountainous watershed and the Kyungan stream and the Bokha stream in the agricultural region were selected and vegetation species abundance and stream characteristics were documented from the site at 2~3km intervals from the upper stream to the lower. The Models for predicting the vegetation abundance were developed by multiple regression analysis using SPSS statistics package. The linear relationship between the dependant(species abundance) and independant(stream characteristics) variables was tested by a graphical method. Longitudinal and transectional slope had a nonlinear relationship with species abundance. In the next step, the independance between the independant variables was tested and the correlation between independant and dependant variables was tested by the Pearson bivariate correlation test. The selected independant variables were transectional slope, riparian width, and cumulated fine sand weight portion. From the multiple regression analysis, the $R^2$for the Pyungchang river, Kyungan stream, Bokga stream were 0.651, 0.512 and 0.240 respectively. The natural stream configuration in the Pyungchang river had the best result and the lower $R^2$for Kyunan and Bokha stream were due to human impact which disturbed the natural ecosystem. The lowest $R^2$for the Bokha stream was due to the shifting sandy bed. If the stream bed is fugitive, the prediction model may not be valid. Using the multiple regression models, the vegetation abundance could be predicted with stream characteristics such as, transection slope, riaparian width, cumulated fine sand weigth portion, after stream restoration.

숙박시설 냉방 시스템의 최적 작동 시점 예측 모델 개발을 위한 입력 변수 선정 (Input Variable Decision of the Predictive Model for the Optimal Starting Moment of the Cooling System in Accommodations)

  • 백용규;윤연주;문진우
    • KIEAE Journal
    • /
    • 제15권4호
    • /
    • pp.105-110
    • /
    • 2015
  • Purpose: This study aimed at finding the optimal input variables of the artificial neural network-based predictive model for the optimal controls of the indoor temperature environment. By applying the optimal input variables to the predictive model, the required time for restoring the current indoor temperature during the setback period to the normal setpoint temperature can be more precisely calculated for the cooling season. The precise prediction results will support the advanced operation of the cooling system to condition the indoor temperature comfortably in a more energy-efficient manner. Method: Two major steps employing the numerical computer simulation method were conducted for developing an ANN model and finding the optimal input variables. In the first process, the initial ANN model was intuitively determined to have input neurons that seemed to have a relationship with the output neuron. The second process was conducted for finding the statistical relationship between the initial input variables and output variable. Result: Based on the statistical analysis, the optimal input variables were determined.