• Title/Summary/Keyword: cost prediction

Search Result 1,037, Processing Time 0.026 seconds

Preliminary Inspection Prediction Model to select the on-Site Inspected Foreign Food Facility using Multiple Correspondence Analysis (차원축소를 활용한 해외제조업체 대상 사전점검 예측 모형에 관한 연구)

  • Hae Jin Park;Jae Suk Choi;Sang Goo Cho
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.1
    • /
    • pp.121-142
    • /
    • 2023
  • As the number and weight of imported food are steadily increasing, safety management of imported food to prevent food safety accidents is becoming more important. The Ministry of Food and Drug Safety conducts on-site inspections of foreign food facilities before customs clearance as well as import inspection at the customs clearance stage. However, a data-based safety management plan for imported food is needed due to time, cost, and limited resources. In this study, we tried to increase the efficiency of the on-site inspection by preparing a machine learning prediction model that pre-selects the companies that are expected to fail before the on-site inspection. Basic information of 303,272 foreign food facilities and processing businesses collected in the Integrated Food Safety Information Network and 1,689 cases of on-site inspection information data collected from 2019 to April 2022 were collected. After preprocessing the data of foreign food facilities, only the data subject to on-site inspection were extracted using the foreign food facility_code. As a result, it consisted of a total of 1,689 data and 103 variables. For 103 variables, variables that were '0' were removed based on the Theil-U index, and after reducing by applying Multiple Correspondence Analysis, 49 characteristic variables were finally derived. We build eight different models and perform hyperparameter tuning through 5-fold cross validation. Then, the performance of the generated models are evaluated. The research purpose of selecting companies subject to on-site inspection is to maximize the recall, which is the probability of judging nonconforming companies as nonconforming. As a result of applying various algorithms of machine learning, the Random Forest model with the highest Recall_macro, AUROC, Average PR, F1-score, and Balanced Accuracy was evaluated as the best model. Finally, we apply Kernal SHAP (SHapley Additive exPlanations) to present the selection reason for nonconforming facilities of individual instances, and discuss applicability to the on-site inspection facility selection system. Based on the results of this study, it is expected that it will contribute to the efficient operation of limited resources such as manpower and budget by establishing an imported food management system through a data-based scientific risk management model.

The Prognostic Value of the Seventh Day APACHE III Score in Medical Intensive Care Unit (내과계 중환자들의 예후 판정에 었어서 제 7병일 APACHE III 점수의 임상적 유용성)

  • Kim, Mi-Ok;Yun, Soo-Mi;Park, Eun-Joo;Sohn, Jang-Won;Yang, Seok-Chul;Yoon, Ho-Joo;Shin, Dong-Ho;Park, Sung-Soo
    • Tuberculosis and Respiratory Diseases
    • /
    • v.50 no.2
    • /
    • pp.236-244
    • /
    • 2001
  • Background : Most current research using prognostic scoring systems in critically ill patients have focused on prediction using the first intensive care unit (ICU) day data or daily updated data. Usually the mean ICU length of stay in Korea is longer than in the western world. Consequently, a more cost-effective and practical prognostic parameter is required. The principal aim of this study was to assess the prognostic value of the seventh day(7th day : the average mean ICU length of stay) APACHE III score in a medical intensive care unit. Methods : 241 medical ICU patients from July 1997 to April 1998 were enrolled. The 1st and 7th scores were measured by using the APACHE III scoring system and compared between survivors and non-survivors. Logistic regression analysis was performed to determine the relationship between the $1^{st}$ and $7^{th}$ APACHE III scores and the mortality risk. Results : 1 )The mean length of stay in the ICU was $10.3{\pm}13.8$ days. 2)The mean $1^{st}$ and $7^{th}$ day APACHE III scores were $59.7{\pm}30.9$ and $37.9{\pm}27.7$. 3) The mean $1^{st}$ day APACHE III score was significantly lower in survivors than in non- survivors($49.9{\pm}23.8$ vs $86.3{\pm}32.3$, P<0.0001). 4)The mean $7^{th}$ day APACHE III score was significantly lower in survivors than in non- survivors($30.1{\pm}18.5$ vs $80.1{\pm}30.4$, P<0.0001). 5)The odds ratios among the $1^{st}$ and $7^{th}$ day APACHE III scores and the mortality rate were 1.0507 and 1.0779 respectively. Conclusion : These results suggest that the seventh day APACHE III score is as useful in predicting the outcome as is such like the first day APACHE III score. Therefore, in comparison to the daily APACHE III score, measuring the $1^{st}$ and $7^{th}$ day APACHE III scores are also useful for predicting the prognosis of critically ill patients in terms of cost-effectiveness. It is suggested that the $7^{th}$ day APACHE III score is useful for predicting the clinical outcome.

  • PDF

A Prediction of N-value Using Artificial Neural Network (인공신경망을 이용한 N치 예측)

  • Kim, Kwang Myung;Park, Hyoung June;Goo, Tae Hun;Kim, Hyung Chan
    • The Journal of Engineering Geology
    • /
    • v.30 no.4
    • /
    • pp.457-468
    • /
    • 2020
  • Problems arising during pile design works for plant construction, civil and architecture work are mostly come from uncertainty of geotechnical characteristics. In particular, obtaining the N-value measured through the Standard Penetration Test (SPT) is the most important data. However, it is difficult to obtain N-value by drilling investigation throughout the all target area. There are many constraints such as licensing, time, cost, equipment access and residential complaints etc. it is impossible to obtain geotechnical characteristics through drilling investigation within a short bidding period in overseas. The geotechnical characteristics at non-drilling investigation points are usually determined by the engineer's empirical judgment, which can leads to errors in pile design and quantity calculation causing construction delay and cost increase. It would be possible to overcome this problem if N-value could be predicted at the non-drilling investigation points using limited minimum drilling investigation data. This study was conducted to predicted the N-value using an Artificial Neural Network (ANN) which one of the Artificial intelligence (AI) method. An Artificial Neural Network treats a limited amount of geotechnical characteristics as a biological logic process, providing more reliable results for input variables. The purpose of this study is to predict N-value at the non-drilling investigation points through patterns which is studied by multi-layer perceptron and error back-propagation algorithms using the minimum geotechnical data. It has been reviewed the reliability of the values that predicted by AI method compared to the measured values, and we were able to confirm the high reliability as a result. To solving geotechnical uncertainty, we will perform sensitivity analysis of input variables to increase learning effect in next steps and it may need some technical update of program. We hope that our study will be helpful to design works in the future.

Corporate Credit Rating based on Bankruptcy Probability Using AdaBoost Algorithm-based Support Vector Machine (AdaBoost 알고리즘기반 SVM을 이용한 부실 확률분포 기반의 기업신용평가)

  • Shin, Taek-Soo;Hong, Tae-Ho
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.3
    • /
    • pp.25-41
    • /
    • 2011
  • Recently, support vector machines (SVMs) are being recognized as competitive tools as compared with other data mining techniques for solving pattern recognition or classification decision problems. Furthermore, many researches, in particular, have proved them more powerful than traditional artificial neural networks (ANNs) (Amendolia et al., 2003; Huang et al., 2004, Huang et al., 2005; Tay and Cao, 2001; Min and Lee, 2005; Shin et al., 2005; Kim, 2003).The classification decision, such as a binary or multi-class decision problem, used by any classifier, i.e. data mining techniques is so cost-sensitive particularly in financial classification problems such as the credit ratings that if the credit ratings are misclassified, a terrible economic loss for investors or financial decision makers may happen. Therefore, it is necessary to convert the outputs of the classifier into wellcalibrated posterior probabilities-based multiclass credit ratings according to the bankruptcy probabilities. However, SVMs basically do not provide such probabilities. So it required to use any method to create the probabilities (Platt, 1999; Drish, 2001). This paper applied AdaBoost algorithm-based support vector machines (SVMs) into a bankruptcy prediction as a binary classification problem for the IT companies in Korea and then performed the multi-class credit ratings of the companies by making a normal distribution shape of posterior bankruptcy probabilities from the loss functions extracted from the SVMs. Our proposed approach also showed that their methods can minimize the misclassification problems by adjusting the credit grade interval ranges on condition that each credit grade for credit loan borrowers has its own credit risk, i.e. bankruptcy probability.

Determination of Grades and Design Strengths of Machine Graded Lumber in Korea (국내 기계등급구조재의 등급구분체계 및 기준설계값 결정방법 연구)

  • Hong, Jung-Pyo;Lee, Jun-Jae;Park, Moon-Jae;Yeo, Hwanmyeong;Pang, Sung-Jun;Kim, Chul-Ki;Oh, Jung-Kwon
    • Journal of the Korean Wood Science and Technology
    • /
    • v.43 no.4
    • /
    • pp.446-455
    • /
    • 2015
  • Based on comparative studies on standards and grading procedures of machine graded lumber in Korea and other countries, this study proposed a procedure of determining the grade classification and design strengths of domestic machine graded lumber. Differences between machine stress rated lumber and E-rated laminations were detailed in order to clarify the need for the procedure improvement. To this improvement the use of average MOE requirement for grading was introduced instead of the fixed minimum MOE requirement which is currently used in the Korean standards. It was found that the fixed minimum MOE requirement method was easier for an inspector to grade but, less efficient as a strength predictor than the average MOE requirement method. The advantage of average MOE requirement method is statistically MOR-MOE regression-based MOR prediction and highly efficient in quality control though it requires a computer-aided operation system in an initial setup. A major weakness of the current Korean grading system was found that different strength characteristics depending on wood species were not reflected on the grade classification and the tabulated allowable design stress. The proposed procedures were developed taking advantages of respective merits of both methods and based on MOR-MOE regression analysis. Through this procedure, the grades of machine stress rated lumber should be revised to become interchangeable with E-rated lamination, which would be beneficial to the cost competitiveness of domestic machine graded lumber and glued laminated timber industry.

Analysis of Environmental Design Data for Growing Pleurotus ervngii (큰 느타리버섯 재배사의 환경설계용 자료 분석)

  • Yoon, Yong-Cheol;Suh, Won-Myung;Lee, In-Bok
    • Journal of Bio-Environment Control
    • /
    • v.14 no.2
    • /
    • pp.95-105
    • /
    • 2005
  • This study was carried out to file up using effect and requirement of energy for environmental design data of Pleurotus eryngii growing houses. Heating and cooling Degree-Hour (D-H) were calculated and compared for. some Pleurotus eryngii growing houses of sandwich-panel (permanent) o. arch-roofed(simple) type structures modified and suggested through field survey and analysis. Also thermal resistance (R-value) was calculated for the heat insulating and covering materials of the permanent and simple-type, which were made of polyurethane or polystyrene panel and $7\~8$ layers heat conservation cover wall. The variations of heating and cooling D-H simulated for Jinju area was nearly linearly proportional to the setting inside temperatures. The variations of cooling D-H was much more sensitive than those of heating D-H. Therefore, it was expected that the variations of required energy in accordance with setting temperature or actual temperature maintained inside of the cultivation house could be estimated and also the estimated results of heating and cooling D-H could be effectively used far the verification of environmental simulation as well as for the calculation of required energy amounts. When the cultivation floor areas are all equal, panel type houses to be constructed by various combinations of materials were found to by far more effective than simple type pipe house in the aspect of energy conservation maintenance except some additional cost invested initially. And also the energy effectiveness of multi-span house compared to single span together with the prediction of energy requirement depending on the level insulated for the wall and roof area could be estimated. Additionally, structural as well as environmental optimizations are expected to be possible by calculating periodical and/or seasonal energy requirements for those various combinations of insulation level and different climate conditions, etc.

Analysis of the Elderly Travel Characteristics and Travel Behavior with Daily Activity Schedules (the Case of Seoul, Korea) (활동 스케줄 분석을 통한 고령자의 통행특성과 통행행태에 관한 연구)

  • Seo, Sang-Eon;Jeong, Jin-Hyeok;Kim, Sun-Gwan
    • Journal of Korean Society of Transportation
    • /
    • v.24 no.5 s.91
    • /
    • pp.89-108
    • /
    • 2006
  • Korea has been entering the ageing society as the population of age over 65 shared over 7% since the year 2000. The ageing society needs to have transportation facility considering elderly people's travel behavior. This study aims to understand the elderly people's travel behavior using recent data in Korea. The activity schedule approach begins with travel outcomes are part of an activitv scheduling decision. For tho?e approach. used discrete choice models (especially. Nested Logit Model) to address the basic modeling problem capturing decision interaction among the many choice dimensions of the immense activity schedule choice set The day activity schedule is viewed as a sot of tours and at-home activity episodes tied togather with overarching day activity pattern using the Seoul Metropolitan Area Transportation Survey data, which was conducted in June, 2002. Decisions about a specific tour in the schedule are conditioned by the choice of day activity pattern. The day activity scheduling model estimated in this study consists of tours interrelated in a day activity pattern. The day activity pattern model represents the basic decision of activity participation and priorities and places each activity in a configuration of tours and at-home episodes. Each pattern alternative is defined by the primary activity of the day, whether the primary activity occurs at home or away, and the type of tour for the primary activity. In travel mode choice of the elderly and non-workers, especially, travel cost was found to be important in understanding interpersonal variations in mode choice behavior though, travel time was found to be less important factor in choosing travel mode. In addition, although, generally, the elderly was likely to choose transit mode, private mode was preferred for the elderly over 75 years old owing to weakened physical health for such things as going up and down of stairs. Therefore. as entering the ageing society, transit mode should be invested heavily in transportation facility Planning tor improving elderly transportation service. Although the model has not yet been validated in before-and-after prediction studies. this study gives strong evidence of its behavioral soundness, current practicality. and potential for improving reliability of transportation Projects superior to those of the best existing systems in Korea.

Risk Ranking Analysis for the City-Gas Pipelines in the Underground Laying Facilities (지하매설물 중 도시가스 지하배관에 대한 위험성 서열화 분석)

  • Ko, Jae-Sun;Kim, Hyo
    • Fire Science and Engineering
    • /
    • v.18 no.1
    • /
    • pp.54-66
    • /
    • 2004
  • In this article, we are to suggest the hazard-assessing method for the underground pipelines, and find out the pipeline-maintenance schemes of high efficiency in cost. Three kinds of methods are applied in order to refer to the approaching methods of listing the hazards for the underground pipelines: the first is RBI(Risk Based Inspection), which firstly assess the effect of the neighboring population, the dimension, thickness of pipe, and working time. It enables us to estimate quantitatively the risk exposure. The second is the scoring system which is based on the environmental factors of the buried pipelines. Last we quantify the frequency of the releases using the present THOMAS' theory. In this work, as a result of assessing the hazard of it using SPC scheme, the hazard score related to how the gas pipelines erodes indicate the numbers from 30 to 70, which means that the assessing criteria define well the relative hazards of actual pipelines. Therefore. even if one pipeline region is relatively low score, it can have the high frequency of leakage due to its longer length. The acceptable limit of the release frequency of pipeline shows 2.50E-2 to 1.00E-l/yr, from which we must take the appropriate actions to have the consequence to be less than the acceptable region. The prediction of total frequency using regression analysis shows the limit operating time of pipeline is the range of 11 to 13 years, which is well consistent with that of the actual pipeline. Concludingly, the hazard-listing scheme suggested in this research will be very effectively applied to maintaining the underground pipelines.

Estimation and Mapping of Soil Organic Matter using Visible-Near Infrared Spectroscopy (분광학을 이용한 토양 유기물 추정 및 분포도 작성)

  • Choe, Eun-Young;Hong, Suk-Young;Kim, Yi-Hyun;Zhang, Yong-Seon
    • Korean Journal of Soil Science and Fertilizer
    • /
    • v.43 no.6
    • /
    • pp.968-974
    • /
    • 2010
  • We assessed the feasibility of discrete wavelet transform (DWT) applied for the spectral processing to enhance the estimation performance quality of soil organic matters using visible-near infrared spectra and mapped their distribution via block Kriging model. Continuum-removal and $1^{st}$ derivative transform as well as Haar and Daubechies DWT were used to enhance spectral variation in terms of soil organic matter contents and those spectra were put into the PLSR (Partial Least Squares Regression) model. Estimation results using raw reflectance and transformed spectra showed similar quality with $R^2$ > 0.6 and RPD> 1.5. These values mean the approximation prediction on soil organic matter contents. The poor performance of estimation using DWT spectra might be caused by coarser approximation of DWT which not enough to express spectral variation based on soil organic matter contents. The distribution maps of soil organic matter were drawn via a spatial information model, Kriging. Organic contents of soil samples made Gaussian distribution centered at around 20 g $kg^{-1}$ and the values in the map were distributed with similar patterns. The estimated organic matter contents had similar distribution to the measured values even though some parts of estimated value map showed slightly higher. If the estimation quality is improved more, estimation model and mapping using spectroscopy may be applied in global soil mapping, soil classification, and remote sensing data analysis as a rapid and cost-effective method.

A Long-term Variability of the Extent of East Asian Desert (동아시아 사막 면적의 경년변화분석)

  • Han, Hyeon-Gyeong;Lee, Eunkyung;Son, Sanghun;Choi, Sungwon;Lee, Kyeong-Sang;Seo, Minji;Jin, Donghyun;Kim, Honghee;Kwon, Chaeyoung;Lee, Darae;Han, Kyung-Soo
    • Korean Journal of Remote Sensing
    • /
    • v.34 no.6_1
    • /
    • pp.869-877
    • /
    • 2018
  • The area of desert in East Asia is increasing every year, and it cause a great cost of social damage. Because desert is widely distributed and it is difficult to approach people, remote sensing using satellites is commonly used. But the study of desert area comparison is insufficient which is calculated by satellite sensor. It is important to recognize the characteristics of the desert area data that are calculated for each sensor because the desert area calculated according to the selection of the sensor may be different and may affect the climate prediction and desertification prevention measures. In this study, the desert area of Northeast Asia in 2001-2013 was calculated and compared using Moderate Resolution Imaging Spectroradiometer (MODIS) and Vegetation. As a result of the comparison, the desert area of Vegetation increased by $3,020km^2/year$, while in the case of MODIS, it decreased by $20,911km^2/year$. We performed indirect validation because It is difficult to obtain actual data. We analyzed the correlation with the occurrence frequency of Asian dust affected by desert area change. As a result, MODIS showed a relatively low correlation with R = 0.2071 and Vegetation had a relatively high correlation with R = 0.4837. It is considered that Vegetation performed more accurate desert area calculation in Northeast Asian desert area.