• Title/Summary/Keyword: Prediction Algorithms

Search Result 996, Processing Time 0.028 seconds

Preliminary Inspection Prediction Model to select the on-Site Inspected Foreign Food Facility using Multiple Correspondence Analysis (차원축소를 활용한 해외제조업체 대상 사전점검 예측 모형에 관한 연구)

  • Hae Jin Park;Jae Suk Choi;Sang Goo Cho
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.1
    • /
    • pp.121-142
    • /
    • 2023
  • As the number and weight of imported food are steadily increasing, safety management of imported food to prevent food safety accidents is becoming more important. The Ministry of Food and Drug Safety conducts on-site inspections of foreign food facilities before customs clearance as well as import inspection at the customs clearance stage. However, a data-based safety management plan for imported food is needed due to time, cost, and limited resources. In this study, we tried to increase the efficiency of the on-site inspection by preparing a machine learning prediction model that pre-selects the companies that are expected to fail before the on-site inspection. Basic information of 303,272 foreign food facilities and processing businesses collected in the Integrated Food Safety Information Network and 1,689 cases of on-site inspection information data collected from 2019 to April 2022 were collected. After preprocessing the data of foreign food facilities, only the data subject to on-site inspection were extracted using the foreign food facility_code. As a result, it consisted of a total of 1,689 data and 103 variables. For 103 variables, variables that were '0' were removed based on the Theil-U index, and after reducing by applying Multiple Correspondence Analysis, 49 characteristic variables were finally derived. We build eight different models and perform hyperparameter tuning through 5-fold cross validation. Then, the performance of the generated models are evaluated. The research purpose of selecting companies subject to on-site inspection is to maximize the recall, which is the probability of judging nonconforming companies as nonconforming. As a result of applying various algorithms of machine learning, the Random Forest model with the highest Recall_macro, AUROC, Average PR, F1-score, and Balanced Accuracy was evaluated as the best model. Finally, we apply Kernal SHAP (SHapley Additive exPlanations) to present the selection reason for nonconforming facilities of individual instances, and discuss applicability to the on-site inspection facility selection system. Based on the results of this study, it is expected that it will contribute to the efficient operation of limited resources such as manpower and budget by establishing an imported food management system through a data-based scientific risk management model.

Development of a Stock Trading System Using M & W Wave Patterns and Genetic Algorithms (M&W 파동 패턴과 유전자 알고리즘을 이용한 주식 매매 시스템 개발)

  • Yang, Hoonseok;Kim, Sunwoong;Choi, Heung Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.63-83
    • /
    • 2019
  • Investors prefer to look for trading points based on the graph shown in the chart rather than complex analysis, such as corporate intrinsic value analysis and technical auxiliary index analysis. However, the pattern analysis technique is difficult and computerized less than the needs of users. In recent years, there have been many cases of studying stock price patterns using various machine learning techniques including neural networks in the field of artificial intelligence(AI). In particular, the development of IT technology has made it easier to analyze a huge number of chart data to find patterns that can predict stock prices. Although short-term forecasting power of prices has increased in terms of performance so far, long-term forecasting power is limited and is used in short-term trading rather than long-term investment. Other studies have focused on mechanically and accurately identifying patterns that were not recognized by past technology, but it can be vulnerable in practical areas because it is a separate matter whether the patterns found are suitable for trading. When they find a meaningful pattern, they find a point that matches the pattern. They then measure their performance after n days, assuming that they have bought at that point in time. Since this approach is to calculate virtual revenues, there can be many disparities with reality. The existing research method tries to find a pattern with stock price prediction power, but this study proposes to define the patterns first and to trade when the pattern with high success probability appears. The M & W wave pattern published by Merrill(1980) is simple because we can distinguish it by five turning points. Despite the report that some patterns have price predictability, there were no performance reports used in the actual market. The simplicity of a pattern consisting of five turning points has the advantage of reducing the cost of increasing pattern recognition accuracy. In this study, 16 patterns of up conversion and 16 patterns of down conversion are reclassified into ten groups so that they can be easily implemented by the system. Only one pattern with high success rate per group is selected for trading. Patterns that had a high probability of success in the past are likely to succeed in the future. So we trade when such a pattern occurs. It is a real situation because it is measured assuming that both the buy and sell have been executed. We tested three ways to calculate the turning point. The first method, the minimum change rate zig-zag method, removes price movements below a certain percentage and calculates the vertex. In the second method, high-low line zig-zag, the high price that meets the n-day high price line is calculated at the peak price, and the low price that meets the n-day low price line is calculated at the valley price. In the third method, the swing wave method, the high price in the center higher than n high prices on the left and right is calculated as the peak price. If the central low price is lower than the n low price on the left and right, it is calculated as valley price. The swing wave method was superior to the other methods in the test results. It is interpreted that the transaction after checking the completion of the pattern is more effective than the transaction in the unfinished state of the pattern. Genetic algorithms(GA) were the most suitable solution, although it was virtually impossible to find patterns with high success rates because the number of cases was too large in this simulation. We also performed the simulation using the Walk-forward Analysis(WFA) method, which tests the test section and the application section separately. So we were able to respond appropriately to market changes. In this study, we optimize the stock portfolio because there is a risk of over-optimized if we implement the variable optimality for each individual stock. Therefore, we selected the number of constituent stocks as 20 to increase the effect of diversified investment while avoiding optimization. We tested the KOSPI market by dividing it into six categories. In the results, the portfolio of small cap stock was the most successful and the high vol stock portfolio was the second best. This shows that patterns need to have some price volatility in order for patterns to be shaped, but volatility is not the best.

A Study on the Impact of Artificial Intelligence on Decision Making : Focusing on Human-AI Collaboration and Decision-Maker's Personality Trait (인공지능이 의사결정에 미치는 영향에 관한 연구 : 인간과 인공지능의 협업 및 의사결정자의 성격 특성을 중심으로)

  • Lee, JeongSeon;Suh, Bomil;Kwon, YoungOk
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.231-252
    • /
    • 2021
  • Artificial intelligence (AI) is a key technology that will change the future the most. It affects the industry as a whole and daily life in various ways. As data availability increases, artificial intelligence finds an optimal solution and infers/predicts through self-learning. Research and investment related to automation that discovers and solves problems on its own are ongoing continuously. Automation of artificial intelligence has benefits such as cost reduction, minimization of human intervention and the difference of human capability. However, there are side effects, such as limiting the artificial intelligence's autonomy and erroneous results due to algorithmic bias. In the labor market, it raises the fear of job replacement. Prior studies on the utilization of artificial intelligence have shown that individuals do not necessarily use the information (or advice) it provides. Algorithm error is more sensitive than human error; so, people avoid algorithms after seeing errors, which is called "algorithm aversion." Recently, artificial intelligence has begun to be understood from the perspective of the augmentation of human intelligence. We have started to be interested in Human-AI collaboration rather than AI alone without human. A study of 1500 companies in various industries found that human-AI collaboration outperformed AI alone. In the medicine area, pathologist-deep learning collaboration dropped the pathologist cancer diagnosis error rate by 85%. Leading AI companies, such as IBM and Microsoft, are starting to adopt the direction of AI as augmented intelligence. Human-AI collaboration is emphasized in the decision-making process, because artificial intelligence is superior in analysis ability based on information. Intuition is a unique human capability so that human-AI collaboration can make optimal decisions. In an environment where change is getting faster and uncertainty increases, the need for artificial intelligence in decision-making will increase. In addition, active discussions are expected on approaches that utilize artificial intelligence for rational decision-making. This study investigates the impact of artificial intelligence on decision-making focuses on human-AI collaboration and the interaction between the decision maker personal traits and advisor type. The advisors were classified into three types: human, artificial intelligence, and human-AI collaboration. We investigated perceived usefulness of advice and the utilization of advice in decision making and whether the decision-maker's personal traits are influencing factors. Three hundred and eleven adult male and female experimenters conducted a task that predicts the age of faces in photos and the results showed that the advisor type does not directly affect the utilization of advice. The decision-maker utilizes it only when they believed advice can improve prediction performance. In the case of human-AI collaboration, decision-makers higher evaluated the perceived usefulness of advice, regardless of the decision maker's personal traits and the advice was more actively utilized. If the type of advisor was artificial intelligence alone, decision-makers who scored high in conscientiousness, high in extroversion, or low in neuroticism, high evaluated the perceived usefulness of the advice so they utilized advice actively. This study has academic significance in that it focuses on human-AI collaboration that the recent growing interest in artificial intelligence roles. It has expanded the relevant research area by considering the role of artificial intelligence as an advisor of decision-making and judgment research, and in aspects of practical significance, suggested views that companies should consider in order to enhance AI capability. To improve the effectiveness of AI-based systems, companies not only must introduce high-performance systems, but also need employees who properly understand digital information presented by AI, and can add non-digital information to make decisions. Moreover, to increase utilization in AI-based systems, task-oriented competencies, such as analytical skills and information technology capabilities, are important. in addition, it is expected that greater performance will be achieved if employee's personal traits are considered.

A Study on the Effect of Network Centralities on Recommendation Performance (네트워크 중심성 척도가 추천 성능에 미치는 영향에 대한 연구)

  • Lee, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.23-46
    • /
    • 2021
  • Collaborative filtering, which is often used in personalization recommendations, is recognized as a very useful technique to find similar customers and recommend products to them based on their purchase history. However, the traditional collaborative filtering technique has raised the question of having difficulty calculating the similarity for new customers or products due to the method of calculating similaritiesbased on direct connections and common features among customers. For this reason, a hybrid technique was designed to use content-based filtering techniques together. On the one hand, efforts have been made to solve these problems by applying the structural characteristics of social networks. This applies a method of indirectly calculating similarities through their similar customers placed between them. This means creating a customer's network based on purchasing data and calculating the similarity between the two based on the features of the network that indirectly connects the two customers within this network. Such similarity can be used as a measure to predict whether the target customer accepts recommendations. The centrality metrics of networks can be utilized for the calculation of these similarities. Different centrality metrics have important implications in that they may have different effects on recommended performance. In this study, furthermore, the effect of these centrality metrics on the performance of recommendation may vary depending on recommender algorithms. In addition, recommendation techniques using network analysis can be expected to contribute to increasing recommendation performance even if they apply not only to new customers or products but also to entire customers or products. By considering a customer's purchase of an item as a link generated between the customer and the item on the network, the prediction of user acceptance of recommendation is solved as a prediction of whether a new link will be created between them. As the classification models fit the purpose of solving the binary problem of whether the link is engaged or not, decision tree, k-nearest neighbors (KNN), logistic regression, artificial neural network, and support vector machine (SVM) are selected in the research. The data for performance evaluation used order data collected from an online shopping mall over four years and two months. Among them, the previous three years and eight months constitute social networks composed of and the experiment was conducted by organizing the data collected into the social network. The next four months' records were used to train and evaluate recommender models. Experiments with the centrality metrics applied to each model show that the recommendation acceptance rates of the centrality metrics are different for each algorithm at a meaningful level. In this work, we analyzed only four commonly used centrality metrics: degree centrality, betweenness centrality, closeness centrality, and eigenvector centrality. Eigenvector centrality records the lowest performance in all models except support vector machines. Closeness centrality and betweenness centrality show similar performance across all models. Degree centrality ranking moderate across overall models while betweenness centrality always ranking higher than degree centrality. Finally, closeness centrality is characterized by distinct differences in performance according to the model. It ranks first in logistic regression, artificial neural network, and decision tree withnumerically high performance. However, it only records very low rankings in support vector machine and K-neighborhood with low-performance levels. As the experiment results reveal, in a classification model, network centrality metrics over a subnetwork that connects the two nodes can effectively predict the connectivity between two nodes in a social network. Furthermore, each metric has a different performance depending on the classification model type. This result implies that choosing appropriate metrics for each algorithm can lead to achieving higher recommendation performance. In general, betweenness centrality can guarantee a high level of performance in any model. It would be possible to consider the introduction of proximity centrality to obtain higher performance for certain models.

Multi-day Trip Planning System with Collaborative Recommendation (협업적 추천 기반의 여행 계획 시스템)

  • Aprilia, Priska;Oh, Kyeong-Jin;Hong, Myung-Duk;Ga, Myeong-Hyeon;Jo, Geun-Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.159-185
    • /
    • 2016
  • Planning a multi-day trip is a complex, yet time-consuming task. It usually starts with selecting a list of points of interest (POIs) worth visiting and then arranging them into an itinerary, taking into consideration various constraints and preferences. When choosing POIs to visit, one might ask friends to suggest them, search for information on the Web, or seek advice from travel agents; however, those options have their limitations. First, the knowledge of friends is limited to the places they have visited. Second, the tourism information on the internet may be vast, but at the same time, might cause one to invest a lot of time reading and filtering the information. Lastly, travel agents might be biased towards providers of certain travel products when suggesting itineraries. In recent years, many researchers have tried to deal with the huge amount of tourism information available on the internet. They explored the wisdom of the crowd through overwhelming images shared by people on social media sites. Furthermore, trip planning problems are usually formulated as 'Tourist Trip Design Problems', and are solved using various search algorithms with heuristics. Various recommendation systems with various techniques have been set up to cope with the overwhelming tourism information available on the internet. Prediction models of recommendation systems are typically built using a large dataset. However, sometimes such a dataset is not always available. For other models, especially those that require input from people, human computation has emerged as a powerful and inexpensive approach. This study proposes CYTRIP (Crowdsource Your TRIP), a multi-day trip itinerary planning system that draws on the collective intelligence of contributors in recommending POIs. In order to enable the crowd to collaboratively recommend POIs to users, CYTRIP provides a shared workspace. In the shared workspace, the crowd can recommend as many POIs to as many requesters as they can, and they can also vote on the POIs recommended by other people when they find them interesting. In CYTRIP, anyone can make a contribution by recommending POIs to requesters based on requesters' specified preferences. CYTRIP takes input on the recommended POIs to build a multi-day trip itinerary taking into account the user's preferences, the various time constraints, and the locations. The input then becomes a multi-day trip planning problem that is formulated in Planning Domain Definition Language 3 (PDDL3). A sequence of actions formulated in a domain file is used to achieve the goals in the planning problem, which are the recommended POIs to be visited. The multi-day trip planning problem is a highly constrained problem. Sometimes, it is not feasible to visit all the recommended POIs with the limited resources available, such as the time the user can spend. In order to cope with an unachievable goal that can result in no solution for the other goals, CYTRIP selects a set of feasible POIs prior to the planning process. The planning problem is created for the selected POIs and fed into the planner. The solution returned by the planner is then parsed into a multi-day trip itinerary and displayed to the user on a map. The proposed system is implemented as a web-based application built using PHP on a CodeIgniter Web Framework. In order to evaluate the proposed system, an online experiment was conducted. From the online experiment, results show that with the help of the contributors, CYTRIP can plan and generate a multi-day trip itinerary that is tailored to the users' preferences and bound by their constraints, such as location or time constraints. The contributors also find that CYTRIP is a useful tool for collecting POIs from the crowd and planning a multi-day trip.

Prediction of Potential Species Richness of Plants Adaptable to Climate Change in the Korean Peninsula (한반도 기후변화 적응 대상 식물 종풍부도 변화 예측 연구)

  • Shin, Man-Seok;Seo, Changwan;Lee, Myungwoo;Kim, Jin-Yong;Jeon, Ja-Young;Adhikari, Pradeep;Hong, Seung-Bum
    • Journal of Environmental Impact Assessment
    • /
    • v.27 no.6
    • /
    • pp.562-581
    • /
    • 2018
  • This study was designed to predict the changes in species richness of plants under the climate change in South Korea. The target species were selected based on the Plants Adaptable to Climate Change in the Korean Peninsula. Altogether, 89 species including 23 native plants, 30 northern plants, and 36 southern plants. We used the Species Distribution Model to predict the potential habitat of individual species under the climate change. We applied ten single-model algorithms and the pre-evaluation weighted ensemble method. And then, species richness was derived from the results of individual species. Two representative concentration pathways (RCP 4.5 and RCP 8.5) were used to simulate the species richness of plants in 2050 and 2070. The current species richness was predicted to be high in the national parks located in the Baekdudaegan mountain range in Gangwon Province and islands of the South Sea. The future species richness was predicted to be lower in the national park and the Baekdudaegan mountain range in Gangwon Province and to be higher for southern coastal regions. The average value of the current species richness showed that the national park area was higher than the whole area of South Korea. However, predicted species richness were not the difference between the national park area and the whole area of South Korea. The difference between current and future species richness of plants could be the disappearance of a large number of native and northern plants from South Korea. The additional reason could be the expansion of potential habitat of southern plants under climate change. However, if species dispersal to a suitable habitat was not achieved, the species richness will be reduced drastically. The results were different depending on whether species were dispersed or not. This study will be useful for the conservation planning, establishment of the protected area, restoration of biological species and strategies for adaptation of climate change.

A Prediction of N-value Using Artificial Neural Network (인공신경망을 이용한 N치 예측)

  • Kim, Kwang Myung;Park, Hyoung June;Goo, Tae Hun;Kim, Hyung Chan
    • The Journal of Engineering Geology
    • /
    • v.30 no.4
    • /
    • pp.457-468
    • /
    • 2020
  • Problems arising during pile design works for plant construction, civil and architecture work are mostly come from uncertainty of geotechnical characteristics. In particular, obtaining the N-value measured through the Standard Penetration Test (SPT) is the most important data. However, it is difficult to obtain N-value by drilling investigation throughout the all target area. There are many constraints such as licensing, time, cost, equipment access and residential complaints etc. it is impossible to obtain geotechnical characteristics through drilling investigation within a short bidding period in overseas. The geotechnical characteristics at non-drilling investigation points are usually determined by the engineer's empirical judgment, which can leads to errors in pile design and quantity calculation causing construction delay and cost increase. It would be possible to overcome this problem if N-value could be predicted at the non-drilling investigation points using limited minimum drilling investigation data. This study was conducted to predicted the N-value using an Artificial Neural Network (ANN) which one of the Artificial intelligence (AI) method. An Artificial Neural Network treats a limited amount of geotechnical characteristics as a biological logic process, providing more reliable results for input variables. The purpose of this study is to predict N-value at the non-drilling investigation points through patterns which is studied by multi-layer perceptron and error back-propagation algorithms using the minimum geotechnical data. It has been reviewed the reliability of the values that predicted by AI method compared to the measured values, and we were able to confirm the high reliability as a result. To solving geotechnical uncertainty, we will perform sensitivity analysis of input variables to increase learning effect in next steps and it may need some technical update of program. We hope that our study will be helpful to design works in the future.

MDP(Markov Decision Process) Model for Prediction of Survivor Behavior based on Topographic Information (지형정보 기반 조난자 행동예측을 위한 마코프 의사결정과정 모형)

  • Jinho Son;Suhwan Kim
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.2
    • /
    • pp.101-114
    • /
    • 2023
  • In the wartime, aircraft carrying out a mission to strike the enemy deep in the depth are exposed to the risk of being shoot down. As a key combat force in mordern warfare, it takes a lot of time, effot and national budget to train military flight personnel who operate high-tech weapon systems. Therefore, this study studied the path problem of predicting the route of emergency escape from enemy territory to the target point to avoid obstacles, and through this, the possibility of safe recovery of emergency escape military flight personnel was increased. based problem, transforming the problem into a TSP, VRP, and Dijkstra algorithm, and approaching it with an optimization technique. However, if this problem is approached in a network problem, it is difficult to reflect the dynamic factors and uncertainties of the battlefield environment that military flight personnel in distress will face. So, MDP suitable for modeling dynamic environments was applied and studied. In addition, GIS was used to obtain topographic information data, and in the process of designing the reward structure of MDP, topographic information was reflected in more detail so that the model could be more realistic than previous studies. In this study, value iteration algorithms and deterministic methods were used to derive a path that allows the military flight personnel in distress to move to the shortest distance while making the most of the topographical advantages. In addition, it was intended to add the reality of the model by adding actual topographic information and obstacles that the military flight personnel in distress can meet in the process of escape and escape. Through this, it was possible to predict through which route the military flight personnel would escape and escape in the actual situation. The model presented in this study can be applied to various operational situations through redesign of the reward structure. In actual situations, decision support based on scientific techniques that reflect various factors in predicting the escape route of the military flight personnel in distress and conducting combat search and rescue operations will be possible.

A prediction model for adolescents' skipping breakfast using the CART algorithm for decision trees: 7th (2016-2018) Korea National Health and Nutrition Examination Survey (의사결정나무 CART 알고리즘을 이용한 청소년 아침결식 예측 모형: 제7기 (2016-2018년) 국민건강영양조사 자료분석)

  • Sun A Choi;Sung Suk Chung;Jeong Ok Rho
    • Journal of Nutrition and Health
    • /
    • v.56 no.3
    • /
    • pp.300-314
    • /
    • 2023
  • Purpose: This study sought to predict the reasons for skipping breakfast by adolescents aged 13-18 years using the 7th Korea National Health and Nutrition Examination Survey (KNHANES). Methods: The participants included 1,024 adolescents. The data were analyzed using a complex-sample t-test, the Rao Scott χ2-test, and the classification and regression tree (CART) algorithm for decision tree analysis with SPSS v. 27.0. The participants were divided into two groups, one regularly eating breakfast and the other skipping it. Results: A total of 579 and 445 study participants were found to be breakfast consumers and breakfast skippers respectively. Breakfast consumers were significantly younger than those who skipped breakfast. In addition, breakfast consumers had a significantly higher frequency of eating dinner, had been taught about nutrition, and had a lower frequency of eating out. The breakfast skippers did so to lose weight. Children who skipped breakfast consumed less energy, carbohydrates, proteins, fats, fiber, cholesterol, vitamin C, vitamin A, calcium, vitamin B1, vitamin B2, phosphorus, sodium, iron, potassium, and niacin than those who consumed breakfast. The best predictor of skipping breakfast was identifying adolescents who sought to control their weight by not eating meals. Other participants who had low and middle-low household incomes, ate dinner 3-4 times a week, were more than 14.5 years old, and ate out once a day showed a higher frequency of skipping breakfast. Conclusion: Based on these results, nutrition education targeted at losing weight correctly and emphasizing the importance of breakfast, especially for adolescents, is required. Moreover, nutrition educators should consider designing and implementing specific action plans to encourage adolescents to improve their breakfast-eating practices by also eating dinner regularly and reducing eating out.

Introduction and Evaluation of the Production Method for Chlorophyll-a Using Merging of GOCI-II and Polar Orbit Satellite Data (GOCI-II 및 극궤도 위성 자료를 병합한 Chlorophyll-a 산출물 생산방법 소개 및 활용 가능성 평가)

  • Hye-Kyeong Shin;Jae Yeop Kwon;Pyeong Joong Kim;Tae-Ho Kim
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.6_1
    • /
    • pp.1255-1272
    • /
    • 2023
  • Satellite-based chlorophyll-a concentration, produced as a long-term time series, is crucial for global climate change research. The production of data without gaps through the merging of time-synthesized or multi-satellite data is essential. However, studies related to satellite-based chlorophyll-a concentration in the waters around the Korean Peninsula have mainly focused on evaluating seasonal characteristics or proposing algorithms suitable for research areas using a single ocean color sensor. In this study, a merging dataset of remote sensing reflectance from the geostationary sensor GOCI-II and polar-orbiting sensors (MODIS, VIIRS, OLCI) was utilized to achieve high spatial coverage of chlorophyll-a concentration in the waters around the Korean Peninsula. The spatial coverage in the results of this study increased by approximately 30% compared to polar-orbiting sensor data, effectively compensating for gaps caused by clouds. Additionally, we aimed to quantitatively assess accuracy through comparison with global chlorophyll-a composite data provided by Ocean Colour Climate Change Initiative (OC-CCI) and GlobColour, along with in-situ observation data. However, due to the limited number of in-situ observation data, we could not provide statistically significant results. Nevertheless, we observed a tendency for underestimation compared to global data. Furthermore, for the evaluation of practical applications in response to marine disasters such as red tides, we qualitatively compared our results with a case of a red tide in the East Sea in 2013. The results showed similarities to OC-CCI rather than standalone geostationary sensor results. Through this study, we plan to use the generated data for future research in artificial intelligence models for prediction and anomaly utilization. It is anticipated that the results will be beneficial for monitoring chlorophyll-a events in the coastal waters around Korea.