• Title/Summary/Keyword: Learning by making

Search Result 1,072, Processing Time 0.028 seconds

A study on improving the accuracy of machine learning models through the use of non-financial information in predicting the Closure of operator using electronic payment service (전자결제서비스 이용 사업자 폐업 예측에서 비재무정보 활용을 통한 머신러닝 모델의 정확도 향상에 관한 연구)

  • Hyunjeong Gong;Eugene Hwang;Sunghyuk Park
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.3
    • /
    • pp.361-381
    • /
    • 2023
  • Research on corporate bankruptcy prediction has been focused on financial information. Since the company's financial information is updated quarterly, there is a problem that timeliness is insufficient in predicting the possibility of a company's business closure in real time. Evaluated companies that want to improve this need a method of judging the soundness of a company that uses information other than financial information to judge the soundness of a target company. To this end, as information technology has made it easier to collect non-financial information about companies, research has been conducted to apply additional variables and various methodologies other than financial information to predict corporate bankruptcy. It has become an important research task to determine whether it has an effect. In this study, we examined the impact of electronic payment-related information, which constitutes non-financial information, when predicting the closure of business operators using electronic payment service and examined the difference in closure prediction accuracy according to the combination of financial and non-financial information. Specifically, three research models consisting of a financial information model, a non-financial information model, and a combined model were designed, and the closure prediction accuracy was confirmed with six algorithms including the Multi Layer Perceptron (MLP) algorithm. The model combining financial and non-financial information showed the highest prediction accuracy, followed by the non-financial information model and the financial information model in order. As for the prediction accuracy of business closure by algorithm, XGBoost showed the highest prediction accuracy among the six algorithms. As a result of examining the relative importance of a total of 87 variables used to predict business closure, it was confirmed that more than 70% of the top 20 variables that had a significant impact on the prediction of business closure were non-financial information. Through this, it was confirmed that electronic payment-related information of non-financial information is an important variable in predicting business closure, and the possibility of using non-financial information as an alternative to financial information was also examined. Based on this study, the importance of collecting and utilizing non-financial information as information that can predict business closure is recognized, and a plan to utilize it for corporate decision-making is also proposed.

Ensemble of Nested Dichotomies for Activity Recognition Using Accelerometer Data on Smartphone (Ensemble of Nested Dichotomies 기법을 이용한 스마트폰 가속도 센서 데이터 기반의 동작 인지)

  • Ha, Eu Tteum;Kim, Jeongmin;Ryu, Kwang Ryel
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.4
    • /
    • pp.123-132
    • /
    • 2013
  • As the smartphones are equipped with various sensors such as the accelerometer, GPS, gravity sensor, gyros, ambient light sensor, proximity sensor, and so on, there have been many research works on making use of these sensors to create valuable applications. Human activity recognition is one such application that is motivated by various welfare applications such as the support for the elderly, measurement of calorie consumption, analysis of lifestyles, analysis of exercise patterns, and so on. One of the challenges faced when using the smartphone sensors for activity recognition is that the number of sensors used should be minimized to save the battery power. When the number of sensors used are restricted, it is difficult to realize a highly accurate activity recognizer or a classifier because it is hard to distinguish between subtly different activities relying on only limited information. The difficulty gets especially severe when the number of different activity classes to be distinguished is very large. In this paper, we show that a fairly accurate classifier can be built that can distinguish ten different activities by using only a single sensor data, i.e., the smartphone accelerometer data. The approach that we take to dealing with this ten-class problem is to use the ensemble of nested dichotomy (END) method that transforms a multi-class problem into multiple two-class problems. END builds a committee of binary classifiers in a nested fashion using a binary tree. At the root of the binary tree, the set of all the classes are split into two subsets of classes by using a binary classifier. At a child node of the tree, a subset of classes is again split into two smaller subsets by using another binary classifier. Continuing in this way, we can obtain a binary tree where each leaf node contains a single class. This binary tree can be viewed as a nested dichotomy that can make multi-class predictions. Depending on how a set of classes are split into two subsets at each node, the final tree that we obtain can be different. Since there can be some classes that are correlated, a particular tree may perform better than the others. However, we can hardly identify the best tree without deep domain knowledge. The END method copes with this problem by building multiple dichotomy trees randomly during learning, and then combining the predictions made by each tree during classification. The END method is generally known to perform well even when the base learner is unable to model complex decision boundaries As the base classifier at each node of the dichotomy, we have used another ensemble classifier called the random forest. A random forest is built by repeatedly generating a decision tree each time with a different random subset of features using a bootstrap sample. By combining bagging with random feature subset selection, a random forest enjoys the advantage of having more diverse ensemble members than a simple bagging. As an overall result, our ensemble of nested dichotomy can actually be seen as a committee of committees of decision trees that can deal with a multi-class problem with high accuracy. The ten classes of activities that we distinguish in this paper are 'Sitting', 'Standing', 'Walking', 'Running', 'Walking Uphill', 'Walking Downhill', 'Running Uphill', 'Running Downhill', 'Falling', and 'Hobbling'. The features used for classifying these activities include not only the magnitude of acceleration vector at each time point but also the maximum, the minimum, and the standard deviation of vector magnitude within a time window of the last 2 seconds, etc. For experiments to compare the performance of END with those of other methods, the accelerometer data has been collected at every 0.1 second for 2 minutes for each activity from 5 volunteers. Among these 5,900 ($=5{\times}(60{\times}2-2)/0.1$) data collected for each activity (the data for the first 2 seconds are trashed because they do not have time window data), 4,700 have been used for training and the rest for testing. Although 'Walking Uphill' is often confused with some other similar activities, END has been found to classify all of the ten activities with a fairly high accuracy of 98.4%. On the other hand, the accuracies achieved by a decision tree, a k-nearest neighbor, and a one-versus-rest support vector machine have been observed as 97.6%, 96.5%, and 97.6%, respectively.

Measuring the Public Service Quality Using Process Mining: Focusing on N City's Building Licensing Complaint Service (프로세스 마이닝을 이용한 공공서비스의 품질 측정: N시의 건축 인허가 민원 서비스를 중심으로)

  • Lee, Jung Seung
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.35-52
    • /
    • 2019
  • As public services are provided in various forms, including e-government, the level of public demand for public service quality is increasing. Although continuous measurement and improvement of the quality of public services is needed to improve the quality of public services, traditional surveys are costly and time-consuming and have limitations. Therefore, there is a need for an analytical technique that can measure the quality of public services quickly and accurately at any time based on the data generated from public services. In this study, we analyzed the quality of public services based on data using process mining techniques for civil licensing services in N city. It is because the N city's building license complaint service can secure data necessary for analysis and can be spread to other institutions through public service quality management. This study conducted process mining on a total of 3678 building license complaint services in N city for two years from January 2014, and identified process maps and departments with high frequency and long processing time. According to the analysis results, there was a case where a department was crowded or relatively few at a certain point in time. In addition, there was a reasonable doubt that the increase in the number of complaints would increase the time required to complete the complaints. According to the analysis results, the time required to complete the complaint was varied from the same day to a year and 146 days. The cumulative frequency of the top four departments of the Sewage Treatment Division, the Waterworks Division, the Urban Design Division, and the Green Growth Division exceeded 50% and the cumulative frequency of the top nine departments exceeded 70%. Higher departments were limited and there was a great deal of unbalanced load among departments. Most complaint services have a variety of different patterns of processes. Research shows that the number of 'complementary' decisions has the greatest impact on the length of a complaint. This is interpreted as a lengthy period until the completion of the entire complaint is required because the 'complement' decision requires a physical period in which the complainant supplements and submits the documents again. In order to solve these problems, it is possible to drastically reduce the overall processing time of the complaints by preparing thoroughly before the filing of the complaints or in the preparation of the complaints, or the 'complementary' decision of other complaints. By clarifying and disclosing the cause and solution of one of the important data in the system, it helps the complainant to prepare in advance and convinces that the documents prepared by the public information will be passed. The transparency of complaints can be sufficiently predictable. Documents prepared by pre-disclosed information are likely to be processed without problems, which not only shortens the processing period but also improves work efficiency by eliminating the need for renegotiation or multiple tasks from the point of view of the processor. The results of this study can be used to find departments with high burdens of civil complaints at certain points of time and to flexibly manage the workforce allocation between departments. In addition, as a result of analyzing the pattern of the departments participating in the consultation by the characteristics of the complaints, it is possible to use it for automation or recommendation when requesting the consultation department. In addition, by using various data generated during the complaint process and using machine learning techniques, the pattern of the complaint process can be found. It can be used for automation / intelligence of civil complaint processing by making this algorithm and applying it to the system. This study is expected to be used to suggest future public service quality improvement through process mining analysis on civil service.

Development of Market Growth Pattern Map Based on Growth Model and Self-organizing Map Algorithm: Focusing on ICT products (자기조직화 지도를 활용한 성장모형 기반의 시장 성장패턴 지도 구축: ICT제품을 중심으로)

  • Park, Do-Hyung;Chung, Jaekwon;Chung, Yeo Jin;Lee, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.1-23
    • /
    • 2014
  • Market forecasting aims to estimate the sales volume of a product or service that is sold to consumers for a specific selling period. From the perspective of the enterprise, accurate market forecasting assists in determining the timing of new product introduction, product design, and establishing production plans and marketing strategies that enable a more efficient decision-making process. Moreover, accurate market forecasting enables governments to efficiently establish a national budget organization. This study aims to generate a market growth curve for ICT (information and communication technology) goods using past time series data; categorize products showing similar growth patterns; understand markets in the industry; and forecast the future outlook of such products. This study suggests the useful and meaningful process (or methodology) to identify the market growth pattern with quantitative growth model and data mining algorithm. The study employs the following methodology. At the first stage, past time series data are collected based on the target products or services of categorized industry. The data, such as the volume of sales and domestic consumption for a specific product or service, are collected from the relevant government ministry, the National Statistical Office, and other relevant government organizations. For collected data that may not be analyzed due to the lack of past data and the alteration of code names, data pre-processing work should be performed. At the second stage of this process, an optimal model for market forecasting should be selected. This model can be varied on the basis of the characteristics of each categorized industry. As this study is focused on the ICT industry, which has more frequent new technology appearances resulting in changes of the market structure, Logistic model, Gompertz model, and Bass model are selected. A hybrid model that combines different models can also be considered. The hybrid model considered for use in this study analyzes the size of the market potential through the Logistic and Gompertz models, and then the figures are used for the Bass model. The third stage of this process is to evaluate which model most accurately explains the data. In order to do this, the parameter should be estimated on the basis of the collected past time series data to generate the models' predictive value and calculate the root-mean squared error (RMSE). The model that shows the lowest average RMSE value for every product type is considered as the best model. At the fourth stage of this process, based on the estimated parameter value generated by the best model, a market growth pattern map is constructed with self-organizing map algorithm. A self-organizing map is learning with market pattern parameters for all products or services as input data, and the products or services are organized into an $N{\times}N$ map. The number of clusters increase from 2 to M, depending on the characteristics of the nodes on the map. The clusters are divided into zones, and the clusters with the ability to provide the most meaningful explanation are selected. Based on the final selection of clusters, the boundaries between the nodes are selected and, ultimately, the market growth pattern map is completed. The last step is to determine the final characteristics of the clusters as well as the market growth curve. The average of the market growth pattern parameters in the clusters is taken to be a representative figure. Using this figure, a growth curve is drawn for each cluster, and their characteristics are analyzed. Also, taking into consideration the product types in each cluster, their characteristics can be qualitatively generated. We expect that the process and system that this paper suggests can be used as a tool for forecasting demand in the ICT and other industries.

Comparative analysis of RN-BSN Program in Korea and U. S. A. (간호학사 편입학제도의 교과과정 비교분석)

  • Lee Ok-Ja;Kim Hyun-Sil
    • The Journal of Korean Academic Society of Nursing Education
    • /
    • v.3
    • /
    • pp.99-116
    • /
    • 1997
  • In response of the increasing demand for professional degree in nursing, some university in Korea offers RN-BSN program for R. N. from diploma in nursing. However, RN-BSN program in Korea is in formative period. Therefore, the purpose of this survey study is for the comparative analysis of RN-BSN curriculum in Korea and U.S.A. In this study, subjects consisted of 18 department of nursing in university and 5 RN-BSN programs in Korea and 18 department of nursing in university and 12 RN-BSN programs in U.S.A. For earn the degree of Bachelor of Science in Nursing, the student earns 134 of mean credits in U.S.A., whereas 150.3 of mean credits in Korea. The mean credit for clinical pratice is 30.1 in U.S.A., whereas 23.9 in Korea. Students are assigned to individually planned clinical experiences under the direction of a preceptor in U.S.A. In RN-BSN program, total mean credits through lecture and clinical practice for earn the degree of BSN is 35.5(lecture : 27.7, practice ; 7.8)in U.S.A., whereas,48.1 (lecture;42.1, practice;6.0) in Korea. RN-BSN program can be taken on a full-or-part time basis in U.S.A., whereas didn't in Korea. Especially, emphasis is place on the advanced nursing practicum that focus on the role of the professional nurse in providing health care to individuals, families, and groups in community setting in U.S.A. 27.7 of mean credits was earned through lecture in U.S.A., whereas 42.1 of mean credits in Korea. It means that RN-BSN program in Korea is the lesser development in teaching method and appraisal method than in U.S.A. Students of RN-BSN program in U.S.A. can earns credit through CLEP, NLN achievement test, portfolio review session etc as well as lecture. Therefore, the authors suggests some recommendations for the development of curriculum of RN-BSN program in Korea based on comparative analysis of RN-BSN curricula in U.S.A. and Korea. 1. The curriculum of RN-BSN Program in nursing was required to do some alterations. Nursing care, today, is complex and ever changing. According to change of public need, RN-BSN curriculum intensified primary care program in community setting, geriatric nursing, marketing skill, computer language. 2. The various and new methods of earning credit should be developed. That is, the students will earn credits through the transfer of previous nursing college credits, accredited examination of university, advanced placement examination, portfolio review session, case study, report, self-directed learning and so on. Flexible teaching place should ile offered. 3. Flexible teaching place should be offered. The RN-BSN curriculum should accommodate each RN student's geographical needs and school/work schedule. Therefore, the university should search a variety of teaching places and the RN students can obtain their degrees comfortably throughout the teaching place such as lecture room inside the health care agency and establishment of the branch school in each student's residence area. 4. The RN-BSN program should offer a long distance education to place-bound RN student in many parts of Korea. That is, from the main office of university, the RN-BSN courses are delivered to many areas by Internet, EdNet (satellite telecommunication) and other non-traditional methods. 5. For allowing RN student to take nursing courses, program length should be various, depending upon the student's study/work schedule. That is, the various term systems such as semester, three terms, quarter systems and the student's status like full time or part time should be considered. Therefore, the student can take advantage of the many other educational and professional opportunities, making them available during the school year.

  • PDF

Clinical Investigation of Childhood Epilepsy (소아간질의 임상적 관찰)

  • Moon, Han-Ku;Park, Yong-Hoon
    • Journal of Yeungnam Medical Science
    • /
    • v.2 no.1
    • /
    • pp.103-111
    • /
    • 1985
  • Childhood epilepsy which has high prevalence rate and inception rate is one of the commonest problem encountered in pediatrician. In contrast with epilepsy of adult, in childhood epilepsy, more variable and varying manifestations are found because the factors of age, growth and development exert their influences in the manifestations and the courses of childhood epilepsy. Moreover epileptic children have associated problems such as physical and mental handicaps, psychologicaldisorders and learning disability. For these reasons pediatrician who deals with epileptic children experiences difficulties in making diagnosis and managing them. In order to improve understanding and management of childhood epilepsy, authors reviewed 103 cases of epileptic patients seen at pediatric department of Yeungnam University Hospital retrospectively. The patients were classified according to the type of epileptic seizure. Suspected causes of epilepsy, associated conditions of epileptic patients, age incidence and the findings of brain CT were reviewed. Large numbers of epileptic patients (61.2%) developed their first seizures under the age of 5. The most frequent type of epileptic seizure was generalized ionic-clonic, tonic, clonic seizure (49.5%), followed by simple partial seizure with secondary generalization (17.5%), simple partial seizure (7.8%), a typical absence (5.8%) and unclassified seizure (5.8%). In 83.5% of patients, we could not find specific cause of it, but in 16.5% of cases, history of neonatal hypoxia (4.9%), meningitis (3.9%), prematurity (1.9%), small for gestational age (1.0%), CO poisoning (1.0%), encephalopathy (1.0%), DPT vaccination (1.0%), cerebrovascular accident (1.0%) and neonatal jaundice (1.0%) were found, 30 cases of patients had associated diseases such as mental retardation, hyperactivity, delayed motor milestones or their combinations. The major abnormal findings of brain CT performed in 42 cases were cortical atrophy, cerebral infarction, hydrocephalus and brain swelling. This review stressed better designed classification of epilepsy is needed and with promotion of medical care, prevention of epilepsy is possible in some cases. Also it is stressed that childhood epilepsy requires multidisplinary therapy and brain CT is helpful in the evaluation of epilepsy with limitation in therapeutic aspects.

  • PDF

Retail Product Development and Brand Management Collaboration between Industry and University Student Teams (산업여대학학생단대지간적령수산품개발화품패관리협작(产业与大学学生团队之间的零售产品开发和品牌管理协作))

  • Carroll, Katherine Emma
    • Journal of Global Scholars of Marketing Science
    • /
    • v.20 no.3
    • /
    • pp.239-248
    • /
    • 2010
  • This paper describes a collaborative project between academia and industry which focused on improving the marketing and product development strategies for two private label apparel brands of a large regional department store chain in the southeastern United States. The goal of the project was to revitalize product lines of the two brands by incorporating student ideas for new solutions, thereby giving the students practical experience with a real-life industry situation. There were a number of key players involved in the project. A privately-owned department store chain based in the southeastern United States which was seeking an academic partner had recognized a need to update two existing private label brands. They targeted middle-aged consumers looking for casual, moderately priced merchandise. The company was seeking to change direction with both packaging and presentation, and possibly product design. The branding and product development divisions of the company contacted professors in an academic department of a large southeastern state university. Two of the professors agreed that the task would be a good fit for their classes - one was a junior-level Intermediate Brand Management class; the other was a senior-level Fashion Product Development class. The professors felt that by working collaboratively on the project, students would be exposed to a real world scenario, within the security of an academic learning environment. Collaboration within an interdisciplinary team has the advantage of providing experiences and resources beyond the capabilities of a single student and adds "brainpower" to problem-solving processes (Lowman 2000). This goal of improving the capabilities of students directed the instructors in each class to form interdisciplinary teams between the Branding and Product Development classes. In addition, many universities are employing industry partnerships in research and teaching, where collaboration within temporal (semester) and physical (classroom/lab) constraints help to increase students' knowledge and experience of a real-world situation. At the University of Tennessee, the Center of Industrial Services and UT-Knoxville's College of Engineering worked with a company to develop design improvements in its U.S. operations. In this study, Because should be lower case b with a private label retail brand, Wickett, Gaskill and Damhorst's (1999) revised Retail Apparel Product Development Model was used by the product development and brand management teams. This framework was chosen because it addresses apparel product development from the concept to the retail stage. Two classes were involved in this project: a junior level Brand Management class and a senior level Fashion Product Development class. Seven teams were formed which included four students from Brand Management and two students from Product Development. The classes were taught the same semester, but not at the same time. At the beginning of the semester, each class was introduced to the industry partner and given the problem. Half the teams were assigned to the men's brand and half to the women's brand. The teams were responsible for devising approaches to the problem, formulating a timeline for their work, staying in touch with industry representatives and making sure that each member of the team contributed in a positive way. The objective for the teams was to plan, develop, and present a product line using merchandising processes (following the Wickett, Gaskill and Damhorst model) and develop new branding strategies for the proposed lines. The teams performed trend, color, fabrication and target market research; developed sketches for a line; edited the sketches and presented their line plans; wrote specifications; fitted prototypes on fit models, and developed final production samples for presentation to industry. The branding students developed a SWOT analysis, a Brand Measurement report, a mind-map for the brands and a fully integrated Marketing Report which was presented alongside the ideas for the new lines. In future if the opportunity arises to work in this collaborative way with an existing company who wishes to look both at branding and product development strategies, classes will be scheduled at the same time so that students have more time to meet and discuss timelines and assigned tasks. As it was, student groups had to meet outside of each class time and this proved to be a challenging though not uncommon part of teamwork (Pfaff and Huddleston, 2003). Although the logistics of this exercise were time-consuming to set up and administer, professors felt that the benefits to students were multiple. The most important benefit, according to student feedback from both classes, was the opportunity to work with industry professionals, follow their process, and see the results of their work evaluated by the people who made the decisions at the company level. Faculty members were grateful to have a "real-world" case to work with in the classroom to provide focus. Creative ideas and strategies were traded as plans were made, extending and strengthening the departmental links be tween the branding and product development areas. By working not only with students coming from a different knowledge base, but also having to keep in contact with the industry partner and follow the framework and timeline of industry practice, student teams were challenged to produce excellent and innovative work under new circumstances. Working on the product development and branding for "real-life" brands that are struggling gave students an opportunity to see how closely their coursework ties in with the real-world and how creativity, collaboration and flexibility are necessary components of both the design and business aspects of company operations. Industry personnel were impressed by (a) the level and depth of knowledge and execution in the student projects, and (b) the creativity of new ideas for the brands.

Multi-Dimensional Analysis Method of Product Reviews for Market Insight (마켓 인사이트를 위한 상품 리뷰의 다차원 분석 방안)

  • Park, Jeong Hyun;Lee, Seo Ho;Lim, Gyu Jin;Yeo, Un Yeong;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.57-78
    • /
    • 2020
  • With the development of the Internet, consumers have had an opportunity to check product information easily through E-Commerce. Product reviews used in the process of purchasing goods are based on user experience, allowing consumers to engage as producers of information as well as refer to information. This can be a way to increase the efficiency of purchasing decisions from the perspective of consumers, and from the seller's point of view, it can help develop products and strengthen their competitiveness. However, it takes a lot of time and effort to understand the overall assessment and assessment dimensions of the products that I think are important in reading the vast amount of product reviews offered by E-Commerce for the products consumers want to compare. This is because product reviews are unstructured information and it is difficult to read sentiment of reviews and assessment dimension immediately. For example, consumers who want to purchase a laptop would like to check the assessment of comparative products at each dimension, such as performance, weight, delivery, speed, and design. Therefore, in this paper, we would like to propose a method to automatically generate multi-dimensional product assessment scores in product reviews that we would like to compare. The methods presented in this study consist largely of two phases. One is the pre-preparation phase and the second is the individual product scoring phase. In the pre-preparation phase, a dimensioned classification model and a sentiment analysis model are created based on a review of the large category product group review. By combining word embedding and association analysis, the dimensioned classification model complements the limitation that word embedding methods for finding relevance between dimensions and words in existing studies see only the distance of words in sentences. Sentiment analysis models generate CNN models by organizing learning data tagged with positives and negatives on a phrase unit for accurate polarity detection. Through this, the individual product scoring phase applies the models pre-prepared for the phrase unit review. Multi-dimensional assessment scores can be obtained by aggregating them by assessment dimension according to the proportion of reviews organized like this, which are grouped among those that are judged to describe a specific dimension for each phrase. In the experiment of this paper, approximately 260,000 reviews of the large category product group are collected to form a dimensioned classification model and a sentiment analysis model. In addition, reviews of the laptops of S and L companies selling at E-Commerce are collected and used as experimental data, respectively. The dimensioned classification model classified individual product reviews broken down into phrases into six assessment dimensions and combined the existing word embedding method with an association analysis indicating frequency between words and dimensions. As a result of combining word embedding and association analysis, the accuracy of the model increased by 13.7%. The sentiment analysis models could be seen to closely analyze the assessment when they were taught in a phrase unit rather than in sentences. As a result, it was confirmed that the accuracy was 29.4% higher than the sentence-based model. Through this study, both sellers and consumers can expect efficient decision making in purchasing and product development, given that they can make multi-dimensional comparisons of products. In addition, text reviews, which are unstructured data, were transformed into objective values such as frequency and morpheme, and they were analysed together using word embedding and association analysis to improve the objectivity aspects of more precise multi-dimensional analysis and research. This will be an attractive analysis model in terms of not only enabling more effective service deployment during the evolving E-Commerce market and fierce competition, but also satisfying both customers.

A Study on the Application of Outlier Analysis for Fraud Detection: Focused on Transactions of Auction Exception Agricultural Products (부정 탐지를 위한 이상치 분석 활용방안 연구 : 농수산 상장예외품목 거래를 대상으로)

  • Kim, Dongsung;Kim, Kitae;Kim, Jongwoo;Park, Steve
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.3
    • /
    • pp.93-108
    • /
    • 2014
  • To support business decision making, interests and efforts to analyze and use transaction data in different perspectives are increasing. Such efforts are not only limited to customer management or marketing, but also used for monitoring and detecting fraud transactions. Fraud transactions are evolving into various patterns by taking advantage of information technology. To reflect the evolution of fraud transactions, there are many efforts on fraud detection methods and advanced application systems in order to improve the accuracy and ease of fraud detection. As a case of fraud detection, this study aims to provide effective fraud detection methods for auction exception agricultural products in the largest Korean agricultural wholesale market. Auction exception products policy exists to complement auction-based trades in agricultural wholesale market. That is, most trades on agricultural products are performed by auction; however, specific products are assigned as auction exception products when total volumes of products are relatively small, the number of wholesalers is small, or there are difficulties for wholesalers to purchase the products. However, auction exception products policy makes several problems on fairness and transparency of transaction, which requires help of fraud detection. In this study, to generate fraud detection rules, real huge agricultural products trade transaction data from 2008 to 2010 in the market are analyzed, which increase more than 1 million transactions and 1 billion US dollar in transaction volume. Agricultural transaction data has unique characteristics such as frequent changes in supply volumes and turbulent time-dependent changes in price. Since this was the first trial to identify fraud transactions in this domain, there was no training data set for supervised learning. So, fraud detection rules are generated using outlier detection approach. We assume that outlier transactions have more possibility of fraud transactions than normal transactions. The outlier transactions are identified to compare daily average unit price, weekly average unit price, and quarterly average unit price of product items. Also quarterly averages unit price of product items of the specific wholesalers are used to identify outlier transactions. The reliability of generated fraud detection rules are confirmed by domain experts. To determine whether a transaction is fraudulent or not, normal distribution and normalized Z-value concept are applied. That is, a unit price of a transaction is transformed to Z-value to calculate the occurrence probability when we approximate the distribution of unit prices to normal distribution. The modified Z-value of the unit price in the transaction is used rather than using the original Z-value of it. The reason is that in the case of auction exception agricultural products, Z-values are influenced by outlier fraud transactions themselves because the number of wholesalers is small. The modified Z-values are called Self-Eliminated Z-scores because they are calculated excluding the unit price of the specific transaction which is subject to check whether it is fraud transaction or not. To show the usefulness of the proposed approach, a prototype of fraud transaction detection system is developed using Delphi. The system consists of five main menus and related submenus. First functionalities of the system is to import transaction databases. Next important functions are to set up fraud detection parameters. By changing fraud detection parameters, system users can control the number of potential fraud transactions. Execution functions provide fraud detection results which are found based on fraud detection parameters. The potential fraud transactions can be viewed on screen or exported as files. The study is an initial trial to identify fraud transactions in Auction Exception Agricultural Products. There are still many remained research topics of the issue. First, the scope of analysis data was limited due to the availability of data. It is necessary to include more data on transactions, wholesalers, and producers to detect fraud transactions more accurately. Next, we need to extend the scope of fraud transaction detection to fishery products. Also there are many possibilities to apply different data mining techniques for fraud detection. For example, time series approach is a potential technique to apply the problem. Even though outlier transactions are detected based on unit prices of transactions, however it is possible to derive fraud detection rules based on transaction volumes.

A Study on The 'Kao Zheng Pai'(考證派) of The Traditional Medicine of Japan (일본 '고증파(考證派)' 의학에 관한 연구)

  • Park, Hyun-Kuk;Kim, Ki-Wook
    • Journal of Korean Medical classics
    • /
    • v.20 no.4
    • /
    • pp.211-250
    • /
    • 2007
  • 1. The 'Kao Zheng Pai(考證派) comes from the 'Zhe Zhong Pai' and is a school that is influenced by the confucianism of the Qing dynasty. In Japan Inoue Kinga(井上金娥), Yoshida Koton(吉田篁墩) became central members, and the rise of the methodology of historical research(考證學) influenced the members of the 'Zhe Zhong Pai', and the trend of historical research changed from confucianism to medicine, making a school of medicine based on the study of texts and proving that the classics were right. 2. Based on the function of 'Nei Qu Li '(內驅力) the 'Kao Zheng Pai', in the spirit of 'use confucianism as the base', researched letters, meanings and historical origins. Because they were influenced by the methodology of historical research(考證學) of the Qing era, they valued the evidential research of classic texts, and there was even one branch that did only historical research, the 'Rue Xue Kao Zheng Pai'(儒學考證派). Also, the 'Yi Xue Kao Zheng Pai'(醫學考證派) appeared by the influence of Yoshida Kouton and Kariya Ekisai(狩谷掖齋). 3. In the 'Kao Zheng Pai(考證派)'s theories and views the 'Yi Xue Kao Zheng Pai' did not look at medical scriptures like the "Huang Di Nei Jing"("黃帝內經") and did not do research on 'medical' related areas like acupuncture, the meridian and medicinal herbs. Since they were doctors that used medicine, they naturally were based on 'formulas'(方劑) and since their thoughts were based on the historical ideologies, they valued the "Shang Han Ja Bing Lun" which was revered as the 'ancestor of all formulas'(衆方之祖). 4. The lives of the important doctors of the 'Kao Zheng Pai' Meguro Dotaku(目黑道琢) Yamada Seichin(山田正珍), Yamada Kyoko(山田業廣), Mori Ritsi(森立之) Kitamura Naohara(喜多村直寬) are as follows. 1) Meguro Dotaku(目黑道琢 1739${\sim}$1798) was born of lowly descent but, using his intelligence and knowledge, became a professor as a Shi Jing Yi(市井醫) and as a professor for 34 years at Ji Shou Guan mastered the "Huang Di Nei Jing" after giving over 300 lectures. Since his pupil, Isawara Ken taught the Lan Men Wu Zhe(蘭門五哲) and Shibue Chusai, Mori Ritsi(森立之), Okanishi Gentei(岡西玄亭), Kiyokawa Gendoh(淸川玄道) and Yamada Kyoko(山田業廣), Meguro Dotaku is considered the founder of the 'Yi Xue Kao Zheng Pai'. 2) The family of Yamada Seichin(山田正珍 1749${\sim}$1787) had been medical officials in the Makufu(幕府) and the many books that his ancestors had left were the base of his art. Seichin learned from Shan Ben Bei Shan(山本北山), a 'Zhe Zhong Pai' scholar, and put his efforts into learning, teaching and researching the "Shang Han Lun"("傷寒論"). Living in a time between 'Gu Fang Pai'(古方派) member Nakanishi Goretada(中西惟忠) and 'Kao Zheng Pai' member Taki Motohiro(多紀元簡), he wrote 11 books, 2 of which express his thoughts and research clearly, the "Shang Han Lun Ji Cheng"("傷寒論集成") and "Shang Han Kao"("傷寒考"). His comparison of the 'six meridians'(3 yin, 3 yang) between the "Shang Han Lun" and the "Su Wen Re Lun"("素問 熱論) and his acknowledgement of the need and rationality of the concept of Yin-Yang and Deficient-Replete distinguishes him from the other 'Gu Fang Pai'. Also, his dissertation of the need for the concept doesn't use the theories of latter schools but uses the theory of the "Shang Han Lun" itself. He even researched the historical parts, such as terms like 'Shen Nong Chang Bai Cao'(神農嘗百草) and 'Cheng Qi Tang'(承氣湯) 3) The ancestor of Yamada Kyoko(山田業廣) was a court physician, and learned confucianism from Kao Zheng Pai 's Ashikawa Genan(朝川善庵) and medicine from Isawa Ranken and Taki Motokata(多紀元堅), and the secret to smallpox from Ikeda Keisui(池田京水). He later became a lecturer at the Edo Yi Xue Guan(醫學館) and was invited as the director to the Ji Zhong(濟衆) hospital. He also became the first owner of the Wen Zhi She(溫知社), whose main purpose was the revival of kampo, and launched the monthly magazine Wen Zi Yi Tan(溫知醫談). He also diagnosed and prescribed for the prince Ming Gong(明宮). His works include the "Jing Fang Bian"("經方辨"), "Shang Han Lun Si Ci"("傷寒論釋司"), "Huang Zhao Zhu Jia Zhi Yan Ji Yao"("皇朝諸家治驗集要") and "Shang Han Ja Bing Lun Lei Juan"("傷寒雜病論類纂"). of these, the "Jing Fang Bian"("經方辨") states that the Shi Gao(石膏) used in the "Shang Han Lun" had three meanings-Fa Biao(發表), Qing Re(淸熱), Zi Yin(滋陰)-which were from 'symptoms', and first deducted the effects and then told of the reason. Another book, the "Jiu Zhe Tang Du Shu Ji"("九折堂讀書記") researched and translated the difficult parts of the "Shang Han Lun", "Jin Qui Yao Lue", "Qian Jin Fang"("千金方"), and "Wai Tai Mi Yao"("外臺秘要"). He usually analyzed the 'symptoms' of diseases but the composition, measurement, processing and application of medicine were all in the spectrum of 'analystic research' and 'researching analysis'. 4) The ancestors of Mori Rits(森立之 1807${\sim}$ 1885) were warriors but he became a doctor by the will of his mother, and he learned from Shibue Chosai(澁江抽齋) and Isawaran Ken and later became a pupil of Shou Gu Yi Zhai, a historical research scholar. He then became a lecturer of medical herbs at the Yi Xue Guan, and later participated in the proofreading of "Yi Xin Fang"("醫心方") and with Chosai compiled the "Jing Ji Fang Gu Zhi"("神農本草經"). He visited the Chinese scholar Yang Shou Jing(楊守敬) in 1881 and exchanged books and ideas. Of his works, there are the collections(輯複本) of "Shen Nong Ben Cao Jing"(神農本草經) and "You Xiang Yi Hwa"("遊相醫話") and the records, notes, poems, and diaries such as "Zhi Yuan Man Lu"("枳園漫錄") and "Zhi Yuan Sui Bi"("枳園隨筆") that were not published. His thoughts were that in restoring the "Shen Nong Ben Cao Jing", "the herb to the doctor is like the "Shuo Wen Jie Zi"("說文解字") to the scholar", and he tried to restore the ancient herbal text using knowledge of medicine and investigation(考據). Also with Chosai he compiled the "Jing Ji Fang Gu Zhi"("經籍訪古志") using knowledge of ancient text. Ritzi left works on pure investigation, paid much attention to social problems, and through 12 years of poverty treated all people and animals in all branches of medicine, so he is called a 'half confucianist half doctor'(半儒半醫). 5) Kitamurana Ohira(喜多村直寬 1804${\sim}$1876) learned scriptures and ancient texts from confucian scholar Asaka Gonsai, and learned medicine from his father Huai Yaun(槐園). He became a teacher in the Yi Xue Guan in his middle ages, and to repay his country, he printed 266 volumes of "Yi Fang Lei Ju("醫方類聚") and 1000 volumes of "Tai Ping Yu Lan"("太平禦覽") and devoted it to his country to be spread. His works are about 40 volumes including "Jin Qui Yao Lue Shu Yi" and "Lao Yi Zhi Yan" but most of them are researches on the "Shang Han Za Bing Lun". In his "Shang Han Lun Shu Yi"("傷寒論疏義") he shows the concept of the six meridians through the Yin-Yang, Superficial or internal, cold or hot, deficient or replete state of diseases, but did not match the names with the six meridians of the meridian theory, and this has something in common with the research based on the confucianism of Song(宋儒). In clinical treatment he was positive toward old and new methods and also the experience of civilians, but was negative toward western medicine. 6) The ancestor of the Taki family Tanbano Yasuyori(丹波康賴 912-955) became a Yi Bo Shi(醫博士) by his medical skills and compiled the "Yi Xin Fang"("醫心方"). His first son Tanbano Shigeaki(丹波重明) inherited the Shi Yao Yuan(施藥院) and the third son Tanbano Masatada(丹波雅忠) inherited the Dian You Tou(典藥頭). Masatada's descendents succeeded him for 25 generations until the family name was changed to Jin Bao(金保) and five generations later it was changed again to Duo Ji(多紀). The research scholar Taki Motohiro was in the third generation after the last name was changed to Taki, and his family kept an important part in the line of medical officers in Japan. Taki Motohiro(多紀元簡 1755-1810) was a teacher in the Yi Xue Guan where his father was residing, and became the physician for the general Jia Qi(家齊). He had a short temper and was not good at getting on in the world, and went against the will of the king and was banished from Ao Yi Shi(奧醫師). His most famous works, the "Shang Han Lun Ji Yi" and "Jin Qui Yao Lue Ji Yi" are the work of 20 years of collecting the theories of many schools and discussing, and is one of the most famous books on the "Shang Han Lun" in Japan. "Yi Sheng" is a collection of essays on research. Also there are the "Su Wen Shi"("素問識"), "Ling Shu Shi"("靈樞識"), and the "Guan lu Fang Yao Bu"("觀聚方要補"). Taki Motohiro(多紀元簡)'s position was succeeded by his third son Yuan Yin(元胤 1789-1827), and his works include works of research such as "Nan Jing Shu Jeng"("難經疏證"), "Ti Ya"("體雅"), "Yao Ya"("藥雅"), "Ji Ya"("疾雅"), "Ming Yi Gong An"("名醫公案"), and "Yi Ji Kao"("醫籍考"). The "Yi Ji Kao" is 80 volumes in length and lists about 3000 books on medicine in China before the Qing Dao Guang(道光), and under each title are the origin, number of volumes, state of existence, and, if possible, the preface, Ba Yu(跋語) and biography of the author. The younger sibling of Yuan Yin(元胤 1789-1827), Yuan Jian(元堅 1795-1857) expounded ancient writings at the Yi Xue Guan only after he reached middle age, was chosen for the Ao Yi Shi(奧醫師) and later became a Fa Yan(法眼), Fa Yin(法印) and Yu Chi(樂匙). He left about 15 texts, including "Su Wen Shao Shi"("素間紹識"), "Yi Xin Fang"("醫心方"), published in school, "Za Bing Guang Yao"("雜病廣要"), "Shang Han Guang Yao"(傷寒廣要), and "Zhen Fu Yao Jue"("該腹要訣"). On the Taki family's founding and working of the Yi Xue Guan Yasuka Doumei(失數道明) said they were "the people who took the initiative in Edo era kampo medicine" and evaluated their deeds in the fields of 'research of ancient text', 'the founding of Ji Shou Guan and medical education', 'publication business', 'writing of medical text'. 5. The doctors of the 'Kao Zheng Pai ' based their operations on the Edo Yi Xue Guan, and made groups with people with similar ideas to them, making a relationship 'net'. For example the three families of Duo Ji(多紀), Tang Chuan(湯川) and Xi Duo Cun(喜多村) married and adopted with and from each other and made prefaces and epitaphs for each other. Thus, the Taki family, the state science of the Makufu, the tendency of thinking, one's own interests and glory, one's own knowledge, the need of the society all played a role in the development of kampo medicine in the 18th and 19th century.

  • PDF