Search | Korea Science

The Effect of Meta-Features of Multiclass Datasets on the Performance of Classification Algorithms (다중 클래스 데이터셋의 메타특징이 판별 알고리즘의 성능에 미치는 영향 연구)

Kim, Jeonghun;Kim, Min Yong;Kwon, Ohbyung
- Journal of Intelligence and Information Systems
- /
- v.26 no.1
- /
- pp.23-45
- /
- 2020
Big data is creating in a wide variety of fields such as medical care, manufacturing, logistics, sales site, SNS, and the dataset characteristics are also diverse. In order to secure the competitiveness of companies, it is necessary to improve decision-making capacity using a classification algorithm. However, most of them do not have sufficient knowledge on what kind of classification algorithm is appropriate for a specific problem area. In other words, determining which classification algorithm is appropriate depending on the characteristics of the dataset was has been a task that required expertise and effort. This is because the relationship between the characteristics of datasets (called meta-features) and the performance of classification algorithms has not been fully understood. Moreover, there has been little research on meta-features reflecting the characteristics of multi-class. Therefore, the purpose of this study is to empirically analyze whether meta-features of multi-class datasets have a significant effect on the performance of classification algorithms. In this study, meta-features of multi-class datasets were identified into two factors, (the data structure and the data complexity,) and seven representative meta-features were selected. Among those, we included the Herfindahl-Hirschman Index (HHI), originally a market concentration measurement index, in the meta-features to replace IR(Imbalanced Ratio). Also, we developed a new index called Reverse ReLU Silhouette Score into the meta-feature set. Among the UCI Machine Learning Repository data, six representative datasets (Balance Scale, PageBlocks, Car Evaluation, User Knowledge-Modeling, Wine Quality(red), Contraceptive Method Choice) were selected. The class of each dataset was classified by using the classification algorithms (KNN, Logistic Regression, Nave Bayes, Random Forest, and SVM) selected in the study. For each dataset, we applied 10-fold cross validation method. 10% to 100% oversampling method is applied for each fold and meta-features of the dataset is measured. The meta-features selected are HHI, Number of Classes, Number of Features, Entropy, Reverse ReLU Silhouette Score, Nonlinearity of Linear Classifier, Hub Score. F1-score was selected as the dependent variable. As a result, the results of this study showed that the six meta-features including Reverse ReLU Silhouette Score and HHI proposed in this study have a significant effect on the classification performance. (1) The meta-features HHI proposed in this study was significant in the classification performance. (2) The number of variables has a significant effect on the classification performance, unlike the number of classes, but it has a positive effect. (3) The number of classes has a negative effect on the performance of classification. (4) Entropy has a significant effect on the performance of classification. (5) The Reverse ReLU Silhouette Score also significantly affects the classification performance at a significant level of 0.01. (6) The nonlinearity of linear classifiers has a significant negative effect on classification performance. In addition, the results of the analysis by the classification algorithms were also consistent. In the regression analysis by classification algorithm, Naïve Bayes algorithm does not have a significant effect on the number of variables unlike other classification algorithms. This study has two theoretical contributions: (1) two new meta-features (HHI, Reverse ReLU Silhouette score) was proved to be significant. (2) The effects of data characteristics on the performance of classification were investigated using meta-features. The practical contribution points (1) can be utilized in the development of classification algorithm recommendation system according to the characteristics of datasets. (2) Many data scientists are often testing by adjusting the parameters of the algorithm to find the optimal algorithm for the situation because the characteristics of the data are different. In this process, excessive waste of resources occurs due to hardware, cost, time, and manpower. This study is expected to be useful for machine learning, data mining researchers, practitioners, and machine learning-based system developers. The composition of this study consists of introduction, related research, research model, experiment, conclusion and discussion.
https://doi.org/10.13088/jiis.2020.26.1.023 인용 PDF KSCI

Predicting the Performance of Recommender Systems through Social Network Analysis and Artificial Neural Network (사회연결망분석과 인공신경망을 이용한 추천시스템 성능 예측)

Cho, Yoon-Ho;Kim, In-Hwan
- Journal of Intelligence and Information Systems
- /
- v.16 no.4
- /
- pp.159-172
- /
- 2010
The recommender system is one of the possible solutions to assist customers in finding the items they would like to purchase. To date, a variety of recommendation techniques have been developed. One of the most successful recommendation techniques is Collaborative Filtering (CF) that has been used in a number of different applications such as recommending Web pages, movies, music, articles and products. CF identifies customers whose tastes are similar to those of a given customer, and recommends items those customers have liked in the past. Numerous CF algorithms have been developed to increase the performance of recommender systems. Broadly, there are memory-based CF algorithms, model-based CF algorithms, and hybrid CF algorithms which combine CF with content-based techniques or other recommender systems. While many researchers have focused their efforts in improving CF performance, the theoretical justification of CF algorithms is lacking. That is, we do not know many things about how CF is done. Furthermore, the relative performances of CF algorithms are known to be domain and data dependent. It is very time-consuming and expensive to implement and launce a CF recommender system, and also the system unsuited for the given domain provides customers with poor quality recommendations that make them easily annoyed. Therefore, predicting the performances of CF algorithms in advance is practically important and needed. In this study, we propose an efficient approach to predict the performance of CF. Social Network Analysis (SNA) and Artificial Neural Network (ANN) are applied to develop our prediction model. CF can be modeled as a social network in which customers are nodes and purchase relationships between customers are links. SNA facilitates an exploration of the topological properties of the network structure that are implicit in data for CF recommendations. An ANN model is developed through an analysis of network topology, such as network density, inclusiveness, clustering coefficient, network centralization, and Krackhardt's efficiency. While network density, expressed as a proportion of the maximum possible number of links, captures the density of the whole network, the clustering coefficient captures the degree to which the overall network contains localized pockets of dense connectivity. Inclusiveness refers to the number of nodes which are included within the various connected parts of the social network. Centralization reflects the extent to which connections are concentrated in a small number of nodes rather than distributed equally among all nodes. Krackhardt's efficiency characterizes how dense the social network is beyond that barely needed to keep the social group even indirectly connected to one another. We use these social network measures as input variables of the ANN model. As an output variable, we use the recommendation accuracy measured by F1-measure. In order to evaluate the effectiveness of the ANN model, sales transaction data from H department store, one of the well-known department stores in Korea, was used. Total 396 experimental samples were gathered, and we used 40%, 40%, and 20% of them, for training, test, and validation, respectively. The 5-fold cross validation was also conducted to enhance the reliability of our experiments. The input variable measuring process consists of following three steps; analysis of customer similarities, construction of a social network, and analysis of social network patterns. We used Net Miner 3 and UCINET 6.0 for SNA, and Clementine 11.1 for ANN modeling. The experiments reported that the ANN model has 92.61% estimated accuracy and 0.0049 RMSE. Thus, we can know that our prediction model helps decide whether CF is useful for a given application with certain data characteristics.
PDF KSCI

Hardware Approach to Fuzzy Inference―ASIC and RISC―

Watanabe, Hiroyuki
- Proceedings of the Korean Institute of Intelligent Systems Conference
- /
- 1993.06a
- /
- pp.975-976
- /
- 1993
This talk presents the overview of the author's research and development activities on fuzzy inference hardware. We involved it with two distinct approaches. The first approach is to use application specific integrated circuits (ASIC) technology. The fuzzy inference method is directly implemented in silicon. The second approach, which is in its preliminary stage, is to use more conventional microprocessor architecture. Here, we use a quantitative technique used by designer of reduced instruction set computer (RISC) to modify an architecture of a microprocessor. In the ASIC approach, we implemented the most widely used fuzzy inference mechanism directly on silicon. The mechanism is beaded on a max-min compositional rule of inference, and Mandami's method of fuzzy implication. The two VLSI fuzzy inference chips are designed, fabricated, and fully tested. Both used a full-custom CMOS technology. The second and more claborate chip was designed at the University of North Carolina(U C) in cooperation with MCNC. Both VLSI chips had muliple datapaths for rule digital fuzzy inference chips had multiple datapaths for rule evaluation, and they executed multiple fuzzy if-then rules in parallel. The AT & T chip is the first digital fuzzy inference chip in the world. It ran with a 20 MHz clock cycle and achieved an approximately 80.000 Fuzzy Logical inferences Per Second (FLIPS). It stored and executed 16 fuzzy if-then rules. Since it was designed as a proof of concept prototype chip, it had minimal amount of peripheral logic for system integration. UNC/MCNC chip consists of 688,131 transistors of which 476,160 are used for RAM memory. It ran with a 10 MHz clock cycle. The chip has a 3-staged pipeline and initiates a computation of new inference every 64 cycle. This chip achieved an approximately 160,000 FLIPS. The new architecture have the following important improvements from the AT & T chip: Programmable rule set memory (RAM). On-chip fuzzification operation by a table lookup method. On-chip defuzzification operation by a centroid method. Reconfigurable architecture for processing two rule formats. RAM/datapath redundancy for higher yield It can store and execute 51 if-then rule of the following format: IF A and B and C and D Then Do E, and Then Do F. With this format, the chip takes four inputs and produces two outputs. By software reconfiguration, it can store and execute 102 if-then rules of the following simpler format using the same datapath: IF A and B Then Do E. With this format the chip takes two inputs and produces one outputs. We have built two VME-bus board systems based on this chip for Oak Ridge National Laboratory (ORNL). The board is now installed in a robot at ORNL. Researchers uses this board for experiment in autonomous robot navigation. The Fuzzy Logic system board places the Fuzzy chip into a VMEbus environment. High level C language functions hide the operational details of the board from the applications programme . The programmer treats rule memories and fuzzification function memories as local structures passed as parameters to the C functions. ASIC fuzzy inference hardware is extremely fast, but they are limited in generality. Many aspects of the design are limited or fixed. We have proposed to designing a are limited or fixed. We have proposed to designing a fuzzy information processor as an application specific processor using a quantitative approach. The quantitative approach was developed by RISC designers. In effect, we are interested in evaluating the effectiveness of a specialized RISC processor for fuzzy information processing. As the first step, we measured the possible speed-up of a fuzzy inference program based on if-then rules by an introduction of specialized instructions, i.e., min and max instructions. The minimum and maximum operations are heavily used in fuzzy logic applications as fuzzy intersection and union. We performed measurements using a MIPS R3000 as a base micropro essor. The initial result is encouraging. We can achieve as high as a 2.5 increase in inference speed if the R3000 had min and max instructions. Also, they are useful for speeding up other fuzzy operations such as bounded product and bounded sum. The embedded processor's main task is to control some device or process. It usually runs a single or a embedded processer to create an embedded processor for fuzzy control is very effective. Table I shows the measured speed of the inference by a MIPS R3000 microprocessor, a fictitious MIPS R3000 microprocessor with min and max instructions, and a UNC/MCNC ASIC fuzzy inference chip. The software that used on microprocessors is a simulator of the ASIC chip. The first row is the computation time in seconds of 6000 inferences using 51 rules where each fuzzy set is represented by an array of 64 elements. The second row is the time required to perform a single inference. The last row is the fuzzy logical inferences per second (FLIPS) measured for ach device. There is a large gap in run time between the ASIC and software approaches even if we resort to a specialized fuzzy microprocessor. As for design time and cost, these two approaches represent two extremes. An ASIC approach is extremely expensive. It is, therefore, an important research topic to design a specialized computing architecture for fuzzy applications that falls between these two extremes both in run time and design time/cost. TABLEI INFERENCE TIME BY 51 RULES {{{{Time }}{{MIPS R3000 }}{{ASIC }}{{Regular }}{{With min/mix }}{{6000 inference 1 inference FLIPS }}{{125s 20.8ms 48 }}{{49s 8.2ms 122 }}{{0.0038s 6.4㎲ 156,250 }} }}
PDF

A Study on the Emotional Happiness of Human (인간의 감성적 행복감에 관한 연구)

Jeong, Cheol-Yeong
- Journal of Korea Entertainment Industry Association
- /
- v.13 no.6
- /
- pp.211-220
- /
- 2019
It helps to wisely abstain from errors of the a priori subjective emotions related to human emotions, and orders emotions to make rational choices. These emotional happiness of human and moral sensitivities work directly or indirectly in rational choice of rational thought and reason. Abraham would have been troubled by the divine mandate to sacrifice a son who was only one, and a son who had been healed. Was his reason reasonable at this time? In rational reason, it can be said that the act of dedicating his son is an appropriate act, but is it possible in the human mind? Aristoteles also called human virtue virtue in good for human beings. Because happiness is also a mental activity, we have to know a certain degree about the mind. This ψυχή(psyche, spirit) spirit is an irrational element that is invisible but an intervention in rational principles. Also C. G. Jung states that all human beings have four dynamic psychological functions that are not visible, and that the mind is driven by these four functional dimensions. This means that the elements of S, Sensing, N, Intuition, T, Thinking, and Feeling are combined. David Hume also emphasized the principle of empathy, asserting that morality can not be derived from reason, and Max Ferdinand Scheler, before grasping the visual characteristics of a person, has already captured the whole feeling of the person, And that the value given to this feeling is the value, and that the function of emotion that is elevated to the perceived object by grasping the value through this process and the value is always preceded by the reason. Emmanuel Levinas states that emotional emotions of love are ahead of reason and that emotions precede human reasoning and rationality is the inability of emotional control that we need rational thought and rational and wise action as reason of control and temperance. As part of human emotional education, in the 7th curriculum, Bloom's cognitive, perceptive, and behavioral domain, which is a person with integrated thinking, is trying to be a moral practitioner. It focuses on how to act according to the direction of emotions for virtuous acts and how to develop emotions for emotions on behalf of vicious acts. We can design the possibility and direction of cultivating human emotions and emotional happiness and happy sensitivities by the principle of strengthening virtue and the principle of elimination of ill feeling.
https://doi.org/10.21184/jkeia.2019.8.13.6.211 인용

Application of MicroPACS Using the Open Source (Open Source를 이용한 MicroPACS의 구성과 활용)

You, Yeon-Wook;Kim, Yong-Keun;Kim, Yeong-Seok;Won, Woo-Jae;Kim, Tae-Sung;Kim, Seok-Ki
- The Korean Journal of Nuclear Medicine Technology
- /
- v.13 no.1
- /
- pp.51-56
- /
- 2009
Purpose: Recently, most hospitals are introducing the PACS system and use of the system continues to expand. But small-scaled PACS called MicroPACS has already been in use through open source programs. The aim of this study is to prove utility of operating a MicroPACS, as a substitute back-up device for conventional storage media like CDs and DVDs, in addition to the full-PACS already in use. This study contains the way of setting up a MicroPACS with open source programs and assessment of its storage capability, stability, compatibility and performance of operations such as "retrieve", "query". Materials and Methods: 1. To start with, we searched open source software to correspond with the following standards to establish MicroPACS, (1) It must be available in Windows Operating System. (2) It must be free ware. (3) It must be compatible with PET/CT scanner. (4) It must be easy to use. (5) It must not be limited of storage capacity. (6) It must have DICOM supporting. 2. (1) To evaluate availability of data storage, we compared the time spent to back up data in the open source software with the optical discs (CDs and DVD-RAMs), and we also compared the time needed to retrieve data with the system and with optical discs respectively. (2) To estimate work efficiency, we measured the time spent to find data in CDs, DVD-RAMs and MicroPACS. 7 technologists participated in this study. 3. In order to evaluate stability of the software, we examined whether there is a data loss during the system is maintained for a year. Comparison object; How many errors occurred in randomly selected data of 500 CDs. Result: 1. We chose the Conquest DICOM Server among 11 open source software used MySQL as a database management system. 2. (1) Comparison of back up and retrieval time (min) showed the result of the following: DVD-RAM (5.13,2.26)/Conquest DICOM Server (1.49,1.19) by GE DSTE (p<0.001), CD (6.12,3.61)/Conquest (0.82,2.23) by GE DLS (p<0.001), CD (5.88,3.25)/Conquest (1.05,2.06) by SIEMENS. (2) The wasted time (sec) to find some data is as follows: CD ($156{\pm}46$), DVD-RAM ($115{\pm}21$) and Conquest DICOM Server ($13{\pm}6$). 3. There was no data loss (0%) for a year and it was stored 12741 PET/CT studies in 1.81 TB memory. In case of CDs, On the other hand, 14 errors among 500 CDs (2.8%) is generated. Conclusions: We found that MicroPACS could be set up with the open source software and its performance was excellent. The system built with open source proved more efficient and more robust than back-up process using CDs or DVD-RAMs. We believe that the operation of the MicroPACS would be effective data storage device as long as its operators develop and systematize it.
PDF

Effects of climate change on biodiversity and measures for them (생물다양성에 대한 기후변화의 영향과 그 대책)

An, Ji Hong;Lim, Chi Hong;Jung, Song Hie;Kim, A Reum;Lee, Chang Seok
- Journal of Wetlands Research
- /
- v.18 no.4
- /
- pp.474-480
- /
- 2016
In this study, formation background of biodiversity and its changes in the process of geologic history, and effects of climate change on biodiversity and human were discussed and the alternatives to reduce the effects of climate change were suggested. Biodiversity is 'the variety of life' and refers collectively to variation at all levels of biological organization. That is, biodiversity encompasses the genes, species and ecosystems and their interactions. It provides the basis for ecosystems and the services on which all people fundamentally depend. Nevertheless, today, biodiversity is increasingly threatened, usually as the result of human activity. Diverse organisms on earth, which are estimated as 10 to 30 million species, are the result of adaptation and evolution to various environments through long history of four billion years since the birth of life. Countlessly many organisms composing biodiversity have specific characteristics, respectively and are interrelated with each other through diverse relationship. Environment of the earth, on which we live, has also created for long years through extensive relationship and interaction of those organisms. We mankind also live through interrelationship with the other organisms as an organism. The man cannot lives without the other organisms around him. Even though so, human beings accelerate mean extinction rate about 1,000 times compared with that of the past for recent several years. We have to conserve biodiversity for plentiful life of our future generation and are responsible for sustainable use of biodiversity. Korea has achieved faster economic growth than any other countries in the world. On the other hand, Korea had hold originally rich biodiversity as it is not only a peninsula country stretched lengthily from north to south but also three sides are surrounded by sea. But they disappeared increasingly in the process of fast economic growth. Korean people have created specific Korean culture by coexistence with nature through a long history of agriculture, forestry, and fishery. But in recent years, the relationship between Korean and nature became far in the processes of introduction of western culture and development of science and technology and specific natural feature born from harmonious combination between nature and culture disappears more and more. Population of Korea is expected to be reduced as contrasted with world population growing continuously. At this time, we need to restore biodiversity damaged in the processes of rapid population growth and economic development in concert with recovery of natural ecosystem due to population decrease. There were grand extinction events of five times since the birth of life on the earth. Modern extinction is very rapid and human activity is major causal factor. In these respects, it is distinguished from the past one. Climate change is real. Biodiversity is very vulnerable to climate change. If organisms did not find a survival method such as 'adaptation through evolution', 'movement to the other place where they can exist', and so on in the changed environment, they would extinct. In this respect, if climate change is continued, biodiversity should be damaged greatly. Furthermore, climate change would also influence on human life and socio-economic environment through change of biodiversity. Therefore, we need to grasp the effects that climate change influences on biodiversity more actively and further to prepare the alternatives to reduce the damage. Change of phenology, change of distribution range including vegetation shift, disharmony of interaction among organisms, reduction of reproduction and growth rates due to odd food chain, degradation of coral reef, and so on are emerged as the effects of climate change on biodiversity. Expansion of infectious disease, reduction of food production, change of cultivation range of crops, change of fishing ground and time, and so on appear as the effects on human. To solve climate change problem, first of all, we need to mitigate climate change by reducing discharge of warming gases. But even though we now stop discharge of warming gases, climate change is expected to be continued for the time being. In this respect, preparing adaptive strategy of climate change can be more realistic. Continuous monitoring to observe the effects of climate change on biodiversity and establishment of monitoring system have to be preceded over all others. Insurance of diverse ecological spaces where biodiversity can establish, assisted migration, and establishment of horizontal network from south to north and vertical one from lowland to upland ecological networks could be recommended as the alternatives to aid adaptation of biodiversity to the changing climate.
https://doi.org/10.17663/JWR.2016.18.4.474 인용 PDF KSCI

Search Result 1,296, Processing Time 0.029 seconds

The Effect of Meta-Features of Multiclass Datasets on the Performance of Classification Algorithms (다중 클래스 데이터셋의 메타특징이 판별 알고리즘의 성능에 미치는 영향 연구)

Predicting the Performance of Recommender Systems through Social Network Analysis and Artificial Neural Network (사회연결망분석과 인공신경망을 이용한 추천시스템 성능 예측)

Hardware Approach to Fuzzy Inference―ASIC and RISC―

A Study on the Emotional Happiness of Human (인간의 감성적 행복감에 관한 연구)

Application of MicroPACS Using the Open Source (Open Source를 이용한 MicroPACS의 구성과 활용)

Effects of climate change on biodiversity and measures for them (생물다양성에 대한 기후변화의 영향과 그 대책)

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)