• Title, Summary, Keyword: C4.5 Algorithm

Search Result 288, Processing Time 0.062 seconds

A Study on Split Variable Selection Using Transformation of Variables in Decision Trees

  • Chung, Sung-S.;Lee, Ki-H.;Lee, Seung-S.
    • Journal of the Korean Data and Information Science Society
    • /
    • v.16 no.2
    • /
    • pp.195-205
    • /
    • 2005
  • In decision tree analysis, C4.5 and CART algorithm have some problems of computational complexity and bias on variable selection. But QUEST algorithm solves these problems by dividing the step of variable selection and split point selection. When input variables are continuous, QUEST algorithm uses ANOVA F-test under the assumption of normality and homogeneity of variances. In this paper, we investigate the influence of violation of normality assumption and effect of the transformation of variables in the QUEST algorithm. In the simulation study, we obtained the empirical powers of variable selection and the empirical bias of variable selection after transformation of variables having various type of underlying distributions.

  • PDF

Development of Look Ahead Interpolation Algorithm For PC Based CNC System (PC기반CNC시스템을 위한 Look Ahead 보간 알고리즘 개발)

  • Ryu, Sun-Joong
    • Journal of the Semiconductor & Display Technology
    • /
    • v.14 no.4
    • /
    • pp.30-37
    • /
    • 2015
  • This research aims to develop Look Ahead position interpolation algorithm for small size CNC machine controlled by PC based controller. Look Ahead scheme can process a bundle of CNC's linear interpolation commands simultaneously, which reduces acceleration and deceleration time within single linear interpolation command. The algorithm is derived as simple analytical form which can be adapted to PC based CNC system by C language programming. The performance of the algorithm was verified by tail stock machining G codes experimentally. The average traverse speed of the CNC machine was increased by 27.5% and the total traverse time also reduced by 27.2% with the Look Ahead scheme.

Decision Tree Approach for Factor Analysis of Industrial Accidents (산업재해의 요인분석을 위한 의사결정나무)

  • Leem, Young-Moon;Hwang, Young-Seob
    • Journal of the Korea Safety Management and Science
    • /
    • v.8 no.4
    • /
    • pp.1-11
    • /
    • 2006
  • 의사결정나무 알고리즘은 데이터마이닝 기법중 하나인데 관심이 되는 데이터들에 대하여 분류 및 예측을 가능하게 해준다. 이 기법은 데이터 형태의 특성을 분석할 수 있고 산업재해 형태의 차이점을 찾아내는데 사용될 수 있다. 본 연구에서는 산업재해 데이터의 특성을 파악하고자 C4.5 알고리즘을 사용하였다. 본 연구에서 분석을 위하여 사용된 데이터는 강원도에서 발생한 2년 동안의 산업재해 관련 데이터로서 연구에 적용된 데이터의 수는 19,909개로 구성되어 있다. 본 연구의 목적을 위하여 한 개의 목표변수와 여덟 개의 독립변수가 산업재해 형태에 따라 세분화 되었다. 분석 후 데이터는 222개의 전체 나뭇가지와 151개의 줄기가지로 분류되었다. 또한 본 연구에서는 재해자들의 위험도 관리와 감소를 위하여 이익도표를 제공하였다.

A Study on Development of A Web-Based Forecasting System of Industrial Accidents (웹 기반의 산업재해 예측시스템 개발에 관한 연구)

  • Leem, Young-Moon;Hwang, Young-Seob;Choi, Yo-Han
    • Proceedings of the Safety Management and Science Conference
    • /
    • /
    • pp.269-274
    • /
    • 2007
  • Ultimate goal of this research is to develop a web-based forecasting system of industrial accidents. As an initial step for the purpose of this study, this paper provides a comparative analysis of 4 kinds of algorithms including CHAID, CART, C4.5, and QUEST. In addition, this paper presents the logical process for development of a forecasting system. Decision tree algorithm is utilized to predict results using objective and quantified data as a typical technique of data mining. The sample for this work was chosen from 10,536 data related to manufacturing industries during three years(2002$^{\sim}$2004) in korea.

  • PDF

Automatic Switching of Clustering Methods based on Fuzzy Inference in Bibliographic Big Data Retrieval System

  • Zolkepli, Maslina;Dong, Fangyan;Hirota, Kaoru
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.14 no.4
    • /
    • pp.256-267
    • /
    • 2014
  • An automatic switch among ensembles of clustering algorithms is proposed as a part of the bibliographic big data retrieval system by utilizing a fuzzy inference engine as a decision support tool to select the fastest performing clustering algorithm between fuzzy C-means (FCM) clustering, Newman-Girvan clustering, and the combination of both. It aims to realize the best clustering performance with the reduction of computational complexity from O($n^3$) to O(n). The automatic switch is developed by using fuzzy logic controller written in Java and accepts 3 inputs from each clustering result, i.e., number of clusters, number of vertices, and time taken to complete the clustering process. The experimental results on PC (Intel Core i5-3210M at 2.50 GHz) demonstrates that the combination of both clustering algorithms is selected as the best performing algorithm in 20 out of 27 cases with the highest percentage of 83.99%, completed in 161 seconds. The self-adapted FCM is selected as the best performing algorithm in 4 cases and the Newman-Girvan is selected in 3 cases.The automatic switch is to be incorporated into the bibliographic big data retrieval system that focuses on visualization of fuzzy relationship using hybrid approach combining FCM and Newman-Girvan algorithm, and is planning to be released to the public through the Internet.

A Stable Multilevel Partitioning Algorithm for VLSI Circuit Designs Using Adaptive Connectivity Threshold (가변적인 연결도 임계치 설정에 의한 대규모 집적회로 설계에서의 안정적인 다단 분할 방법)

  • 임창경;정정화
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.35C no.10
    • /
    • pp.69-77
    • /
    • 1998
  • This paper presents a new efficient and stable multilevel partitioning algorithm for VLSI circuit design. The performance of multilevel partitioning algorithms that are proposed to enhance the performance of previous iterative-improvement partitioning algorithms for large scale circuits, depend on choice of construction methods for partition hierarchy. As the most of previous multilevel partitioning algorithms forces experimental constraints on the process of hierarchy construction, the stability of their performances goes down. The lack of stability causes the large variation of partition results during multiple runs. In this paper, we minimize the use of experimental constraints and propose a new method for constructing partition hierarchy. The proposed method clusters the cells with the connection status of the circuit. After constructing the partition hierarchy, a partition improvement algorithm, HYIP$^{[11]}$ using hybrid bucket structure, unclusters the hierachy to get partition results. The experimental results on ACM/SIGDA benchmark circuits show improvement up to 10-40% in minimum outsize over the previous algorithm $^{[3] [4] [5] [8] [10]}$. Also our technique outperforms ML$^{[10]}$ represented multilevel partition method by about 5% and 20% for minimum and average custsize, respectively. In addition, the results of our algorithm with 10 runs are better than ML algorithm with 100 runs.

  • PDF

Temperature Prediction of Underground Working Place Using Artificial Neural Networks (인공신경망을 이용한 심부 갱내온도 예측)

  • Kim, Yun-Kwang;Kim, Jin
    • Tunnel and Underground Space
    • /
    • v.17 no.4
    • /
    • pp.301-310
    • /
    • 2007
  • The prediction of temperature in the workings for the propriety examination for the development of a deep coal bed and the ventilation design is fairly important. It is quite demanding to obtain precise thermal conductivity of rock due to the variety and the complexity of the rock types contiguous to the coal bed. Therefore, to estimate the thermal conductivity corresponding to this geological situation and complex gallery conditions, a computing program which is TemPredict, is developed in this study. It employs Artificial Neural Network and calculates the climatic conditions in galleries. This advanced neural network is based upon the Back-Propagation Algorithm and composed of the input layers that are acceptant of the physical and geological factors of the coal bed and the hidden layers each of which has the 5 and 3 neurons. To verify TemPredict, the calculated result is compared with the measured one at the entrance of -300 ML 9X of Jang-sung production department, Jang-sung Coal Mine. The difference between the results calculated by TemPredict ($25.65^{\circ}C$) and measured ($25.7^{\circ}C$) is only $0.05^{\circ}C$, which is less than the allowable error 5%. The result has more than 95% of very high reliability. The temperature prediction for the main carriage gallery 9X in -425 ML under construction when it is completed is made. Its result is $28.2^{\circ}C$. In the future, it would contribute to the ventilation design for the mine and the underground structures.

Temperature Control of Greenhouse Using Ventilation Window Adjustments by a Fuzzy Algorithm (퍼지제어에 의한 자연환기온실의 온도제어)

  • 정태상;민영봉;문경규
    • Protected Horticulture and Plant Factory
    • /
    • v.10 no.1
    • /
    • pp.42-49
    • /
    • 2001
  • This study was carried out to develop a fuzzy control technique of ventilation window for controlling a temperature in a greenhouse. To reduce the fuzzy variables, the inside air temperature shop was taken as one of fuzzy variables, because the inside air temperature variation of a greenhouse by ventilation at the same window aperture is affected by difference between inside and outside air temperature, outside wind speed and the wind direction. Therefore, the antecedent variables for fuzzy algorithm were used the control error and its slop, which was same value as the inside air temperature slop during the control period, and the conclusion variable was used the window aperture opening rate. Through the basic and applicative control experiment with the control period of 3 minutes the optimum ranges of fuzzy variables were decided. The control error and its slop were taken as 3 and 1.5 times compared with target error in steady state, and the window opening rate were taken as 30% of full size of the window aperture. To evaluate the developed fuzzy algorithm in which the optimized 19 rules of fuzzy production were used, the performances of fuzzy control and PID control were compared. The temperature control errors by the fuzzy control and PID control were lower than 1.3$^{\circ}C$ and 2.2$^{\circ}C$ respectively. The accumulated operating size of the window, the number of operating and the number of inverse operating for the fuzzy control were 0.4 times, 0.5 times and 0.3 times of those compared with the PID control. Therefore, the fuzzy control can operating the window more smooth and reduce the operating energy by 1/2 times of PID control.

  • PDF

A New Temperature Control System by PWM Control Method for Thermal Massage System (PWM 제어방식에 의한 온열치료기의 새로운 온도제어 시스템)

  • Song, Myoung-Gyu;Lee, Jae-Heung
    • Journal of IKEEE
    • /
    • v.18 no.3
    • /
    • pp.409-419
    • /
    • 2014
  • This paper proposes a new temperature control algorithm and system configuration of the pTMS(personal Thermal Massage System). By controlling the pulse width of the PWM(Pulse Width Modulation), the temparature of the heating lamp can be controlled stably, which is indispensable to the massage function. This technology is also adapted to the 'thermal massage', 'thermal acupressure', 'thermal moxibustion' functions of medical equipments. The temperature could be set at between $40^{\circ}C{\sim}70^{\circ}C$ by increments of $5^{\circ}C$, the control could be made in real time by increments of $1^{\circ}C$, and the temperature is displayed on the monitor by triggering every 2 seconds. when the present temperature is equal to the preset temperature, the PWM signal is minimized, and when the present temperature is higher than the preset temperature, overheating is prevented by interrupting the PWM output signal. When the difference of temperature exceeds $4^{\circ}C$, the PWM control is maximized in order for the system to reach the target temperature within a short period of time.

Classification Techniques for XML Document Using Text Mining (텍스트 마이닝을 이용한 XML 문서 분류 기술)

  • Kim Cheon-Shik;Hong You-Sik
    • Journal of the Korea Society of Computer and Information
    • /
    • v.11 no.2
    • /
    • pp.15-23
    • /
    • 2006
  • Millions of documents are already on the Internet, and new documents are being formed all the time. This poses a very important problem in the management and querying of documents to classify them on the Internet by the most suitable means. However, most users have been using the document classification method based on a keyword. This method does not classify documents efficiently, and there is a weakness in the category of document that includes meaning. Document classification by a person can be very correct sometimes and often times is required. Therefore, in this paper, We wish to classify documents by using a neural network algorithm and C4.5 algorithms. We used resume data forming by XML for a document classification experiment. The result showed excellent possibilities in the document category. Therefore, We expect an applicable solution for various document classification problems.

  • PDF