• Title, Summary, Keyword: C4.5 Algorithm

Search Result 288, Processing Time 0.041 seconds

A Decision Tree Approach for Identifying Defective Products in the Manufacturing Process

  • Choi, Sungsu;Battulga, Lkhagvadorj;Nasridinov, Aziz;Yoo, Kwan-Hee
    • International Journal of Contents
    • /
    • v.13 no.2
    • /
    • pp.57-65
    • /
    • 2017
  • Recently, due to the significance of Industry 4.0, the manufacturing industry is developing globally. Conventionally, the manufacturing industry generates a large volume of data that is often related to process, line and products. In this paper, we analyzed causes of defective products in the manufacturing process using the decision tree technique, that is a well-known technique used in data mining. We used data collected from the domestic manufacturing industry that includes Manufacturing Execution System (MES), Point of Production (POP), equipment data accumulated directly in equipment, in-process/external air-conditioning sensors and static electricity. We propose to implement a model using C4.5 decision tree algorithm. Specifically, the proposed decision tree model is modeled based on components of a specific part. We propose to identify the state of products, where the defect occurred and compare it with the generated decision tree model to determine the cause of the defect.

An Efficient Bit Loading Algorithm for OFDM-based Wireless LAN systems and Hardware Architecture Design (OFDM 기반의 무선 LAN 시스템을 위한 효율적인 비트 로딩 알고리즘 및 하드웨어 구조 설계)

  • 강희윤;손병직;정윤호;김근회;김재석
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.41 no.5
    • /
    • pp.153-160
    • /
    • 2004
  • In this paper, we propose an efficient bit loading algorithm for IEEE 802.11a wireless LAN systems. While a conventional bit loading algorithm uses the SNR value of each subcarrier, it is very difficult to estimate the exact SNR value in wireless LAN systems due to randomness of AWGN. Therefore, in order to solve this problem our proposed algorithm uses the channel frequency response instead of the SNR of each subcarrier. Through simulation results, we can obtain the performance gain of 3.5∼8㏈ at PER of 10-2 with the proposed bit loading algorithm while the conventional one obtains the performance gain of 0.5∼5㏈ at the same conditions. Also, the increased data rate can be confirmed 63Mbps. After the logic synthesis using 0.3${\mu}{\textrm}{m}$ CMOS technology, the logic gate count for the processor with proposed algorithm can be reduced by 34% in comparison with the conventional one.

Evaluation of Machine Learning Algorithm Utilization for Lung Cancer Classification Based on Gene Expression Levels

  • Podolsky, Maxim D;Barchuk, Anton A;Kuznetcov, Vladimir I;Gusarova, Natalia F;Gaidukov, Vadim S;Tarakanov, Segrey A
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.17 no.2
    • /
    • pp.835-838
    • /
    • 2016
  • Background: Lung cancer remains one of the most common cancers in the world, both in terms of new cases (about 13% of total per year) and deaths (nearly one cancer death in five), because of the high case fatality. Errors in lung cancer type or malignant growth determination lead to degraded treatment efficacy, because anticancer strategy depends on tumor morphology. Materials and Methods: We have made an attempt to evaluate effectiveness of machine learning algorithms in the task of lung cancer classification based on gene expression levels. We processed four publicly available data sets. The Dana-Farber Cancer Institute data set contains 203 samples and the task was to classify four cancer types and sound tissue samples. With the University of Michigan data set of 96 samples, the task was to execute a binary classification of adenocarcinoma and non-neoplastic tissues. The University of Toronto data set contains 39 samples and the task was to detect recurrence, while with the Brigham and Women's Hospital data set of 181 samples it was to make a binary classification of malignant pleural mesothelioma and adenocarcinoma. We used the k-nearest neighbor algorithm (k=1, k=5, k=10), naive Bayes classifier with assumption of both a normal distribution of attributes and a distribution through histograms, support vector machine and C4.5 decision tree. Effectiveness of machine learning algorithms was evaluated with the Matthews correlation coefficient. Results: The support vector machine method showed best results among data sets from the Dana-Farber Cancer Institute and Brigham and Women's Hospital. All algorithms with the exception of the C4.5 decision tree showed maximum potential effectiveness in the University of Michigan data set. However, the C4.5 decision tree showed best results for the University of Toronto data set. Conclusions: Machine learning algorithms can be used for lung cancer morphology classification and similar tasks based on gene expression level evaluation.

Pipelined Scheduling of Functional HW/SW Modules for Platform-Based SoC Design

  • Kim, Won-Jong;Chang, June-Young;Cho, Han-Jin
    • ETRI Journal
    • /
    • v.27 no.5
    • /
    • pp.533-538
    • /
    • 2005
  • We developed a pipelined scheduling technique of functional hardware and software modules for platform-based system-on-a-chip (SoC) designs. It is based on a modified list scheduling algorithm. We used the pipelined scheduling technique for a performance analysis of an MPEG4 video encoder application. Then, we applied it for architecture exploration to achieve a better performance. In our experiments, the modified SoC platform with 6 pipelines for the 32-bit dual layer architecture shows a 118% improvement in performance compared to the given basic SoC platform with 4 pipelines for the 16-bit single-layer architecture.

  • PDF

Selection of an Optimal Algorithm among Decision Tree Techniques for Feature Analysis of Industrial Accidents in Construction Industries (건설업의 산업재해 특성분석을 위한 의사결정나무 기법의 상용 최적 알고리즘 선정)

  • Leem Young-Moon;Choi Yo-Han
    • Journal of the Korea Safety Management and Science
    • /
    • v.7 no.5
    • /
    • pp.1-8
    • /
    • 2005
  • The consequences of rapid industrial advancement, diversified types of business and unexpected industrial accidents have caused a lot of damage to many unspecified persons both in a human way and a material way Although various previous studies have been analyzed to prevent industrial accidents, these studies only provide managerial and educational policies using frequency analysis and comparative analysis based on data from past industrial accidents. The main objective of this study is to find an optimal algorithm for data analysis of industrial accidents and this paper provides a comparative analysis of 4 kinds of algorithms including CHAID, CART, C4.5, and QUEST. Decision tree algorithm is utilized to predict results using objective and quantified data as a typical technique of data mining. Enterprise Miner of SAS and AnswerTree of SPSS will be used to evaluate the validity of the results of the four algorithms. The sample for this work chosen from 19,574 data related to construction industries during three years ($2002\sim2004$) in Korea.

OCV Hysteresis Effect-based SOC Estimation in EKF Algorithm for a LiFePO4/C Cell (OCV 히스테리시스 특성을 이용한 확장 칼만 필터 기반 리튬 폴리머 배터리 SOC 추정)

  • Kim, J.H;Chun, C.Y.;Hur, I.N.;Cho, B.H.;Kim, B.J.
    • Proceedings of the KIPE Conference
    • /
    • /
    • pp.301-302
    • /
    • 2011
  • 본 논문에서는 리튬 폴리머 배터리($LiFePO_4/C$)의 개방전압(OCV;open-circuit voltage) 히스테리시스 특성을 이용한 확장 칼만 필터(EKF;extended Kalman filter) 기반 state-of-charge(SOC) 추정방법을 소개한다. 배터리 등가회로의 중요 요소인 OCV 모델링을 위해 충전 및 방전 각각의 OCV 히스테리시스 특성을 고려하였고 더불어 OCV-SOC 관계의 SOC 간격을 10%에서 5%로 조정하여 EKF 기반 SOC 추정알고리즘의 성능이 향상되었다. 축소된 하이브리드 자동차용 전류프로파일을 적용했을 때 SOC 추정이 잘 이루어지지 않는 영역은 EKF의 측정방정식에 노이즈 모델 및 데이터 리젝션(data rejection)을 구축하였다. 제안된 방법을 이용하여 SOC 추정결과 전류적산법 대비 5%이내의 SOC 추정에러를 만족하였다.

  • PDF

Clustering Algorithm for Efficient Energy Management in Sensor Network (센서 네트워크에서의 효율적 에너지 관리를 위한 클러스터링 알고리즘)

  • Seo, Sung-Yun;Jung, Won-Soo;Oh, Young-Hwan
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.33 no.10B
    • /
    • pp.845-854
    • /
    • 2008
  • In this paper, we propose a clustering algorithm for efficient energy management of sensor network consisted of sensor nodes that have restricted energy to solve these problem. Proposed algorithm improves energy efficiency by controlling sensing power. And it has distinctive feature that is applied in various network environment. The performance evaluation result shows that the energy efficiency is improved by 5% in the case of all sensor node fixed and by $10{\sim}15%$ in the case of all sensor node moving. It is confirmed through experiment process that the proposed algorithm brings energy efficiency ratio improvement of $5{\sim}15%$ more than the existing algorithm. Proposed algorithm derived an upper bound on the energy efficiency for Ubiquitous Computing environment that have various network environment that is with ZigBee technology of IEEE 802.15.4 bases. Also, we can blow bring elevation for lifetime of sensor network greatly for lifetime of sensor node as is small. And we think that may expand practical use extent of a sensor network technology more in fast changed network environment.

A Predicate-Sensitive Scheduling Algorithm in Instruction-Level Parallelism Processors (ILP 프로세서를 위한 조건실행 지원 스케쥴링 알고리즘)

  • Yoo, Byung-Kang;Lee, Sang-Jeong
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.1
    • /
    • pp.202-214
    • /
    • 1998
  • Exploitation of instruction-level parallelism(ILP) is an effective mechanism for improving the performance of modern super-scalar and VLIW processors. Various software techniques can be applied to increase ILP. Among these techniques, predicated execution is the one that increases the degree of ILP by allowing instructions from different basic blocks to be converted to a single basic block by removing branch instructions. In this paper, a global predicate-sensitive scheduling algorithm is proposed to improve the performance for ILP processors that support predicated execution. In order to examine the performance of proposed algorithm, a C compiler and a simulator are developed. By simulating various benchmark programs with the compiler and the simulator, the performance results of this algorithm are measured and the effectiveness of the algorithm is verified. As a result of measure performance with I, 2, 4 issue execution, this study was confirmed average performance by 20% or more.

  • PDF

Parameter Identification of Induction Motors using Variable-weighted Cost Function of Genetic Algorithms

  • Megherbi, A.C.;Megherbi, H.;Benmahamed, K.;Aissaoui, A.G.;Tahour, A.
    • Journal of Electrical Engineering and Technology
    • /
    • v.5 no.4
    • /
    • pp.597-605
    • /
    • 2010
  • This paper presents a contribution to parameter identification of a non-linear system using a new strategy to improve the genetic algorithm (GA) method. Since cost function plays an important role in GA-based parameter identification, we propose to improve the simple version of GA, where weights of the cost function are not taken as constant values, but varying along the procedure of parameter identification. This modified version of GA is applied to the induction motor (IM) as an example of nonlinear system. The GA cost function is the weighted sum of stator current and rotor speed errors between the plant and the model of induction motor. Simulation results show that the identification method based on improved GA is feasible and gives high precision.

Video Content Indexing using Kullback-Leibler Distance

  • Kim, Sang-Hyun
    • International Journal of Contents
    • /
    • v.5 no.4
    • /
    • pp.51-54
    • /
    • 2009
  • In huge video databases, the effective video content indexing method is required. While manual indexing is the most effective approach to this goal, it is slow and expensive. Thus automatic indexing is desirable and recently various indexing tools for video databases have been developed. For efficient video content indexing, the similarity measure is an important factor. This paper presents new similarity measures between frames and proposes a new algorithm to index video content using Kullback-Leibler distance defined between two histograms. Experimental results show that the proposed algorithm using Kullback-Leibler distance gives remarkable high accuracy ratios compared with several conventional algorithms to index video content.