• Title/Summary/Keyword: Classification Algorithms

Search Result 1,191, Processing Time 0.024 seconds

A performance improvement methodology of web document clustering using FDC-TCT (FDC-TCT를 이용한 웹 문서 클러스터링 성능 개선 기법)

  • Ko, Suc-Bum;Youn, Sung-Dae
    • The KIPS Transactions:PartD
    • /
    • v.12D no.4 s.100
    • /
    • pp.637-646
    • /
    • 2005
  • There are various problems while applying classification or clustering algorithm in that document classification which requires post processing or classification after getting as a web search result due to my keyword. Among those, two problems are severe. The first problem is the need to categorize the document with the help of the expert. And, the second problem is the long processing time the document classification takes. Therefore we propose a new method of web document clustering which can dramatically decrease the number of times to calculate a document similarity using the Transitive Closure Tree(TCT) and which is able to speed up the processing without loosing the precision. We also compare the effectivity of the proposed method with those existing algorithms and present the experimental results.

Prefix Cuttings for Packet Classification with Fast Updates

  • Han, Weitao;Yi, Peng;Tian, Le
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.8 no.4
    • /
    • pp.1442-1462
    • /
    • 2014
  • Packet classification is a key technology of the Internet for routers to classify the arriving packets into different flows according to the predefined rulesets. Previous packet classification algorithms have mainly focused on search speed and memory usage, while overlooking update performance. In this paper, we propose PreCuts, which can drastically improve the update speed. According to the characteristics of IP field, we implement three heuristics to build a 3-layer decision tree. In the first layer, we group the rules with the same highest byte of source and destination IP addresses. For the second layer, we cluster the rules which share the same IP prefix length. Finally, we use the heuristic of information entropy-based bit partition to choose some specific bits of IP prefix to split the ruleset into subsets. The heuristics of PreCuts will not introduce rule duplication and incremental update will not reduce the time and space performance. Using ClassBench, it is shown that compared with BRPS and EffiCuts, the proposed algorithm not only improves the time and space performance, but also greatly increases the update speed.

Effective Hand Gesture Recognition by Key Frame Selection and 3D Neural Network

  • Hoang, Nguyen Ngoc;Lee, Guee-Sang;Kim, Soo-Hyung;Yang, Hyung-Jeong
    • Smart Media Journal
    • /
    • v.9 no.1
    • /
    • pp.23-29
    • /
    • 2020
  • This paper presents an approach for dynamic hand gesture recognition by using algorithm based on 3D Convolutional Neural Network (3D_CNN), which is later extended to 3D Residual Networks (3D_ResNet), and the neural network based key frame selection. Typically, 3D deep neural network is used to classify gestures from the input of image frames, randomly sampled from a video data. In this work, to improve the classification performance, we employ key frames which represent the overall video, as the input of the classification network. The key frames are extracted by SegNet instead of conventional clustering algorithms for video summarization (VSUMM) which require heavy computation. By using a deep neural network, key frame selection can be performed in a real-time system. Experiments are conducted using 3D convolutional kernels such as 3D_CNN, Inflated 3D_CNN (I3D) and 3D_ResNet for gesture classification. Our algorithm achieved up to 97.8% of classification accuracy on the Cambridge gesture dataset. The experimental results show that the proposed approach is efficient and outperforms existing methods.

Evaluation on Performance for Classification of Students Leaving Their Majors Using Data Mining Technique (데이터마이닝 기법을 이용한 전공이탈자 분류를 위한 성능평가)

  • Leem, Young-Moon;Ryu, Chang-Hyun
    • Proceedings of the Safety Management and Science Conference
    • /
    • 2006.11a
    • /
    • pp.293-297
    • /
    • 2006
  • Recently most universities are suffering from students leaving their majors. In order to make a countermeasure for reducing major separation rate, many universities are trying to find a proper solution. As a similar endeavor, this paper uses decision tree algorithm which is one of the data mining techniques which conduct grouping or prediction into several sub-groups from interested groups. This technique can analyze a feature of type on students leaving their majors. The dataset consists of 5,115 features through data selection from total data of 13,346 collected from a university in Kangwon-Do during seven years(2000.3.1 $\sim$ 2006.6.30). The main objective of this study is to evaluate performance of algorithms including CHAID, CART and C4.5 for classification of students leaving their majors with ROC Chart, Lift Chart and Gains Chart. Also, this study provides values about accuracy, sensitivity, specificity using classification table. According to the analysis result, CART showed the best performance for classification of students leaving their majors.

  • PDF

Edge-Preserving Algorithm for Block Artifact Reduction and Its Pipelined Architecture

  • Vinh, Truong Quang;Kim, Young-Chul
    • ETRI Journal
    • /
    • v.32 no.3
    • /
    • pp.380-389
    • /
    • 2010
  • This paper presents a new edge-protection algorithm and its very large scale integration (VLSI) architecture for block artifact reduction. Unlike previous approaches using block classification, our algorithm utilizes pixel classification to categorize each pixel into one of two classes, namely smooth region and edge region, which are described by the edge-protection maps. Based on these maps, a two-step adaptive filter which includes offset filtering and edge-preserving filtering is used to remove block artifacts. A pipelined VLSI architecture of the proposed deblocking algorithm for HD video processing is also presented in this paper. A memory-reduced architecture for a block buffer is used to optimize memory usage. The architecture of the proposed deblocking filter is verified on FPGA Cyclone II and implemented using the ANAM 0.25 ${\mu}m$ CMOS cell library. Our experimental results show that our proposed algorithm effectively reduces block artifacts while preserving the details. The PSNR performance of our algorithm using pixel classification is better than that of previous algorithms using block classification.

Area Classification, Identification and Tracking for Multiple Moving Objects with the Similar Colors (유사한 색상을 지닌 다수의 이동 물체 영역 분류 및 식별과 추적)

  • Lee, Jung Sik;Joo, Yung Hoon
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.65 no.3
    • /
    • pp.477-486
    • /
    • 2016
  • This paper presents the area classification, identification, and tracking for multiple moving objects with the similar colors. To do this, first, we use the GMM(Gaussian Mixture Model)-based background modeling method to detect the moving objects. Second, we propose the use of the binary and morphology of image in order to eliminate the shadow and noise in case of detection of the moving object. Third, we recognize ROI(region of interest) of the moving object through labeling method. And, we propose the area classification method to remove the background from the detected moving objects and the novel method for identifying the classified moving area. Also, we propose the method for tracking the identified moving object using Kalman filter. To the end, we propose the effective tracking method when detecting the multiple objects with the similar colors. Finally, we demonstrate the feasibility and applicability of the proposed algorithms through some experiments.

Comparison of machine learning algorithms for regression and classification of ultimate load-carrying capacity of steel frames

  • Kim, Seung-Eock;Vu, Quang-Viet;Papazafeiropoulos, George;Kong, Zhengyi;Truong, Viet-Hung
    • Steel and Composite Structures
    • /
    • v.37 no.2
    • /
    • pp.193-209
    • /
    • 2020
  • In this paper, the efficiency of five Machine Learning (ML) methods consisting of Deep Learning (DL), Support Vector Machine (SVM), Random Forest (RF), Decision Tree (DT), and Gradient Tree Booting (GTB) for regression and classification of the Ultimate Load Factor (ULF) of nonlinear inelastic steel frames is compared. For this purpose, a two-story, a six-story, and a twenty-story space frame are considered. An advanced nonlinear inelastic analysis is carried out for the steel frames to generate datasets for the training of the considered ML methods. In each dataset, the input variables are the geometric features of W-sections and the output variable is the ULF of the frame. The comparison between the five ML methods is made in terms of the mean-squared-error (MSE) for the regression models and the accuracy for the classification models, respectively. Moreover, the ULF distribution curve is calculated for each frame and the strength failure probability is estimated. It is found that the GTB method has the best efficiency in both regression and classification of ULF regardless of the number of training samples and the space frames considered.

Using GAs to Support Feature Weighting and Instance Selection in CBR for CRM

  • Ahn, Hyun-Chul;Kim, Kyoung-Jae;Han, In-Goo
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2005.11a
    • /
    • pp.516-525
    • /
    • 2005
  • Case-based reasoning (CBR) has been widely used in various areas due to its convenience and strength in complex problem solving. Generally, in order to obtain successful results from CBR, effective retrieval of useful prior cases for the given problem is essential. However, designing a good matching and retrieval mechanism for CBR systems is still a controversial research issue. Most prior studies have tried to optimize the weights of the features or selection process of appropriate instances. But, these approaches have been performed independently until now. Simultaneous optimization of these components may lead to better performance than in naive models. In particular, there have been few attempts to simultaneously optimize the weight of the features and selection of the instances for CBR. Here we suggest a simultaneous optimization model of these components using a genetic algorithm (GA). We apply it to a customer classification model which utilizes demographic characteristics of customers as inputs to predict their buying behavior for a specific product. Experimental results show that simultaneously optimized CBR may improve the classification accuracy and outperform various optimized models of CBR as well as other classification models including logistic regression, multiple discriminant analysis, artificial neural networks and support vector machines.

  • PDF

Neural and MTS Algorithms for Feature Selection

  • Su, Chao-Ton;Li, Te-Sheng
    • International Journal of Quality Innovation
    • /
    • v.3 no.2
    • /
    • pp.113-131
    • /
    • 2002
  • The relationships among multi-dimensional data (such as medical examination data) with ambiguity and variation are difficult to explore. The traditional approach to building a data classification system requires the formulation of rules by which the input data can be analyzed. The formulation of such rules is very difficult with large sets of input data. This paper first describes two classification approaches using back-propagation (BP) neural network and Mahalanobis distance (MD) classifier, and then proposes two classification approaches for multi-dimensional feature selection. The first one proposed is a feature selection procedure from the trained back-propagation (BP) neural network. The basic idea of this procedure is to compare the multiplication weights between input and hidden layer and hidden and output layer. In order to simplify the structure, only the multiplication weights of large absolute values are used. The second approach is Mahalanobis-Taguchi system (MTS) originally suggested by Dr. Taguchi. The MTS performs Taguchi's fractional factorial design based on the Mahalanobis distance as a performance metric. We combine the automatic thresholding with MD: it can deal with a reduced model, which is the focus of this paper In this work, two case studies will be used as examples to compare and discuss the complete and reduced models employing BP neural network and MD classifier. The implementation results show that proposed approaches are effective and powerful for the classification.

Evaluation of User Profile Construction Method by Fuzzy Inference

  • Kim, Byeong-Man;Rho, Sun-Ok;Oh, Sang-Yeop;Lee, Hyun-Ah;Kim, Jong-Wan
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.8 no.3
    • /
    • pp.175-184
    • /
    • 2008
  • To construct user profiles automatically, an extraction method for representative keywords from a set of documents is needed. In our previous works, we suggested such a method and showed its usefulness. Here, we apply it to the classification problem and observe how much it contributes to performance improvement. The method can be used as a linear document classifier with few modifications. So, we first evaluate its performance for that case. The method is also applicable to some non-linear classification methods such as GIS (Generalized Instance Set). In GIS algorithm, generalized instances are built from training documents by a generalization function and then the K-NN algorithm is applied to them, where the method can be used as a generalization function. For comparative works, two famous linear classification methods, Rocchio and Widrow-Hoff algorithms, are also used. Experimental results show that our method is better than the others for the case that only positive documents are considered, but not when negative documents are considered together.