• Title/Summary/Keyword: SVM Classifier

Search Result 374, Processing Time 0.021 seconds

Vehicle Headlight and Taillight Recognition in Nighttime using Low-Exposure Camera and Wavelet-based Random Forest (저노출 카메라와 웨이블릿 기반 랜덤 포레스트를 이용한 야간 자동차 전조등 및 후미등 인식)

  • Heo, Duyoung;Kim, Sang Jun;Kwak, Choong Sub;Nam, Jae-Yeal;Ko, Byoung Chul
    • Journal of Broadcast Engineering
    • /
    • v.22 no.3
    • /
    • pp.282-294
    • /
    • 2017
  • In this paper, we propose a novel intelligent headlight control (IHC) system which is durable to various road lights and camera movement caused by vehicle driving. For detecting candidate light blobs, the region of interest (ROI) is decided as front ROI (FROI) and back ROI (BROI) by considering the camera geometry based on perspective range estimation model. Then, light blobs such as headlights, taillights of vehicles, reflection light as well as the surrounding road lighting are segmented using two different adaptive thresholding. From the number of segmented blobs, taillights are first detected using the redness checking and random forest classifier based on Haar-like feature. For the headlight and taillight classification, we use the random forest instead of popular support vector machine or convolutional neural networks for supporting fast learning and testing in real-life applications. Pairing is performed by using the predefined geometric rules, such as vertical coordinate similarity and association check between blobs. The proposed algorithm was successfully applied to various driving sequences in night-time, and the results show that the performance of the proposed algorithms is better than that of recent related works.

Abnormal Crowd Behavior Detection via H.264 Compression and SVDD in Video Surveillance System (H.264 압축과 SVDD를 이용한 영상 감시 시스템에서의 비정상 집단행동 탐지)

  • Oh, Seung-Geun;Lee, Jong-Uk;Chung, Yongw-Ha;Park, Dai-Hee
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.21 no.6
    • /
    • pp.183-190
    • /
    • 2011
  • In this paper, we propose a prototype system for abnormal sound detection and identification which detects and recognizes the abnormal situations by means of analyzing audio information coming in real time from CCTV cameras under surveillance environment. The proposed system is composed of two layers: The first layer is an one-class support vector machine, i.e., support vector data description (SVDD) that performs rapid detection of abnormal situations and alerts to the manager. The second layer classifies the detected abnormal sound into predefined class such as 'gun', 'scream', 'siren', 'crash', 'bomb' via a sparse representation classifier (SRC) to cope with emergency situations. The proposed system is designed in a hierarchical manner via a mixture of SVDD and SRC, which has desired characteristics as follows: 1) By fast detecting abnormal sound using SVDD trained with only normal sound, it does not perform the unnecessary classification for normal sound. 2) It ensures a reliable system performance via a SRC that has been successfully applied in the field of face recognition. 3) With the intrinsic incremental learning capability of SRC, it can actively adapt itself to the change of a sound database. The experimental results with the qualitative analysis illustrate the efficiency of the proposed method.

The Effect of Meta-Features of Multiclass Datasets on the Performance of Classification Algorithms (다중 클래스 데이터셋의 메타특징이 판별 알고리즘의 성능에 미치는 영향 연구)

  • Kim, Jeonghun;Kim, Min Yong;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.23-45
    • /
    • 2020
  • Big data is creating in a wide variety of fields such as medical care, manufacturing, logistics, sales site, SNS, and the dataset characteristics are also diverse. In order to secure the competitiveness of companies, it is necessary to improve decision-making capacity using a classification algorithm. However, most of them do not have sufficient knowledge on what kind of classification algorithm is appropriate for a specific problem area. In other words, determining which classification algorithm is appropriate depending on the characteristics of the dataset was has been a task that required expertise and effort. This is because the relationship between the characteristics of datasets (called meta-features) and the performance of classification algorithms has not been fully understood. Moreover, there has been little research on meta-features reflecting the characteristics of multi-class. Therefore, the purpose of this study is to empirically analyze whether meta-features of multi-class datasets have a significant effect on the performance of classification algorithms. In this study, meta-features of multi-class datasets were identified into two factors, (the data structure and the data complexity,) and seven representative meta-features were selected. Among those, we included the Herfindahl-Hirschman Index (HHI), originally a market concentration measurement index, in the meta-features to replace IR(Imbalanced Ratio). Also, we developed a new index called Reverse ReLU Silhouette Score into the meta-feature set. Among the UCI Machine Learning Repository data, six representative datasets (Balance Scale, PageBlocks, Car Evaluation, User Knowledge-Modeling, Wine Quality(red), Contraceptive Method Choice) were selected. The class of each dataset was classified by using the classification algorithms (KNN, Logistic Regression, Nave Bayes, Random Forest, and SVM) selected in the study. For each dataset, we applied 10-fold cross validation method. 10% to 100% oversampling method is applied for each fold and meta-features of the dataset is measured. The meta-features selected are HHI, Number of Classes, Number of Features, Entropy, Reverse ReLU Silhouette Score, Nonlinearity of Linear Classifier, Hub Score. F1-score was selected as the dependent variable. As a result, the results of this study showed that the six meta-features including Reverse ReLU Silhouette Score and HHI proposed in this study have a significant effect on the classification performance. (1) The meta-features HHI proposed in this study was significant in the classification performance. (2) The number of variables has a significant effect on the classification performance, unlike the number of classes, but it has a positive effect. (3) The number of classes has a negative effect on the performance of classification. (4) Entropy has a significant effect on the performance of classification. (5) The Reverse ReLU Silhouette Score also significantly affects the classification performance at a significant level of 0.01. (6) The nonlinearity of linear classifiers has a significant negative effect on classification performance. In addition, the results of the analysis by the classification algorithms were also consistent. In the regression analysis by classification algorithm, Naïve Bayes algorithm does not have a significant effect on the number of variables unlike other classification algorithms. This study has two theoretical contributions: (1) two new meta-features (HHI, Reverse ReLU Silhouette score) was proved to be significant. (2) The effects of data characteristics on the performance of classification were investigated using meta-features. The practical contribution points (1) can be utilized in the development of classification algorithm recommendation system according to the characteristics of datasets. (2) Many data scientists are often testing by adjusting the parameters of the algorithm to find the optimal algorithm for the situation because the characteristics of the data are different. In this process, excessive waste of resources occurs due to hardware, cost, time, and manpower. This study is expected to be useful for machine learning, data mining researchers, practitioners, and machine learning-based system developers. The composition of this study consists of introduction, related research, research model, experiment, conclusion and discussion.

Suggestion of Urban Regeneration Type Recommendation System Based on Local Characteristics Using Text Mining (텍스트 마이닝을 활용한 지역 특성 기반 도시재생 유형 추천 시스템 제안)

  • Kim, Ikjun;Lee, Junho;Kim, Hyomin;Kang, Juyoung
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.3
    • /
    • pp.149-169
    • /
    • 2020
  • "The Urban Renewal New Deal project", one of the government's major national projects, is about developing underdeveloped areas by investing 50 trillion won in 100 locations on the first year and 500 over the next four years. This project is drawing keen attention from the media and local governments. However, the project model which fails to reflect the original characteristics of the area as it divides project area into five categories: "Our Neighborhood Restoration, Housing Maintenance Support Type, General Neighborhood Type, Central Urban Type, and Economic Base Type," According to keywords for successful urban regeneration in Korea, "resident participation," "regional specialization," "ministerial cooperation" and "public-private cooperation", when local governments propose urban regeneration projects to the government, they can see that it is most important to accurately understand the characteristics of the city and push ahead with the projects in a way that suits the characteristics of the city with the help of local residents and private companies. In addition, considering the gentrification problem, which is one of the side effects of urban regeneration projects, it is important to select and implement urban regeneration types suitable for the characteristics of the area. In order to supplement the limitations of the 'Urban Regeneration New Deal Project' methodology, this study aims to propose a system that recommends urban regeneration types suitable for urban regeneration sites by utilizing various machine learning algorithms, referring to the urban regeneration types of the '2025 Seoul Metropolitan Government Urban Regeneration Strategy Plan' promoted based on regional characteristics. There are four types of urban regeneration in Seoul: "Low-use Low-Level Development, Abandonment, Deteriorated Housing, and Specialization of Historical and Cultural Resources" (Shon and Park, 2017). In order to identify regional characteristics, approximately 100,000 text data were collected for 22 regions where the project was carried out for a total of four types of urban regeneration. Using the collected data, we drew key keywords for each region according to the type of urban regeneration and conducted topic modeling to explore whether there were differences between types. As a result, it was confirmed that a number of topics related to real estate and economy appeared in old residential areas, and in the case of declining and underdeveloped areas, topics reflecting the characteristics of areas where industrial activities were active in the past appeared. In the case of the historical and cultural resource area, since it is an area that contains traces of the past, many keywords related to the government appeared. Therefore, it was possible to confirm political topics and cultural topics resulting from various events. Finally, in the case of low-use and under-developed areas, many topics on real estate and accessibility are emerging, so accessibility is good. It mainly had the characteristics of a region where development is planned or is likely to be developed. Furthermore, a model was implemented that proposes urban regeneration types tailored to regional characteristics for regions other than Seoul. Machine learning technology was used to implement the model, and training data and test data were randomly extracted at an 8:2 ratio and used. In order to compare the performance between various models, the input variables are set in two ways: Count Vector and TF-IDF Vector, and as Classifier, there are 5 types of SVM (Support Vector Machine), Decision Tree, Random Forest, Logistic Regression, and Gradient Boosting. By applying it, performance comparison for a total of 10 models was conducted. The model with the highest performance was the Gradient Boosting method using TF-IDF Vector input data, and the accuracy was 97%. Therefore, the recommendation system proposed in this study is expected to recommend urban regeneration types based on the regional characteristics of new business sites in the process of carrying out urban regeneration projects."