• Title/Summary/Keyword: SVM Model

Search Result 702, Processing Time 0.024 seconds

A Classification Model for Illegal Debt Collection Using Rule and Machine Learning Based Methods

  • Kim, Tae-Ho;Lim, Jong-In
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.4
    • /
    • pp.93-103
    • /
    • 2021
  • Despite the efforts of financial authorities in conducting the direct management and supervision of collection agents and bond-collecting guideline, the illegal and unfair collection of debts still exist. To effectively prevent such illegal and unfair debt collection activities, we need a method for strengthening the monitoring of illegal collection activities even with little manpower using technologies such as unstructured data machine learning. In this study, we propose a classification model for illegal debt collection that combine machine learning such as Support Vector Machine (SVM) with a rule-based technique that obtains the collection transcript of loan companies and converts them into text data to identify illegal activities. Moreover, the study also compares how accurate identification was made in accordance with the machine learning algorithm. The study shows that a case of using the combination of the rule-based illegal rules and machine learning for classification has higher accuracy than the classification model of the previous study that applied only machine learning. This study is the first attempt to classify illegalities by combining rule-based illegal detection rules with machine learning. If further research will be conducted to improve the model's completeness, it will greatly contribute in preventing consumer damage from illegal debt collection activities.

A study on EPB shield TBM face pressure prediction using machine learning algorithms (머신러닝 기법을 활용한 토압식 쉴드TBM 막장압 예측에 관한 연구)

  • Kwon, Kibeom;Choi, Hangseok;Oh, Ju-Young;Kim, Dongku
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.24 no.2
    • /
    • pp.217-230
    • /
    • 2022
  • The adequate control of TBM face pressure is of vital importance to maintain face stability by preventing face collapse and surface settlement. An EPB shield TBM excavates the ground by applying face pressure with the excavated soil in the pressure chamber. One of the challenges during the EPB shield TBM operation is the control of face pressure due to difficulty in managing the excavated soil. In this study, the face pressure of an EPB shield TBM was predicted using the geological and operational data acquired from a domestic TBM tunnel site. Four machine learning algorithms: KNN (K-Nearest Neighbors), SVM (Support Vector Machine), RF (Random Forest), and XGB (eXtreme Gradient Boosting) were applied to predict the face pressure. The model comparison results showed that the RF model yielded the lowest RMSE (Root Mean Square Error) value of 7.35 kPa. Therefore, the RF model was selected as the optimal machine learning algorithm. In addition, the feature importance of the RF model was analyzed to evaluate appropriately the influence of each feature on the face pressure. The water pressure indicated the highest influence, and the importance of the geological conditions was higher in general than that of the operation features in the considered site.

Stress Identification and Analysis using Observed Heart Beat Data from Smart HRM Sensor Device

  • Pramanta, SPL Aditya;Kim, Myonghee;Park, Man-Gon
    • Journal of Korea Multimedia Society
    • /
    • v.20 no.8
    • /
    • pp.1395-1405
    • /
    • 2017
  • In this paper, we analyses heart beat data to identify subjects stress state (binary) using heart rate variability (HRV) features extracted from heart beat data of the subjects and implement supervised machine learning techniques to create the mental stress classifier. There are four steps need to be done: data acquisition, data processing (HRV analysis), features selection, and machine learning, before doing performance measurement. There are 56 features generated from the HRV Analysis module with several of them are selected (using own algorithm) after computing the Pearson Correlation Matrix (p-values). The results of the list of selected features compared with all features data are compared by its model error after training using several machine learning techniques: support vector machine, decision tree, and discriminant analysis. SVM model and decision tree model with using selected features shows close results compared to using all recording by only 1% difference. Meanwhile, the discriminant analysis differs about 5%. All the machine learning method used in this works have 90% maximum average accuracy.

A New Fine-grain SMS Corpus and Its Corresponding Classifier Using Probabilistic Topic Model

  • Ma, Jialin;Zhang, Yongjun;Wang, Zhijian;Chen, Bolun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.2
    • /
    • pp.604-625
    • /
    • 2018
  • Nowadays, SMS spam has been overflowing in many countries. In fact, the standards of filtering SMS spam are different from country to country. However, the current technologies and researches about SMS spam filtering all focus on dividing SMS message into two classes: legitimate and illegitimate. It does not conform to the actual situation and need. Furthermore, they are facing several difficulties, such as: (1) High quality and large-scale SMS spam corpus is very scarce, fine categorized SMS spam corpus is even none at all. This seriously handicaps the researchers' studies. (2) The limited length of SMS messages lead to lack of enough features. These factors seriously degrade the performance of the traditional classifiers (such as SVM, K-NN, and Bayes). In this paper, we present a new fine categorized SMS spam corpus which is unique and the largest one as far as we know. In addition, we propose a classifier, which is based on the probability topic model. The classifier can alleviate feature sparse problem in the task of SMS spam filtering. Moreover, we compare the approach with three typical classifiers on the new SMS spam corpus. The experimental results show that the proposed approach is more effective for the task of SMS spam filtering.

A Framework for Human Body Parts Detection in RGB-D Image (RGB-D 이미지에서 인체 영역 검출을 위한 프레임워크)

  • Hong, Sungjin;Kim, Myounggyu
    • Journal of Korea Multimedia Society
    • /
    • v.19 no.12
    • /
    • pp.1927-1935
    • /
    • 2016
  • This paper propose a framework for human body parts in RGB-D image. We conduct tasks of obtaining person area, finding candidate areas and local detection in order to detect hand, foot and head which have features of long accumulative geodesic distance. A person area is obtained with background subtraction and noise removal by using depth image which is robust to illumination change. Finding candidate areas performs construction of graph model which allows us to measure accumulative geodesic distance for the candidates. Instead of raw depth map, our approach constructs graph model with segmented regions by quadtree structure to improve searching time for the candidates. Local detection uses HOG based SVM for each parts, and head is detected for the first time. To minimize false detections for hand and foot parts, the candidates are classified with upper or lower body using the head position and properties of geodesic distance. Then, detect hand and foot with the local detectors. We evaluate our algorithm with datasets collected Kinect v2 sensor, and our approach shows good performance for head, hand and foot detection.

Determinants of the intention to use information services (서비스 가치 관점에서의 e-정보서비스 사용 의도에 관한 연구)

  • Han Jung-hee;Chang Hwal-sik
    • The Journal of Information Systems
    • /
    • v.13 no.1
    • /
    • pp.97-119
    • /
    • 2004
  • Recently, many e-business companies started to charge fees to the use of information contents service. However, little is known about how users evaluate and determine to purchase information services. Past technology adoption research has focused primarily on the positive utility gains side, focusing on usefulness and ease of use. Purchase of e-service, however, involves not only the position utilities but also negative utilities. This research uses the service value model(SVM) and explains user's intention of purchasing a new information service. Based on the Perceived Value Framework, this research investigates the impacts of the service quality and the fee charge on the user's perceived service value and further on user's intention of adopting the e-service. One of the most important postulations of this research is that both service quality and the fee charge influence user's intention through affecting the user's perceived service value. This research presences a conceptual model of users' e-service evaluation process. The conceptual framework provides a basis for understanding how perceptions of quality and sacrifice influence value perceptions and purchase intentions. The results of an empirical research suggest that the both service quality and fee charge have influences on the perceived service value. However, they do not directly affect user's intention to purchase the e-service. They affect user's intention to purchase through affecting the perceived service value. In conclusion, this research provides a base to build on for other research studying use intention model of new e-service.

  • PDF

Classification Accuracy by Deviation-based Classification Method with the Number of Training Documents (학습문서의 개수에 따른 편차기반 분류방법의 분류 정확도)

  • Lee, Yong-Bae
    • Journal of Digital Convergence
    • /
    • v.12 no.6
    • /
    • pp.325-332
    • /
    • 2014
  • It is generally accepted that classification accuracy is affected by the number of learning documents, but there are few studies that show how this influences automatic text classification. This study is focused on evaluating the deviation-based classification model which is developed recently for genre-based classification and comparing it to other classification algorithms with the changing number of training documents. Experiment results show that the deviation-based classification model performs with a superior accuracy of 0.8 from categorizing 7 genres with only 21 training documents. This exceeds the accuracy of Bayesian and SVM. The Deviation-based classification model obtains strong feature selection capability even with small number of training documents because it learns subject information within genre while other methods use different learning process.

Analysis of market share attraction data using LS-SVM (최소제곱 서포트벡터기계를 이용한 시장점유율 자료 분석)

  • Park, Hye-Jung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.5
    • /
    • pp.879-886
    • /
    • 2009
  • The purpose of this article is to present the application of Least Squares Support Vector Machine in analyzing the existing structure of brand. We estimate the parameters of the Market Share Attraction Model using a non-parametric technique for function estimation called Least Squares Support Vector Machine, which allows us to perform even nonlinear regression by constructing a linear regression function in a high dimensional feature space. Estimation by Least Squares Support Vector Machine technique makes it a good candidate for solving the Market Share Attraction Model. To illustrate the performance of the proposed method, we use the car sales data in South Korea's car market.

  • PDF

Real-time seismic structural response prediction system based on support vector machine

  • Lin, Kuang Yi;Lin, Tzu Kang;Lin, Yo
    • Earthquakes and Structures
    • /
    • v.18 no.2
    • /
    • pp.163-170
    • /
    • 2020
  • Floor acceleration plays a major role in the seismic design of nonstructural components and equipment supported by structures. Large floor acceleration may cause structural damage to or even collapse of buildings. For precision instruments in high-tech factories, even small floor accelerations can cause considerable damage in this study. Six P-wave parameters, namely the peak measurement of acceleration, peak measurement of velocity, peak measurement of displacement, effective predominant period, integral of squared velocity, and cumulative absolute velocity, were estimated from the first 3 s of a vertical ground acceleration time history. Subsequently, a new predictive algorithm was developed, which utilizes the aforementioned parameters with the floor height and fundamental period of the structure as the new inputs of a support vector regression model. Representative earthquakes, which were recorded by the Structure Strong Earthquake Monitoring System of the Central Weather Bureau in Taiwan from 1992 to 2016, were used to construct the support vector regression model for predicting the peak floor acceleration (PFA) of each floor. The results indicated that the accuracy of the predicted PFA, which was defined as a PFA within a one-level difference from the measured PFA on Taiwan's seismic intensity scale, was 96.96%. The proposed system can be integrated into the existing earthquake early warning system to provide complete protection to life and the economy.

A Comparative Study on the Optimal Model for abnormal Detection event of Heart Rate Time Series Data Based on the Correlation between PPG and ECG (PPG와 ECG의 상관 관계에 기반한 심박 시계열 데이터 이상 상황 탐지 최적 모델 비교 연구)

  • Kim, Jin-soo;Lee, Kang-yoon
    • Journal of Internet Computing and Services
    • /
    • v.20 no.6
    • /
    • pp.137-142
    • /
    • 2019
  • This paper Various services exist to detect and monitor abnormal event. However, most services focus on fires and gas leaks. so It is impossible to prevent and respond to emergency situations for the elderly and severely disabled people living alone. In this study, AI model is designed and compared to detect abnormal event of heart rate signal which is considered to be the most important among various bio signals. Specifically, electrocardiogram (ECG) data is collected using Physionet's MIT-BIH Arrhythmia Database, an open medical data. The collected data is transformed in different ways. We then compare the trained AI model with the modified and ECG data.