Search | Korea Science

Predicting stock price direction by using data mining methods : Emphasis on comparing single classifiers and ensemble classifiers

Eo, Kyun Sun;Lee, Kun Chang
- Journal of the Korea Society of Computer and Information
- /
- v.22 no.11
- /
- pp.111-116
- /
- 2017
This paper proposes a data mining approach to predicting stock price direction. Stock market fluctuates due to many factors. Therefore, predicting stock price direction has become an important issue in the field of stock market analysis. However, in literature, there are few studies applying data mining approaches to predicting the stock price direction. To contribute to literature, this paper proposes comparing single classifiers and ensemble classifiers. Single classifiers include logistic regression, decision tree, neural network, and support vector machine. Ensemble classifiers we consider are adaboost, random forest, bagging, stacking, and vote. For the sake of experiments, we garnered dataset from Korea Stock Exchange (KRX) ranging from 2008 to 2015. Data mining experiments using WEKA revealed that random forest, one of ensemble classifiers, shows best results in terms of metrics such as AUC (area under the ROC curve) and accuracy.
https://doi.org/10.9708/jksci.2017.22.11.111 인용 PDF KSCI

A Multi-Level Integrator with Programming Based Boosting for Person Authentication Using Different Biometrics

Kundu, Sumana;Sarker, Goutam
- Journal of Information Processing Systems
- /
- v.14 no.5
- /
- pp.1114-1135
- /
- 2018
A multiple classification system based on a new boosting technique has been approached utilizing different biometric traits, that is, color face, iris and eye along with fingerprints of right and left hands, handwriting, palm-print, gait (silhouettes) and wrist-vein for person authentication. The images of different biometric traits were taken from different standard databases such as FEI, UTIRIS, CASIA, IAM and CIE. This system is comprised of three different super-classifiers to individually perform person identification. The individual classifiers corresponding to each super-classifier in their turn identify different biometric features and their conclusions are integrated together in their respective super-classifiers. The decisions from individual super-classifiers are integrated together through a mega-super-classifier to perform the final conclusion using programming based boosting. The mega-super-classifier system using different super-classifiers in a compact form is more reliable than single classifier or even single super-classifier system. The system has been evaluated with accuracy, precision, recall and F-score metrics through holdout method and confusion matrix for each of the single classifiers, super-classifiers and finally the mega-super-classifier. The different performance evaluations are appreciable. Also the learning and the recognition time is fairly reasonable. Thereby making the system is efficient and effective.
https://doi.org/10.3745/JIPS.02.0094 인용 PDF KSCI

Movie Review Classification Based on a Multiple Classifier

Tsutsumi, Kimitaka;Shimada, Kazutaka;Endo, Tsutomu
- Proceedings of the Korean Society for Language and Information Conference
- /
- 2007.11a
- /
- pp.481-488
- /
- 2007
In this paper, we propose a method to classify movie review documents into positive or negative opinions. There are several approaches to classify documents. The previous studies, however, used only a single classifier for the classification task. We describe a multiple classifier for the review document classification task. The method consists of three classifiers based on SVMs, ME and score calculation. We apply two voting methods and SVMs to the integration process of single classifiers. The integrated methods improved the accuracy as compared with the three single classifiers. The experimental results show the effectiveness of our method.
PDF

Improving Weak Classifiers by Using Discriminant Function in Selecting Threshold Values (판별 함수를 이용한 문턱치 선정에 의한 약분류기 개선)

Shyam, Adhikari;Yoo, Hyeon-Joong;Kim, Hyong-Suk
- The Journal of the Korea Contents Association
- /
- v.10 no.12
- /
- pp.84-90
- /
- 2010
In this paper, we propose a quadratic discriminant analysis based approach for improving the discriminating strength of weak classifiers based on simple Haar-like features that were used in the Viola-Jones object detection framework. Viola and Jones built a strong classifier using a boosted ensemble of weak classifiers. However, their single threshold (or decision boundary) based weak classifier is sub-optimal and too weak for efficient discrimination between object class and background. A quadratic discriminant analysis based approach is presented which leads to hyper-quadric boundary between the object class and background class, thus realizing multiple thresholds based weak classifiers. Experiments carried out for car detection using 1000 positive and 3000 negative images for training, and 500 positive and 500 negative images for testing show that our method yields higher classification performance with fewer classifiers than single threshold based weak classifiers.
https://doi.org/10.5392/JKCA.2010.10.12.084 인용 PDF KSCI

Appliance identification algorithm using multiple classifier system (다중 분류 시스템을 이용한 가전기기 식별 알고리즘)

Park, Yong-Soon;Chung, Tae-Yun;Park, Sung-Wook
- IEMEK Journal of Embedded Systems and Applications
- /
- v.10 no.4
- /
- pp.213-219
- /
- 2015
Real-time energy monitoring systems is a demand-response system which is reported to be effective in saving energy up to 12%. Real-time energy monitoring system is commonly composed of smart-plugs which sense how much electrical power is consumed and IHD(In-Home Display device) which displays power consumption patterns. Even though the monitoring system is effective, users should themselves match which smart plus is connected to which appliance. In order to make the matching work to be automatic, the monitoring system need to have appliance identification algorithm, and some works have made under the name of NILM(Non-Intrusive Load Monitoring). This paper proposed an algorithm which utilizes multiple classifiers to improve accuracy of appliance identification. The algorithm proposes to understand each classifiers performance, that is, when a classifier make a result how much the result is reliable, and utilize it in choosing the final result among result candidates from many classifiers. By using the proposed algorithm this paper make 4.5% of improved accuracy with respect to using single best classifier, and 2.9% of improved accuracy with respect to other method using multiple classifiers, so called CDM(Commitee Decision Mechanism) method.
https://doi.org/10.14372/IEMEK.2015.10.4.213 인용 PDF KSCI

A credit scoring model of a capital company's customers using genetic algorithm based integration of multiple classifiers (유전자알고리즘 기반 복수 분류모형 통합에 의한 캐피탈고객의 신용 스코어링 모형)

Kim Kap-Sik
- Journal of the Korea Society of Computer and Information
- /
- v.10 no.6 s.38
- /
- pp.279-286
- /
- 2005
The objective of this study is to suggest a credit scoring model of a capital company's customers by integration of multiple classifiers using genetic algorithm. For this purpose , an integrated model is derived in two phases. In first phase, three types of classifiers MLP (Multi-Layered Perceptron), RBF (Radial Basis Function) and linear models - are trained, in which each type has three ones respectively so htat we have nine classifiers totally. In second phase, genetic algorithm is applied twice for integration of classifiers. That is, after htree models are derived from each group, a final one is from these three, In result, our suggested model shows a superior accuracy to any single ones.
PDF

Complex Neural Classifiers for Power Quality Data Mining

Vidhya, S.;Kamaraj, V.
- Journal of Electrical Engineering and Technology
- /
- v.13 no.4
- /
- pp.1715-1723
- /
- 2018
This work investigates the performance of fully complex- valued radial basis function network(FC-RBF) and complex extreme learning machine (CELM) based neural approaches for classification of power quality disturbances. This work engages the use of S-Transform to extract the features relating to single and combined power quality disturbances. The performance of the classifiers are compared with their real valued counterparts namely extreme learning machine(ELM) and support vector machine(SVM) in terms of convergence and classification ability. The results signify the suitability of complex valued classifiers for power quality disturbance classification.
https://doi.org/10.5370/JEET.2018.13.4.1715 인용 PDF KSCI

Optimization of Random Subspace Ensemble for Bankruptcy Prediction (재무부실화 예측을 위한 랜덤 서브스페이스 앙상블 모형의 최적화)

Min, Sung-Hwan
- Journal of Information Technology Services
- /
- v.14 no.4
- /
- pp.121-135
- /
- 2015
Ensemble classification is to utilize multiple classifiers instead of using a single classifier. Recently ensemble classifiers have attracted much attention in data mining community. Ensemble learning techniques has been proved to be very useful for improving the prediction accuracy. Bagging, boosting and random subspace are the most popular ensemble methods. In random subspace, each base classifier is trained on a randomly chosen feature subspace of the original feature space. The outputs of different base classifiers are aggregated together usually by a simple majority vote. In this study, we applied the random subspace method to the bankruptcy problem. Moreover, we proposed a method for optimizing the random subspace ensemble. The genetic algorithm was used to optimize classifier subset of random subspace ensemble for bankruptcy prediction. This paper applied the proposed genetic algorithm based random subspace ensemble model to the bankruptcy prediction problem using a real data set and compared it with other models. Experimental results showed the proposed model outperformed the other models.
https://doi.org/10.9716/KITS.2015.14.4.121 인용 PDF KSCI

A Genetic Algorithm-based Classifier Ensemble Optimization for Activity Recognition in Smart Homes

Fatima, Iram;Fahim, Muhammad;Lee, Young-Koo;Lee, Sungyoung
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.7 no.11
- /
- pp.2853-2873
- /
- 2013
Over the last few years, one of the most common purposes of smart homes is to provide human centric services in the domain of u-healthcare by analyzing inhabitants' daily living. Currently, the major challenges in activity recognition include the reliability of prediction of each classifier as they differ according to smart homes characteristics. Smart homes indicate variation in terms of performed activities, deployed sensors, environment settings, and inhabitants' characteristics. It is not possible that one classifier always performs better than all the other classifiers for every possible situation. This observation has motivated towards combining multiple classifiers to take advantage of their complementary performance for high accuracy. Therefore, in this paper, a method for activity recognition is proposed by optimizing the output of multiple classifiers with Genetic Algorithm (GA). Our proposed method combines the measurement level output of different classifiers for each activity class to make up the ensemble. For the evaluation of the proposed method, experiments are performed on three real datasets from CASAS smart home. The results show that our method systematically outperforms single classifier and traditional multiclass models. The significant improvement is achieved from 0.82 to 0.90 in the F-measures of recognized activities as compare to existing methods.
https://doi.org/10.3837/tiis.2013.11.018 인용 PDF KSCI KPUBS HTML

Improving an Ensemble Model Using Instance Selection Method (사례 선택 기법을 활용한 앙상블 모형의 성능 개선)

Min, Sung-Hwan
- Journal of Korean Society of Industrial and Systems Engineering
- /
- v.39 no.1
- /
- pp.105-115
- /
- 2016
Ensemble classification involves combining individually trained classifiers to yield more accurate prediction, compared with individual models. Ensemble techniques are very useful for improving the generalization ability of classifiers. The random subspace ensemble technique is a simple but effective method for constructing ensemble classifiers; it involves randomly drawing some of the features from each classifier in the ensemble. The instance selection technique involves selecting critical instances while deleting and removing irrelevant and noisy instances from the original dataset. The instance selection and random subspace methods are both well known in the field of data mining and have proven to be very effective in many applications. However, few studies have focused on integrating the instance selection and random subspace methods. Therefore, this study proposed a new hybrid ensemble model that integrates instance selection and random subspace techniques using genetic algorithms (GAs) to improve the performance of a random subspace ensemble model. GAs are used to select optimal (or near optimal) instances, which are used as input data for the random subspace ensemble model. The proposed model was applied to both Kaggle credit data and corporate credit data, and the results were compared with those of other models to investigate performance in terms of classification accuracy, levels of diversity, and average classification rates of base classifiers in the ensemble. The experimental results demonstrated that the proposed model outperformed other models including the single model, the instance selection model, and the original random subspace ensemble model.
https://doi.org/10.11627/jkise.2016.39.1.105 인용 PDF KSCI

Search Result 66, Processing Time 0.019 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)