A Feature Selection Technique based on Distributional Differences

Kim, Sung-Dong;

doi:10.3745/JIPS.2006.2.1.023

Journal of Information Processing Systems

Volume 2 Issue 1
/
Pages.23-27
/
2006
/
1976-913X(pISSN)
/
2092-805X(eISSN)

Korea Information Processing Society (한국정보처리학회)

DOI QR Code

A Feature Selection Technique based on Distributional Differences

Kim, Sung-Dong (Dept. of Computer Engineering, Hansung University)

Published : 2006.03.01

https://doi.org/10.3745/JIPS.2006.2.1.023 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

This paper presents a feature selection technique based on distributional differences for efficient machine learning. Initial training data consists of data including many features and a target value. We classified them into positive and negative data based on the target value. We then divided the range of the feature values into 10 intervals and calculated the distribution of the intervals in each positive and negative data. Then, we selected the features and the intervals of the features for which the distributional differences are over a certain threshold. Using the selected intervals and features, we could obtain the reduced training data. In the experiments, we will show that the reduced training data can reduce the training time of the neural network by about 40%, and we can obtain more profit on simulated stock trading using the trained functions as well.

Keywords

References

H. Liu and H. Motoda, 'Feature Selection for Knowledge Discovery and Data Mining', Kluwer Academic Publishers, 1998
Daphne Koller and Mehran Sahami, 'Toward Optimal Feature Selection', In Proceedings of the 13th ICML, pp. 284-292, 1996
Sung-Dong Kim, Jae Won Lee, Jongwoo Lee, and Jinseok Chae, 'A Two-Phase Stock Trading System Using Distributional Differences', In Proceedings of the 13th DEXA, LNCS 2453, pp. 143-152, 2002
Sung-Dong Kim, Jae Won Lee, 'Induction of Stock Trading Rules Using Distributional Differences', In Proceedings of Korea Data Mining Conference, pp. 206-216, 2001
N. Wyse, R. Dubes, and A.K. Jain, 'A critical evaluation of intrinsic dimensionality algorithms', In E.S. Gelsema and L.N. Kanal, editors, Pattern Recognition in Practice, Morgan Kaufmann Publishers, Inc., pp. 415-425, 1980
A.L. Blum and P. Langley, 'Selection of relevant features and examples in machine learning', Artificial Intelligence, pp.245-271, 1997
J.G. Dy and C.E. Brodley, 'Feature subset selection and order identification for unsupervised learning', In Proceedings of the 17th International Conference on Machine Learning, pp. 247-254, 2000
Yiming Yang and Jan O. Pedersen, 'A Comparative Study on Feature Selection in Text Categorization', In Proceedings of the 14th International Conference on Machine Learning, pp. 412-420, 1997
G.H. John, R. Kohavi, and K. Pfleger, 'Irrelevant feature and the subset selection problem', In Proceedings of the 11th International Conference on Machine Learning, pp. 121-129, 1994
M. Dash and H. Lie, 'Feature selection methods for classification', Intelligent Data Analysis, Vol. 1, No. 3, pp. 131-156, 1997 https://doi.org/10.1016/S1088-467X(97)00008-5
L. Talavera, 'Feature selection as a preprocessing step for hierarchical clustering', In Proceedings of International Conference on Machine Learning, pp. 389-397, 1999
K. Kira and L. Rendell, 'A practical approach to feature selection', In Proceedings of the 9th ICML, pp. 249-256, 1992
J.G. Dy and C.E. Brodley, 'Feature subset selection and order identification for unsupervised learning', In Proceedings of the 17th International Conference on Machine Learning, pp. 247-254, 2000
H. Almuallim and T.G. Dietterich, 'Learning with many irrelevant features', In Proceedings of the 9th National Conference on Artificial Intelligence, pp. 547-552, 1991

Journal of Information Processing Systems

A Feature Selection Technique based on Distributional Differences

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)