Feature Selection Algorithm for Intrusions Detection System using Sequential Forward Search and Random Forest Classifier

Lee, Jinlee;Park, Dooho;Lee, Changhoon;

doi:10.3837/tiis.2017.10.024

KSII Transactions on Internet and Information Systems (TIIS)

제11권10호
/
Pages.5132-5148
/
2017
/
1976-7277(pISSN)
/
1976-7277(eISSN)

한국인터넷정보학회 (Korean Society for Internet Information)

DOI QR Code

Feature Selection Algorithm for Intrusions Detection System using Sequential Forward Search and Random Forest Classifier

Lee, Jinlee (Division of Computer Science and Engineering, Konkuk University) ;
Park, Dooho (Intelligent Service Development Team, XIIlab Co. Ltd) ;
Lee, Changhoon (Division of Computer Science and Engineering, Konkuk University)

투고 : 2017.05.08
심사 : 2017.08.29
발행 : 2017.10.31

https://doi.org/10.3837/tiis.2017.10.024 인용 PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

Cyber attacks are evolving commensurate with recent developments in information security technology. Intrusion detection systems collect various types of data from computers and networks to detect security threats and analyze the attack information. The large amount of data examined make the large number of computations and low detection rates problematic. Feature selection is expected to improve the classification performance and provide faster and more cost-effective results. Despite the various feature selection studies conducted for intrusion detection systems, it is difficult to automate feature selection because it is based on the knowledge of security experts. This paper proposes a feature selection technique to overcome the performance problems of intrusion detection systems. Focusing on feature selection, the first phase of the proposed system aims at constructing a feature subset using a sequential forward floating search (SFFS) to downsize the dimension of the variables. The second phase constructs a classification model with the selected feature subset using a random forest classifier (RFC) and evaluates the classification accuracy. Experiments were conducted with the NSL-KDD dataset using SFFS-RF, and the results indicated that feature selection techniques are a necessary preprocessing step to improve the overall system performance in systems that handle large datasets. They also verified that SFFS-RF could be used for data classification. In conclusion, SFFS-RF could be the key to improving the classification model performance in machine learning.

키워드

참고문헌

C. Yin, L. Ma, L. Feng, Z. Yin and J. Wang, "A Feature Selection Algorithm towards Efficient Intrusion Detection," International Journal of Multimedia and Ubiquitous Engineering, vol.10, no.11, pp.253-264, 2015.
S. Y. Ohn, S. D. Chi, and M. Y. Han, "Feature Selection for Classification of Mass Spectrometric Proteomic Data Using Random Forest," The Korea Society For Simulation(KSS), Vol.22, No.4, pp.139-147, 2013. https://doi.org/10.9709/JKSS.2013.22.4.139
W. Lee and S. Oh, "Efficient Feature Selection Based Near Real-Time Hybrid Intrusion Detection System," KIPS Tr. Comp. and Comm. Sys., vol.5, no.12, pp.471-480, Dec. 2016. https://doi.org/10.3745/KTCCS.2016.5.12.471
NSL-KDD Dataset [Internet], http://www.unb.ca/research/iscx/dataset/iscx-NSL-KDD-dataset.html.
M. Tavallaee, E. Bagheri, W. Lu, and A.-A. Ghorbani, "A detailed analysis of the kdd cup 99 data set," Computational Intelligence for Security and Defense Applications, CISDA 2009. IEEE Symposium on. IEEE, pp.1-6, 2009.
L. C. Molina, L. Belanche, and A. Nebot, "Feature selection algorithms: a survey and experimental evaluation," in Data Mining, ICDM 2003. Proceedings. 2002 IEEE International Conference on. IEEE. pp.306-313, 2002.
G. CHANDRASHEKAR, F. SAHIN, "A survey on feature selection methods," Computers & Electrical Engineering, Vol.40, No.1, pp.16-28, 2014. https://doi.org/10.1016/j.compeleceng.2013.11.024
L. Breiman, "Random forests," Machine learning, Vol.45, No.1, pp.5-32, 2001. https://doi.org/10.1023/A:1010933404324
F. Baumann, A. Ehlers, K. Vogt, and B. Rosenhahn, "Cascaded Random Forest for Fast Object Detection," Scandinavian Conference on Image Analysis, Springer Berlin Heidelberg , pp. 131-142, 2013.
Y. Mishina, R. Murata, Y. Yamauchi, T. Yamashita, and H. Fujiyoshi, "Boosted random forest," IEICE TRANSACTIONS on Information and Systems, Vol.98, No.9, pp.1630-1636, 2015.
Ian H. Witten, Eibe Frank and Mark A. Hall, "Data Mining. 3rd," Trans. Lee. S. H, acorn, 2014.
M. A. Hall, "Correlation-based Feature Subset Selection for Machine Learning," doctoral dissertation, The University of Waikato, Canada, 1999.
H. Liu and R. Setiono, "A probabilistic approach to feature selection-A filter solution," in Proc. of 13th International Conference on Machine Learning, pp.319-327, 1996.
Kakavand, M., Mustapha, N., Mustapha, A., and Abdullah, M. T., "Effective Dimensionality Reduction of Payload-Based Anomaly Detection in TMAD Model for HTTP Payload," KSII Transactions on Internet and Information Systems, Vol. 10, No.8, pp.3884-3910, 2016 https://doi.org/10.3837/tiis.2016.08.025
Eid, H. F., Salama, M. A, Hassanien, A. E., and Kim, T. H, "Bi-layer behavioral-based feature selection approach for network intrusion classification," International Conference on Security Technology, Springer Berlin Heidelberg, vol. 259, pp.195-203, 2011.
S. Mukherjee and N. Sharma, "Intrusion detection using naive Bayes classifier with feature reduction," Procedia Technology, vol.4, pp.119-128, 2012. https://doi.org/10.1016/j.protcy.2012.05.017
H. F. Eid, A. E. Hassanien, T.-h. Kim, and S. Banerjee, "Linear correlation-based feature selection for network intrusion detection model," Advances in Security of Information and Communication Networks, Springer Berlin Heidelberg, vol.381, pp.240-248, 2013.
E. de la Hoz, A. Ortiz, J. Ortega, and E. de la Hoz, "Network anomaly classification by support vector classifiers ensemble and non-linear projection techniques," International Conference on Hybrid Artificial Intelligence Systems, Springer Berlin Heidelberg, vol.8073, pp.103-111, 2013.
Abd-Eldayem and Mohamed M, "A proposed HTTP service based IDS," Egyptian Informatics Journal, vol.15, no.1, 13-24, 2014. https://doi.org/10.1016/j.eij.2014.01.001
A. Frank and A. Asuncion, "UCI machine learning repository," 2010, http://archive.ics.uci.edu/ml

피인용 문헌

Intrusion Detection System Modeling Based on Learning from Network Traffic Data vol.12, pp.11, 2018, https://doi.org/10.3837/tiis.2018.11.022
Dimensionality reduction method for hyperspectral image analysis based on rough set theory vol.53, pp.1, 2020, https://doi.org/10.1080/22797254.2020.1785949
캠페인 효과 제고를 위한 자기 최적화 변수 선택 알고리즘 vol.26, pp.4, 2017, https://doi.org/10.13088/jiis.2020.26.4.173
An intelligent flow-based and signature-based IDS for SDNs using ensemble feature selection and a multi-layer machine learning-based classifier vol.40, pp.3, 2017, https://doi.org/10.3233/jifs-200850
A novel self-learning feature selection approach based on feature attributions vol.183, pp.None, 2017, https://doi.org/10.1016/j.eswa.2021.115219

KSII Transactions on Internet and Information Systems (TIIS)

Feature Selection Algorithm for Intrusions Detection System using Sequential Forward Search and Random Forest Classifier

초록

키워드

참고문헌

피인용 문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)