Acknowledgement
Supported by : 한국연구재단
References
- Kang, Seung-Shik (2002). Korean Morphology and Information Retrieval. Hongrung Publishing Company.
- Kim, Seong-Hee, & Eom, Jae-Eun (2012). A study on the documents's automatic classification using machine learning. Journal of Information Management, 39(4), 47-66. http://dx.doi.org/10.1633/JIM.2008.39.4.047
- Kim, Yong-Hwan, & Chung, Young-Mee (2012). An experimental study on feature selection using wikipedia for text categorization. Journal of the Korean Society for information Management, 29(2), 155-171. http://dx.doi.Org/10.3743/KOSIM.2012.29.2.155
- Kim, Jong-Min, & Yoo, Chang D. (2014). Linear classifier optimization for feature acquisition cost-sensitive classification. In Proceedings of the IEEK Conference, 37(1), 2021-2024.
- Kim, Pan Jun (2006a). A study on automatic assignment of descriptors using machine learning. Journal of the Korean Society for Information Management, 23(1), 279-299. http://dx.doi.org/10.3743/KOSIM.2006.23.1.279
- Kim, Pan Jun (2006b). A study on the automatic descriptor assignment for scientific journal articles uing rocchio algorithm. Journal of the Korean Society for Information Management, 23(3), 69-89. http://dx.doi.org/10.3743/KOSIM.2006.23.3.069
- Kim, Pan Jun (2008). A study on the performance improvement of rocchio classifier with term weighting methods. Journal of the Korean Society for Information Management, 25(1), 211-233. http://dx.doi.org/10.3743/KOSIM.2008.25.1.211
- Kim, Pan Jun (2016). An analytical study on performance factors of automatic classification based on machine learning. Journal of the Korean Society for Information Management, 33(2), 33-59. http://dx.doi.org/10.3743/KOSIM.2016.33.2.033
- Kim, Pan Jun, & Lee, Jae Yun (2007). Utilizing unlabeled documents in automatic classification with inter-document similarities. Journal of the Korean Society for Information Management, 24(1), 251-271. http://dx.doi.org/10.3743/KOSIM.2007.24.1.251
- Kim, Pan Jun, & Lee, Jae Yun (2012). A study on the reclassification of author keywords for automatic assignment of descriptors. Journal of the Korean Society for Information Management, 29(2), 225-246. http://dx.doi.org/10.3743/KOSIM.2012.29.2.225
- Kim, Pan Jun, & Lee, Jae Yun (2014). An experimental study on the performance improvement of automatic classification for the articles of korean journals based on controlled keywords in international database. Journal of the Korean Society for Library and Information Science, 48(3), 491-510. http://dx.doi.org/10.4275/KSLIS.2014.48.3.491
- Song, Sung-Jeon, & Chung, Young-Mee (2012). A study on improving the performance of document classification using the context of terms. Journal of the Korean Society for Information Management, 29(2), 205-224. http://dx.doi.Org/10.3743/KOSIM.2012.29.2.205
- Shim, Kyung (2006). Optimization of number of training documents in text categorization. Journal of the Korean Society for Information Management, 23(4), 277-294. http://dx.doi.org/10.3743/KOSIM.2006.23.4.277
- Shim, Kyung, & Chung, Young-Mee (2006). The effect of the quality of pre-assigned subject categories on the text categorization performance. Journal of the Korean Society for Information Management, 23(2), 265-285. http://dx.doi.org/10.3743/KOSIM.2006.23.2.265
- Lee, Yong-Gu (2009). Classification performance analysis of cross-language text categorization using machine translation. Journal of the Korean Society for Library and Information Science, 43(1), 313-332. http://dx.doi.org/10.4275/kslis.2009.43.1.313
- Lee, Yong-Gu (2013). A study on feature selection for kNN classifier using document frequency and collection frequency. Journal of Korean Library and Information Science Society, 44(1), 27-47. http://dx.doi.org/10.16981/kliss.44.1.201303.27
- Lee, Jae Yun (2005a). Improving the performance of a fast text classifier with document-side feature selection. Journal of Information Management, 36(4), 51-69. http://dx.doi.org/10.1633/jim.2005.36.4.051
- Lee, Jae Yun (2005b). An empirical study on improving the performance of text categorization considering the relationships between feature selection criteria and weighting methods. Journal of the Korean Society for Library and Information Science, 39(2), 123-146. http://dx.doi.org/10.4275/kslis.2005.39.2.123
- Chung, Eun-Kyung (2009). A semantic-based feature expansion approach for improving the effectiveness of text categorization by using wordNet. Journal of the Korean Society for Information Management, 26(3), 261-278. http://dx.doi.Org/10.3743/KOSIM.2009.26.3.261
- National Research Foundation of Korea (2016). Research Field Classification Scheme. Retrieved from http://www.nrf.re.kr
- Korea Citation Index (2018). Retrieved from https://www.kci.go.kr
- AI-Salemi, B., Aziz, M., Juzaiddin, A., & Noah, S. (2015). Boosting algorithms with topic modeling for multi-label text categorization: A comparative empirical study. Journal of Information Science, 41(5), 732-746. http://dx.doi.Org/10.1177/0165551515590079
- Chen, E., Lin, Y., Xiong, H., Luo, Q., & Ma, H. (2011). Exploiting probabilistic topic models to improve text categorization under class imbalance. Information Processing and Management, 47(2), 202-214. https://doi.org/10.1016/j.ipm.2010.07.003
- Chen, Yao-Tsung, & Chen, Meng Chang (2011). Using chi-square statistics to measure similarities for text categorization. Expert Systems with Application, 38(4), 3085-3090. https://doi.org/10.1016/j.eswa.2010.08.100
- Dalal, M. K., & Zaveri, M. A. (2012). Automatic text classification of sports blog data, proceedings of the ieee international conference on computing, communications and applications (ComComAp 2012), Hong Kong, 11-13 January 2012, 219-222.
- Dalal, M. K., & Zaveri, M. A. (2013). Automatic classification of unstructured blog text. Journal of Intelligent Learning Systems and Applications, 5(2), 108-114. http://dx.doi.Org/10.4236/jilsa.2013.52012.
- Eriksson, Tobias (2013). Automatic web page categorization using text classification methods. Master's Degree Project in Computer Science CSC School of Computer Science and Communication.
- Foulds, J., & Frank, E. (2010). A review of multi-instance learning assumptions. Knowl. Eng. Rev., 25(1), 1-25. https://doi.org/10.1017/S026988890999035X
- Hmeidi, I., Al-Ayyoub, M., Abdulla, N. A., Almodawar, A. A., Abooraig, R., & Mahyoub, N. A. (2015). Automatic arabic text categorization: A comprehensive comparative study. Journal of Information Science, 41(1), 114-124. https://doi.org/10.1177/0165551514558172
- Jiang, S., Pang, G., Wu, M., & Kuang, L. (2012). An improved k-nearest-neighbor algorithm for text categorization. Expert Systems with Applications, 39(1), 1503-1509. https://doi.org/10.1016/j.eswa.2011.08.040
- Jindal, Rajni, Malhotra, Ruchika, & Jain, Abha. (2015). Techniques for text classification: Literature review and current trends. Webology, 12(2), 2-28.
- Joorabchi, A., & Mahdi, A. E. (2011). An unsupervised approach to automatic classification of scientific literature utilizing bibliographic metadata. Journal of Information Science, 37(5), 499-514. https://doi.org/10.1177/0165551511417785
- Khan, A., Baharudin, B., & Lee, L. H. (2010). A review of machine learning algorithms for text-documents classification. Journal of Advances in Information Technology, 1(1), 4-20. https://doi.org/10.4304/jait.1.1.4-20
- Kumar, M. A., & Gopal, M. (2010). A comparison study on multiple binary-class SVM methods for unilabel text categorization. Pattern Recognition Letters, 31(11), 1437-1444. https://doi.org/10.1016/j.patrec.2010.02.015
- Li, C. H., & Park, S. C. (2009). An efficient document classification model using an improved back propagation neural network and singular value decomposition. Expert Systems with Applications, 36(2), 3208-3215. https://doi.org/10.1016/j.eswa.2008.01.014
- Liu, Y., Loh, H. T., Yousef-Toumi, K., & Tor, S. B. (2007). Handling of imbalanced data in text classification: category-based term weights. In Kao, A., & Poteet, S. R. eds. Natural Language Processing and Text Mining. Springer, 171-192. https://doi.org/10.1007/978-1-84628-754-1_10
- Miao, Yun-Qian, & Kamel, Mohamed (2011). Pairwise optimized rocchio algorithm for text categorization. Pattern Recognition, 32(2), 375-382. https://doi.org/10.1016/j.patrec.2010.09.018
- Pawar, P. Y., & Gawande, S. H. (2012). Comparative study on different types of approaches to text categorization. International Journal of Machine Learning and Computing, 2(4), 423-426. https://doi.org/10.7763/ijmlc.2012.v2.158
- Pedregosa, F. et al. (2011). Scikit-learn: Machine learning in python. Journal of Machine Learning Research, 12, 2825-2830.
- Read, J. (2010). Scalable Multi-label Classification (Thesis, Doctor of Philosophy (PhD)). University of Waikato, Hamilton, New Zealand. Retrieved from https://hdl.handle.net/10289/4645
- Read, J., Pfahringer, B., Holmes, G., & Frank, E. (2011). Classifier chains for multi-label classification. Machine Learning, 85, 333-359. https://doi.org/10.1007/s10994-011-5256-5
- Schapire, R. E., & Singer, Y. (2000). BoosTexter: A boosting-based system for text categorization. Machine Learning, 39, 135-168. https://doi.org/10.1023/A:1007649029923
- Sebastiani, Fabrizio (2002). Machine learning in automated text categorization. ACM computing Surveys, 34(1), 1-47. https://doi.org/10.1145/505282.505283
- Shehab, M. A., Badarneh, O., Al-Ayyoub, M., & Jararweh, Y. (2016). A supervised approach for multi-label classification of Arabic news articles, 7th International Conference on Computer Science and Information Technology (CSIT), Amman, 2016, 1-6. http://dx.doi.Org/10.1109/CSIT.2016.7549465
- Tarrago, D. S., Cornelis, C., Bello, R., & Herrera, F. (2014). A multi-instance learning wrapper based on the Rocchio classifier for web index recommendation. Knowledge-Based Systems, 59, 173-181. https://doi.org/10.1016/j.knosys.2014.01.008
- Torii, M., Yin, L., Nguyen, T., Mazumdar, C. T., Liu, H., Hartley, D. M., & Nelson, N. P. (2011). An exploratory study of a text classification framework for Internet-based surveillance of emerging epidemics. International Journal of Medical Informatics, 80(1), 56-66. https://doi.org/10.1016/j.ijmedinf.2010.10.015
- Tsoumakas G, Katakis I., & Vlahavas I. (2010). Mining multi-label data. In: Data mining and knowledge discovery handbook. Berlin: Springer, 667-685.
- Uguz, Harun. (2011). A two-stage feature selection methods for text categorization by using information gain, principal component analysis and genetic algorithm. Knowledge-Based Systems, 24(7), 1024-1032. https://doi.org/10.1016/j.knosys.2011.04.014
- Vasuki, Vidya, & Cohen, Trevor (2010). Reflective random indexing for semi-automatic indexing of the biomedical literature. Journal of Biomedical Informatics, 43(5), 694-700. https://doi.org/10.1016/j.jbi.2010.04.001
- Villena-Roman, J., Collada-Perez, S., Lana-Serrano, S., & Gonzalez-Cristobal, J. C. (2011). Hybrid approach combining machine learning and a rule-based expert system for text categorization. In Proceedings of the Twenty-Fourth International Florida Artificial Intelligence Research Society Conference, 323-328.
- Vogrincic, Sergeja, & Bosnic, Zoran (2011). Ontology-based multi-label classification of economic articles. ComSIS, 8(1), 101-119. https://doi.org/10.2298/csis100420034v
- Wang, Tai-Yue, & Chiang, Huei-Min (2007). Fuzzy support vector machine for multi-class text categorization. Information Processing and Management, 43(4), 914-929. https://doi.org/10.1016/j.ipm.2006.09.011
- Wu, Chih-Hung (2009). Behavior-based spam detection using a hybrid method of rule-based techniques and neural networks. Expert Systems with Applications, 36(1), 4321-4330. https://doi.org/10.1016/j.eswa.2008.03.002
- Yu, B., Xu, Zong-ben, & Li, Cheng-hua (2008). Latent semantic analysis for text categorization using neural network. Knowledge-Based Systems, 21(8), 900-904. https://doi.org/10.1016/j.knosys.2008.03.045