과제정보
김재오의 연구는 인하대학교 연구비 지원을, 방성완의 연구는 정부(과학기술정보통신부)의 재원으로 한국연구재단의 지원을 받아 수행된 연구임(NO. 2020R1F1A1A01065107).
참고문헌
- Akbani R, Kwek S, and Japkowicz N (2004). Applying support vector machines to imbalanced datasets. In Proceedings of European Conference of Machine Learning, 3201, 39-50.
- Anand A, Pugalenthi G, Fogel GB, and Suganthan PN (2010). An approach for classification of highly imbalanced data using weighting and undersampling, Amino Acids, 39, 1385-1391. https://doi.org/10.1007/s00726-010-0595-2
- Bang S, Han SK, and Kim J (2021). Divide and conquer algorithm based support vector machine for massive data analysis, Journal of the Korean Data & Information Science Society, 32, 463-473. https://doi.org/10.7465/jkdi.2021.32.3.463
- Bang S and Jhun M (2014). Weighted support vector machine using k-means clustering, Communications in Statistics-Simulation and Computation, 43, 2307-2324. https://doi.org/10.1080/03610918.2012.762388
- Bang S and Kim J (2020a). Divide and conquer kernel quantile regression for massive dataset, The Korean Journal of Applied Statistics, 33, 569-578. https://doi.org/10.5351/KJAS.2020.33.5.569
- Bang S and Kim J (2020b). Sampling method using Gaussian mixture clustering for classification analysis of imbalanced data, Journal of the Korean Data Analysis Society, 22, 565-574. https://doi.org/10.37727/jkdas.2020.22.2.565
- Bunkhumpornpat C, Sinapiromsaran K, and Lursinsap C (2009). Safe-level-SMOTE: safe-level-synthetic minority over-sampling technique for handling the class imbalanced problem. In Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, 475--482.
- Bunkhumpornpat C, Sinapiromsaran K, and Lursinsap C (2012). DBSMOTE: density-based synthetic minority over-sampling technique, Applied Intelligence, 36, 664--684. https://doi.org/10.1007/s10489-011-0287-y
- Chawla N, Bowyer K, Hall L, and Kegelmeyer W (2002). SMOTE: synthetic minority over-sampling technique, Journal of Artificial Intelligence Research, 16, 321-357. https://doi.org/10.1613/jair.953
- Chen X, Liu W, and Zhang Y (2019). Quantile regression under memory constraint, Annals of Statistics, 47, 3244-3273.
- Chen X and Xie M (2014). A split-and-conquer approach for analysis of extraordinarily large data, Statistica Sinica, 24, 1655-1684.
- Chen L and Zhou Y (2020). Quantile regression in big data: A divide and conquer based strategy, Computational Statistics and Data Analysis, 144, 1-17.
- Cristianini N and Shawe-Taylor J (2000). An Introduction to Support Vector Machines, Cambridge University Press, Cambridge.
- Cortes C and Vapnik V (1995). Support vector networks, Machine Learning, 20, 273-297. https://doi.org/10.1007/BF00994018
- Datta S and Das S (2015). Near-Bayesian support vector machines for imbalanced data classification with equal or unequal misclassification costs, Neural Networks, 70, 39-52. https://doi.org/10.1016/j.neunet.2015.06.005
- Dua D and Graff C (2019). UCI Machine Learning Repository, Irvine, CA: University of California, School of Information and Computer Science.
- Fan T, Lin D, and Cheng K (2007). Regression analysis for massive datasets, Data and Knowledge Engineering, 61, 554-562. https://doi.org/10.1016/j.datak.2006.06.017
- Han H, Wang WY, and Mao BH (2005). Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning, Lecture Notes in Computer Science, 3644, 878-887.
- He H, Bai Y, Garcia EA, and Li S (2008). ADASYN: adaptive synthetic samplingapproach for imbalanced learning. In Proceedings of the 2008 IEEE International Joint Conference Neural Networks, 1322-1328.
- Hsieh C and Dhillon I (2014). A divide and conquer solver for kernel support vector machines. In Proceedings of the 31st International Conference on Machine Learning.
- Jeong H, Kang C, and Kim K (2008). The effect of oversampling method for imbalanced data, Journal of the Korean Data Analysis Society, 10, 2089-2098.
- Jiang R, Hu X, Yu K, and Qian W (2018). Composite quantile regression for massive datasets, Statistics, 52, 980-1004. https://doi.org/10.1080/02331888.2018.1500579
- Kang J and Jhun M (2020). Divide-and-conquer random sketched kernel ridge regression for large-scale data, Journal of the Korean Data & Information Science Society, 31, 15-23. https://doi.org/10.7465/jkdi.2020.31.1.15
- Kim E, Jhun M, and Bang S (2016). Hierarchically penalized support vector machine for the classification of imbalanced data with grouped variables, The Korea Journal of Applied Statistics, 29, 961-975. https://doi.org/10.5351/KJAS.2016.29.5.961
- Lian H and Fan Z (2018). Divide-and-conquer for debiased l1-norm support vector machine in ultra-high dimensions, Journal of Machine Learning Research, 18, 1-26.
- Lin Y, Lee Y, and Wahba G (2002). Support Vector Machines for Classification in Nonstandard Situations, Machine Learning, 46, 191-202. https://doi.org/10.1023/a:1012406528296
- Ling CX and Sheng VS (2008). Cost-sensitive learning and the class imbalance problem, Encyclopedia of Machine Learning, 2011, 231-235.
- Meyer D, Dimitriadou E, Hornik K, Weingessel A, Leisch F, Chang CC, and Meyer MD (2019). Package 'e1071', The R Journal.
- Oommen T, Baise LG, and Vogel RM (2011). Sampling bias and class imbalance in maximum-likelihood logistic regression, Mathematical Geosciences, 43, 99-120. https://doi.org/10.1007/s11004-010-9311-8
- Owen AB (2007). Infinitely imbalanced logistic regression, The Journal of Machine Learning Research, 8, 761-773.
- Park J and Bang S (2015). Logistic regression with sampling techniques for the classification of imbalanced data, Journal of The Korean Data Analysis Society, 17, 1877-1888.
- Tang Y, Zhang YQ, Chawla NV, and Krasser S (2009). SVMs modeling for highly imbalanced classification. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 39, 281-288. https://doi.org/10.1109/TSMCB.2008.2002909
- Vapnik VN (1998). Statistical Learning Theory, Wiley, New York.
- Veropoulos K, Campbell C, and Cristianini N (1999). Controlling the sensitivity of support vector machines. In Proceedings of the International Joint Conference on AI, 55-60.
- Xu Q, Cai C, Jiang C, Sun F, and Huang X (2020). Block average quantile regression for massive dataset, Statistical Papers, 61, 141-165. https://doi.org/10.1007/s00362-017-0932-6
- Zhang Y, Duchi J, and Wainwright M (2015). Divide and conquer kernel ridge regression: A distributed algorithm with minimax optimal rates, Journal of Machine Learning Research, 16, 3299-3340.
- Zhang YP, Zhang LN, and Wang YC (2010). Cluster-based majority under-sampling approaches for class imbalance learning. In Information and Financial Engineering (ICIFE) 2010 2nd IEEE International Conference, 400-404.