DOI QR코드

DOI QR Code

A review of gene selection methods based on machine learning approaches

기계학습 접근법에 기반한 유전자 선택 방법들에 대한 리뷰

  • Lee, Hajoung (Department of Statistics, Sungkyunkwan University) ;
  • Kim, Jaejik (Department of Statistics, Sungkyunkwan University)
  • 이하정 (성균관대학교 통계학과) ;
  • 김재직 (성균관대학교 통계학과)
  • Received : 2022.08.18
  • Accepted : 2022.08.25
  • Published : 2022.10.31

Abstract

Gene expression data present the level of mRNA abundance of each gene, and analyses of gene expressions have provided key ideas for understanding the mechanism of diseases and developing new drugs and therapies. Nowadays high-throughput technologies such as DNA microarray and RNA-sequencing enabled the simultaneous measurement of thousands of gene expressions, giving rise to a characteristic of gene expression data known as high dimensionality. Due to the high-dimensionality, learning models to analyze gene expression data are prone to overfitting problems, and to solve this issue, dimension reduction or feature selection techniques are commonly used as a preprocessing step. In particular, we can remove irrelevant and redundant genes and identify important genes using gene selection methods in the preprocessing step. Various gene selection methods have been developed in the context of machine learning so far. In this paper, we intensively review recent works on gene selection methods using machine learning approaches. In addition, the underlying difficulties with current gene selection methods as well as future research directions are discussed.

유전자 발현 데이터는 각 유전자에 대해 mRNA 양의 정도를 나타내고, 그러한 유전자 발현량에 대한 분석은 질병 발생에 대한 메커니즘을 이해하고 새로운 치료제와 치료 방법을 개발하는데 중요한 아이디어를 제공해오고 있다. 오늘날 DNA 마이크로어레이와 RNA-시퀀싱과 같은 고출력 기술은 수천 개의 유전자 발현량을 동시에 측정하는 것을 가능하게 하여 고차원성이라는 유전자 발현 데이터의 특징을 발생시켰다. 이러한 고차원성으로 인해 유전자 발현 데이터를 분석하기 위한 학습 모형들은 과적합 문제에 부딪히기 쉽고, 이를 해결하기 위해 차원 축소 또는 변수 선택 기술들이 사전 분석 단계로써 보통 사용된다. 특히, 사전 분석 단계에서 우리는 유전자 선택법을 이용하여 부적절하거나 중복된 유전자를 제거할 수 있고 중요한 유전자를 찾아낼 수도 있다. 현재까지 다양한 유전자 선택 방법들이 기계학습의 맥락에서 개발되어왔다. 본 논문에서는 기계학습 접근법을 사용하는 최근의 유전자 선택 방법들을 집중적으로 살펴보고자 한다. 또한, 현재까지 개발된 유전자 선택 방법들의 근본적인 문제점과 앞으로의 연구 방향에 대해 논의하고자 한다.

Keywords

Acknowledgement

이 논문은 2022년도 정부(과학기술정보통신부)의 재원으로 한국연구재단의 지원을 받아 수행된 기초연구사업임 (No.NRF-2022R1F1A1072444).

References

  1. Almugren N and Alshamlan H (2019). A survey on hybrid feature selection methods in microarray gene expression data for cancer classification, IEEE Access, 7, 78533-78548. https://doi.org/10.1109/ACCESS.2019.2922987
  2. Anaissi A, Kennedy PJ, Goyal M, and Catchpoole DR (2013). A balanced iterative random forest for gene selection from microarray data, BMC Bioinformatics, 14, 1-10. https://doi.org/10.1186/1471-2105-14-1
  3. Ang JC, Haron H, and Hamed HNA (2015a). Semi-supervised SVM-based feature selection for cancer classification using microarray gene expression data, In International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, 9101, 468-477.
  4. Ang JC, Mirzal A, Haron H, and Hamed HNA (2015b). Supervised, unsupervised, and semi-supervised feature selection: a review on gene selection, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 13, 971-989.
  5. Awada W, Khoshgoftaar TM, Dittman D, Wald R, and Napolitano A (2012). A review of the stability of feature selection techniques for bioinformatics data, In 2012 IEEE 13th International Conference on Information Reuse and Integration (IRI), IEEE, 356-363.
  6. Ben Brahim A and Limam M (2018). Ensemble feature selection for high dimensional data: a new method and a comparative study, Advances in Data Analysis and Classification, 12, 937-952. https://doi.org/10.1007/s11634-017-0285-y
  7. Benabdeslem K and Hindawi M (2011). Constrained laplacian score for semi-supervised feature selection, In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, 6911, 204-218.
  8. Boucheham A, Batouche M, and Meshoul S (2015). An ensemble of cooperative parallel metaheuristics for gene selection in cancer classification, In International Conference on Bioinformatics and Biomedical Engineering, 9044, 301-312.
  9. Boulesteix AL, Strobl C, Augustin T, and Daumer M (2008). Evaluating microarray-based classifiers: an overview, Cancer Informatics, 6, CIN-S408.
  10. Boutsidis C, Mahoney MW, and Drineas P (2008). Unsupervised feature selection for principal components analysis, In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Nevada, 61-69.
  11. Chakraborty D and Maulik U (2014). Identifying cancer biomarkers from microarray data using feature selection and semisupervised learning, IEEE Journal of Translational Engineering in Health and Medicine, 2, 1-11. https://doi.org/10.1109/JTEHM.2014.2375820
  12. Chandrashekar G and Sahin F (2014). A survey on feature selection methods, Computers and Electrical Engineering, 40, 16-28. https://doi.org/10.1016/j.compeleceng.2013.11.024
  13. Chinnaswamy A and Srinivasan R (2016). Hybrid feature selection using correlation coefficient and particle swarm optimization on microarray gene expression data, In Innovations in Bio-Inspired Computing and Applications, 424, 229-239. https://doi.org/10.1007/978-3-319-28031-8_20
  14. Chuang LY, Chang HW, Tu CJ, and Yang CH (2008). Improved binary PSO for feature selection using gene expression data, Computational Biology and Chemistry, 32, 29-38. https://doi.org/10.1016/j.compbiolchem.2007.09.005
  15. Chuang LY, Yang CH, and Yang CH (2009). Tabu search and binary particle swarm optimization for feature selection using microarray data, Journal of Computational Biology, 16, 1689-1703. https://doi.org/10.1089/cmb.2007.0211
  16. Devi Arockia Vanitha C, Devaraj D, and Venkatesulu M (2016). Multiclass cancer diagnosis in microarray gene expression profile using mutual information and support vector machine, Intelligent Data Analysis, 20, 1425-1439. https://doi.org/10.3233/IDA-150203
  17. Djellali H, Guessoum S, Ghoualmi-Zine N, and Layachi S (2017). Fast correlation based filter combined with genetic algorithm and particle swarm on feature selection, In 2017 5th International Conference on Electrical Engineering-Boumerdes (ICEE-B), IEEE, 1-6.
  18. Du D, Li K, and Deng J (2012). An efficient two-stage gene selection method for microarray data, In International Conference on Intelligent Computing for Sustainable Energy and Environment, 355, 424-432.
  19. El Akadi A, Amine A, El Ouardighi A, and Aboutajdine D (2011). A two-stage gene selection scheme utilizing MRMR filter and GA wrapper, Knowledge and Information Systems, 26, 487-500. https://doi.org/10.1007/s10115-010-0288-x
  20. Elghazel H and Aussem A (2015). Unsupervised feature selection with ensemble learning, Machine Learning, 98, 157-180. https://doi.org/10.1007/s10994-013-5337-8
  21. Filippone M, Masulli F, and Rovetta S (2005). Unsupervised gene selection and clustering using simulated annealing, In International Workshop on Fuzzy Logic and Applications, 3849, 229-235.
  22. Fujita A, Patriota AG, Sato JR, and Miyano S (2009). The impact of measurement errors in the identification o f regulatory networks, BMC Bioinformatics, 10, 412. https://doi.org/10.1186/1471-2105-10-412
  23. Galar M, Fernandez A, Barrenechea E, and Herrera F (2013). EUSBoost: Enhancing ensembles for highly im-balanced data-sets by evolutionary undersampling, Pattern Recognition, 46, 3460-3471. https://doi.org/10.1016/j.patcog.2013.05.006
  24. Gangeh MJ, Zarkoob H, and Ghodsi A (2017). Fast and scalable feature selection for gene expression data using hilbert-schmidt independence criterion, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 14, 167-181. https://doi.org/10.1109/TCBB.2016.2631164
  25. Garcia-Nieto J, Alba E, Jourdan L, and Talbi E (2009). Sensitivity and specificity based multiobjective approach for feature selection: Application to cancer diagnosis, Information Processing Letters, 109, 887-896. https://doi.org/10.1016/j.ipl.2009.03.029
  26. George G and Raj VC (2011). Review on feature selection techniques and the impact of SVM for cancer classification using gene expression profile, Available from: http://doi.org/arXiv preprint arXiv:1109.1062 https://doi.org/10.5121/ijcses.2011.2302
  27. Ghosh M, Adhikary S, Ghosh KK, Sardar A, Begum S, and Sarkar R (2019a). Genetic algorithm based cancerous gene identification from microarray data using ensemble of filter methods, Medical and Biological Engineering and Computing, 57, 159-176. https://doi.org/10.1007/s11517-018-1874-4
  28. Ghosh M, Begum S, Sarkar R, Chakraborty D, and Maulik U (2019b). Recursive memetic algorithm for gene selection in microarray data, Expert Systems with Applications, 116, 172-185. https://doi.org/10.1016/j.eswa.2018.06.057
  29. Guo S, Guo D, Chen L, and Jiang Q (2017). A L1-regularized feature selection method for local dimension reduction on microarray data, Computational Biology and Chemistry, 67, 92-101. https://doi.org/10.1016/j.compbiolchem.2016.12.010
  30. Gutkin M, Shamir R, and Dror G (2009). SlimPLS: A method for feature selection in gene expression-based disease classification, PloS One, 4, Available from: http://doi.org/10.1371/journal.pone.0006416
  31. Guyon I, Weston J, Barnhill S, and Vapnik V (2002). Gene selection for cancer classification using support vector machines, Machine Learning, 46, 389-422. https://doi.org/10.1023/A:1012487302797
  32. Hajiloo M, Damavandi B, HooshSadat M, Sangi F, Mackey JR, Cass CE, Greiner R, and Damaraju S (2013). Breast cancer prediction using genome wide single nucleotide polymorphism data, BMC Bioinformatics, 14, 1-10. https://doi.org/10.1186/1471-2105-14-1
  33. Halperin E, Kimmel G, and Shamir R (2005). Tag SNP selection in genotype data for maximizing SNP prediction accuracy, Bioinformatics, 21, i195-i203. https://doi.org/10.1093/bioinformatics/bti1021
  34. Hambali MA, Oladele TO, and Adewole KS (2020). Microarray cancer feature selection: review, challenges and research directions, International Journal of Cognitive Computing in Engineering, 1, 78-97. https://doi.org/10.1016/j.ijcce.2020.11.001
  35. Hancer E, Xue B, and Zhang M (2018). Differential evolution for filter feature selection based on information theory and feature ranking, Knowledge-Based Systems, 140, 103-119. https://doi.org/10.1016/j.knosys.2017.10.028
  36. Hasri NM, Wen NH, Howe CW, Mohamad MS, Deris S, and Kasim S (2017). Improved support vector machine using multiple SVM-RFE for cancer classification, International Journal on Advanced Science, Engineering and Information Technology, 7, 1589-1594. https://doi.org/10.18517/ijaseit.7.4-2.3394
  37. Huang HL and Chang FL (2007). ESVM: Evolutionary support vector machine for automatic feature selection and classification of microarray data, Biosystems, 90, 516-528. https://doi.org/10.1016/j.biosystems.2006.12.003
  38. Irigoyen A, Jimenez-Luna C, Benavides M, et al. (2018). Integrative multi-platform meta-analysis of gene expression profiles in pancreatic ductal adenocarcinoma patients for identifying novel diagnostic biomarkers, PloS One, 13, e0194844. https://doi.org/10.1371/journal.pone.0194844
  39. Ji G, Yang Z, and You W (2010). PLS-based gene selection and identification of tumor-specific genes, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 41, 830-841.
  40. Kalakech M, Biela P, Macaire L, and Hamad D (2011). Constraint scores for semi-supervised feature selection: A comparative study, Pattern Recognition Letters, 32, 656-665. https://doi.org/10.1016/j.patrec.2010.12.014
  41. Kira K and Rendell LA (1992). A practical approach to feature selection, In Machine Learning Proceedings 1992, Morgan Kaufmann, 249-256.
  42. Kumar CA, Sooraj MP, and Ramakrishnan S (2017). A comparative performance evaluation of supervised feature selection algorithms on microarray datasets, Procedia Computer Science, 115, 209-217. https://doi.org/10.1016/j.procs.2017.09.127
  43. Lan L and Vucetic S (2011). Improving accuracy of microarray classification by a simple multi-task feature selection filter, International Journal of Data Mining and Bioinformatics, 5, 189-208. https://doi.org/10.1504/IJDMB.2011.039177
  44. Lazar C, Taminau J, Meganck S, et al. (2012). A survey on filter techniques for feature selection in gene expression microarray analysis, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 9, 1106-1119. https://doi.org/10.1109/TCBB.2012.33
  45. Leung Y and Hung Y (2008). A multiple-filter-multiple-wrapper approach to gene selection and microarray data classification, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 7, 108-117. https://doi.org/10.1109/TCBB.2008.46
  46. Li C and Li H (2008). Network-constrained regularization and variable selection for analysis of genomic data, Bioinformatics, 24, 1175-1182. https://doi.org/10.1093/bioinformatics/btn081
  47. Li J, Tang J, and Liu H (2017). Reconstruction-based unsupervised feature selection: An embedded approach, In IJCAI, 2159-2165.
  48. Li Z, Liao B, Cai L, Chen M, and Liu W (2018). Semi-supervised maximum discriminative local margin for gene selection, Scientific Reports, 8, 1-11.
  49. Liaghat S and Mansoori EG (2016). Unsupervised selection of informative genes in microarray gene expression data, International Journal of Applied Pattern Recognition, 3, 351-367. https://doi.org/10.1504/IJAPR.2016.082237
  50. Liao B, Jiang Y, Liang W, Zhu W, Cai L, and Cao Z (2014). Gene selection using locality sensitive Laplacian score, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 11, 1146-1156. https://doi.org/10.1109/TCBB.2014.2328334
  51. Li G, Zhang W, Zeng H, Chen L, Wang W, Liu J, Zhang Z, and Cai Z (2009). An integrative multi-platform analysis for discovering biomarkers of osteosarcoma, BMC Cancer, 9, 150. https://doi.org/10.1186/1471-2407-9-150
  52. Li HD, Liang YZ, Xu QS, Cao DS, Tan BB, Deng BC, and Lin CC (2011). Recipe for uncovering predictive genes using support vector machines based on model population analysis, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 8, 1633-1641. https://doi.org/10.1109/TCBB.2011.36
  53. Liu B, Wan C, and Wang L (2006). An efficient semi-unsupervised gene selection method via spectral biclustering, IEEE Transactions on Nanobioscience, 5, 110-114. https://doi.org/10.1109/TNB.2006.875040
  54. Liu H, Motoda H, Setiono R, and Zhao Z (2010). Feature selection: An ever evolving frontier in data mining, In Feature Selection in Data Mining, PMLR, 10, 4-13.
  55. Liu H, Zhou M, and Liu Q (2019). An embedded feature selection method for imbalanced data classification, IEEE/CAA Journal of Automatica Sinica, 6, 703-715. https://doi.org/10.1109/JAS.2019.1911447
  56. Liu J, Cheng Y, Wang X, Zhang L, and Wang ZJ (2018). Cancer characteristic gene selection via sample learning based on deep sparse filtering, Scientific Reports, 8, 1-13.
  57. Liu JX, Wang YT, Zheng CH, Sha W, Mi JX, and Xu Y (2013). Robust PCA based method for discovering differentially expressed genes, BMC Bioinformatics, BioMed Central, 14, 1-10. https://doi.org/10.1186/1471-2105-14-1
  58. Liu Y (2009). Wavelet feature extraction for high-dimensional microarray data, Neurocomputing, 72, 985-990. https://doi.org/10.1016/j.neucom.2008.04.010
  59. Loscalzo S, Yu L, and Ding C (2009). Consensus group stable feature selection, In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, 567-576.
  60. Mahapatra S and Swarnkar T (2021). Gene selection using integrative analysis of multi-level omics data: A systematic review, Data Analytics in Bioinformatics: A Machine Learning Perspective, 145-171.
  61. Mahendran N, Durai Raj Vincent PM, Srinivasan K, and Chang CY (2020). Machine learning based computational gene selection models: a survey, performance evaluation, open issues, and future research directions, Frontiers in Genetics, Available from: http://doi.org/10.3389/fgene.2020.603808
  62. Maldonado S, Weber R, and Basak J (2011). Simultaneous feature selection and classification using kernelpenalized support vector machines, Information Sciences, 181, 115-128. https://doi.org/10.1016/j.ins.2010.08.047
  63. Maugis C, Celeux G, and Martin-Magniette ML (2009). Variable selection for clustering with Gaussian mixture models, Journal of the International Biometric Society, 65, 701-709.
  64. Maulik U and Chakraborty D (2014). Fuzzy preference based feature selection and semisupervised SVM for cancer classification, IEEE Transactions on Nanobioscience, 13, 152-160. https://doi.org/10.1109/TNB.2014.2312132
  65. Mazumder DH and Veilumuthu R (2019). An enhanced feature selection filter for classification of microarray cancer data, ETRI Journal, 41, 358-370. https://doi.org/10.4218/etrij.2018-0522
  66. Mishra S and Mishra D (2015). SVM-BT-RFE: An improved gene selection framework using Bayesian T-test embedded in support vector machine (recursive feature elimination) algorithm, Karbala International Journal of Modern Science, 1, 86-96. https://doi.org/10.1016/j.kijoms.2015.10.002
  67. Mohamed E, El Houby EM, Wassif KT, and Salah AI (2016). Survey on different methods for classifying gene expression using microarray approach, International Journal of Computer Applications, 150, 975-8887.
  68. Mundra PA and Rajapakse JC (2009). SVM-RFE with MRMR filter for gene selection, IEEE Transactions on Nanobioscience, 9, 31-37. https://doi.org/10.1109/TNB.2009.2035284
  69. Nie F, Huang H, Cai X, and Ding C (2010). Efficient and robust feature selection via joint ℓ2, 1-norms minimization, Advances in Neural Information Processing Systems, 23, 1813-1821.
  70. Peng Y, Wu Z, and Jiang J (2010). A novel feature selection approach for biomedical data classification, Journal of Biomedical Informatics, 43, 15-23. https://doi.org/10.1016/j.jbi.2009.07.008
  71. Reel PS, Reel S, Pearson E, Trucco E, and Jefferson E (2021). Using machine learning approaches for multiomics data analysis: A review, Biotechnology Advances, 49, 107739. https://doi.org/10.1016/j.biotechadv.2021.107739
  72. Rouhi A and Nezamabadi-pour H (2018). Filter-based feature selection for microarray data using improved binary gravitational search algorithm, In 2018 3rd Conference on Swarm Intelligence and Evolutionary Computation (CSIEC), IEEE, 1-6.
  73. Saeys Y (2004). Feature Selection for Classification of Nucleic Acid Sequences, Doctoral dissertation, Ghent University.
  74. Saeys Y, Inza I, and Larranaga P (2007). A review of feature selection techniques in bioinformatics, Bioinformatics, 23, 2507-2517. https://doi.org/10.1093/bioinformatics/btm344
  75. Seijo-Pardo B, Bolon-Canedo V, and Alonso-Betanzos A (2016). Using a feature selection ensemble on DNA microarray datasets, ESANN, 277-282.
  76. Shanab AA, Khoshgoftaar TM, and Wald R (2014). Evaluation of wrapper-based feature selection using hard, moderate, and easy bioinformatics data, In 2014 IEEE International Conference on Bioinformatics and Bioengineering, 49-155.
  77. Sharma A, Imoto S, and Miyano S (2011). A top-r feature selection algorithm for microarray gene expression data, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 9, 754-764.
  78. Sharma A, Imoto S, Miyano S, and Sharma V (2012). Null space based feature selection method for gene expression data, International Journal of Machine Learning and Cybernetics, 3, 269-276. https://doi.org/10.1007/s13042-011-0061-9
  79. Sheikhpour R, Sarram MA, Gharaghani S, and Chahooki MAZ (2017). A survey on semi-supervised feature selection methods, Pattern Recognition, 64, 141-158. https://doi.org/10.1016/j.patcog.2016.11.003
  80. Shen Q, Diao R, and Su P (2012). Feature selection ensemble, Turing-100, 10, 289-306.
  81. Shen Q, Mei Z, and Ye BX (2009). Simultaneous genes and training samples selection by modified particle swarm optimization for gene expression data classification, Computers in Biology and Medicine, 39, 646-649. https://doi.org/10.1016/j.compbiomed.2009.04.008
  82. Shukla AK, Singh P, and Vardhan M (2018). A hybrid gene selection method for microarray recognition, Biocybernetics and Biomedical Engineering, 38, 975-991. https://doi.org/10.1016/j.bbe.2018.08.004
  83. Shukla AK and Tripathi D (2019). Identification of potential biomarkers on microarray data using distributed gene selection approach, Mathematical Biosciences, 315, 108230. https://doi.org/10.1016/j.mbs.2019.108230
  84. Solorio-Fernandez S, Carrasco-Ochoa JA, and Martinez-Trinidad JF (2016). A new hybrid filter-wrapper feature selection method for clustering based on ranking, Neurocomputing, 214, 866-880. https://doi.org/10.1016/j.neucom.2016.07.026
  85. Solorio-Fernandez S, Martinez-Trinidad JF, and Carrasco-Ochoa JA (2017). A new unsupervised spectral feature selection method for mixed data: a filter approach, Pattern Recognition, 72, 314-326. https://doi.org/10.1016/j.patcog.2017.07.020
  86. Sun L, Zhang X, Qian Y, Xu J, and Zhang S (2019). Feature selection using neighborhood entropy-based uncertainty measures for gene expression data classification, Information Sciences, 502, 18-41. https://doi.org/10.1016/j.ins.2019.05.072
  87. Sun Y, Todorovic S, and Goodison S (2009). Local-learning-based feature selection for high-dimensional data analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence, 32, 1610-1626.
  88. Tong DL and Schierz AC (2011). Hybrid genetic algorithm-neural network: Feature extraction for unpreprocessed microarray data, Artificial Intelligence in Medicine, 53, 47-56. https://doi.org/10.1016/j.artmed.2011.06.008
  89. Vanitha CDA, Devaraj D, and Venkatesulu M (2015). Gene expression data classification using support vector machine and mutual information-based gene selection, Procedia Computer Science, 47, 13-21. https://doi.org/10.1016/j.procs.2015.03.178
  90. Wang A, An N, Yang J, Chen G, Li L, and Alterovitz G (2017a). Wrapper-based gene selection with Markov blanket, Computers in Biology and Medicine, 81, 11-23. https://doi.org/10.1016/j.compbiomed.2016.12.002
  91. Wang H, Jing X, and Niu B (2017b). A discrete bacterial algorithm for feature selection in classification of microarray gene expression cancer data, Knowledge-Based Systems, 126, 8-19. https://doi.org/10.1016/j.knosys.2017.04.004
  92. Wang L, Wang Y, and Chang Q (2016). Feature selection methods for big data bioinformatics: a survey from the search perspective, Methods, 111, 21-31. https://doi.org/10.1016/j.ymeth.2016.08.014
  93. Wang Y, Tetko IV, Hall MA, Frank E, Facius A, Mayer KF, and Mewes HW (2005). Gene selection from microarray data for cancer classification-a machine learning approach, Computational Biology and Chemistry, 29, 37-46. https://doi.org/10.1016/j.compbiolchem.2004.11.001
  94. Witten DM and Tibshirani R (2010). A framework for feature selection in clustering, Journal of the American Statistical Association, 105, 713-726. https://doi.org/10.1198/jasa.2010.tm09415
  95. Xu R, Damelin S, Nadler B, and Wunsch II DC (2010). Clustering of high-dimensional gene expression data with feature filtering methods and diffusion maps, Artificial Intelligence in Medicine, 48, 91-98. https://doi.org/10.1016/j.artmed.2009.06.001
  96. Yang J, Zhou J, Zhu Z, Ma X, and Ji Z (2016). Iterative ensemble feature selection for multiclass classification of imbalanced microarray data, Journal of Biological Research-Thessaloniki, 23, 1-9. https://doi.org/10.1186/s40709-016-0038-7
  97. Yang K, Cai Z, Li J, and Lin G (2006). A stable gene selection in microarray data analysis, BMC Bioinformatics, 7, 1-16. https://doi.org/10.1186/1471-2105-7-1
  98. Yang Y, Yin P, Luo Z, Gu W, Chen R, and Wu Q (2019). Informative feature clustering and selection for gene expression data, IEEE Access, 7, 169174-169184. https://doi.org/10.1109/ACCESS.2019.2952548
  99. Ye X and Sakurai T (2017). Unsupervised Feature Learning for Gene Selection in Microarray Data Analysis, In Proceedings of the 1st International Conference on Medical and Health Informatics 2017, Taichung City, 101-106.
  100. Yu L, Han Y, and Berens ME (2011). Stable gene selection from microarray data via sample weighting, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 9, 262-272.
  101. Yu L and Liu H (2004). Efficient feature selection via analysis of relevance and redundancy, The Journal of Machine Learning Research, 5, 1205-1224.
  102. Yu Z, Chen H, You J, Wong HS, Liu J, Li L, and Han G (2014). Double selection based semi-supervised clustering ensemble for tumor clustering from gene expression profiles, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 11, 727-740. https://doi.org/10.1109/TCBB.2014.2315996
  103. Zare M, Eftekhari M, and Aghamollaei G (2019). Supervised feature selection via matrix factorization based on singular value decomposition, Chemometrics and Intelligent Laboratory Systems, 185, 105-113. https://doi.org/10.1016/j.chemolab.2019.01.003
  104. Zhang Y, Deng Q, Liang W, and Zou X (2018). An efficient feature selection strategy based on multiple support vector machine technology with gene expression data, BioMed Research International, 2018, 7538204.
  105. Zhao J, Lu K, and He X (2008). Locality sensitive semi-supervised feature selection, Neurocomputing, 71, 1842-1849. https://doi.org/10.1016/j.neucom.2007.06.014