Diagnosis of Alzheimer's Disease using Combined Feature Selection Method

Faisal, Fazal Ur Rehman;Khatri, Uttam;Kwon, Goo-Rak;

doi:10.9717/kmms.2021.24.5.667

Journal of Korea Multimedia Society (한국멀티미디어학회논문지)

Volume 24 Issue 5
/
Pages.667-675
/
2021
/
1229-7771(pISSN)
/
2384-0102(eISSN)

Korea Multimedia Society (한국멀티미디어학회)

DOI QR Code

Diagnosis of Alzheimer's Disease using Combined Feature Selection Method

Faisal, Fazal Ur Rehman (School of Information and Communication Engineering, Chosun University) ;
Khatri, Uttam (School of Information and Communication Engineering, Chosun University) ;
Kwon, Goo-Rak (School of Information and Communication Engineering, Chosun University)

Received : 2021.05.20
Accepted : 2021.05.28
Published : 2021.05.31

https://doi.org/10.9717/kmms.2021.24.5.667 Citation PDF KSCI HTML

Download PDF

⟨ Previous Next ⟩

Abstract

The treatments for symptoms of Alzheimer's disease are being provided and for the early diagnosis several researches are undergoing. In this regard, by using T1-weighted images several classification techniques had been proposed to distinguish among AD, MCI, and Healthy Control (HC) patients. In this paper, we also used some traditional Machine Learning (ML) approaches in order to diagnose the AD. This paper consists of an improvised feature selection method which is used to reduce the model complexity which accounted an issue while utilizing the ML approaches. In our presented work, combination of subcortical and cortical features of 308 subjects of ADNI dataset has been used to diagnose AD using structural magnetic resonance (sMRI) images. Three classification experiments were performed: binary classification. i.e., AD vs eMCI, AD vs lMCI, and AD vs HC. Proposed Feature Selection method consist of a combination of Principal Component Analysis and Recursive Feature Elimination method that has been used to reduce the dimension size and selection of best features simultaneously. Experiment on the dataset demonstrated that SVM is best suited for the AD vs lMCI, AD vs HC, and AD vs eMCI classification with the accuracy of 95.83%, 97.83%, and 97.87% respectively.

Keywords

1. INTRODUCTION

Over 50 million people across the world, have been suffering from dementia and these figures gradually increasing by 10 million cases per year. A well-known form of dementia is Alzheimer’s disease (AD) which contributes in around 50% of cases[1]. AD is most usual type of neurodegenerative disorder which results dementia and it is identify by a loss of memory and a progressive decline of cognitive functions. The diagnosis of AD is still under progress due to the various symptoms shown by the patients[2].

In this context, a development of a computational-intelligence-based diagnostic tools is a very promising goal which can help the experts to identify the AD at its early stage. Therefore, advancement of neuroimaging techniques has its importance for structural and functional brain analysis which can use for the identification of AD-related brain symptoms[3, 4, 5].

In recent years, Machine Learning (ML) gaining interest in the field of digital healthcare due to its unique property of integrating a data on large scale [6,7]. ML algorithms are basically based on computational and statistical models that can trained through experience and predict based on unique data[8]. By using ML approaches it is possible to uncover the patterns in the data in order to distinguish the diagnostics subjects and identifying pathological scenarios[9, 10, 11].

Many studies analyzed the potential of MLbased analytical frameworks on MRI data for the characterization and automatic diagnosis of AD [12, 13, 14, 15]. In this paper, structural MRI is under-focused to perform AD classification. The intensity and stage of the neurodegeneration are figured out using the atrophy measures from sMRI scans. Atrophy on sMRI makes an effect on cumulative loss and results into shrinkage of the neuropil[16]. It shows a volumetric estimation of cortical thickness and subcortical volume which is essential biomarker for early detection of AD. Thus, features that are extracted from sMRI scans attracted the researchers for AD classification. These studies mention about the morphometric methods such as region of interest (ROI)/volume of interest (VOI) grey matter voxels in the automatic segmentation of images[17] and the sMRI measurement of the hippocampus and the medial temporal lobe[18].

In this paper, we also used the subcortical and cortical features in order to classify AD. Whereas, while working on a model it’s common to see a complexity and time consumption issue. Therefore, different techniques have been used in different studies, the aim of those studies is to overcome the issue of model accuracy and overfitting of a data[19, 20, 21]. In that case, Principal Component Analysis (PCA) is witnessed as a widely used technique in a model. The purpose of PCA is to reduce the dimensions of datasets. Which helps the model to train itself faster and visualize the data without any complexity. The purpose of using a PCA in this paper is to verify whether the features are independent of each other or not. With the help of PCA new independent features were created from the old one and the least features were dropped. On the other hand, Recursive Feature Elimination (RFE) is like a greedy optimization approach whose targets is to explore the best feature in a dataset. It can be achieved by fitting the given ML algorithm which is used in the core of the model, labeled the features by its importance and eliminates the least important features and then re-fit the model. This process can repeat until a defined number of features remains. In experiment, we use the combination of PCA and RFE in order to build a model where PCA reduced the dimension of features set and then RFE provides the best selected features to the classifiers.

Experiment demonstrated the performances of some traditional ML based approaches i.e., SoftMax Classifier, Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbor, Naïve Bayes and perform the binary classification among four groups: AD, lMCI, eMCI and HC. The motivation of this study is to propose the best method for feature selection and to account the overfitting issue in the classification model.

2. METHODOLOGY

2.1 sMRI Dataset

For studies, data from Alzheimer’s disease neuroimaging initiative database (ADNI) (http://adni. loni.usc.edu/) has been collected. ADNI project was started in 2003 as a public-private partnership with an aim to examine the progression of mild cognitive impairment and early AD while investing the combination of serial MRI, PET, other biological markers, clinical and neuro-physical assessments.

2.2 Subjects

ADNI dataset contain more than 6000 subjects between the age of 18 to 96. Among this, preprocessed images of all 313 patients were selected in this research which are fulfilling the ADNI protocol.

The dataset divided into four groups

(1) 78 AD subjects: 42 males, 36 females, age ± SD = 76.7 ± 5.7 years, range = 60-90 years.

(2) 79 LMCI subjects: 41 males, 38 females, age ± SD = 72.5 ± 6.7 years, range = 60 - 90 years.

(3) 78 EMCI subjects: 40 males, 38 females, age ± SD = 74.5 ± 6.9 years, range = 60 - 90 years.

(4) 78 HC subjects: 35 males, 43 females, age ± SD = 79.6 ± 5.4 years, range = 60 - 90 years.

Table 1 mention the demographics information regarding the subjects that has been used in this paper. Data split in (70/30 ratio) for an unbiased evaluation. Training part contain 70% of the data while for testing 30% was allocated.

Table 1. Demographic Report of Subjects.

MTMDCW_2021_v24n5_667_t0001.png 이미지

2.3 Feature Extraction

In total, 873 features have been used from the subcortical and cortical segmentation in this study, which was obtained through fully automated pipeline of the FreeSurfer software package. Free Surfer works on an automated workflow which performs preprocessing tasks in order to achieve a final brain parcellation image inside the subject’s space; however, for quality control manual editing option of an image also available at each stage

After the preprocessing stage, the extracted data was normalized to zero mean and unit variance using standard scalar function. The purpose of normalization is to get rid of the anomalies in the data which makes the analysis more complicated. The normalized matrix of elements x(i, j) is given by:

\(X_{norm}=\frac{x(i,j)-mean(X_j)}{std(X_j)}\) (1)

2.4 Feature Selection

In analysis process, often dimension issue is reported due to large number of features. PCA provide the efficient method to handle this issue. PCA basically creates a linear combination of initial features and maps the dataset in d-dimension space to a k-dimension subspace following that k < d. The acquired variables k is called as principal components (PCs). Each PC is addressed towards the maximum variance except the variance which is already accounted for in all its succeeding components. This, the first component covers the larger variances as compare to the adhering components. PCs can be calculated as:

\(PC_{I}=a_1X_1+a_2X_2+\cdots+a_dX_d\) (2)

Where PC_i represent the ith PC, X_i is original feature j, and a_i shows the numerical coefficient for x_i.

PCA is the most widely used method for feature selection. Where, it only transforms the features into lower dimension. On the other hand, by using a feature selection method model only runs on the basis of selective feature without any change. We first reduce the dimension of a feature through PCA, then we select the important features using RFE feature selection method.

Recursive Feature Elimination is a wrappertype feature selection algorithm. This method creates models based on various subsets of input features and select the best ones according to their performance metric. RFE is the best example of a wrapper feature selection. In Table 2, RFE algorithm is defined as below:

Table 2. Process of Recursive feature elimination.

MTMDCW_2021_v24n5_667_t0002.png 이미지

2.5 Classification Method

Once the sMRI features were extracted in both cortical and subcortical region. The combined features from both regions undergoes to the normalization and feature selection stage. After having the selected features, it can be used to discriminate the AD with other groups. This approach is shown in Fig. 1. In order to perform the classification task, we select five best machine-learning based classi fiers namely: SVM, Random Forest, Naïve Bayes, KNN, SoftMax classifiers, and Random Forest.

MTMDCW_2021_v24n5_667_f0001.png 이미지

Fig. 1. Block diagram of Diagnosis Process.

3. EXPERIMENTAL RESULTS

3.1 Performance Evaluation

Using confusion matrix as shown in Table 3 the binary classification performance can be evaluated easier:

System’s performance was estimated using SVM, KNN, NB, RF and SoftMax classifiers. Each classifier predicts the correct number of outputs which is in the form of diagonal matrix. The elements are further split into true positive (TP), true negative (TN), false positive (FP) and false negative (FN) elements. Where TN and TP indicate the correct identified controls. Whereas, FP and FN represent the incorrect identified controls.

Table 3. Demographic Report of Subjects.

MTMDCW_2021_v24n5_667_t0003.png 이미지

3.2 Evaluation Parameters

For evaluation we use the accuracy parameter which measures the ratio of examples that are correctly labeled by the classifiers.

\(ACC=\frac{TP+TN}{TP+TN+FP+FN}\) (3)

However, accuracy result may mislead due to the unstable distribution of class. Therefore, 3 additional performance metrics should be calculated: specificity, sensitivity, and Youden Index. Defined as follows

\(SEN=\frac{TP}{TP+FN}\) (4)

\(SPEC=\frac{TN}{TN+FP}\) (5)

\(YI=(SEN+SPEC-1)\) (6)

Where, sensitivity (4) identifies the prediction group’s accuracy, specificity (5) represents the absence group’s accuracy and Youden Index (6) mention the effectiveness of the biomarker.

3.3 Classification Results

The classification performance was measures using the subcortical and cortical features and the obtained results are shown in Tables 3- for AD vs HC, AD vs eMCI, AD vs lMCI. The classification report is shown in Fig. 2-4. The programs were executed in 64-bit Python 3.8.5 environment on Intel (R) core (TM_i9-9960X) at 3.10 GHz and 126GB RAM running Ubuntu 20.04. The mentioned model might be implemented on any computer which supported the Python 3.8.5 environment.

MTMDCW_2021_v24n5_667_f0002.png 이미지

Fig. 2. AD vs eMCI Performance Comparison in terms of accuracy and Youden Index.

MTMDCW_2021_v24n5_667_f0003.png 이미지

Fig. 4. AD vs HC Performance Comparison in terms of accuracy and Youden Index.

3.3.1 Binary Classification: AD vs eMCI, AD vs lMCI, and AD vs HC

Table 4 shows the summary of classification results among AD vs eMCI. Generally, all specified techniques performed better. However, AD diagnostic was significantly better through SVM with an accuracy of 97.87% in this binary classification

In terms of AD vs lMCI classification.. The performance report for AD vs lMCI is mentioned in Table 5 shows the SVM accuracy　of 95.83% which is noted better as compared to other classifiers techniques.

Similarly in AD vs HC classification,　as mentioned in Table 5　SVM outperform the other machine learning classifiers performance with an accuracy of 97.83%.

Table 4. Classification Results of AD vs eMCI.

MTMDCW_2021_v24n5_667_t0004.png 이미지

Table 5. Classification Results of AD vs lMCI.

MTMDCW_2021_v24n5_667_t0005.png 이미지

Table 6. Classification Results of AD vs HC.

MTMDCW_2021_v24n5_667_t0006.png 이미지

4. DISCUSSION

Various techniques for AD classification have been performed. The subjects based on anatomical T1-weighted MRI were used to assess and compare the classification performance using three binary classification experiments: AD vs eMCI, AD vs lMCI, and AD vs HC. Data was randomly split into two groups with a ratio of 70/30 for training and testing purpose. After that, PCA was implemented for dimension reduction and RFE was utilized to select the best features for classification task. Five classifiers were used to evaluate the performance in which radical basic function (RBF) based SVM classifier yielded a better result as compared to other classifiers.

Previously structural　and functional　measurements were also used to classify AD patients. S. Kadoury [22] proposed a method for group classification using semantically labeled PET image features. Their method yield accuracy of 91.2% for AD versus HC classification. Manifold-based emisupervised learning method for AD diagnosis is utilized by Khajehnejad et al.[23]. Their method acquired an accuracy of 93.86%. Islam and Zhan [24] presented an idea of deep convolutional neural network ensembling for early diagnosis of AD. They uses DenseNet deep learning models in their study for the classification of OASIS dataset. In their study they yield accuracy of about 93.18%. Wolz et al.[25] compare the performance of SVM and LDA classifiers while using the combined features that are extracted from the hippocampal volume, TBM, cortical thickness, and manifold-based learning. In their proposed method LDA classifier achieved the best accuracy of 89% for AD classification.

In the comparison of the above-mentioned work, this study aims to enhance the accuracy of a model that can be attained by combining the feature selection technique. In our study, we investigate a combined feature of cortical thickness and subcortical volume in the AD, eMCI, lMCI, and HC and compare the performance of the different machine learning classifiers (SVM, Random Forest, Naïve Bayes, K-nearest neighbor,　and Softmax classifier) using an efficient combined feature selection approach for the classification of AD.

5. CONCLUSION

A combination of dimension reduction and selection method of features had been proposed in this paper. Idea of combine feature selection method effectively predict the AD from eMCI, lMCI (early or late to the AD) and a healthy group obtained from ADNI dataset. In this study, a combination of subcortical and cortical features had been utilized which were extracted from an automated toolbox. Classification tasks was performed using five classifiers (Softmax, SVM, KNN, Naïve Bayes and Random Forest). The experimental results mentioned the satisfactory performance for three cases (AD vs eMCI, AD vs lMCI and AD vs HC) following the proposed techniques. Comparison of the performances shows that RBF-SVM classifier achieved prominent performance for all three cases mentioned in Tables 4 and 6.

Presented works consists only subcortical and cortical features for classification. However, in future features like hippocampus and amygdala will be use in the AD diagnosis. Additionally, different datasets can also be used for the early prediction of AD and for classification purpose new method can also be introduced in the future work.

ACKNOWLEDGMENTS

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. NRF-2019 R1A4A1029769, NRF-2019R1F1A1060166). Data collection and sharing was supported by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense, Award no. W81XWH-12-2-0012). The database acquired from ADNI can be found at (http://adni.loni.usc. edu/)

References

S.A. Haidar, S. Akihiko, K. Hitoshi, I. Ryuta, I. Manabu, and T. Kenji, "Machine Learning for Diagnosis of AD and Prediction of MCI Progression From Brain MRI Using Brain Anatomical Analysis Using Diffeomorphic Deformation," Frontiers in Neurology, Vol. 11, pp. 1894, 2021.
E. Lella, N. Amoroso, A. Lombardi, T. Maggipinto, S. Tangaro, R. Bellotti et al., "Communicability Disruption in Alzheimer's Disease Connectivity Networks," Journal of Complex Networks, Vol. 7, pp. 83-100, 2019. https://doi.org/10.1093/comnet/cny009
S. ARB Rombouts, F. Barkhof, R. Goekoop, C.J. Stam, and P. Scheltens, "Altered Resting State Networks in Mild Cognitive Impairment and Mild Alzheimer's Disease: an fMRI Study," Human Brain Mapping, Vol. 26, No. 4 pp. 231-239, 2005.
X. Zhao, Y. Liu, X. Wang, B. Liu, Q. Xi, Q. Guo et al., "Disrupted Small-world Brain Networks in Moderate Alzheimer's Disease: a Resting-State fMRI Study," PloS One, Vol. 7, No. 3, pp. e33540, 2012. https://doi.org/10.1371/journal.pone.0033540
E. Lella, N. Amoroso, A. Lombardi, T. Maggipinto, S. Tangaro, and R. Bellotti, "Communicability Disruption in Alzheimer's Disease Connectivity Networks," Journal of Complex Networks, Vol. 7, Issue 1, pp. 83-100, 2019. https://doi.org/10.1093/comnet/cny009
F. Al-Turjman, M.H. Nawaz, and U.D. Ulusar, "Intelligence in the Internet of Medical Things era: A Systematic Review of Current and Future Trends," Computer Communications, Vol. 150, pp. 644-660, 2020. https://doi.org/10.1016/j.comcom.2019.12.030
J.A.M. Sidey-Gibbons and C.J. Sidey-Gibbons, "Machine Learning in Medicine: a Practical Introduction," BMC Medical Research Methodology, Vol. 19, pp. 1-18, 2019. https://doi.org/10.1186/s12874-018-0650-3
M.I. Jordan and T.M. Mitchell, "Machine Learning: Trends, Perspectives, and Prospects," Science, Vol. 349, Issue 6245, pp. 255-260, 2015. https://doi.org/10.1126/science.aaa8415
C. Casalino, G. Castellano, A. Consiglio, M. Liguori, N. Nuzziello, and D. Primiceri. "A Predictive Model for Microrna Expressions in Pediatric Multiple Sclerosis Detection," International Conference on Modeling Decisions for Artificial Intelligence, Springer, pp. 177-188, 2019.
M.T. Angelillo, F. Balducci, D. Impedovo, G. Pirlo, and G. Vessio. "Attentional Pattern Classification for Automatic Dementia Detection," IEEE Access, Vol. 7, pp. 57706-57716, 2019. https://doi.org/10.1109/ACCESS.2019.2913685
S. Bhattacharjee, D. Pakash, H.C. Kim, and H.K. Choi, "Multichannel Convolution Neural Network Classification for the Detection of Histological Pattern in Prostate Biopsy Images," Journal of Korea Multimedia Society, Vol. 23, No. 12, pp. 1486-1495, 2020 https://doi.org/10.9717/KMMS.2020.23.12.1486
M. Dyrba, M. Ewers, M. Wegrzyn, I. Kilimann, C. Plant, A. Oswald et al., "Robust Automated Detection of Microstructural White Matter Degeneration in Alzheimer's Disease Using Machine Learning Classification of Multicenter DTI Data," PloS One, Vol. 8, No. 5, pp. e64925, 2013. https://doi.org/10.1371/journal.pone.0064925
E. Lella, N. Amoroso, R. Bellotti, D. Diacono, M. La Rocca, T. Maggipinto et al., "Machine Learning for the Assessment of Alzheimer's Disease through DTI," Applications of Digital Image Processing XL, International Society for Optics and Photonics, Vol. 10396, pp. 1039619. 2017.
C. Lian, M. Liu, J. Zhang, and D. Shen. "Hierarchical Fully Convolutional Network for Joint Atrophy Localization and Alzheimer's Disease Diagnosis Using Structural MRI," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 42, No. 4, pp. 880-893, 2018. https://doi.org/10.1109/tpami.2018.2889096
C.Y. Wee, P.T. Yap, W. Li, K. Denny, J.N. Browndyke, G.G. Potter et al., "Enriched White Matter Connectivity Networks for Accurate Identification of MCI Aatients," Neuroimage, Vol. 54, No. 3, pp. 1812-1822, 2011. https://doi.org/10.1016/j.neuroimage.2010.10.026
F. Barkhof, T.M. Polvikoski, E.C.W. Van Straaten, R.N. Kalaria, R. Sulkava, H.J. Aronen et al., "The Significance of Medial Temporal Lobe Atrophy: a Postmortem MRI Study in the Very Old," Neurology, Vol. 69, No. 15, pp. 1521-1527, 2007. https://doi.org/10.1212/01.wnl.0000277459.83543.99
K.I. Diamantaras, and S.Y. Kung, "Principal Component Neural Networks: Ttheory and Applications," John Wiley & Sons, Inc., 1996.
J. Barnes, J.W. Bartlett, L.A. van de Pol, C.T. Loy, R.I. Scahill, C. Frost et al., "A Meta-Analysis of Hippocampal Atrophy Rates in Alzheimer's Disease," Neurobiology of Aging, Vol. 30, No. 11, pp. 1711-1723, 2009. https://doi.org/10.1016/j.neurobiolaging.2008.01.010
Y. Gupta, K.H. Lee, K.Y. Choi, J.J. Lee, B.C. Kim, and G.R. Kwon, "Alzheimer's Disease Diagnosis Based on Cortical and Subcortical Features," Journal of Healthcare Engineering, Vol. 2019, Article ID 2492719, 2019.
R.K. Lama, J. Gwak, J.S. Park, and S.W. Lee, "Diagnosis of Alzheimer's Disease Based on Structural MRI Images Using a Regularized Extreme Learning Machine and PCA Features," Journal of Healthcare Engineering, Vol. 2017, PMID 29065619, 2017.
D. Jha, S. Alam, J.Y. Pyun, K.H. Lee, and G.-R. Kwon, "Alzheimer's Disease Detection Using Extreme Learning Machine, Complex Dual Tree Wavelet Principal Coefficients and Linear Discriminant Dnalysis," Journal of Medical Imaging and Health Informatics, Vol. 8, No. 5, pp. 881-890, 2018. https://doi.org/10.1166/jmihi.2018.2381
S.H. Nozadi, S. Kadoury, and The Alzheimer's Disease Neuroimaging Initiative, "Classification of Alzheimer's and MCI Patients from Semantically Parcelled PET Images: a Comparison between AV45 and FDG-PET," International Journal of Biomedical Imaging, Vol. 2018, Article ID 12417430, 2018.
M. Khajehnejad, F. Saatlou, and H. Mohammadzade, "Alzheimer's Disease Early Diagnosis Using Manifold-Based Semi- Supervised Learning," Brain Sciences, Vol. 7, No. 12, pp. 1-19, 2017.
J. Islam and Y. Zhang, "An Ensemble of Deep Convolutional Neural Networks for Alzheimer's Disease Detection and Classification," arXiv preprint, arXiv:171201675v2, Dec. 2017.
R. Wolz, V. Julkunen, J. Koikkalainen, E. Niskanen, D.P. Zhang, D. Rueckert et al., "Multi-Method Analysis of MRI Images in Early Diagnostics of Alzheimer's Disease," PloS One, Vol. 6, No. 10, e25446, 2011. https://doi.org/10.1371/journal.pone.0025446

Journal of Korea Multimedia Society (한국멀티미디어학회논문지)

Diagnosis of Alzheimer's Disease using Combined Feature Selection Method

Abstract

Keywords

1. INTRODUCTION

2. METHODOLOGY

2.1 sMRI Dataset

2.2 Subjects

2.3 Feature Extraction

2.4 Feature Selection

2.5 Classification Method

3. EXPERIMENTAL RESULTS

3.1 Performance Evaluation

3.2 Evaluation Parameters

3.3 Classification Results

3.3.1 Binary Classification: AD vs eMCI, AD vs lMCI, and AD vs HC

4. DISCUSSION

5. CONCLUSION

ACKNOWLEDGMENTS

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)