1. Introduction
Medical imaging is a valuable technique for visualizing the internal structure of the human body, aiding in the diagnosis, treatment, and evaluation of diseases. Among the different medical imaging techniques available, magnetic resonance imaging (MRI) particularly stands out for its ability to provide detailed image information that assists in determining the nature of lesions. As a result, MRI is widely used in diagnosing brain diseases that involve damaged brain tissue. Accurate brain MR images segmentation is immense practical significance in assisting doctors with diagnosis. The objective of brain image segmentation is to categorize the tissues in the image into non-overlapping categories: gray matter (GM), white matter (WM), and cerebrospinal fluid (CSF). However, the low contrast between tissues increases the complexity of brain MR image segmentation. In addition, brain MR images are often affected by factors such as intensity inhomogeneity and noise, resulting in blurriness and intensity inhomogeneity, which pose significant challenges to segmentation. Numerous existing image segmentation methods leverage the discontinuity of image pixel intensity or the similarity of neighborhood pixels. Examples of conventional segmentation methods include thresholding algorithms[1], edge-based algorithms[2], region-based algorithms[3], and clustering-based algorithms[4][5][6]. Furthermore, deep learning[7][8][9], as an emerging technology, has shown exceptional performance in image segmentation tasks.
The fuzzy c-means algorithm (FCM) is a widely used clustering-based algorithm known for its simplicity, efficiency, and ease of application. However, the original FCM algorithm has some limitations that hinder its performance in segmenting MR images. These limitations include the use of the non-robust Euclidean distance based dissimilarity function and the lack of consideration for spatial information between pixels.
The FCM algorithm, which relies on the Euclidean distance of a dissimilarity function, has proven effective for spherical data but falls short when dealing with high-dimensional data (ED, [10]). To overcome this limitation, Chen and Zhang proposed a modification that employs a kernel-induced non-Euclidean distance as the dissimilarity function in their method (FCMS12, [11]). This alteration significantly enhances the algorithm's performance. However, interpreting the results becomes challenging due to the use of high-dimensional space to cluster prototypes. Another approach, introduced by Zhao et al. (MD, [12]), combines statistical methods with clustering methods by utilizing the Mahalanobis distance as the dissimilarity function for FCM (MFCM). This integration yields improved performance and robustness in clustering. In addition, the robust spatial constrained FCM method (RSCFCM, [13]) incorporates the Gaussian distribution as the dissimilarity function. By considering both prior and posterior probabilities, as well as spatial direction, this method achieves superior segmentation results.
All the aforementioned methods assume that the distributions of brain MR images are symmetric. However, in some cases, brain MR images exhibit asymmetric distributions, which poses challenges for symmetric distribution-based methods to achieve satisfactory results (LPL, MSN2, MSN3, [14][15][16]). Fig. 1 displays the histograms of the brain MR image data sets from IBSR (16_3 and 7_8). The histogram of 16_3 exhibits a symmetric form, whereas the histogram of 7_8 displays an asymmetric form. Consequently, methods based on symmetric distributions are unable to accurately model the distributions of image data sets with asymmetric forms. In our previous work, we proposed an improved anisotropic Hierarchical Fuzzy c-means method (AMTHFCM, [17]). This method utilizes a hierarchical multivariate Student-t distribution to construct the dissimilarity function, allowing for the estimation of asymmetric distributions and effectively handling outliers. AMTHFCM has shown the capability to yield more accurate results. However, determining the parameter of the inner class is challenging. To address this challenge, we drew on the work of Azzalini et al. (MSN1, [18]), who proposed a multivariate skew-normal distribution and highlighted the normal distribution as a specific instance of the skew-normal distribution when the data exhibits no skewness. Based on this, in our previous work, we proposed a spatially constrained anisotropic asymmetric finite mixture model (SCAAFMM, [19]) to segment images with asymmetric forms.
Fig. 1. The histograms of the brain MR image data sets of IBSR (16_3 and 7_8).
Any spatial information has not been considered in distribution-based methods, which makes them sensitive to noise. Researchers have proposed several improvements to enhance the accuracy of segmentation in MR images[11][20][21][22][23]. In the work of Chen et al. (FCMS12, [11]), they incorporated spatial information by considering the mean and median information of each neighbor, aiming to mitigate the impact of noise. However, determining both the clustering number and the weighting factor that governs the original image and the processed image remains a challenge. Krinidis et al. (FLICM, [21]) made enhancements to the FCM algorithm by incorporating a fuzzy information constraint term that considers both the gray level information of neighboring pixels and local information. This modification, known as FLICM, effectively suppresses noise interference while preserving image details. However, the FLICM has not been thoroughly derived mathematically and further investigation may be required in this aspect. In addition to considering the local neighborhood information of pixel gray levels, utilizing local membership information can help reduce the influence of noisy pixels on membership. The Kullback-Leibler (KL) divergence is a widely used measure to capture local membership information and the membership itself. Gharieb (LMKLFCM, [22]) proposed an improved FCM algorithm by incorporating local membership KL information into the objective function as the fuzzification and regularization term. This approach, known as LMKLFCM, effectively suppresses noise interference and smoothens the image boundaries. However, since LMKLFCM does not utilize enhanced image data, Gharieb (LMDKLFCM, [23]) further improved the anti-noise performance by incorporating a weighted distance term based on the enhanced data. This modification enhances the ability of LMDKLFCM to mitigate the impact of noise.
Due to the random initialization of membership and the presence of noise, clustering results can be unstable and suboptimal. In order to address this issue, researchers have explored algorithms that incorporate prior information to guide and correct membership errors during the clustering process. For instance, Wang et al.[24] utilized the membership of the filtered image as prior knowledge and coupled it with the membership function of the original image based on KL divergence (KLDFCM). This approach improved robustness to outliers and noise. Another approach was introduced by Yang et al.[25], who proposed a linked dimensionality reduction and K-means clustering method (LDRKCM). LDRKCM employed a deep neural network (DNN) to produce reduced-dimensional data and utilized the K-means algorithm. By leveraging deep learning techniques, LDRKCM obtained prior knowledge, which was then combined with the clustering method to enhance performance.
Furthermore, the presence of bias field in MR images is caused by equipment and magnetic field effects during image acquisition. This phenomenon leads to variations in grey intensity within the same tissue. It has been found that intensity inhomogeneity affects brain MR image segmentation more significantly than noise[26]. Intensity inhomogeneity correction, a postprocessing technique, is commonly employed to mitigate or eliminate the bias field effect. Wells[26] proposed an adaptive technique, known as ASeg, for both correction and segmentation of MR images. Building on this work, Pham and Prince[27] introduced a smoothing term in the objective function based on Fuzzy C-means (FCM) to obtain a smooth bias field (AFCM). However, determining the optimal weight for the smoothing term in order to achieve the best segmentation outcome is a challenging task. Leemput[28] and Li[29] modeled the bias field by using a set of orthogonal basis functions to achieve better results. However, these methods did not consider spatial information, making them less robust to noise.
Deep learning methods have the ability to extract feature information from the bottom to the top, leading to more accurate results. However, these methods often require Sufficient amount of training data. In the context of medical image analysis, obtaining pixel-level calibration data proves challenging due to the need for strong medical expertise to protect patient privacy and ensure data accuracy. As a result, deep learning based methods are at risk of overfitting and may struggle to achieve high-precision results.
Based on the analysis presented above, we propose an innovative algorithm that combines the FCM algorithm based on multivariate skew-normal distribution with U-Net for brain MR image segmentation. Our approach addresses several key challenges. Firstly, we utilize the multivariate skew-normal distribution to define the dissimilarity function, providing robustness to asymmetric data. Secondly, we employ U-Net to achieve preliminary segmentation results with limited training data, thus serving as prior information for the target images. Thirdly, we define a regular term based on the KL divergence and prior probabilities, which helps in avoiding the selection of inappropriate initial parameters for FCM. Lastly, we model the bias field by using a set of orthogonal basis functions and integrate it into the improved FCM algorithm for simultaneous segmentation and estimation. Comparative evaluations demonstrate that our proposed method outperforms existing techniques, particularly in terms of segmentation accuracy.
The main contributions of this article are as follows:
1) We use the multivariate skew-normal distribution as the dissimilarity function in FCM. This distribution is more suitable for fitting the distribution of asymmetric data compared to traditional measures. This helps improve the accuracy of the clustering algorithm.
2) We incorporate both prior and posterior membership information by utilizing KL divergence to formulate a regularization term. This term takes into account the membership of the original image as well as the prior membership acquired through preliminary segmentation using U-Net. By leveraging this membership information, we are able to refine our approach and achieve better results.
3) We introduce an innovative approach for simultaneously estimating the bias field and segmenting brain MR images.
4) The experiments on both simulated and real brain MR images demonstrates the robustness and efficacy of the proposed method, particularly in the presence of challenges such as noise and intensity inhomogeneity.
2. Theoretical basis
2.1 Fuzzy c-means (FCM) Method
The FCM is a clustering algorithm that extends the K-means algorithm by incorporating fuzzy set theory. In FCM, each sample is allowed to belong to multiple categories simultaneously, with different membership values assigned to each category. This flexibility enables FCM to handle cases where samples may have uncertain or ambiguous membership. Image segmentation is a common application of FCM. The goal of image segmentation is to assign pixels to distinct categories based on their memberships. FCM achieves this by iteratively calculating and updating the clustering center and membership matrix. The clustering center represents the prototype of each category, while the membership matrix contains the membership degrees of each pixel to all categories.
Given an image I = {I1, I2, …, IN} containing N pixels, where Ii denotes the gray value of the 𝒊th pixel, the objective is to divide the image into K distinct and meaningful regions. In FCM, each pixel belongs to one or more clusters with membership values ranging from 0 to 1. The energy function EFCM is defined as the sum of the distances between each pixel in the image and the cluster centroids, weighted by their membership values:
EFCM = ∑𝑁𝑖=1 ∑𝐾𝑗=1 𝑢𝑚ij𝑑(𝐼𝑖, 𝑣𝑗) (1)
Here, uij represents the membership of pixel i to the jth region. The parameter m controls the fuzziness of the clustering, where a larger value of m leads to a more fuzzy partitioning. The function 𝑑(Ii, vj) is a Euclidean distance which represents the distance between pixel 𝑖 and the centroid vj of the jth cluster. The membership values uij must satisfy uij ≥ 0 and ∑𝐾𝑗=1 𝑢ij = 1.
The FCM algorithm is known to be susceptible to noise and bias field interference during the segmentation of brain MR images. Additionally, the use of Euclidean distance alone, which only takes into account the mean values of tissues, without considering their distribution information, makes the algorithm highly sensitive to weak edges.
2.2 Local membership KL divergence based FCM (LMKLFCM)
The Fuzzy C-means (FCM) algorithm has a limitation in that it does not take spatial information into consideration. As a result, it is sensitive to noise and outliers influencing the clustering performance. To address this issue, Gharieb et al. [22] proposed an improvement to the FCM algorithm by introducing a regularization term based on the Kullback-Leibler (KL) divergence. Consequently, the objective function is redefined as:
\(\begin{align}E_{L M K L F C M}=\sum_{i=1}^{N} \sum_{j=1}^{K} u_{i j} d_{i j}+\beta \sum_{i=1}^{N} \sum_{j=1}^{K} u_{i j} \log \left(\frac{u_{i j}}{\bar{\pi}_{i j}}\right)\end{align}\) (2)
where prior information \(\begin{align}{\bar{\pi}_{i j}}\end{align}\) is the mean value of the membership values in local region of each pixel. It can be computed using the following equation:
\(\begin{align}\bar{\pi}_{i j}=\frac{1}{N_{k}} \sum_{k \in N_{j}} u_{i k}\end{align}\) (3)
Then, the membership uij in LMKLFCM algorithm is given by:
\(\begin{align}u_{i j}=\frac{\bar{\pi}_{i j} \exp \left(-d_{i j} / \beta\right)}{\sum_{j=1}^{K} \bar{\pi}_{i j} \exp \left(-d_{i j} / \beta\right)}\end{align}\) (4)
Here, the coefficient 𝛽 serves as a weighting factor that controls the balance between the regularization term and the data term. By minimizing the KL divergence, it allows for similarity in the membership values within a local neighborhood, effectively smoothing noise. However, due to the isotropic nature of local information, the LMKLFCM algorithm tends to lose fine detail information, resulting in edge blurring.
2.3 Skew-normal Distribution
The distance function 𝑑(Ii, vj) in FCM is initially defined using Euclidean distance, which only takes into account the mean values of each tissue. However, in order to enhance accuracy, Ji et al. proposed a Normal distribution-based distance function in their work[13]. This function considers not only the mean but also the variance of each tissue, ultimately leading to more precise results. Nevertheless, the Normal distribution-based approach assumes that the brain MR images follow symmetric distributions. Unfortunately, in some cases, these images may exhibit asymmetric forms, making it difficult for symmetric distribution-based methods to achieve accurate results[19]. To address this issue, Azzalini introduced the concept of the multivariate skew-normal distribution. In his work[18], he explored the properties of this distribution and its ability to fit asymmetric distributions, allowing for more accurate modeling of brain MR images.
Given a D-dimensional random vector X, we assume that X follows a skew-normal distribution. In this case, it can be written as follows:
\(\begin{align}\boldsymbol{X}=\boldsymbol{\mu}+\boldsymbol{\Sigma}^{\frac{1}{2}} \delta\left|T_{0}\right|+\boldsymbol{\Sigma}^{\frac{1}{2}}\left(I_{n}-\boldsymbol{\delta} \boldsymbol{\delta}^{T}\right)^{\frac{1}{2}} T_{1}\end{align}\) (5)
Here, \(\begin{align}\boldsymbol{\delta}=\lambda / \sqrt{1+\lambda^{T} \lambda}\end{align}\) , where λ is the skewness parameter. Additionally, 𝑇0 and 𝑇1 are independent of each other, and follow 𝑁(0,1) and multivariate normal distribution 𝑁𝑛(0,𝐼𝑛), respectively. Thus, (5) can be rewritten as a hierarchical representation with a two-level structure:
\(\begin{align}\begin{aligned} \begin{array}{l}\boldsymbol{X} \left\lvert\, \tau \sim N\left(\boldsymbol{\mu}+\boldsymbol{\Sigma}^{\frac{1}{2}} \boldsymbol{\delta} \tau, \boldsymbol{\Sigma}^{\frac{1}{2}}\left(I_{n}-\boldsymbol{\delta} \boldsymbol{\delta}^{T}\right) \boldsymbol{\Sigma}^{\frac{1}{2}}\right)\right. \\ \quad \tau \sim H N_{1}(0,1)\end{array}\end{aligned}\end{align}\) (6)
Let \(\begin{align}\boldsymbol{\Gamma}=\boldsymbol{\Sigma}^{\frac{1}{2}}\left(I_{n}-\boldsymbol{\delta} \boldsymbol{\delta}^{T}\right) \boldsymbol{\Sigma}^{\frac{1}{2}}\end{align}\) , then the pdf of X can be expressed as:
\(\begin{align}\phi\left(x_{i} \mid \theta_{j}\right)=\frac{1}{\sqrt{(2 \pi)^{\frac{D}{2}}|\boldsymbol{\Gamma}|}} \exp \left(-\frac{1}{2}\left(x_{i}-\boldsymbol{\mu}-\boldsymbol{\Sigma}^{\frac{1}{2}} \boldsymbol{\delta} \tau\right)^{T} \boldsymbol{\Gamma}^{-1}\left(x_{i}-\boldsymbol{\mu}-\boldsymbol{\Sigma}^{\frac{1}{2}} \boldsymbol{\delta} \tau\right)\right)\end{align}\) (7)
The skew-normal distribution degenerates to the normal distribution when λ= 0. On the other hand, when λ > 0, the skew-normal distribution can capture right-skewed distributions, while when λ < 0, it can capture left-skewed distributions. Hence, the skew-normal distribution offers more robust modeling capabilities.
2.4 U-Net network
Convolutional deep learning methods are widely utilized in various image-related tasks such as image classification and segmentation due to their ability to extract features efficiently from the bottom to the top layers. Among these methods, the U-Net [7] network is a highly regarded convolutional deep learning approach that follows an encoding-decoding architecture, allowing for end-to-end pixel classification and yielding excellent segmentation results. However, this method heavily relies on a substantial amount of sample data. In the context of medical image analysis, data collection requires extensive medical expertise and, due to patient privacy concerns, the availability of pixel-level annotated data is often limited. Consequently, this scarcity of calibrated data makes the model susceptible to overfitting and results in segmentation accuracy that falls short of meeting clinical requirements.
2.5 Bias field estimation
As stated in ASEG[26], the impact of intensity inhomogeneity on brain MR image segmentation is more substantial than that of noise. Therefore, the correction of bias field has a crucial role in brain image processing. The observed image with bias field can be represented as follows:
I = (J + n)·B (8)
where I represents the observed image, J represents the true image to be recovered, n represents additive noise, and B represents the bias field. Several researchers have suggested that the bias field of MR images is smooth and changes gradually[26][29]. Based on this assumption, the bias field can be modeled using orthogonal polynomial functions:
𝐵(𝑥) = ∑𝐿𝑙=1 𝑞𝑙𝑠𝑙(𝑥) (9)
Here, {𝑠𝑙} represents the orthogonal polynomial basis functions, {𝑞𝑙} represents the coefficients, and 𝐿 represents the number of basis functions.
3. The proposed model
3.1 Improved FCM based on skew-normal distribution
The existing approaches using the Euclidean distance or distance function based on normal distribution fail to effectively characterize distributions that are asymmetric in nature. To overcome this drawback, we construct a distance function based on the skewed normal distribution. The skew-normal distribution based objective function is defined as:
𝐸 = ∑𝑁𝑖=1 ∑𝐾𝑗=1 𝑢ij𝑑ij (10)
If X ∈ RD follows SN(μ, Σ, λ) and belongs to the 𝑗𝑗th class, the log-likelihood function of the skew-normal distribution is represented as:
\(\begin{align}\begin{array}{l}L\left(\boldsymbol{\theta}_{j} \mid X\right)=\log p\left(X, \tau ; \boldsymbol{\theta}_{j}\right)=\log \left(p\left(X \mid \tau ; \boldsymbol{\theta}_{j}\right) p\left(\tau ; \boldsymbol{\theta}_{j}\right)\right) \\ =\log \left(\frac{\left|\boldsymbol{\Gamma}_{j}\right|^{-\frac{1}{2}}}{(2 \pi)^{\frac{D}{2}}} \exp \left[-\frac{1}{2}\left(x_{i}-\boldsymbol{\mu}_{j}-\boldsymbol{\Delta}_{j} \tau_{i}\right)^{T} \boldsymbol{\Gamma}_{j}^{-1}\left(x_{i}-\boldsymbol{\mu}_{j}-\boldsymbol{\Delta}_{j} \tau_{i}\right)\right] \times \frac{2}{(2 \pi)^{\frac{D}{2}}} \exp \left(-\frac{1}{2} \tau_{i}^{2}\right)\right)\end{array}\end{align}\) (11)
where 𝜟𝒋 = ∑1/2𝑗𝜹𝑗 and 𝜞𝑗 = ∑𝑗 − 𝜟𝑗𝜟𝑇𝑗 . By letting \(\begin{align}\hat{t}_{1, i}=E\left[\tau_{i} \mid x_{i}, \boldsymbol{\theta}_{j}=\widehat{\boldsymbol{\theta}}_{j}\right]\end{align}\) and \(\begin{align}\hat{t}_{2, i}=E\left[\tau_{i}^{2} \mid x_{i}, \boldsymbol{\theta}_{j}=\widehat{\boldsymbol{\theta}}_{j}\right]\end{align}\), the moments of the truncated normal distribution is:
\(\begin{align}\begin{array}{l}\hat{t}_{1, i}=\hat{\mu}_{\tau_{i}}+W_{\Phi_{1}}\left(\frac{\widehat{\mu}_{\tau_{i}}}{\widehat{M}_{\tau_{i}}}\right) \widehat{M}_{\tau_{i}} \\ \hat{t}_{2, i}=\hat{\mu}_{\tau_{i}}^{2}+\widehat{M}_{\tau_{i}}^{2}+W_{\Phi_{1}}\left(\frac{\widehat{\mu}_{\tau_{i}}}{M_{\tau_{i}}}\right) \widehat{M}_{\tau_{i}} \hat{\mu}_{\tau_{i}}\end{array}\end{align}\) (12)
Where \(\begin{align}W_{\Phi_{1}}(u)=\frac{\phi_{1}(u)}{\Phi_{1}(u)}\end{align}\), \(\begin{align}\widehat{M}_{\tau_{i}}^{2}=\frac{1}{1+\widehat{\Delta}_{j}^{T} \widehat{\Gamma}_{j}^{-1} \widehat{\Delta}_{j}}\end{align}\), \(\begin{align}\hat{\mu}_{\tau_{i}}=\frac{\widehat{\Delta}_{j}^{T} \widehat{\Gamma}_{j}^{-1}\left(x_{i}-\mu_{j}\right)}{1+\widehat{\Delta}_{j}^{T} \widehat{\Gamma}_{j}^{-1} \widehat{\Delta}_{j}}\end{align}\). 𝜙1(𝑢) and 𝛷1(𝑢) are the pdf and cdf of the Standard normal distribution. Taking the expectation of (11) with respect to 𝜏 conditional on 𝑋, it can be expressed as follows:
\(\begin{align}\begin{array}{l} Q(\boldsymbol{\Theta} \mid \widehat{\boldsymbol{\Theta}})=E[L(\boldsymbol{\Theta} \mid X)|X,| \widehat{\boldsymbol{\Theta}}]=&-\frac{1}{2} \log \left|\boldsymbol{\Gamma}_{j}\right|-\frac{1}{2}\left(x_{i}-\boldsymbol{\mu}_{j}-\Delta_{j} \hat{t}_{1, i}\right)^{T} \boldsymbol{\Gamma}_{j}^{-1}\left(x_{i}-\boldsymbol{\mu}_{j}-\boldsymbol{\Delta}_{j} \hat{t}_{1, i}\right) \\ & -\frac{1}{2}\left(\hat{t}_{2, i}-\hat{t}_{1, i}^{2}\right) \Delta_{j}^{T} \Gamma_{j}^{-1} \Delta_{j}-\frac{1}{2} \hat{t}_{2, i}-D \log (2 \pi)+\log 2 \end{array}\end{align}\) (13)
Then, we can define the distance function as:
\(\begin{align}\begin{aligned} d_{i j}=\frac{1}{2} \log \left|\boldsymbol{\Gamma}_{j}\right| & +\frac{1}{2}\left(x_{i}-B_{i} \boldsymbol{\mu}_{j}-\boldsymbol{\Delta}_{j} \hat{t}_{1, i}\right)^{T} \boldsymbol{\Gamma}_{j}^{-1}\left(x_{i}-B_{i} \boldsymbol{\mu}_{j}-\boldsymbol{\Delta}_{j} \hat{t}_{1, i}\right) \\ & +\left(\hat{t}_{2, i}-\hat{t}_{1, i}^{2}\right) \boldsymbol{\Delta}_{j}^{T} \boldsymbol{\Gamma}_{j}^{-1} \boldsymbol{\Delta}_{j}+\frac{1}{2} \hat{t}_{2, i}\end{aligned}\end{align}\) (14)
Here, 𝐵𝑖 is the bias field defined by using (9). Due to the use of skewed normal distribution, this distance function can effectively characterize asymmetric distribution information, and the distance function based on Gaussian distribution is a special case of our model, making it more robust.
3.2 Regularization term based on the KL divergence and the results of U-Net
The improved FCM based on skew-normal distribution has not considered any spatial. To reduce the impact of noise, the LMKLFCM[22] defines a regularization term as prior knowledge to enhance segmentation accuracy. However, the prior information is isotropic, which results in the generation of pseudo-contours and the loss of important details, particularly in the presence of noise.
Fig. 2 illustrates the segmentation results on a simulated brain MR image with a noise level of 7%. Fig. 2(a) depicts the initial image, while Fig. 2(b) provides a detailed view of Fig. 2(a). Fig. 2(c-d) display the ground truth and the segmentation result obtained by applying LMKLFCM. In Fig. 2(f-g), the focus is on the segmentation results of pixels within the red rectangular area shown in Fig. 2(c-d). It is apparent that LMKLFCM leads to erroneous classification for more than half of the pixels in the rectangle. This limitation suggests that the current formulation of the KL divergence term in LMKLFCM may not effectively handle the noise present in the image. Further research and modifications are necessary to ensure that important details and edge information are preserved during the segmentation process, even in the presence of noise.
Fig. 2. The segmentation results on simulated brain MR image. (a) is the initial image. (b)-(e) are the details of the noise image, the ground truth, the segmentation results obtained by LMKLFCM and our method. (f)-(h) are gray values of the ground truth, the segmentation results obtained by LMKLFCM and our method in the red rectangle.
To tackle this issue, we leverage the segmentation results generated by U-Net as prior information, even with limited amounts of labeled training data. Subsequently, we define the regularization term as follows:
\(\begin{align}E_{K L}=\sum_{i=1}^{N} \sum_{j=1}^{K} u_{i j} \log \frac{u_{i j}}{\pi_{i j}}\end{align}\) (15)
where πij is the probability of each pixel obtained by using U-Net.
It is important to mention that when using U-Net to train on small samples, the obtained results may not have a high level of confidence. To address this issue, we introduce a weight function that takes into consideration the high confidence results as prior information. This allows us to enhance the regularization term. The improved regularization term can be described as follows:
\(\begin{align}E_{K L}=\sum_{i=1}^{N} \sum_{j=1}^{K} H\left(\pi_{i j}\right) u_{i j} \log \frac{u_{i j}}{\pi_{i j}}\end{align}\) (16)
Here, 𝐻(⋅)is a confidence function that evaluates the reliability of the outcomes generated by the U-Net model. A higher value of 𝐻 indicates greater credibility in the U-Net results, while a lower value signifies lower credibility. This confidence function helps address overfitting issues when dealing with limited sample sizes. 𝐻(⋅)is defined as follows:
\(\begin{align}H(x)=\frac{1}{1+\exp (-\gamma(x-\varsigma))}\end{align}\) (17)
The parameter 𝜍 represents a translation parameter that is based on the credibility of the U-Net results. When there is a sufficient number of U-Net training sets, we can confidently consider the results to be reliable. In such cases, it is advisable to set the value range of 𝜍 to be between 0.6 and 0.7. This range ensures that the model strikes a balance between generalization and specificity.
However, when the number of U-Net training sets is limited, there is a higher risk of overfitting. In such situations, the reliability of the U-Net model decreases. To mitigate the risk of overfitting, it is recommended to set the value range of 𝜍𝜍 to be between 0.8 and 0.95. This wider range encourages the model to generalize better and reduces the likelihood of overfitting.
In summary, the choice of the value range for parameter 𝜍𝜍 depends on the number of available training sets. Larger training sets allow for a range with smaller values, indicating higher reliability. On the other hand, smaller training sets necessitate a range with larger values, acknowledging the vulnerability to overfitting and lower reliability.
Parameter 𝛾 is a scaling factor that controls the steepness of the 𝐻 function, which in turn affects the useful range of confidence of the U-Net model. By adjusting the value of 𝛾, we can modify the steepness of the function and consequently control the range of confidence in the U-Net results. For this paper, only 10% of the available training data is used. Therefore, the value of 𝜍 is set to 0.9 and the value of 𝛾 is set to 10 to accommodate the limited training data.
The U-Net can accurately extract feature information from available images, surpassing the capabilities of traditional methods. As a result, integrating the results of U-Net as prior information allows for the appropriate assignment of membership to corresponding pixels, preventing the occurrence of abnormal memberships and preserving intricate details for more precise segmentation. Fig. 2(e) and (h)showcase the segmentation results obtained by utilizing prior information acquired from U-Net. It is evident from the figures that the results are superior to those obtained using LMKLFCM.
3.3 The energy function
By leveraging the power of deep learning, traditional algorithms can be enhanced to achieve remarkable improvements in accuracy. Moreover, integrating the deep learning methods with traditional algorithms enables the fusion of common and unique features, addressing the performance degradation issue that arises from deep learning method's limited ability to extract common features in scenarios with small training samples. The total energy function of our method is defined as:
\(\begin{align}E=\sum_{i=1}^{N} \sum_{j=1}^{K} u_{i j} d_{i j}+\beta \sum_{i=1}^{N} \sum_{j=1}^{K} H\left(\pi_{i j}\right) u_{i j} \log \frac{u_{i j}}{\pi_{i j}}\end{align}\) (18)
where 𝛽 is a non-negative constant used to balance the energy term and regularization term.
Combining (14) and (18), the energy function has the following form:
\(\begin{align}\begin{aligned} E= & \sum_{i=1}^{N} \sum_{j=1}^{K} u_{i j} d_{i j}+\beta \sum_{i=1}^{N} \sum_{j=1}^{K} H\left(\pi_{i j}\right) u_{i j} \log \frac{u_{i j}}{\pi_{i j}} \\ = & \sum_{i=1}^{N} \sum_{j=1}^{K} u_{i j}\left\{\frac{1}{2} \log \left|\Gamma_{j}\right|+\frac{1}{2}\left(x_{i}-B_{i} \boldsymbol{\mu}_{j}-\Delta_{j} \hat{t}_{1, i}\right)^{T} \boldsymbol{\Gamma}_{j}^{-1}\left(x_{i}-B_{i} \boldsymbol{\mu}_{j}-\Delta_{j} \hat{t}_{1, i}\right)\right. \\ & \left.+\frac{1}{2}\left[\hat{t}_{2, i}-\hat{t}_{1, i}^{2}\right] \Delta_{j}^{T} \boldsymbol{\Gamma}_{j}^{-1} \Delta_{j}+\frac{1}{2} \hat{t}_{2, i}\right\}+\beta \sum_{i=1}^{N} \sum_{j=1}^{K} \frac{u_{i j}}{1+\exp \left(-\gamma\left(\pi_{i j}-\varsigma\right)\right)} \log \frac{u_{i j}}{\pi_{i j}}\end{aligned}\end{align}\) (19)
3.4 Parameter Learning
By using the Lagrange multiplier method, (18) can be written as:
\(\begin{align}E=\sum_{i=1}^{N} \sum_{j=1}^{K} u_{i j} d_{i j}+\beta \sum_{i=1}^{N} \sum_{j=1}^{K} H\left(\pi_{i j}\right) u_{i j} \log \frac{u_{i j}}{\pi_{i j}}+\alpha\left(\sum_{j=1}^{K} u_{i j}-1\right)\end{align}\) (20)
where 𝛼 is the Lagrange multiplier. Set the partial of 𝐸 with respect to uij to zero, we have:
\(\begin{align}\left.\left[\frac{\partial E}{\partial u_{i j}}=d_{i j}+\beta H\left(\pi_{i j}\right)\left(\log u_{i j}-\log \pi_{i j}+1\right)+\alpha\right]\right|_{u_{i j}=\widehat{u}_{i j}}=0\end{align}\) (21)
Then, we can obtain:
\(\begin{align}\widehat{u}_{i j}=\frac{\exp \left(-d_{i j} / \beta H\left(\pi_{i j}\right)\right)}{\sum_{j=1}^{K} \exp \left(-d_{i j} / \beta H\left(\pi_{i j}\right)\right)}\end{align}\) (22)
Let 𝐵𝑖 = 𝑄𝑇𝑆𝑖, where 𝑄 = [𝑞1, 𝑞2, … , 𝑞𝐿]𝑇 is the coefficients of the basis functions. Here, we use the Leyland polynomials as the basis functions. Fixed 𝑢, 𝝁, 𝜞, 𝜟 and 𝝀, set the partial of 𝐸 with regard to 𝑄 to zero, we obtain:
\(\begin{align}\begin{array}{l}{\left.\left[\frac{\partial E}{\partial Q}=\frac{\partial \sum_{i=1}^{N} \sum_{j=1}^{K} u_{i j} d_{i j}}{\partial Q}\right]\right|_{Q=\hat{Q}}=0} \\ \Rightarrow \sum_{i=1}^{N} S_{i} S_{i}^{T} \sum_{j=1}^{K} u_{i j} \cdot \boldsymbol{\mu}_{j}^{T} \boldsymbol{\Gamma}_{j}^{-1} \boldsymbol{\mu}_{j} \hat{Q}=\sum_{i=1}^{N} S_{i} \sum_{j=1}^{K} u_{i j}\left[\left(x_{i}-\Delta_{j} \hat{t}_{1, i}\right)^{T} \boldsymbol{\Gamma}_{j}^{-1} \boldsymbol{\mu}_{j}\right. \\ \Rightarrow \hat{Q}=A^{-1} W\end{array}\end{align}\) (23)
Where 𝐴 = ∑𝑁𝑖=1 𝑆𝑖𝑆𝑇𝑖 ∑𝐾𝑗=1 𝑢ij ⋅ 𝝁𝑇𝑗𝜞−1𝑗𝝁𝑗, \(\begin{align}W=\sum_{i=1}^{N} S_{i} \sum_{j=1}^{K} u_{i j}\left(x_{i}-\boldsymbol{\Delta}_{j} \hat{t}_{1, i}\right)^{T} \boldsymbol{\Gamma}_{j}^{-1} \boldsymbol{\mu}_{\boldsymbol{j}}\end{align}\). Fixed 𝑢, 𝑄, 𝜞, 𝜟 and 𝝀, calculate 𝝁𝑗 in the same way, we can obtain:
\(\begin{align}\begin{array}{l}{\left.\left[\frac{\partial E}{\partial \boldsymbol{\mu}_{j}}=\frac{\partial \sum_{i=1}^{N} \sum_{j=1}^{K} u_{i j} d_{i j}}{\partial \boldsymbol{\mu}_{j}}\right]\right|_{\boldsymbol{\mu}_{j}=\widehat{\boldsymbol{\mu}}_{j}}=0} \\ \Rightarrow \sum_{i=1}^{N} u_{i j} \boldsymbol{\Gamma}_{j}^{-1}\left(x_{i}-B_{i} \widehat{\boldsymbol{\mu}}_{j}-\Delta_{j} \hat{t}_{1, i}\right) B_{i}=0 \Rightarrow \widehat{\boldsymbol{\mu}}_{j}=\frac{\sum_{i=1}^{N} u_{i j} B_{i}\left(x_{i}-\Delta_{j} \hat{t}_{1, i}\right)}{\sum_{i=1}^{N} u_{i j} B_{i}^{2}}\end{array}\end{align}\) (24)
Using the same method, we can obtain:
\(\begin{align}\begin{array}{l}{\left.\left[\frac{\partial E}{\partial \Gamma_{j}^{-1}}=\frac{\partial \sum_{i=1}^{N} \sum_{j=1}^{K} u_{i j} d_{i j}}{\partial \Gamma_{j}^{-1}}\right]\right|_{\Gamma_{j}^{-1}=\widehat{\Gamma}_{j}^{-1}}=0} \\ \Rightarrow \sum_{i=1}^{N} u_{i j}\left(\left(x_{i}-B_{i} \widehat{\boldsymbol{\mu}}_{j}-\boldsymbol{\Delta}_{j} \hat{t}_{1, i}\right)\left(x_{i}-B_{i} \widehat{\boldsymbol{\mu}}_{j}-\boldsymbol{\Delta}_{j} \hat{t}_{1, i}\right)^{T}+\left(\hat{t}_{2, i}-\hat{t}_{1, i}^{2}\right) \Delta_{j}^{T} \boldsymbol{\Delta}_{j}\right) \\ \quad=\sum_{i=1}^{N} u_{i j} \widehat{\Gamma}_{j}^{-1}\end{array}\end{align}\) (25)
\(\begin{align}\begin{array}{l} \Rightarrow \widehat{\Gamma}_{j}^{-1}=\frac{\sum_{i=1}^{N} u_{i j}\left(\left(x_{i}-B_{i} \hat{\mu}_{j}-\Delta_{j} \hat{t}_{1, i}\right)^{T}\left(x_{i}-B_{i} \hat{\mu}_{j}-\Delta_{j} \hat{t}_{1, i}\right)+\left(\hat{t}_{2, i}-\hat{t}_{1, i}^{2}\right) \Delta_{j}^{T} \Delta_{j}\right)}{\sum_{i=1}^{N} u_{i j}} \\ {\left.\left[\frac{\partial E}{\partial \Delta_{j}}=\frac{\partial \sum_{i=1}^{N} \sum_{j=1}^{K} u_{i j} d_{i j}}{\partial \Delta_{j}}\right]\right|_{\Delta_{j}=\Delta_{j}}=0} \\ \Rightarrow \sum_{i=1}^{N} u_{i j} \boldsymbol{\Gamma}_{j}^{-1}\left(\left(x_{i}-B_{i} \widehat{\boldsymbol{\mu}}_{j}-\widehat{\Delta}_{j} \hat{t}_{1, i}\right) \hat{t}_{1, i}+\left(\hat{t}_{2, i}-\hat{t}_{1, i}^{2}\right) \widehat{\Delta}_{j}\right)=0 \\ \Rightarrow \widehat{\Delta}_{j}=\frac{\sum_{i=1}^{N} u_{i j} \hat{1}_{1, i}\left(x_{i}-B_{i} \hat{\mu}_{j}\right)}{\sum_{i=1}^{N} u_{i j} \hat{t}_{2, i}} \end{array}\end{align}\) (26)
\(\begin{align}\widehat{\Sigma}_{j}=\widehat{\Gamma}_{j}+\widehat{\Delta}_{j}^{T} \widehat{\Delta}_{j}\end{align}\) (27)
\(\begin{align}\hat{\lambda}_{j}=\frac{\widehat{\Sigma}_{j}^{-\frac{1}{2}} \widehat{\Delta}_{j}}{\left(1-\widehat{\Delta}_{j}^{T} \widehat{T}_{j}^{-1} \widehat{\Delta}_{j}\right)^{\frac{1}{2}}}\end{align}\) (28)
The proposed algorithm can be summarized as follows:
Step 1: Initialize the parameters uij and 𝜣 = (𝝁𝑗, 𝜮𝑗, 𝝀𝑗, 𝜞𝑗, 𝜹𝑗) using the results obtained from U-Net.
Step 2: Calculate \(\begin{align}\hat {t}_{1,i}\end{align}\), \(\begin{align}\hat {t}_{2,i}\end{align}\) by using (12).
Step 3: Update 𝑄, uij, 𝝁𝑗, 𝜞𝑗,𝜟𝑗 by using (23)(22)(24)(25)(26), respectively.
Step 4: Update 𝜮𝑗, 𝝀𝑗 by using (27)(28), respectively.
Step 5: Check if the convergence criterion is met for either the objective function or the parameter values. If the criterion is satisfied, stop the iteration; otherwise, go back to Step 2.
4. Experiment Results
We evaluate the performance of our proposed algorithm through experiments and compare it with several existing algorithms, including FLICM[21], RSCFCM[13], LMKLFCM[22], AMTHFCM[17], SCAAFMM[19], MICO[29] and U-Net[7]. We conduct the experiments on a collection of simulated and clinical 3T brain MR images. For our experiments, we use the following settings for the parameters unless stated otherwise. We initialize the initial parameters using the results of U-Net. The degree of the basis function is set as 4, resulting in a total of 15 basis functions (L=15). Additionally, we set the values of ς , γ , and β to 0.9, 10, and 50 respectively. In the U-Net part of our algorithm, the amount of training data and testing data is 10% and 90% of the corresponding datasets, respectively. The loss function used is cross entropy loss. The batch size is set to 10, the learning rate is 0.0001, and the epochs is 100. The parameters of the other algorithms used in the comparison are set to the values specified in their respective papers.
In our experiments, we utilized brain MR images from two different sources: simulated images from BrainWeb1 and real images from the Internet Brain Segmentation Repository (IBSR2) and MRBrainS133 . BrainWeb provides complete 3-dimensional simulated brain data sets. For our specific experiments, we selected T1-weighted images with a slice thickness of 1mm. These simulated images allow us to assess the performance of our proposed method in a controlled environment with known ground truth. On the other hand, IBSR offers a collection of 18 subjects along with their corresponding segmentation results. These real images provide a more diverse and realistic dataset for evaluating the effectiveness of our method in realworld scenarios. Additionally, MRBrainS13 provides fifteen data sets accompanied by manual segmentation results. This dataset allows us to further validate the performance of our proposed method against manual annotations by experts. The size of each image volume in BrainWeb, IBSR, MRBrainS13 are 181×217×181 , 256×128×256 , 240×240×48 , respectively. By using a combination of simulated and real brain MR images, our aim is to thoroughly evaluate the performance of our proposed method in various scenarios. This approach allows us to validate its effectiveness and potential for practical applications in realworld settings.
To assess the accuracy of the segmentation results, we utilize the Jaccard similarity coefficient, denoted as Js. The Jaccard similarity coefficient quantifies the similarity between the predicted segmentation and the ground truth segmentation. It is defined as:
\(\begin{align}J s\left(S_{1}, S_{2}\right)=\frac{\left|S_{1} \cap S_{2}\right|}{\left|S_{1} \cup S_{2}\right|}\end{align}\) (1)
where S1 and S2 denote the segmentation result and the ground truth respectively. In practice, the pixels of the segmentation result are first matched one-to-one with the pixels of ground truth, the wrong ones are 0, and the right ones are 1, so as to obtain two 01 matrices. Then we add the two 01 matrices to get a new matrix, the number of elements greater than 1 divided by the total number of elements is Js. The Js value ranges from 0 to 1, where a value of 1 indicates a perfect match between the predicted segmentation and the ground truth, while a value of 0 indicates no overlap between the two segmentations. Higher values of Js indicate improved segmentation accuracy.
4.1 Performance of robustness to noise on simulated brain MRI data
To show the effect of the noise, we compared it with six other algorithms on simulated brain datasets with noise levels of 3%, 5%, and 7% (referred to as N3F0, N5F0, and N7F0, respectively). Fig. 3 presents the segmentation results of the 150th simulated brain MR image with noise levels of 3% (first row), 5% (third row), and 7% (fifth row). The first column shows the initial images used for segmentation, while the second column displays the corresponding ground truths. From the third to the last column, the segmentation results of FLICM, RSCFCM, LMKLFCM, AMTHFCM, SCAAFMM, U-Net, and our proposed method are displayed. Additionally, the even rows show local zoomed-in images for a clearer view.
Fig. 3. The segmentation results on simulated brain MR images with different noise levels. Each column, from the first to the last, represents the corresponding initial images, ground truths, and the segmentation results produced by FLICM, RSCFCM, LMKLFCM, AMTHFCM, SCAAFMM, U-Net, and our method, respectively. The even rows show local zoomed-in images for a clearer view.
From Fig. 3, it can be observed that the FLICM method is highly affected by noise due to its reliance on spatial Euclidean distance as the local spatial information. This approach ignores the influence of pixel intensity, leading to poor performance in noisy environments. The RSCFCM method utilizes isotropic spatial information, which makes it challenging to preserve fine details on weak edges. The LMKLFCM method also performs poorly in regions with slim structures since it tends to lose details when using the mean membership information. In comparison, the AMTHFCM and SCAAFMM methods are more robust to noise than the RSCFCM method as they incorporate anisotropic spatial information. However, even these methods suffer from some loss of details. In contrast, the U-Net model, which incorporates skip connections, is able to recover lost details caused by down sampling, including boundary information. Additionally, the U-Net model demonstrates robustness to noise, making it a favorable choice in noisy environments. With the increasing noise level, the segmentation results of FLICM, RSCFCM, AMTHFCM, and MICO are not satisfactory. The SCAAFMM method produces overly smooth boundaries, making it difficult to preserve details. In contrast, our proposed method combines U-Net with the FCM model and utilizes the skew-normal distribution as the dissimilarity function, which helps preserve more details.
To assess the accuracy of the segmentation results, we employ Js values as a performance metric. Table 1 displays the average Js values acquired from 200 MR images, reflecting the accuracy across various noise levels. It is evident that our method consistently achieves the highest Js values for each noise level, exhibiting relatively low standard deviations. This highlights the robustness and stability of our approach compared to other methods.
Table 1. The average JS values of segmentation results for simulated brain MR images with different noise levels.
4.2 Performance of robustness to bias field on simulated brain MRI data
To evaluate the effectiveness of our method in handling bias field, we conducted a comparative analysis with various intensity inhomogeneity levels. Since FLICM, LMKLFCM, and SCAAFMM do not account for bias field interference, we compared our proposed method with RSCFCM, AMTHFCM, MICO, and U-Net. Fig. 4 showcases the segmentation results of the five methods under different intensity inhomogeneity conditions. The first, third, and fifth rows present the segmentation results of the image with the level of noise of 3% and the level of intensity inhomogeneity of 40% (N3F40), 80% (N3F80), and 100% (N3F100), respectively. It is evident that all five methods mitigate the impact of intensity inhomogeneity to some extent. However, RSCFCM and AMTHFCM sacrifice details in thin structures, while MICO is susceptible to noise interference due to its disregard for spatial information.
Fig. 4. The segmentation results on simulated brain MR images with different intensity inhomogeneity levels. Each column, from the first to the last, represents the corresponding initial images, ground truths, and the segmentation results produced by RSCFCM, AMTHFCM, MICO, U-Net and our method, respectively. The even rows show local zoomed-in images for a clearer view.
To further quantify the results, Table 2 presents the average Js values of the segmentation results obtained using the five methods. It is observed that our method achieves the highest average Js values and the lowest standard deviations. This indicates that our method effectively preserves intricate details and demonstrates greater robustness in comparison to the other methods.
Table 2. The average JS values of segmentation results for simulated brain MR images with different intensity inhomogeneity levels.
4.3 Performance on real brain MRI data
In this section, our proposed algorithm is evaluated by comparing it with six existing algorithms using real brain MR images acquired from the IBSR and MRBrainS13 datasets. These images possess unknown noise and bias fields, and their intensity distributions are asymmetric. Fig. 5 showcases the results of two brain MR images. The first image is obtained from the IBSR dataset, while the second image is obtained from the MRBrainS13 dataset. We present the initial images, ground truths, and segmentation results of FLICM, RSCFCM, LMKLFCM, AMTHFCM, SCAAFMM, MICO, U-Net, and our algorithm in the first to final columns. The grayscale histograms of the original images and the details of the preceding rows are displayed in the even rows.
Fig. 5. Segmentation results on real brain MR images. The first image and the second image are sourced from IBSR and MRBrainS13, respectively. The first to last columns are the initial images, the ground truths, the segmentation results of FLICM, RSCFCM, LMKLFCM, AMTHFCM, SCAAFMM, MICO, U-Net and our method, respectively.
As shown in Fig. 5, the first brain MRI image exhibits a significant lack of contrast, with a skewed histogram. This leads to unsatisfactory segmentation results for several algorithms, including FLICM, RSCFCM, LMKLFCM, AMTHFCM, and MICO. In particular, FLICM and MICO misclassify a considerable number of pixels belonging to GM as WM within the red rectangle region, due to the interference of low contrast. Although RSCFCM, AMTHFCM, and SCAAFMM achieve slightly better results by incorporating spatial information, misclassification of pixels around the boundary is still noticeable due to the presence of weak boundaries. Furthermore, LMKLFCM misclassifies CSF in slim structure regions into the GM cluster, where most neighboring pixels reside. In comparison to these algorithms, U-Net produces the best segmentation result but struggles to maintain fine details.
To show the accuracy of segmentation results, we use the JS values on 200 real brain images. As shown in Table 3, our method achieves the highest average Js values, demonstrating its superior robustness compared to other methods.
Table 3. The average Js values of the segmentation results on real brain MR images.
4.4 Discussion
In this study, our objective was to train the U-Net model using only 10% of the labeled data. It is crucial to acknowledge that the size of the training set plays a significant role in determining the accuracy of the final model. To demonstrate the superiority of our approach, we conducted evaluations using various proportions of the training data. The results, presented in Table 4, unequivocally illustrate that our method outperforms U-Net when trained with 10%, 20%, and 40% of the available data. As the proportion of training data increases to 80%, our method achieves comparable performance to U-Net. Given the limited availability of medical image data, our algorithm exhibits superior results in brain MR image segmentation compared to U-Net.
Table 4. The average Js values of the segmentation results with different proportions of training data.
Although our method demonstrates high accuracy in brain image segmentation, it does have certain limitations in the selection of parameters ς and γ . To determine the optimal parameters, we conducted tests on 200 simulated brain images with a noise level of 3% and an intensity inhomogeneity level of 80%. Fig. 6 depicts the average misclassification ratio (MCR) values obtained. The results indicate that our method achieves more accurate segmentation when ς and γ are set to 0.9 and 10, respectively.
Fig. 6. The MCR values of our proposed method with different ς and γ
5. Conclusion
In this paper, we introduce a novel fuzzy c-means (FCM) algorithm that addresses the challenges associated with brain MR image segmentation. The traditional FCM model has some limitations, such as its inability to effectively handle noise, weak edges, and limited detail preservation. In order to overcome these limitations, we propose an algorithm that incorporates prior information from the U-Net model and the KL divergence through a regular term. This enables us to improve the robustness of the algorithm to asymmetric data, which is achieved by adopting a dissimilarity function based on the skew-normal distribution. By using the dissimilarity function based on the skew-normal distribution, we are able to effectively handle asymmetric data, which is a common issue in various applications. This improves the overall performance of our algorithm by providing a more accurate representation of the data. Moreover, our method addresses another common problem encountered during small sample training, i.e., insufficient feature extraction by U-Net. This often leads to a reduction in performance. By improving the feature extraction process, our method ensures that even with limited training samples, a satisfactory level of performance is maintained. Additionally, the prior information is leveraged to initialize the parameters of the enhanced FCM. This reduces the impact of initialization and results in better convergence and improved performance. To evaluate the efficacy of our method, we conducted experiments on both simulated brain MR images and real brain MR images. The results clearly demonstrate that our algorithm outperforms other state-of-the-art methods in terms of robustness to asymmetric form, noise, and intensity inhomogeneity. This highlights the effectiveness of our proposed algorithm in overcoming the limitations of the traditional FCM model. In the future work, wo focus on the outliers of the distributions of brain MR images.
Acknowledgement
The authors would like to thank the anonymous reviewers for their valuable comments and suggestions to improve the quality of the paper. This work was supported by the Open Project of Center for Applied Mathematics of Jiangsu Province (Nanjing University of Information Science and Technology).
References
- L. Li, L. Sun, Y. Xue, S. Li, X. Huang, R. F. Mansour, "Fuzzy Multilevel Image Thresholding Based on Improved Coyote Optimization Algorithm," IEEE Access, vol.9, pp.33595-33607, 2021. https://doi.org/10.1109/ACCESS.2021.3060749
- C. Liu, W. Liu, W. Xing, "An improved edge-based level set method combining local regional fitting information for noisy image segmentation," Signal Processing, vol.130, pp.12-21, 2017. https://doi.org/10.1016/j.sigpro.2016.06.013
- X. Jiang, H. Yu, S. Lv, "An Image Segmentation Algorithm Based on a Local Region Conditional Random Field Model," International Journal of Communications, Network and System Sciences, vol.13, no.9, pp.139-159, 2020. https://doi.org/10.4236/ijcns.2020.139009
- D. Stosic, D. Stosic, T. B. Ludermir, T. I. Ren, "Natural image segmentation with non-extensive mixture models," Journal of Visual Communication and Image Representation, vol.63, 2019.
- S. Tongbram, B. A. Shimray, L. S. Singh, "Segmentation of image based on k-means and modified subtractive clustering," Indonesian Journal of Electrical Engineering and Computer Science, vol.22, no.3, pp.1396-1403, 2021. https://doi.org/10.11591/ijeecs.v22.i3.pp1396-1403
- X. Zhang, H. Wang, Y. Zhang, X. Gao, G. Wang, C. Zhang, "Improved fuzzy clustering for image segmentation based on a low-rank prior," Computational Visual Media, vol.7, pp.513-528, 2021. https://doi.org/10.1007/s41095-021-0239-3
- O. Ronneberger, P. Fischer, T. Brox, "U-Net: Convolutional Networks for Biomedical Image Segmentation," in Proc. of Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015: 18th International Conference, Proceedings, Part III, pp.234-241, 2015.
- Z. Zhou, M. M. R. Siddiquee, N. Tajbakhsh, J. Liang, "UNet++: Redesigning Skip Connections to Exploit Multi-scale Features in Image Segmentation," IEEE transactions on medical imaging, vol.39, no.6, pp.1856-1867, 2020. https://doi.org/10.1109/TMI.2019.2959609
- N. Ibtehaz, M. S. Rahman, "MultiResUNet : Rethinking the U-Net architecture for multimodal biomedical image segmentation," Neural networks, vol.121, pp.74-87, 2020. https://doi.org/10.1016/j.neunet.2019.08.025
- H. Zhang, Q. M. J. Wu, T. M. Nguyen, "A Robust Fuzzy Algorithm Based on Student's t-Distribution and Mean Template for Image Segmentation Application," IEEE Signal Processing Letters, vol.20, no.2, pp.117-120, 2013. https://doi.org/10.1109/LSP.2012.2230626
- S. Chen, D. Zhang, "Robust image segmentation using FCM with spatial constraints based on new kernel-induced distance measure," IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol.34, no.4, pp.1907-1916, 2004. https://doi.org/10.1109/TSMCB.2004.831165
- X. Zhao, Y. Li, Q. Zhao, "Mahalanobis distance based on fuzzy clustering algorithm for image segmentation," Digital Signal Processing, vol.43, pp.8-16, 2015. https://doi.org/10.1016/j.dsp.2015.04.009
- Z. Ji, J. Liu, G. Cao, Q. Sun, Q. Chen, "Robust spatially constrained fuzzy c-means algorithm for brain MR image segmentation," Pattern recognition, vol.47, no.7, pp.2454-2466, 2014. https://doi.org/10.1016/j.patcog.2014.01.017
- H. Sun, X. Yang, H. Gao, "A spatially constrained shifted asymmetric Laplace mixture model for the grayscale image segmentation," Neurocomputing, vol.331, pp.50-57, 2019. https://doi.org/10.1016/j.neucom.2018.10.039
- A. Azzalini, A. Capitanio, "Statistical Applications of the Multivariate Skew Normal Distribution," Journal of the Royal Statistical Society Series B: Statistical Methodology, vol.61, no.3, pp.579-602, 1999. https://doi.org/10.1111/1467-9868.00194
- A. Azzalini, M. Chiogna, "Some results on the stress-strength model for skew-normal variates," Metron -International Journal of Statistics, vol.LXII, no.3, pp.315-326, 2004.
- Y. Chen, H. Zhang, Y. Zheng, B. Jeon, Q.M. J. Wu, "An improved anisotropic hierarchical fuzzy c-means method based on multivariate student t-distribution for brain MRI segmentation," Pattern Recognition, vol.60, pp.778-792, 2016. https://doi.org/10.1016/j.patcog.2016.06.020
- A. Azzalini, A. D. Valle, "The multivariate skew-normal distribution," Biometrika, vol.83, no.4, pp.715-726, 1996. https://doi.org/10.1093/biomet/83.4.715
- Y. Chen, N. Cheng, M. Cai, C. Cao, J. Yang, Z. Zhang, "A spatially constrained asymmetric Gaussian mixture model for image segmentation," Information Sciences, vol.575, pp.41-65, 2021. https://doi.org/10.1016/j.ins.2021.06.034
- M.N. Ahmed, S.M. Yamany, N. Mohamed, A.A. Farag, T. Moriarty, "A modified fuzzy c-means algorithm for bias field estimation and segmentation of MRI data," IEEE transactions on medical imaging, vol.21, no.3, pp.193-199, 2002. https://doi.org/10.1109/42.996338
- S. Krinidis, V. Chatzis, "A Robust Fuzzy Local Information C-Means Clustering Algorithm," IEEE Transactions on Image Processing, vol.19, no.5, pp.1328-1337, 2010. https://doi.org/10.1109/TIP.2010.2040763
- R. R. Gharieb, G. Gendy, "Fuzzy C-means with a local membership kl distance for medical image segmentation," in Proc. of 2014 Cairo International Biomedical Engineering Conference (CIBEC), pp.47-50, 2014.
- R. R. Gharieb, G. Gendy, A. Abdelfattah, H. Selim, "Adaptive local data and membership based KL divergence incorporating C-means algorithm for fuzzy image segmentation," Applied Soft Computing, vol.59, pp.143-152, 2017. https://doi.org/10.1016/j.asoc.2017.05.055
- C. Wang, W. Pedrycz, Z. Li, M. Zhou, "Kullback-Leibler Divergence-Based Fuzzy C-Means Clustering Incorporating Morphological Reconstruction and Wavelet Frames for Image Segmentation," IEEE Transactions on Cybernetics, vol.52, no.8, pp.7612-7623, 2022. https://doi.org/10.1109/TCYB.2021.3099503
- B. Yang, X. Fu, N. D. Sidiropoulos, M. Hong, "Towards K-means-friendly spaces: simultaneous deep learning and clustering," in Proc. of ICML'17: Proceedings of the 34th International Conference on Machine Learning, vol.70, pp.3861-3870, 2017.
- W.M. Wells, W.E.L. Grimson, R. Kikinis, F.A. Jolesz, "Adaptive segmentation of MRI data," IEEE Transactions on Medical Imaging, vol.15, no.4, pp.429-442, 1996. https://doi.org/10.1109/42.511747
- D.L. Pham, J.L. Prince, "Adaptive fuzzy segmentation of magnetic resonance images," IEEE Transactions on Medical Imaging, vol.18, no.9, pp.737-752, 1999. https://doi.org/10.1109/42.802752
- K. Van Leemput, F. Maes, D. Vandermeulen, P. Suetens, "Automated model-based bias field correction of MR images of the brain," IEEE Transactions on Medical Imaging, vol.18, no.10, pp.885-896, 1999. https://doi.org/10.1109/42.811268
- C. Li, J. C. Gore, C. Davatzikos, "Multiplicative intrinsic component optimization (MICO) for MRI bias field estimation and tissue segmentation," Magnetic Resonance Imaging, vol.32, no.7, pp.913-923, 2014. https://doi.org/10.1016/j.mri.2014.03.010