통합 검색 | Korea Science

The extension of the largest generalized-eigenvalue based distance metric D_ij(γ₁) in arbitrary feature spaces to classify composite data points

Daoud, Mosaab
- Genomics & Informatics
- /
- 제17권4호
- /
- pp.39.1-39.20
- /
- 2019
Analyzing patterns in data points embedded in linear and non-linear feature spaces is considered as one of the common research problems among different research areas, for example: data mining, machine learning, pattern recognition, and multivariate analysis. In this paper, data points are heterogeneous sets of biosequences (composite data points). A composite data point is a set of ordinary data points (e.g., set of feature vectors). We theoretically extend the derivation of the largest generalized eigenvalue-based distance metric D_ij(γ₁) in any linear and non-linear feature spaces. We prove that D_ij(γ₁) is a metric under any linear and non-linear feature transformation function. We show the sufficiency and efficiency of using the decision rule $\bar{{\delta}}_{{\Xi}i}$(i.e., mean of D_ij(γ₁)) in classification of heterogeneous sets of biosequences compared with the decision rules min_𝚵iand median_𝚵i. We analyze the impact of linear and non-linear transformation functions on classifying/clustering collections of heterogeneous sets of biosequences. The impact of the length of a sequence in a heterogeneous sequence-set generated by simulation on the classification and clustering results in linear and non-linear feature spaces is empirically shown in this paper. We propose a new concept: the limiting dispersion map of the existing clusters in heterogeneous sets of biosequences embedded in linear and nonlinear feature spaces, which is based on the limiting distribution of nucleotide compositions estimated from real data sets. Finally, the empirical conclusions and the scientific evidences are deduced from the experiments to support the theoretical side stated in this paper.
https://doi.org/10.5808/GI.2019.17.4.e39 인용 PDF KSCI

검색결과 1건 처리시간 0.015초

The extension of the largest generalized-eigenvalue based distance metric Dij(γ1) in arbitrary feature spaces to classify composite data points

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)

The extension of the largest generalized-eigenvalue based distance metric D_ij(γ₁) in arbitrary feature spaces to classify composite data points