Search | Korea Science

Document Clustering Using Semantic Features and Fuzzy Relations

Kim, Chul-Won;Park, Sun
- Journal of information and communication convergence engineering
- /
- v.11 no.3
- /
- pp.179-184
- /
- 2013
Traditional clustering methods are usually based on the bag-of-words (BOW) model. A disadvantage of the BOW model is that it ignores the semantic relationship among terms in the data set. To resolve this problem, ontology or matrix factorization approaches are usually used. However, a major problem of the ontology approach is that it is usually difficult to find a comprehensive ontology that can cover all the concepts mentioned in a collection. This paper proposes a new document clustering method using semantic features and fuzzy relations for solving the problems of ontology and matrix factorization approaches. The proposed method can improve the quality of document clustering because the clustered documents use fuzzy relation values between semantic features and terms to distinguish clearly among dissimilar documents in clusters. The selected cluster label terms can represent the inherent structure of a document set better by using semantic features based on non-negative matrix factorization, which is used in document clustering. The experimental results demonstrate that the proposed method achieves better performance than other document clustering methods.
https://doi.org/10.6109/jicce.2013.11.3.179 인용 PDF KSCI

Online Monaural Ambient Sound Extraction based on Nonnegative Matrix Factorization Method for Audio Contents (오디오 컨텐츠를 위한 비음수 행렬 분해 기법 기반의 실시간 단일채널 배경 잡음 추출 기법)

Lee, Seokjin
- Journal of Broadcast Engineering
- /
- v.19 no.6
- /
- pp.819-825
- /
- 2014
In this paper, monaural ambient component extraction algorithm based on nonnegative matrix factorization (NMF) is described. The ambience component extraction algorithm in this paper is developed for audio upmixing system; Recent researches have shown that they can enhance listener envelopment if the extracted ambient signal is applied into the multichannel audio upmixing system. However, the conventional method stores all of the audio signal and processes all at once, so it cannot be applied to streaming system and digital signal processor (DSP) system. In this paper, the ambient component extraction algorithm based on on-line nonnegative matrix factorization is developed and evaluated to solve the problem. As a result of analysis of the processed signal with spectral flatness measures in the experiment, it was shown that the developed system can extract the ambient signal similarly with the conventional batch process system.
https://doi.org/10.5909/JBE.2014.19.6.819 인용 PDF KSCI KPUBS HTML

On dence column splitting in interial point methods of linear programming (내부점 선형계획법의 밀집열 분할에 대하여)

설동렬;박순달;정호원
- Korean Management Science Review
- /
- v.14 no.2
- /
- pp.69-79
- /
- 1997
The computational speed of interior point method of linear programming depends on the speed of Cholesky factorization. If the coefficient matrix A has dense columns then the matrix A.THETA. $A^{T}$ becomes a dense matrix. This causes Cholesky factorization to be slow. We study an efficient implementation method of the dense column splitting among dense column resolving technique and analyze the relation between dense column splitting and order methods to improve the sparsity of Cholesky factoror.
PDF

Document Clustering Method using Coherence of Cluster and Non-negative Matrix Factorization (비음수 행렬 분해와 군집의 응집도를 이용한 문서군집)

Kim, Chul-Won;Park, Sun
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.13 no.12
- /
- pp.2603-2608
- /
- 2009
Document clustering is an important method for document analysis and is used in many different information retrieval applications. This paper proposes a new document clustering model using the clustering method based NMF(non-negative matrix factorization) and refinement of documents in cluster by using coherence of cluster. The proposed method can improve the quality of document clustering because the re-assigned documents in cluster by using coherence of cluster based similarity between documents, the semantic feature matrix and the semantic variable matrix, which is used in document clustering, can represent an inherent structure of document set more well. The experimental results demonstrate appling the proposed method to document clustering methods achieves better performance than documents clustering methods.
https://doi.org/10.6109/JKIICE.2009.13.12.2603 인용 PDF KSCI

Topic-based Multi-document Summarization Using Non-negative Matrix Factorization and K-means (비음수 행렬 분해와 K-means를 이용한 주제기반의 다중문서요약)

Park, Sun;Lee, Ju-Hong
- Journal of KIISE:Software and Applications
- /
- v.35 no.4
- /
- pp.255-264
- /
- 2008
This paper proposes a novel method using K-means and Non-negative matrix factorization (NMF) for topic -based multi-document summarization. NMF decomposes weighted term by sentence matrix into two sparse non-negative matrices: semantic feature matrix and semantic variable matrix. Obtained semantic features are comprehensible intuitively. Weighted similarity between topic and semantic features can prevent meaningless sentences that are similar to a topic from being selected. K-means clustering removes noises from sentences so that biased semantics of documents are not reflected to summaries. Besides, coherence of document summaries can be enhanced by arranging selected sentences in the order of their ranks. The experimental results show that the proposed method achieves better performance than other methods.
PDF KSCI

A Study on Collaborative Filtering Recommender system based on Item Knowledge (아이템 정보 기반 협업 필터링 추천 시스템 연구)

Yang, Yeong-Wook;Yun, You-Dong;Lim, Heui-Seok
- Proceedings of the Korea Information Processing Society Conference
- /
- 2017.04a
- /
- pp.439-441
- /
- 2017
Matrix factorization은 사용자의 아이템 선호도를 통해 아이템을 추천해주는 성공적인 기술 중 하나이다. 이 기법은 사용자-아이템의 선호도 행렬을 채우는 것을 목표로 한다. 이 목표를 달성하기 위해 사용자-아이템의 선호도 행렬을 사용자 행렬(user latent factor)와 아이템 행렬(item latent factor)로 분해하고, 각 행렬에 대해 추론하여 완성된 사용자-아이템의 선호도 행렬을 추론한다. 하지만 Matrix factorization은 아이템의 수가 많고, 아이템에 대한 사용자들의 선호도 데이터가 적을 때 성능이 제한된다. 또한 새로운 아이템이 추가되었을 때, 새로운 아이템에 대한 사용자들의 선호도 정보가 없기 때문에 새로운 아이템이 추천되지 않는다는 문제를 가진다. 이를 해결하기 위해 본 논문에서는 아이템에 대한 부가적인 정보인 아이템 간의 유사도 정보와 아이템의 시나리오 정보의 유사도를 모델링하여 기존의 전통적인 Matrix factorization에 추가하는 아이템 정보 기반 추천 시스템을 제안한다.
https://doi.org/10.3745/PKIPS.y2017m04a.439 인용 PDF

Recovery of Lost Speech Segments Using Incremental Subspace Learning

Huang, Jianjun;Zhang, Xiongwei;Zhang, Yafei
- ETRI Journal
- /
- v.34 no.4
- /
- pp.645-648
- /
- 2012
An incremental subspace learning scheme to recover lost speech segments online is presented. Our contributions in this work are twofold. First, the recovery problem is transformed into an interpolation problem of the time-varying gains via nonnegative matrix factorization. Second, incremental nonnegative matrix factorization is employed to allow online processing and track the evolution of speech statistics. The effectiveness of the proposed scheme is confirmed by the experiment results.
https://doi.org/10.4218/etrij.12.0211.0408 인용 PDF KSCI

Clustering gene expression data using Non -Negative matrix factorization (Non-negative matrix factorization 을 이용한 마이크로어레이 데이터의 클러스터링)

Lee, Min-Young;Cho, Ji-Hoon;Lee, In-Beum
- Proceedings of the Korean Society for Bioinformatics Conference
- /
- 2004.11a
- /
- pp.117-123
- /
- 2004
마이크로어레이 (microarray) 기술이 개발된 후로 연관된 유전자 클러스터 (cluster)를 찾는 문제는 깊이 연구되어왔다. 이 문제는 핵심적인 과제 중 하나는 생물학적으로 타당한 클러스터의 수를 결정하는 데 있다. 본 논문은 최적의 클러스터 수를 결정하는 기준을 제시하고, non-negative factorization (NMF)를 이용해 클러스터 centroid의 패턴을 찾는 방법을 제안한다. NMF에 의해 발견된 각각의 패턴은 생물학적 프로세스의 특정 부분으로 해석될 수 있다. NMF는 factor matrix의 entity를 non-negative로 제약 (constraint)하고, 이 제약은 오직 additive combination만 허용하기 때문에 이러한 부분적인 패턴을 찾아낼 수 있다. NMF의 유용성은 이미지 분석과 텍스트 분석에서 이미 입증되어 있다. 본 논문에서 제안한 방법에 의해 위의패턴과 유사한 발현 패턴을 갖는 유전자를 모을 수 있었다. 제안된 방법은 human fibroblast데이터와 yeast cell cycle 데이터에 적용해 성능을 입증하였다.
PDF

Vehicle Face Recognition Algorithm Based on Weighted Nonnegative Matrix Factorization with Double Regularization Terms

Shi, Chunhe;Wu, Chengdong
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.14 no.5
- /
- pp.2171-2185
- /
- 2020
In order to judge that whether the vehicles in different images which are captured by surveillance cameras represent the same vehicle or not, we proposed a novel vehicle face recognition algorithm based on improved Nonnegative Matrix Factorization (NMF), different from traditional vehicle recognition algorithms, there are fewer effective features in vehicle face image than in whole vehicle image in general, which brings certain difficulty to recognition. The innovations mainly include the following two aspects: 1) we proposed a novel idea that the vehicle type can be determined by a few key regions of the vehicle face such as logo, grille and so on; 2) Through adding weight, sparseness and classification property constraints to the NMF model, we can acquire the effective feature bases that represent the key regions of vehicle face image. Experimental results show that the proposed algorithm not only achieve a high correct recognition rate, but also has a strong robustness to some non-cooperative factors such as illumination variation.
https://doi.org/10.3837/tiis.2020.05.017 인용 PDF KSCI HTML

Enhancing Text Document Clustering Using Non-negative Matrix Factorization and WordNet

Kim, Chul-Won;Park, Sun
- Journal of information and communication convergence engineering
- /
- v.11 no.4
- /
- pp.241-246
- /
- 2013
A classic document clustering technique may incorrectly classify documents into different clusters when documents that should belong to the same cluster do not have any shared terms. Recently, to overcome this problem, internal and external knowledge-based approaches have been used for text document clustering. However, the clustering results of these approaches are influenced by the inherent structure and the topical composition of the documents. Further, the organization of knowledge into an ontology is expensive. In this paper, we propose a new enhanced text document clustering method using non-negative matrix factorization (NMF) and WordNet. The semantic terms extracted as cluster labels by NMF can represent the inherent structure of a document cluster well. The proposed method can also improve the quality of document clustering that uses cluster labels and term weights based on term mutual information of WordNet. The experimental results demonstrate that the proposed method achieves better performance than the other text clustering methods.
https://doi.org/10.6109/jicce.2013.11.4.241 인용 PDF KSCI

Search Result 309, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)