• Title/Summary/Keyword: Latent semantic analysis

Search Result 64, Processing Time 0.021 seconds

Estimating People's Position Using Matrix Decomposition

  • Dao, Thi-Nga;Yoon, Seokhoon
    • International journal of advanced smart convergence
    • /
    • v.8 no.2
    • /
    • pp.39-46
    • /
    • 2019
  • Human mobility estimation plays a key factor in a lot of promising applications including location-based recommendation systems, urban planning, and disease outbreak control. We study the human mobility estimation problem in the case where recent locations of a person-of-interest are unknown. Since matrix decomposition is used to perform latent semantic analysis of multi-dimensional data, we propose a human location estimation algorithm based on matrix factorization to reconstruct the human movement patterns through the use of information of persons with correlated movements. Specifically, the optimization problem which minimizes the difference between the reconstructed and actual movement data is first formulated. Then, the gradient descent algorithm is applied to adjust parameters which contribute to reconstructed mobility data. The experiment results show that the proposed framework can be used for the prediction of human location and achieves higher predictive accuracy than a baseline model.

Information Retrieval based on Probabilistic Latent Semantic Analysis within P2P Environments (P2P 환경에서 확률적 잠재 의미 분석에 기반한 정보 검색)

  • Gu, Tae-Wan;Kim, Yu-Seop;Lee, Kwang-Mo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2004.05a
    • /
    • pp.515-518
    • /
    • 2004
  • 전통적인 Peer-to-Peer 모델에서 정보검색 문제를 해결하기 위한 방법으로는 질의 및 키워드를 각 Peer에 전송하여 해당 질의 및 키워드와 문서들을 비교하는 방법이 대부분이었다. 본 논문에서는 이러한 방법을 확장하여 문서에 대한 의미론적 분석을 통해 검색의 정확성을 향상시키고자 한다. 이를 위해 본 논문에서는 확률적 의미분석 기법을 이용하여 각 Peer에 존재하는 정보에 대한 색인을 작성 한 후, 이것을 Peer-to-Peer 환경에 적용하기 위한 분산 색인 분배 알고리즘을 제안한다.

  • PDF

KOREAN TOPIC MODELING USING MATRIX DECOMPOSITION

  • June-Ho Lee;Hyun-Min Kim
    • East Asian mathematical journal
    • /
    • v.40 no.3
    • /
    • pp.307-318
    • /
    • 2024
  • This paper explores the application of matrix factorization, specifically CUR decomposition, in the clustering of Korean language documents by topic. It addresses the unique challenges of Natural Language Processing (NLP) in dealing with the Korean language's distinctive features, such as agglutinative words and morphological ambiguity. The study compares the effectiveness of Latent Semantic Analysis (LSA) using CUR decomposition with the classical Singular Value Decomposition (SVD) method in the context of Korean text. Experiments are conducted using Korean Wikipedia documents and newspaper data, providing insight into the accuracy and efficiency of these techniques. The findings demonstrate the potential of CUR decomposition to improve the accuracy of document clustering in Korean, offering a valuable approach to text mining and information retrieval in agglutinative languages.

Research trends in the Korean Journal of Women Health Nursing from 2011 to 2021: a quantitative content analysis

  • Ju-Hee Nho;Sookkyoung Park
    • Women's Health Nursing
    • /
    • v.29 no.2
    • /
    • pp.128-136
    • /
    • 2023
  • Purpose: Topic modeling is a text mining technique that extracts concepts from textual data and uncovers semantic structures and potential knowledge frameworks within context. This study aimed to identify major keywords and network structures for each major topic to discern research trends in women's health nursing published in the Korean Journal of Women Health Nursing (KJWHN) using text network analysis and topic modeling. Methods: The study targeted papers with English abstracts among 373 articles published in KJWHN from January 2011 to December 2021. Text network analysis and topic modeling were employed, and the analysis consisted of five steps: (1) data collection, (2) word extraction and refinement, (3) extraction of keywords and creation of networks, (4) network centrality analysis and key topic selection, and (5) topic modeling. Results: Six major keywords, each corresponding to a topic, were extracted through topic modeling analysis: "gynecologic neoplasms," "menopausal health," "health behavior," "infertility," "women's health in transition," and "nursing education for women." Conclusion: The latent topics from the target studies primarily focused on the health of women across all age groups. Research related to women's health is evolving with changing times and warrants further progress in the future. Future research on women's health nursing should explore various topics that reflect changes in social trends, and research methods should be diversified accordingly.

Research trend analysis of Korean new graduate nurses using topic modeling (토픽모델링을 활용한 신규간호사 관련 국내 연구동향 분석)

  • Park, Seungmi;Lee, Jung Lim
    • The Journal of Korean Academic Society of Nursing Education
    • /
    • v.27 no.3
    • /
    • pp.240-250
    • /
    • 2021
  • Purpose: The aim of this study is to analyze the research trends of articles on just graduated Korean nurses during the past 10 years for exploring strategies for clinical adaptation. Methods: The topics of new graduate nurses were extracted from 110 articles that have been published in Korean journals between January 2010 and July 2020. Abstracts were retrieved from 4 databases (DBpia, RISS, KISS and Google scholar). Keywords were extracted from the abstracts and cleaned using semantic morphemes. Network analysis and topic modeling were performed using the NetMiner program. Results: The core keywords included 'education', 'training', 'program', 'skill', 'care', 'performance', and 'satisfaction'. In recent articles on new graduate nurses, three major topics were extracted by Latent Dirichlet Allocation (LDA) techniques: 'turnover', 'adaptation', 'education'. Conclusion: Previous articles focused on exploring the factors related to the adaptation and turnover intentions of new graduate nurses. It is necessary to conduct further research focused on various interventions at the individual, task, and organizational levels to improve the retention of new graduate nurses.

Non-Simultaneous Sampling Deactivation during the Parameter Approximation of a Topic Model

  • Jeong, Young-Seob;Jin, Sou-Young;Choi, Ho-Jin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.7 no.1
    • /
    • pp.81-98
    • /
    • 2013
  • Since Probabilistic Latent Semantic Analysis (PLSA) and Latent Dirichlet Allocation (LDA) were introduced, many revised or extended topic models have appeared. Due to the intractable likelihood of these models, training any topic model requires to use some approximation algorithm such as variational approximation, Laplace approximation, or Markov chain Monte Carlo (MCMC). Although these approximation algorithms perform well, training a topic model is still computationally expensive given the large amount of data it requires. In this paper, we propose a new method, called non-simultaneous sampling deactivation, for efficient approximation of parameters in a topic model. While each random variable is normally sampled or obtained by a single predefined burn-in period in the traditional approximation algorithms, our new method is based on the observation that the random variable nodes in one topic model have all different periods of convergence. During the iterative approximation process, the proposed method allows each random variable node to be terminated or deactivated when it is converged. Therefore, compared to the traditional approximation ways in which usually every node is deactivated concurrently, the proposed method achieves the inference efficiency in terms of time and memory. We do not propose a new approximation algorithm, but a new process applicable to the existing approximation algorithms. Through experiments, we show the time and memory efficiency of the method, and discuss about the tradeoff between the efficiency of the approximation process and the parameter consistency.

Salient Object Detection Based on Regional Contrast and Relative Spatial Compactness

  • Xu, Dan;Tang, Zhenmin;Xu, Wei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.7 no.11
    • /
    • pp.2737-2753
    • /
    • 2013
  • In this study, we propose a novel salient object detection strategy based on regional contrast and relative spatial compactness. Our algorithm consists of four basic steps. First, we learn color names offline using the probabilistic latent semantic analysis (PLSA) model to find the mapping between basic color names and pixel values. The color names can be used for image segmentation and region description. Second, image pixels are assigned to special color names according to their values, forming different color clusters. The saliency measure for every cluster is evaluated by its spatial compactness relative to other clusters rather than by the intra variance of the cluster alone. Third, every cluster is divided into local regions that are described with color name descriptors. The regional contrast is evaluated by computing the color distance between different regions in the entire image. Last, the final saliency map is constructed by incorporating the color cluster's spatial compactness measure and the corresponding regional contrast. Experiments show that our algorithm outperforms several existing salient object detection methods with higher precision and better recall rates when evaluated using public datasets.

Semantic Analysis of Indian Original Stupa - A Comparative Study on the Transmission and Style of the Buddhist Pagoda I - (인도시원불탑(印度始原佛塔)의 의미론적(意味論的) 해석(解析) - 불탑건축의 전래와 양식에 관한 비교론적 고찰 I -)

  • Cheon, Deuk-Youm
    • Journal of architectural history
    • /
    • v.2 no.2 s.4
    • /
    • pp.89-106
    • /
    • 1993
  • Wherever Buddhism has flourished, there were stupas in the form of monuments which have their origin in the tumulm of prehistoric times. After the death of Buddha, his body was cremated following the Indian funeral custom. His ashes, which long reserved for the remains of nobles and holymen, were enshrined under such artificial hills of earth and brick. The Stupa was in origin a simple burial-mound. The form of the burial-mound was a symbolical or magic reconstruction of the imagined shape of the sky, like a dome covering the earth. The domical form of the earliest tumuli may have been concious replicas of the shape of the Vedic hut. There are relationships which may have originally existed between the stupa and West Asiatic monuments. Buddhist Stupa originally cosisted of an almost hemispherical tumulus(anda) and an altar-like structure (harmika) on its top, surmounted by one or several superimposed honorific umbrellas (hti, catta). This hemispherical form is associated with centralisation, lunar worship, mother earth, and Siva. Anda means a symbol of latent creative power, the harmika symbolizes the sanctuary enthroned aboved the world. The honorific umbrella, as an abstract imitation of the shade-giving tree is one of the chief solar symbols and that of enlightenment.

  • PDF

Intelligne information retrieval using latent semantic analysis on the internet (인터넷에서 잠재적 의미 분석을 이용한 지능적 정보 검색)

  • 임재현;김영찬
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.22 no.8
    • /
    • pp.1782-1789
    • /
    • 1997
  • Most systems that retrieve distributed information on the Internet have difficulties in retrieving relevant information for they are not able to reflect exact semantics on retrieval queries that usersrequest. In this paepr, we propose an automatic query expansion based on ter distribution which reflects semantics of retrieval term to emhance the performance of information retrieval. We computed weight, indicating its overal imoritance in the collection documents and user's query and we use LSI's SVD technique to measure the term distribution which appears similar to query. And also, we measure the similarity to compared numerical value with query terms. Also we researched the method to reduce additional terms automatically and evaluated the performance of the proposed method.

  • PDF

Agglomerative Hierarchical Clustering Using Latent Semantic Analysis in Information Retrieval (정보 검색에서의 잠재 의미 분석 방법을 이용한 응집 계층 군집화 기법 연구)

  • Khiati, Abdel-Ilah Zakaria;Kang, Daehyun;Park, Hansaem;Kwon, Kyunglag;Chung, In-Jeong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2014.04a
    • /
    • pp.952-955
    • /
    • 2014
  • 본 논문에서는 정보 검색 분야에서 잘 알려진 잠재 의미 분석 방법과 계층적 군집화 방법의 단점을 상호 보완하여 보다 효율적인 정보 검색을 위한 혼합형 군집화 방법을 제안한다. 먼저, 잠재 의미 분석 방법은 벡터 연산을 통하여 자동적으로 문서 내에 있는 잠재적인 의미를 찾는 정보 검색분야에서 많이 사용되는 고전적인 방법이다. 그러나 이 방법은 언어의 유의성이나 다의성으로 인하여 발생되는 백-오브-워드(bag-of-word) 문제를 가지고 있다. 두 번째 방법인 문서 군집화를 위하여 범용적으로 사용되고 있는 계층적 군집화 방법이다. 이 방법은 이를 통하여 분석된 군집의 질적 측면에서 볼 때, 여전히 단층적 군집들이 많이 형성되어 세부적인 분석을 통한 추가적인 군집화가 필요함을 알 수 있다. 따라서, 본 논문에서는 앞서 언급한 문제점을 해결하기 위하여 혼합적인 방법으로 잠재 의미 분석 방법을 이용한 응집 계층 군집화 방법을 제안한다. 제안한 방법을 이용하여 잘 알려진 두 개의 데이터에 적용하고 기존의 방법과 그 결과를 비교함으로써 군집의 질적 측면에서의 우수함을 보인다.