• Title/Summary/Keyword: Latent semantic analysis

Search Result 65, Processing Time 0.022 seconds

Agglomerative Hierarchical Clustering Using Latent Semantic Analysis in Information Retrieval (정보 검색에서의 잠재 의미 분석 방법을 이용한 응집 계층 군집화 기법 연구)

  • Khiati, Abdel-Ilah Zakaria;Kang, Daehyun;Park, Hansaem;Kwon, Kyunglag;Chung, In-Jeong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2014.04a
    • /
    • pp.952-955
    • /
    • 2014
  • 본 논문에서는 정보 검색 분야에서 잘 알려진 잠재 의미 분석 방법과 계층적 군집화 방법의 단점을 상호 보완하여 보다 효율적인 정보 검색을 위한 혼합형 군집화 방법을 제안한다. 먼저, 잠재 의미 분석 방법은 벡터 연산을 통하여 자동적으로 문서 내에 있는 잠재적인 의미를 찾는 정보 검색분야에서 많이 사용되는 고전적인 방법이다. 그러나 이 방법은 언어의 유의성이나 다의성으로 인하여 발생되는 백-오브-워드(bag-of-word) 문제를 가지고 있다. 두 번째 방법인 문서 군집화를 위하여 범용적으로 사용되고 있는 계층적 군집화 방법이다. 이 방법은 이를 통하여 분석된 군집의 질적 측면에서 볼 때, 여전히 단층적 군집들이 많이 형성되어 세부적인 분석을 통한 추가적인 군집화가 필요함을 알 수 있다. 따라서, 본 논문에서는 앞서 언급한 문제점을 해결하기 위하여 혼합적인 방법으로 잠재 의미 분석 방법을 이용한 응집 계층 군집화 방법을 제안한다. 제안한 방법을 이용하여 잘 알려진 두 개의 데이터에 적용하고 기존의 방법과 그 결과를 비교함으로써 군집의 질적 측면에서의 우수함을 보인다.

Document Summarization Using Mutual Recommendation with LSA and Sense Analysis (LSA를 이용한 문장 상호 추천과 문장 성향 분석을 통한 문서 요약)

  • Lee, Dong-Wook;Baek, Seo-Hyeon;Park, Min-Ji;Park, Jin-Hee;Jung, Hye-Wuk;Lee, Jee-Hyong
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.22 no.5
    • /
    • pp.656-662
    • /
    • 2012
  • In this paper, we describe a new summarizing method based on a graph-based and a sense-based analysis. In the graph-based analysis, we convert sentences in a document into word vectors and calculate the similarity between each sentence using LSA. We reflect this similarity of sentences and the rarity scores of words in sentences to define weights of edges in the graph. Meanwhile, in the sense-based analysis, in order to determine the sense of words, subjectivity or objectivity, we built a database which is extended from the golden standards using Wordnet. We calculate the subjectivity of sentences from the sense of words, and select more subjective sentences. Lastly, we combine the results of these two methods. We evaluate the performance of the proposed method using classification games, which are usually used to measure the performances of summarization methods. We compare our method with the MS-Word auto-summarization, and verify the effectiveness of ours.

Analysis of the Knowledge Structure of Research related to Reality Shock Experienced by New Graduate Nurses using Text Network Analysis (텍스트네트워크분석을 활용한 신규간호사가 경험하는 현실충격 관련 연구의 지식구조 분석)

  • Heejang Yun
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.1
    • /
    • pp.463-469
    • /
    • 2023
  • The aim of this study is to provide basic data that can contribute to improving successful clinical adaptation and reducing turnover of new graduate nurses by analyzing research related to reality shock experienced by new graduate nurses using text network analysis. The topics of reality shock experienced by new graduate nurses were extracted from 115 papers published in domestic and foreign journals from January 2002 to December 2021. Articles were retrieved from 6 databases (Korean DB: DBpia, KISS, RISS /International DB: Web of science, Springer, Scopus). Keywords were extracted from the abstract and organized using semantic morphemes. Network analysis and topic modeling for subject knowledge structure analysis were performed using NetMiner 4.5.0 program. The core keywords included 'new graduate nurses', 'reality shock', 'transition', 'student nurse', 'experience', 'practice', 'work environment', 'role', 'care' and 'education'. In recent articles on reality shock experienced by new graduate nurses, three major topics were extracted by LDA (Latent Dirichlet Allocation) techniques: 'turnover', 'work environment', 'experience of transition'. Based on this research, the necessity of interventional research that can effectively reduce the reality shock experienced by new graduate nurses and successfully help clinical adaptation is suggested.

Simulation Study on E-commerce Recommender System by Use of LSI Method (LSI 기법을 이용한 전자상거래 추천자 시스템의 시뮬레이션 분석)

  • Kwon, Chi-Myung
    • Journal of the Korea Society for Simulation
    • /
    • v.15 no.3
    • /
    • pp.23-30
    • /
    • 2006
  • A recommender system for E-commerce site receives information from customers about which products they are interested in, and recommends products that are likely to fit their needs. In this paper, we investigate several methods for large-scale product purchase data for the purpose of producing useful recommendations to customers. We apply the traditional data mining techniques of cluster analysis and collaborative filtering(CF), and CF with reduction of product-dimensionality by use of latent semantic indexing(LSI). If reduced product-dimensionality obtained from LSI shows a similar latent trend of customers for buying products to that based on original customer-product purchase data, we expect less computational effort for obtaining the nearest-neighbor for target customer may improve the efficiency of recommendation performance. From simulation experiments on synthetic customer-product purchase data, CF-based method with reduction of product-dimensionality presents a better performance than the traditional CF methods with respect to the recall, precision and F1 measure. In general, the recommendation quality increases as the size of the neighborhood increases. However, our simulation results shows that, after a certain point, the improvement gain diminish. Also we find, as a number of products of recommendation increases, the precision becomes worse, but the improvement gain of recall is relatively small after a certain point. We consider these informations may be useful in applying recommender system.

  • PDF

Design of Sidewalk Landscape Considering Human Sensibility (인간의 감성을 고려한 보도경관 설계모형에 관한 연구)

  • Lee, Byeong-Ju;Park, Sang-Myeong;Nam, Gung-Mun
    • Journal of Korean Society of Transportation
    • /
    • v.24 no.6 s.92
    • /
    • pp.119-127
    • /
    • 2006
  • Recently. there are demanding a better sidewalk environment considering side of psychic as well as physical factors as the rapid growth of cities and improvement of traffic consciousness. Also. it needs to give a better sidewalk environment because those pedestrians evade a sidewalk space with minimum Physical design standards. So. we think very important that get a grip what makes Pedestrian feel a comfort and amenity in sidewalk above all. In this study, we carried out a cognition experiment of sidewalk environment on considering the human's psychic with Sensibility Ergonomics and the survey method using SD (Semantic Differential) scale. And we made a recognition evaluation model of sidewalk landscape and sensibility recognition model of sidewalk design factors using LISREL model that analysis sensibility recognition of sensibility adjective by SD scale. In results, we found out a possibility of the design with comfort and amenity in sidewalk environment as considering Sensibility Ergonomics, and an importance of harmonious green environment as a roadside tree etc. above all.

Comparison of Topic Modeling Methods for Analyzing Research Trends of Archives Management in Korea: focused on LDA and HDP (국내 기록관리학 연구동향 분석을 위한 토픽모델링 기법 비교 - LDA와 HDP를 중심으로 -)

  • Park, JunHyeong;Oh, Hyo-Jung
    • Journal of Korean Library and Information Science Society
    • /
    • v.48 no.4
    • /
    • pp.235-258
    • /
    • 2017
  • The purpose of this study is to analyze research trends of archives management in Korea by comparing LDA (Latent Semantic Allocation) topic modeling, which is the most famous method in text mining, and HDP (Hierarchical Dirichlet Process) topic modeling, which is developed LDA topic modeling. Firstly we collected 1,027 articles related to archives management from 1997 to 2016 in two journals related with archives management and four journals related with library and information science in Korea and performed several preprocessing steps. And then we conducted LDA and HDP topic modelings. For a more in-depth comparison analysis, we utilized LDAvis as a topic modeling visualization tool. At the results, LDA topic modeling was influenced by frequently keywords in all topics, whereas, HDP topic modeling showed specific keywords to easily identify the characteristics of each topic.

Design of Character-based Conversational Instruction-Learning System Design for Science Education of Elementary School (초등 과학수업을 위한 캐릭터 기반의 대화형 교수-학습 시스템 설계)

  • Jeong Sang-Mok;Song Ki-Sang
    • Journal of the Korea Society of Computer and Information
    • /
    • v.10 no.5 s.37
    • /
    • pp.343-352
    • /
    • 2005
  • The existing CAI or web-based science learning system of elementary school has some disadvantages. For instance, it is composed of uniform courses designed by an instructor without considering the learner's characters, and the learner's opinions or questions raised during learning can not be delivered to the system. This structure has diminished the willingness or the motive of the learner and make an adverse effect on the learning efficiency. In this regards, Instruction-Learning System is needed to provide learning environment Pertinent to the learner's individual character and motivate the learner's active attendance and learning. This study is to design a character-based conversational Instruction-Learning System. This may induce the learner's active attendance through the communications between instructor and learner and furnish various learning materials to motivate the learners and attract their consistent interests in learning.

  • PDF

Color Recommendation for Text Based on Colors Associated with Words

  • Liba, Saki;Nakamura, Tetsuaki;Sakamoto, Maki
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.17 no.1
    • /
    • pp.21-29
    • /
    • 2012
  • In this paper, we propose a new method to select colors representing the meaning of text contents based on the cognitive relation between words and colors, Our method is designed on the previous study revealing the existence of crucial words to estimate the colors associated with the meaning of text contents, Using the associative probability of each color with a given word and the strength of color association of the word, we estimate the probability of colors associated with a given text. The goal of this study is to propose a system to recommend the cognitively plausible colors for the meaning of the input text. To build a versatile and efficient database used by our system, two psychological experiments were conducted by using news site articles. In experiment 1, we collected 498 words which were chosen by the participants as having the strong association with color. Subsequently, we investigated which color was associated with each word in experiment 2. In addition to those data, we employed the estimated values of the strength of color association and the colors associated with the words included in a very large corpus of newspapers (approximately 130,000 words) based on the similarity between the words obtained by Latent Semantic Analysis (LSA). Therefore our method allows us to select colors for a large variety of words or sentences. Finally, we verified that our system cognitively succeeded in proposing the colors associated with the meaning of the input text, comparing the correct colors answered by participants with the estimated colors by our method. Our system is expected to be of use in various types of situations such as the data visualization, the information retrieval, the art or web pages design, and so on.

A Feasibility Study on Adopting Individual Information Cognitive Processing as Criteria of Categorization on Apple iTunes Store

  • Zhang, Chao;Wan, Lili
    • The Journal of Information Systems
    • /
    • v.27 no.2
    • /
    • pp.1-28
    • /
    • 2018
  • Purpose More than 7.6 million mobile apps could be approved on both Apple iTunes Store and Google Play. For managing those existed Apps, Apple Inc. established twenty-four primary categories, as well as Google Play had thirty-three primary categories. However, all of their categorizations have appeared more and more problems in managing and classifying numerous apps, such as app miscategorized, cross-attribution problems, lack of categorization keywords index, etc. The purpose of this study focused on introducing individual information cognitive processing as the classification criteria to update the current categorization on Apple iTunes Store. Meanwhile, we tried to observe the effectiveness of the new criteria from a classification process on Apple iTunes Store. Design/Methodology/Approach A research approach with four research stages were performed and a series of mixed methods was developed to identify the feasibility of adopting individual information cognitive processing as categorization criteria. By using machine-learning techniques with Term Frequency-Inverse Document Frequency and Singular Value Decomposition, keyword lists were extracted. By using the prior research results related to car app's categorization, we developed individual information cognitive processing. Further keywords extracting process from the extracted keyword lists was performed. Findings By TF-IDF and SVD, keyword lists from more than five thousand apps were extracted. Furthermore, we developed individual information cognitive processing that included a categorization teaching process and learning process. Three top three keywords for each category were extracted. By comparing the extracted results with prior studies, the inter-rater reliability for two different methods shows significant reliable, which proved the individual information cognitive processing to be reliable as criteria of categorization on Apple iTunes Store. The updating suggestions for Apple iTunes Store were discussed in this paper and the results of this paper may be useful for app store hosts to improve the current categorizations on app stores as well as increasing the efficiency of app discovering and locating process for both app developers and users.

Analysis on Topics in Soundscape Research based on Topic Modeling (토픽 모델링을 이용한 사운드스케이프 연구 주제어 분석)

  • Choe, Sou-Hwan
    • The Journal of the Korea Contents Association
    • /
    • v.19 no.7
    • /
    • pp.427-435
    • /
    • 2019
  • Soundscape provides important resources to understand social and cultural aspects of our society, however, it is still its infancy to study on the research framework to record, conserve, categorize, and analyze soundscapes. Topic modeling is an automatic approach to discover hidden themes that are disperse in unstructured documents, thus topic modeling is robust enough to find latent topics such as research trends behind a collection of documents. The purpose of this paper is to discover topics on current soundscape research based on topic modeling, furthermore, to discuss the possibilities to design a metadata system for sound archives and to improve Soundscape Ontology which is currently developing.