• Title, Summary, Keyword: 핵심어 추출

Search Result 69, Processing Time 0.041 seconds

Image Retrieval for Electronic illustrated Fish Book (전자어류도감을 위한 영상검색)

  • Ahn, Soo-Hong;Oh, Jeong-Su
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.36 no.4C
    • /
    • pp.226-231
    • /
    • 2011
  • To improve the conventional illustrated fish book, this paper introduces the concept of an electronic illustrated fish book which applies IT techniques to the conventional one, and proposes the image retrieval for it. The image retrieval is a core technology of the electronic illustrated fish book and make it overwhelm the conventional one. Since fishes, even if the same kind, have different features in shape, color, and texture and the same fish can even have different features by its pose or environment at that time for taking a picture, the conventional image retrieval, that uses simple features in shape, color, and texture, is not suitable for the electronic illustrated fish book. The proposed image retrieval adopts detail shape features extracted from head, body, and tail of a fish and different weights are given to the features depending on their invariability. The simulation results show that the proposed algorithm is far superior to the conventional algorithm.

The Development of Automatic Ontology Generation System Using Extended Search Keywords (검색 키워드 확장을 이용한 온톨로지 자동 생성 시스템 개발)

  • Shim, Joon;Lee, Hong-Chul
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.10 no.6
    • /
    • pp.1220-1228
    • /
    • 2009
  • Ontologies, which are the core of the Semantic Web, are usually limited by specific domains or created by defining meanings and relationships that depend on the heuristic. However, the creation of an ontology is not only very difficult but also very time-consuming. In contrast with ontologies that are used in specific fields, an ontology for the Web entails an unlimited scope of knowledge and expression of information. Hence, it is hard to express information in the same way that is used to create ontologies in specific fields. Therefore, the automatic generation of an ontology takes very important role in the Semantic Web. In this paper, to make ontologies automatically, we suggest the methods to create and renew ontologies by expanding keywords related to the index-terms which are extracted from the search keywords which users input in the search engines by analyzing the morphemes.

Resident Involvement Analysis of New Town Landscape Architecture Construction - Focused on the Gyeonggi GwangGyo District - (택지개발지구 조경공사의 주민관여 분석 - 경기도 광교지구를 중심으로 -)

  • Oh, Jeong-Hak
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.44 no.6
    • /
    • pp.51-59
    • /
    • 2016
  • The purpose of this study is to improve interaction with the construction subject by analyzing the contents and contents of users' involvement in landscaping works. For this purpose, this study selected the Gwanggyo Residential Land Development District Public Landscape Project in Suwon, Gyeonggi Province. For four years before and after the completion, the opinions of tenants were used as research data. Both qualitative and quantitative analyses of 412 complaints received by the project implementation office and local government were conducted. As a result, first, the main purpose of suggesting opinions was 'demanding and expressing complaints', and there were many 'parks' and 'rivers'. In terms of content, "quality" was the most pointed out, but many kinds of trees, such as tree planting, ecological river construction, and pavement construction were also mentioned. Second, the extraction of key words from content analysis was the most common method. Followed by 'additional foodstuff' and 'moving to the toilet and management building'. Much of the point of view about dead wood has continued to be conspicuous in the process of waiting to be dealt with at the time of transplanting. Third, the validity of the contents of the complaints was evaluated as a five - point scale. Therefore, the opinions raised were unreasonable, but overall, there were more complaints with certain objectivity.

A Study on Ontology and Topic Modeling-based Multi-dimensional Knowledge Map Services (온톨로지와 토픽모델링 기반 다차원 연계 지식맵 서비스 연구)

  • Jeong, Hanjo
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.4
    • /
    • pp.79-92
    • /
    • 2015
  • Knowledge map is widely used to represent knowledge in many domains. This paper presents a method of integrating the national R&D data and assists of users to navigate the integrated data via using a knowledge map service. The knowledge map service is built by using a lightweight ontology and a topic modeling method. The national R&D data is integrated with the research project as its center, i.e., the other R&D data such as research papers, patents, and reports are connected with the research project as its outputs. The lightweight ontology is used to represent the simple relationships between the integrated data such as project-outputs relationships, document-author relationships, and document-topic relationships. Knowledge map enables us to infer further relationships such as co-author and co-topic relationships. To extract the relationships between the integrated data, a Relational Data-to-Triples transformer is implemented. Also, a topic modeling approach is introduced to extract the document-topic relationships. A triple store is used to manage and process the ontology data while preserving the network characteristics of knowledge map service. Knowledge map can be divided into two types: one is a knowledge map used in the area of knowledge management to store, manage and process the organizations' data as knowledge, the other is a knowledge map for analyzing and representing knowledge extracted from the science & technology documents. This research focuses on the latter one. In this research, a knowledge map service is introduced for integrating the national R&D data obtained from National Digital Science Library (NDSL) and National Science & Technology Information Service (NTIS), which are two major repository and service of national R&D data servicing in Korea. A lightweight ontology is used to design and build a knowledge map. Using the lightweight ontology enables us to represent and process knowledge as a simple network and it fits in with the knowledge navigation and visualization characteristics of the knowledge map. The lightweight ontology is used to represent the entities and their relationships in the knowledge maps, and an ontology repository is created to store and process the ontology. In the ontologies, researchers are implicitly connected by the national R&D data as the author relationships and the performer relationships. A knowledge map for displaying researchers' network is created, and the researchers' network is created by the co-authoring relationships of the national R&D documents and the co-participation relationships of the national R&D projects. To sum up, a knowledge map-service system based on topic modeling and ontology is introduced for processing knowledge about the national R&D data such as research projects, papers, patent, project reports, and Global Trends Briefing (GTB) data. The system has goals 1) to integrate the national R&D data obtained from NDSL and NTIS, 2) to provide a semantic & topic based information search on the integrated data, and 3) to provide a knowledge map services based on the semantic analysis and knowledge processing. The S&T information such as research papers, research reports, patents and GTB are daily updated from NDSL, and the R&D projects information including their participants and output information are updated from the NTIS. The S&T information and the national R&D information are obtained and integrated to the integrated database. Knowledge base is constructed by transforming the relational data into triples referencing R&D ontology. In addition, a topic modeling method is employed to extract the relationships between the S&T documents and topic keyword/s representing the documents. The topic modeling approach enables us to extract the relationships and topic keyword/s based on the semantics, not based on the simple keyword/s. Lastly, we show an experiment on the construction of the integrated knowledge base using the lightweight ontology and topic modeling, and the knowledge map services created based on the knowledge base are also introduced.

Analysis of Journal of Dental Hygiene Science Research Trends Using Keyword Network Analysis (키워드 네트워크 분석을 활용한 치위생과학회지 연구동향 분석)

  • Kang, Yong-Ju;Yoon, Sun-Joo;Moon, Kyung-Hui
    • Journal of dental hygiene science
    • /
    • v.18 no.6
    • /
    • pp.380-388
    • /
    • 2018
  • This research team extracted keywords from 953 papers published in the Journal of Dental Hygiene Science from 2001 to 2018 for keyword and centrality analyses using the Keyword Network Analysis method. Data were analyzed using Excel 2016 and NetMiner Version 4.4.1. By conducting a deeper analysis between keywords by overall keyword and time frame, we arrived at the following conclusions. For the 17 years considered for this study, the most frequently used words in a dental science paper were "Health," "Oral," "Hygiene," and "Hygienist." The words that form the center by connecting major words in the Journal of Dental Hygiene through the upper-degree centrality words were "Health," "Dental," "Oral," "Hygiene," and "Hygienist." The upper betweenness centrality words were "Dental," "Health," "Oral," "Hygiene," and "Student." Analysis results of the degree centrality words per period revealed "Health" (0.227), "Dental" (0.136), and "Hygiene" (0.136) for period 1; "Health" (0.242), "Dental" (0.177), and "Hygiene" (0.113) for period 2; "Health" (0.200), "Dental" (0.176), and "Oral" (0.082) for period 3; and "Dental" (0.235), "Health" (0.206), and "Oral" (0.147) for period 4. Analysis results of the betweenness centrality words per period revealed "Oral" (0.281) and "Health" (0.199) for period 1; "Dental" (0.205) and "Health" (0.169) for period 2, with the weight then dispersing to "Hygiene" (0.112), "Hygienist" (0.054), and "Oral" (0.053); "Health" (0.258) and "Dental" (0.246) for period 3; and "Oral" (0.364), "Health" (0.353), and "Dental" (0.333) for period 4. Based on the above results, we hope that further studies will be conducted in the future with diverse study subjects.

XSLT Stylesheet Design for Building Web Presentation Layer (웹 프리젠테이션 레이어 생성을 위한 XSLT 스타일쉬트 설계)

  • 채정화;유철중;장옥배
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.3
    • /
    • pp.255-266
    • /
    • 2004
  • In the Web-based information systems, separating the business process logic from the data and presentation logic brings about a wide range of advantages. However, this separation is not easily achieved; even the data logic may be not separated from the presentation layer. So, it requires to define an model for business processes, and then to map the model into the user's dynamic interface using the logic separating strategy. This paper presents a stylesheet method to recognize the process by extending XSLT (Extensible Stylesheet Language Transformations), in order to achieve the logic separation. To do this, it provides an specification of the business process, and a scheme that extracts business model factors and their interactions using a Petri-net notation to show the business model into the process point of view. This is an attempt to separate users' interaction from the business process, that is, dynamic components of interaction Web document from the process structure of Web applications. Our architecture consist mainly of an XSLT controller that is extended by a process control component. The XSLT controller is responsible for receiving the user requests and searching the relevant templet rule related to different user requests one by one. Separation of concerns facilities the development of service-oriented Web sites by making if modular. As a result, the development of service-oriented Web sites would be very easy, and can be changed without affecting the other modules, by virtue of the modularization concept. So, it is easy to develop and maintain the Web applications in independent manner.

A Study on the Intellectual Structure of Metadata Research by Using Co-word Analysis (동시출현단어 분석에 기반한 메타데이터 분야의 지적구조에 관한 연구)

  • Choi, Ye-Jin;Chung, Yeon-Kyoung
    • Journal of the Korean Society for information Management
    • /
    • v.33 no.3
    • /
    • pp.63-83
    • /
    • 2016
  • As the usage of information resources produced in various media and forms has been increased, the importance of metadata as a tool of information organization to describe the information resources becomes increasingly crucial. The purposes of this study are to analyze and to demonstrate the intellectual structure in the field of metadata through co-word analysis. The data set was collected from the journals which were registered in the Core collection of Web of Science citation database during the period from January 1, 1998 to July 8, 2016. Among them, the bibliographic data from 727 journals was collected using Topic category search with the query word 'metadata'. From 727 journal articles, 410 journals with author keywords were selected and after data preprocessing, 1,137 author keywords were extracted. Finally, a total of 37 final keywords which had more than 6 frequency were selected for analysis. In order to demonstrate the intellectual structure of metadata field, network analysis was conducted. As a result, 2 domains and 9 clusters were derived, and intellectual relations among keywords from metadata field were visualized, and proposed keywords with high global centrality and local centrality. Six clusters from cluster analysis were shown in the map of multidimensional scaling, and the knowledge structure was proposed based on the correlations among each keywords. The results of this study are expected to help to understand the intellectual structure of metadata field through visualization and to guide directions in new approaches of metadata related studies.

Analysis of Research Trends in SIAM Journal on Applied Mathematics Using Topic Modeling (토픽모델링을 활용한 SIAM Journal on Applied Mathematics의 연구 동향 분석)

  • Kim, Sung-Yeun
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.7
    • /
    • pp.607-615
    • /
    • 2020
  • The purpose of this study was to analyze the research status and trends related to the industrial mathematics based on text mining techniques with a sample of 4910 papers collected in the SIAM Journal on Applied Mathematics from 1970 to 2019. The R program was used to collect titles, abstracts, and key words from the papers and to analyze topic modeling techniques based on LDA algorithm. As a result of the coherence score on the collected papers, 20 topics were determined optimally using the Gibbs sampling methods. The main results were as follows. First, studies on industrial mathematics were conducted in a variety of mathematics fields, including computational mathematics, geometry, mathematical modeling, topology, discrete mathematics, probability and statistics, with a focus on analysis and algebra. Second, 5 hot topics (mathematical biology, nonlinear partial differential equation, discrete mathematics, statistics, topology) and 1 cold topic (probability theory) were found based on time series regression analysis. Third, among the fields that were not reflected in the 2015 revised mathematics curriculum, numeral system, matrix, vector in space, and complex numbers were extracted as the contents to be covered in the high school mathematical curriculum. Finally, this study suggested strategies to activate industrial mathematics in Korea, described the study limitations, and proposed directions for future research.

A Study on Market Size Estimation Method by Product Group Using Word2Vec Algorithm (Word2Vec을 활용한 제품군별 시장규모 추정 방법에 관한 연구)

  • Jung, Ye Lim;Kim, Ji Hui;Yoo, Hyoung Sun
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.1-21
    • /
    • 2020
  • With the rapid development of artificial intelligence technology, various techniques have been developed to extract meaningful information from unstructured text data which constitutes a large portion of big data. Over the past decades, text mining technologies have been utilized in various industries for practical applications. In the field of business intelligence, it has been employed to discover new market and/or technology opportunities and support rational decision making of business participants. The market information such as market size, market growth rate, and market share is essential for setting companies' business strategies. There has been a continuous demand in various fields for specific product level-market information. However, the information has been generally provided at industry level or broad categories based on classification standards, making it difficult to obtain specific and proper information. In this regard, we propose a new methodology that can estimate the market sizes of product groups at more detailed levels than that of previously offered. We applied Word2Vec algorithm, a neural network based semantic word embedding model, to enable automatic market size estimation from individual companies' product information in a bottom-up manner. The overall process is as follows: First, the data related to product information is collected, refined, and restructured into suitable form for applying Word2Vec model. Next, the preprocessed data is embedded into vector space by Word2Vec and then the product groups are derived by extracting similar products names based on cosine similarity calculation. Finally, the sales data on the extracted products is summated to estimate the market size of the product groups. As an experimental data, text data of product names from Statistics Korea's microdata (345,103 cases) were mapped in multidimensional vector space by Word2Vec training. We performed parameters optimization for training and then applied vector dimension of 300 and window size of 15 as optimized parameters for further experiments. We employed index words of Korean Standard Industry Classification (KSIC) as a product name dataset to more efficiently cluster product groups. The product names which are similar to KSIC indexes were extracted based on cosine similarity. The market size of extracted products as one product category was calculated from individual companies' sales data. The market sizes of 11,654 specific product lines were automatically estimated by the proposed model. For the performance verification, the results were compared with actual market size of some items. The Pearson's correlation coefficient was 0.513. Our approach has several advantages differing from the previous studies. First, text mining and machine learning techniques were applied for the first time on market size estimation, overcoming the limitations of traditional sampling based- or multiple assumption required-methods. In addition, the level of market category can be easily and efficiently adjusted according to the purpose of information use by changing cosine similarity threshold. Furthermore, it has a high potential of practical applications since it can resolve unmet needs for detailed market size information in public and private sectors. Specifically, it can be utilized in technology evaluation and technology commercialization support program conducted by governmental institutions, as well as business strategies consulting and market analysis report publishing by private firms. The limitation of our study is that the presented model needs to be improved in terms of accuracy and reliability. The semantic-based word embedding module can be advanced by giving a proper order in the preprocessed dataset or by combining another algorithm such as Jaccard similarity with Word2Vec. Also, the methods of product group clustering can be changed to other types of unsupervised machine learning algorithm. Our group is currently working on subsequent studies and we expect that it can further improve the performance of the conceptually proposed basic model in this study.