• Title/Summary/Keyword: 텍스트분류

Search Result 684, Processing Time 0.02 seconds

A Study on the Development of Intelligent Contents and Interactive Storytelling System (지능형콘텐츠 개발과 인터렉티브 스토리텔링 시스템 연구)

  • Lee, Eun Ryoung;Kim, Kio Chung
    • Journal of Digital Convergence
    • /
    • v.11 no.1
    • /
    • pp.423-430
    • /
    • 2013
  • The development of information technology introduced digital contents and Social Network Services(SNS), and allowed the virtual transaction and communication between users called "the experience knowledge" advanced from "the objective knowledge." This paper will analyze interactive storytelling system creating different types of stories on narrative genre about family history, personal history and so on. Through analysis on narrative interviews, direct observations, documentations and visual records, contents about CEO story, corporate story, family story and especially family history will be categorized into sampleDB and informationDB. Accumulated contents will allow the user to increase the value and usage of the contents through interactive storytelling system by restructuring the contents on family history. This research has developed writing tool data model using different digital contents such as texts, images and pictures to encourage open communications between first generations and third generations in Korea. Furthermore, researched about connected system on interactive storytelling creation device using various genre of family story that has been data based.

Implementation of TTS Engine for Natural Voice (자연음 TTS(Text-To-Speech) 엔진 구현)

  • Cho Jung-Ho;Kim Tae-Eun;Lim Jae-Hwan
    • Journal of Digital Contents Society
    • /
    • v.4 no.2
    • /
    • pp.233-242
    • /
    • 2003
  • A TTS(Text-To-Speech) System is a computer-based system that should be able to read any text aloud. To output a natural voice, we need a general knowledge of language, a lot of time, and effort. Furthermore, the sound pattern of english has a variable pattern, which consists of phonemic and morphological analysis. It is very difficult to maintain consistency of pattern. To handle these problems, we present a system based on phonemic analysis for vowel and consonant. By analyzing phonological variations frequently found in spoken english, we have derived about phonemic contexts that would trigger the multilevel application of the corresponding phonological process, which consists of phonemic and allophonic rules. In conclusion, we have a rule data which consists of phoneme, and a engine which economize in system. The proposed system can use not only communication system, but also utilize office automation and so on.

  • PDF

A Study on Science Technology Trend and Prediction Using Topic Modeling (토픽모델링을 활용한 과학기술동향 및 예측에 관한 연구)

  • Park, Ju Seop;Hong, Soon-Goo;Kim, Jong-Weon
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.22 no.4
    • /
    • pp.19-28
    • /
    • 2017
  • Companies and Governments have Mainly used the Delphi Technique to Understand Research or Technology Trends. Because this Technique has the Disadvantage of Consuming a Large Amount of Time and Money, this Study Attempted to Understand and Predict Science and Technology Trends using the Topic Modeling Technique Latent Dirichlet Allocation (LDA). To this end, 20 Specific Artificial Intelligence (AI) Technologies were Extracted From the Abstracts of the US Patent Documents on AI. With Regard to the Extracted Specific Technologies, Core Technologies were Identified, and then these were Divided into Hot and Cold Technologies though a Trend Analysis on their Annual Proportions. Text/Word Searching, Computer Management, Programming Syntax, Network Administration, Multimedia, and Wireless Network Technology were Derived From Hot Technologies. These Technologies are Key Technologies that are Actively Studied in the Field of AI in Recent Years. The Methodology Suggested in this Study may be used to Analyze Trends, Derive Policies, or Predict Technical Demands in Various Fields such as Social Issues, Regional Innovation, and Management.

Development of Management System for Feature Change Information using Bid Information (입찰정보를 이용한 지형지물변화정보 관리시스템 개발)

  • Heo, Min;Lee, Yong-Wook;Bae, Kyoung-Ho;Ryu, Keun-Hong
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.27 no.2
    • /
    • pp.195-202
    • /
    • 2009
  • As the generation and application of spatial information is gradually expanded not only in traditional surveying fields but also a CNS and an ITS recently. The Accuracy and the newest of data grow to be an important element. But digital map is updated with system based tile. So, it is hard to get the newest of data and to be satisfied with user requirements. In this study, management system is developed to manage feature change efficiently using bid informations from NaraJangter which service the bid informations. A construction works with change possibility of feature from bid informations are classified and are made DB. And the DB is used as the feature change forecast informations. Also, It is converted from bid information of text form to positioning informations connected to spatial information data. If this system is made successfully, this system contributes to reduce the cost for the update of digital map and to take the newest date of spatial informations.

Collection and Extraction Algorithm of Field-Associated Terms (분야연상어의 수집과 추출 알고리즘)

  • Lee, Sang-Kon;Lee, Wan-Kwon
    • The KIPS Transactions:PartB
    • /
    • v.10B no.3
    • /
    • pp.347-358
    • /
    • 2003
  • VSField-associated term is a single or compound word whose terms occur in any document, and which makes it possible to recognize a field of text by using common knowledge of human. For example, human recognizes the field of document such as or , a field name of text, when she encounters a word 'Pitcher' or 'election', respectively We Proposes an efficient construction method of field-associated terms (FTs) for specializing field to decide a field of text. We could fix document classification scheme from well-classified document database or corpus. Considering focus field we discuss levels and stability ranks of field-associated terms. To construct a balanced FT collection, we construct a single FTs. From the collections we could automatically construct FT's levels, and stability ranks. We propose a new extraction algorithms of FT's for document classification by using FT's concentration rate, its occurrence frequencies.

Image Analysis and Management Strategy for The National Science Museum Utilizing SNS Big Data Analysis (SNS 빅데이터 분석을 활용한 국립과학관에 대한 이미지 분석과 경영전략 제안)

  • Shin, Seongyeon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.1
    • /
    • pp.81-89
    • /
    • 2020
  • The purpose of this study is to investigate science consumers' perceptions of the National Science Museum and suggest effective management strategies for the museum. Research questions were established and the analyses were conducted to achieve the research goals. The collection and analysis of the data were conducted through a new approach to image analysis that combines qualitative and quantitative methods. First, the image of the concept of science was derived from science consumers (adults, undergraduate and graduate students) through a qualitative research method (group-interviewing), and then text analysis was conducted. Second, quantitative research was conducted through LDA (Latent Dirichlet Allocation)-based topical modeling of 63,987 words extracted from 12,920 titles of blog postings from one of the most heavily-trafficked portal sites in Korea. The results of this study indicate that the perception of science differs according to the characteristics of the respondents. Further, topic-modeling extracted 20 topics from the blog posting titles and the topics were condensed into seven factors. Detailed discussions and managerial implications are provided in the conclusion section.

Building Database using Character Recognition Technology (문자 인식 기술을 이용한 데이터베이스 구축)

  • Han, Seon-Hwa;Lee, Chung-Sik;Lee, Jun-Ho;Kim, Jin-Hyeong
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.7
    • /
    • pp.1713-1723
    • /
    • 1999
  • Optical character recognition(OCR) might be the most plausible method in building database out of printed matters. This paper describes the points to be considered when one selects an OCR system in order to build database. Based on the considerations, we evaluated four commercial OCR systems, and chose one which shows the best recognition rate to build OCT-text database. The subject text, the KT-test collection, is a set of abstracts from proceedings of different printing quality, fonts, and formats. KT-test collection is also provided with typed text database. Recognition rate was calculated by comparing the recognition result with the typed text. No preprocessing such as learning and slant correction was applied to the recognition process in order to simulate a practical environment. The result shows 90.5% of character recognition rate over 970 abstracts. This recognition rate is still insufficient for practical use. The errors in OCR texts are different from those of manually typed texts. In this paper, we classify the errors in OCR texts for the further research.

  • PDF

Rating Prediction by Evaluation Item through Sentiment Analysis of Restaurant Review

  • So, Jin-Soo;Shin, Pan-Seop
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.6
    • /
    • pp.81-89
    • /
    • 2020
  • Online reviews we encounter commonly on SNS, although a complex range of assessment information affecting the consumer's preferences are included, it is general that such information is just provided by simple numbers or star ratings. Based on those review types, it is not easy to get specific information that consumers want and use it to make a decision for purchase. Therefore, in this study, we propose a prediction methodology that can provide ratings broken down by evaluation items by performing sentiment analysis on restaurant reviews written in Korean. To this end, we select 'food', 'price', 'service', and 'atmosphere' as the main evaluation items of restaurants, and build a new sentiment dictionary for each evaluation item. It also classifies review sentences by rating item, predicts granular ratings through sentiment analysis, and provides additional information that consumers can use to make decisions. Finally, using MAE and RMSE as evaluation indicators it shows that the rating prediction accuracy of the proposed methodology has been improved than previous studies and presents the use case of proposed methodology.

A Study on Domestic Research Trends (2001-2020) of Forest Ecology Using Text Mining (텍스트마이닝을 활용한 국내 산림생태 분야 연구동향(2001-2020) 분석)

  • Lee, Jinkyu;Lee, Chang-Bae
    • Journal of Korean Society of Forest Science
    • /
    • v.110 no.3
    • /
    • pp.308-321
    • /
    • 2021
  • The purpose of this study was to analyze domestic research trends over the past 20 years and future direction of forest ecology using text mining. A total of 1,015 academic papers and keywords data related to forest ecology were collected by the "Research and Information Service Section" and analyzed using big data analysis programs, such as Textom and UCINET. From the results of word frequency and N-gram analyses, we found domestic studies on forest ecology rapidly increased since 2011. The most common research topic was "species diversity" over the past 20 years and "climate change" became a major topic since 2011. Based on CONCOR analysis, study subjects were grouped intoeight categories, such as "species diversity," "environmental policy," "climate change," "management," "plant taxonomy," "habitat suitability index," "vascular plants," and "recreation and welfare." Consequently, species diversity and climate change will remain important topics in the future and diversifying and expanding domestic research topics following global research trendsis necessary.

Active Senior Contents Trend Analysis using LDA Topic Modeling (LDA 토픽 모델링을 이용한 액티브 시니어 콘텐츠 트렌드 분석)

  • Lee, Dongwoo;Kim, Yoosin;Shin, Eunjung
    • Journal of Internet Computing and Services
    • /
    • v.22 no.5
    • /
    • pp.35-45
    • /
    • 2021
  • The purpose of this study is to understand the characteristics and trends of active senior. As the baby boom generation become the age of the elderly, they are more active than senior. These seniors are called active seniors, a new consumer group. Many countries and companies are also interested in providing relevant policies and services, but there is lack of researches on active senior trends. This study collects the 8,740 posts related to active seniors on social media from January 1st, 2018 to June 31st, 2021, and conducted keyword frequency analysis, TF-IDF analysis and LDA topic modeling. Through LDA topic modeling, topics are classified into 10 categories: lifestyle, benefits, shopping, government business, government education, health, society and economy, care industry, silver housing, leisure. The results of this study can be utilized as fundamental data to help understand the academic and industrial aspects of active senior.