• Title/Summary/Keyword: Text data

Search Result 2,957, Processing Time 0.261 seconds

Design of Narrative Text Visualization Through Character-net (캐릭터 넷을 통한 내러티브 텍스트 시각화 디자인 연구)

  • Jeon, Hea-Jeong;Park, Seung-Bo;Lee, O-Joun;You, Eun-Soon
    • The Journal of the Korea Contents Association
    • /
    • v.15 no.2
    • /
    • pp.86-100
    • /
    • 2015
  • Through advances driven by the Internet and the Smart Revolution, the amount and types of data generated by users have increased and diversified respectively. There is now a new concept at the center of attention, which is Big Data for assessing enormous amount of data and enjoying new values therefrom. In particular, efforts are required to analyze narratives within video clips and to study how to visualize such narratives in order to search contents stored in the Big Data. As part of the research efforts, this paper analyzes dialogues exchanged among characters and offers an interface named "Character-net" developed for modelling narratives. The interface Character-net can extract characters by analyzing narrative videos and also model the relationships between characters, both in the automatic manner. This signifies a possibility of a tool that can visualize a narrative based on an approach different from those used in existing studies. However, its drawbacks have been observed in terms of limited applications and difficulty in grasping a narrative's features at a glace. It was assumed that Character-net could be improved with the introduction of information design. Against the backdrop, the paper first provides a brief explanation of visualization design found in the data information design area and investigates research cases focused on the visualization of narratives present in videos. Next, key ideas of Character-net and its technical differences from existing studies have been introduced, followed by methods suggested for its potential improvements with the help of design-side solutions.

TF-IDF Based Association Rule Analysis System for Medical Data (의료 정보 추출을 위한 TF-IDF 기반의 연관규칙 분석 시스템)

  • Park, Hosik;Lee, Minsu;Hwang, Sungjin;Oh, Sangyoon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.3
    • /
    • pp.145-154
    • /
    • 2016
  • Because of the recent interest in the u-Health and development of IT technology, a need of utilizing a medical information data has been increased. Among previous studies that utilize various data mining algorithms for processing medical information data, there are studies of association rule analysis. In the studies, an association between the symptoms with specified diseases is the target to discover, however, infrequent terms which can be important information for a disease diagnosis are not considered in most cases. In this paper, we proposed a new association rule mining system considering the importance of each term using TF-IDF weight to consider infrequent but important items. In addition, the proposed system can predict candidate diagnoses from medical text records using term similarity analysis based on medical ontology.

Implementation of Character and Object Metadata Generation System for Media Archive Construction (미디어 아카이브 구축을 위한 등장인물, 사물 메타데이터 생성 시스템 구현)

  • Cho, Sungman;Lee, Seungju;Lee, Jaehyeon;Park, Gooman
    • Journal of Broadcast Engineering
    • /
    • v.24 no.6
    • /
    • pp.1076-1084
    • /
    • 2019
  • In this paper, we introduced a system that extracts metadata by recognizing characters and objects in media using deep learning technology. In the field of broadcasting, multimedia contents such as video, audio, image, and text have been converted to digital contents for a long time, but the unconverted resources still remain vast. Building media archives requires a lot of manual work, which is time consuming and costly. Therefore, by implementing a deep learning-based metadata generation system, it is possible to save time and cost in constructing media archives. The whole system consists of four elements: training data generation module, object recognition module, character recognition module, and API server. The deep learning network module and the face recognition module are implemented to recognize characters and objects from the media and describe them as metadata. The training data generation module was designed separately to facilitate the construction of data for training neural network, and the functions of face recognition and object recognition were configured as an API server. We trained the two neural-networks using 1500 persons and 80 kinds of object data and confirmed that the accuracy is 98% in the character test data and 42% in the object data.

Big Data Analysis of Social Media on Gangwon-do Tourism (강원도 관광에 대한 소셜 미디어 빅데이터 분석)

  • JIN, TIANCHENG;Jeong, Eun-Hee
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.14 no.3
    • /
    • pp.193-200
    • /
    • 2021
  • Recently, posts and opinions on tourist attractions are actively shared on social media. These social big data provide meaningful information to identify objective images of tourist destinations recognized by consumers. Therefore, an in-depth understanding of the tourist image is possible by analyzing these big data on tourism. The study is to analyze destination images in Gangwon-do using big data from social media. It is wanted to understand destination images in Gangwon-do using semantic network analysis and then provided suggestions on how to enhance image to secure differentiated competitiveness as a destination for tourists. According to the frequency analysis results, as tourism in Gangwon-do, Sokcho, Gangneung, and Yangyang were mentioned at a high level in that order, and the purpose of travel was restaurant tour, gourmet food, family trip, vacation, and experience. In particular, it was found that they preferred day trips, weekends, and experiences. Four suggestions were made based on the results. First, it is necessary to develop various types of hotels, accommodation facilities and experience-oriented tour packages. Second, it is necessary to develop a day-to-day travel package that utilizes proximity to the Seoul metropolitan area. Third, it is necessary to promote traditional restaurants and local food. Finally, it is necessary to develop tourist package suitable for healing and family travel. Through this research, the destination image of Gangwon-do was identified and a tourism marketing strategy was presented to improve competitiveness. It also provided a theoretical basis for the use of the big data of tourism consumers in the field of tourism business.

A Study on Verification of Back TranScription(BTS)-based Data Construction (Back TranScription(BTS)기반 데이터 구축 검증 연구)

  • Park, Chanjun;Seo, Jaehyung;Lee, Seolhwa;Moon, Hyeonseok;Eo, Sugyeong;Lim, Heuiseok
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.11
    • /
    • pp.109-117
    • /
    • 2021
  • Recently, the use of speech-based interfaces is increasing as a means for human-computer interaction (HCI). Accordingly, interest in post-processors for correcting errors in speech recognition results is also increasing. However, a lot of human-labor is required for data construction. in order to manufacture a sequence to sequence (S2S) based speech recognition post-processor. To this end, to alleviate the limitations of the existing construction methodology, a new data construction method called Back TranScription (BTS) was proposed. BTS refers to a technology that combines TTS and STT technology to create a pseudo parallel corpus. This methodology eliminates the role of a phonetic transcriptor and can automatically generate vast amounts of training data, saving the cost. This paper verified through experiments that data should be constructed in consideration of text style and domain rather than constructing data without any criteria by extending the existing BTS research.

A Study on Image Recognition of local Currency Consumers Using Big Data (빅데이터를 활용한 지역화폐 소비자 이미지 인식에 관한 연구)

  • Kim, Myung-hee;Ryu, Ki-hwan
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.4
    • /
    • pp.11-17
    • /
    • 2022
  • Currently, the income and funds of the local economy are flowing out to the metropolitan area, and talented people, the driving force for regional development, also gather in the metropolitan area, and the local economy is facing a serious crisis. Local currency is issued by local governments and is a currency with auxiliary and complementary functions that can be used only within the area concerned. In order to revitalize the local economy, as local governments have focused their attention on the introduction of local currency, studies on the issuance and use of local currency are continuously being conducted. In this study, by using big data from data materials such as portals and SNS, the consumer image of local currency issued in local governments was identified through big data analysis, and based on the research results, the issuance and operation of local currency was conducted. The purpose is to present implications for The results of this study are as follows. First, by inducing local consumption through the policy issuance of local currency, it is showing the effect of increasing the economic income of the region. Second, local governments are exerting efforts to revitalize the economy and establish a virtuous cycle system for the local economy by issuing and distributing local currency. Third, the introduction of blockchain technology shows the stable operation of local currency. With academic significance, it was possible to grasp the changed appearance and effect of local currency through big data analysis and the policy direction of local currency.

A Study on Research Trends in Metaverse Platform Using Big Data Analysis (빅데이터 분석을 활용한 메타버스 플랫폼 연구 동향 분석)

  • Hong, Jin-Wook;Han, Jung-Wan
    • Journal of Digital Convergence
    • /
    • v.20 no.5
    • /
    • pp.627-635
    • /
    • 2022
  • As the non-face-to-face situation continues for a long time due to COVID-19, the underlying technologies of the 4th industrial revolution such as IOT, AR, VR, and big data are affecting the metaverse platform overall. Such changes in the external environment such as society and culture can affect the development of academics, and it is very important to systematically organize existing achievements in preparation for changes. The Korea Educational Research Information Service (RISS) collected data including the 'metaverse platform' in the keyword and used the text mining technique, one of the big data analysis. The collected data were analyzed for word cloud frequency, connection strength between keywords, and semantic network analysis to examine the trends of metaverse platform research. As a result of the study, keywords appeared in the order of 'use', 'digital', 'technology', and 'education' in word cloud analysis. As a result of analyzing the connection strength (N-gram) between keywords, 'Edue→Tech' showed the highest connection strength and a total of three clusters of word chain clusters were derived. Detailed research areas were classified into five areas, including 'digital technology'. Considering the analysis results comprehensively, It seems necessary to discover and discuss more active research topics from the long-term perspective of developing a metaverse platform.

A Case Study of Untact Lecture on Albert Camus' La Peste using Big Data (빅데이터를 활용한 『페스트』(알베르 카뮈) 비대면 문학 강의 운영 사례 연구)

  • MIN, Jinyoung
    • The Journal of the Convergence on Culture Technology
    • /
    • v.7 no.4
    • /
    • pp.59-65
    • /
    • 2021
  • This is a case study on the use of Albert Camus' La Peste, which has gained its popularity in today's generation of post-COVID as well as the use of big data analysis tools for major and elective classes. First, we asked students majoring in French to compare the use of vocabulary and the number of appearances for characters using big data analysis, for about 400 pages of the original text. As a result, we were able to confirm a similar relationship between Camus' Absurdism and the vocabulary used within La Peste, in addition to noting the heavy frequency of resistant characters. Students in elective classes were asked to read the literature in a Korean-translated version to determine the frequency of vocabulary and characters' appearances. Students were able to strongly relate to La Peste due to its commonality between COVID and the plague in the literature. We also received high levels of class satisfaction regarding the use of big data analysis tools. The students showed a positive response both towards choosing La Peste as the work of literature and using big data, the main tool in the Fourth Industrial Evolution. We were able to identify good results even in a non-contact environment, as long as the literature does not rely on traditional methods but rather lectures to reflect current situations.

Water leakage accident analysis of water supply networks using big data analysis technique (R기반 빅데이터 분석기법을 활용한 상수도시스템 누수사고 분석)

  • Hong, Sung-Jin;Yoo, Do-Guen
    • Journal of Korea Water Resources Association
    • /
    • v.55 no.spc1
    • /
    • pp.1261-1270
    • /
    • 2022
  • The purpose of this study is to collect and analyze information related to water leaks that cannot be easily accessed, and utilized by using the news search results that people can easily access. We applied a web crawling technique for extracting big data news on water leakage accidents in the water supply system and presented an algorithm in a procedural way to obtain accurate leak accident news. In addition, a data analysis technique suitable for water leakage accident information analysis was developed so that additional information such as the date and time of occurrence, cause of occurrence, location of occurrence, damaged facilities, damage effect. The primary goal of value extraction through big data-based leak analysis proposed in this study is to extract a meaningful value through comparison with the existing waterworks statistical results. In addition, the proposed method can be used to effectively respond to consumers or determine the service level of water supply networks. In other words, the presentation of such analysis results suggests the need to inform the public of information such as accidents a little more, and can be used in conjunction to prepare a radio wave and response system that can quickly respond in case of an accident.

Study on Research Trends (2001~2020) of the Baekdudaegan Mountains with Big Data Analyses of Academic Journals (학술논문 빅데이터 분석을 활용한 백두대간에 관한 연구동향(2001~2020) 분석)

  • Lee, Jinkyu;Sim, Hyung Seok;Lee, Chang-Bae
    • Journal of Korean Society of Forest Science
    • /
    • v.111 no.1
    • /
    • pp.36-49
    • /
    • 2022
  • The purpose of this study was to analyze domestic research trends related to the Baekdudaegan Mountains in the last two decades. In total, 551 academic papers and keyword data related to the Baekdudaegan Mountains were collected using the "Research and Information Service Section" and analyzed using "big data" analysis programs, such as Textom and UCINET. Papers related to the Baekdudaegan Mountains were published in 177 academic journals, and 229 papers (41.6% of all published papers) were published between 2011 and 2015. According to word frequency data (N-gram analyses), the major research topic over the past 20 years was "species diversity." According to CONCOR analysis results, the main research could be divided into 15 areas, the most important of which was "species diversity," followed by "vegetation restoration and management," and "culture." Ecological research comprised 12 groups with a frequency of 78.8%; humanities and social research comprised 2 groups with a frequency of 15.6%. Overall, our study of research areas and quantitative data analyses provides valuable information that could help establish policy formulation.