• Title/Summary/Keyword: Text data

Search Result 2,959, Processing Time 0.025 seconds

A Study on the Implementation of Korean History Contents Service based on Linked Open Data (LOD 기반 한국사 콘텐츠 서비스 구축에 관한 연구)

  • Yoon, So Young
    • Journal of the Korean Society for information Management
    • /
    • v.30 no.3
    • /
    • pp.297-315
    • /
    • 2013
  • Anyone curious to easily access and learn Korean history has become interested in Korean history data bases, which will provide accurate and reliable historical information. Furthermore, user demands for information sharing and reusability, available through setting up a semantic web, have been increased, which have taken the shape of linked data. Efforts have been made to construct public data bases containing readily usable contents a user can understand and utilize with ease. They have been produced by several organizations, portal sites, and individuals, trying to deviate from existing mainstreams - expert-based text data bases. A problem with those data bases is that they have not considered such vital factors as the sharing and utilizing of information as a whole. This study suggests a LOD-based Korean history contents implementation system, providing rich information environment by way of multi-dimensional web-data connections. In doing so, this system has tried a historic information circulation service system which is based on information sharing and connecting.

Analysis of Big Data Visualization Technology Based on Patent Analysis (특허분석을 통한 빅 데이터의 시각화 기술 분석)

  • Rho, Seungmin;Choi, YongSoo
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.51 no.7
    • /
    • pp.149-154
    • /
    • 2014
  • Modern data computing developments have led to big improvements in graphic capabilities and there are many new possibilities for data displays. The visualization has proven effective for not only presenting essential information in vast amounts of data but also driving complex analyses. Big-data analytics and discovery present new research opportunities to the computer graphics and visualization community. In this paper, we discuss the patent analysis of big data visualization technology development in major countries. Especially, we analyzed 160 patent applications and registered patents in four countries on November 2012. According to the result of analysis provided by this paper, the text clustering analysis and 2D visualization are important and urgent development is needed to be oriented. In particular, due to the increase of use of smart devices and social networks in domestic, the development of three-dimensional visualization for Big Data can be seen very urgent.

Investigation of Research Trends in the D(Data)·N(Network)·A(A.I) Field Using the Dynamic Topic Model (다이나믹 토픽 모델을 활용한 D(Data)·N(Network)·A(A.I) 중심의 연구동향 분석)

  • Wo, Chang Woo;Lee, Jong Yun
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.9
    • /
    • pp.21-29
    • /
    • 2020
  • The Topic Modeling research, the methodology for deduction keyword within literature, has become active with the explosion of data from digital society transition. The research objective is to investigate research trends in D.N.A.(Data, Network, Artificial Intelligence) field using DTM(Dynamic Topic Model). DTM model was applied to the 1,519 of research projects with SW·A.I technology classifications among ICT(Information and Communication Technology) field projects between 6 years(2015~2020). As a result, technology keyword for D.N.A. field; Big data, Cloud, Artificial Intelligence, extended keyword; Unstructured, Edge Computing, Learning, Recognition was appeared every year, and accordingly that the above technology is being researched inclusively from other projects can be inferred. Finally, it is expected that the result from this paper become useful for future policy·R&D planning and corporation's technology·marketing strategy.

XSLT document editing for XML document conversion (XML 문서 변환을 위한 XSLT 문서편집 시스템)

  • 송종철;최일선;정회경
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.8 no.4
    • /
    • pp.798-803
    • /
    • 2004
  • XML(extensible Markup Language) of W3C(World Wide Web Consortium) that is used the standard core technology of data exchange on the current Internet is an independent data type of usable at the all platforms. Especially, it can handle rapidly because of the integration of each other data types that is exchanged. Between each application and system that built at an enterprise in the past. However, W3C had notice to use XSLT(extensible Stylesheet Language Transformation) that is document transformable standard to descript expression information in XML documents because documents of XML only have a logical structure information. It is designed for XML that is developed for data exchange on the internet. Moreover, it is proposed to process and to change as other data type for expression XML documents for user. This thesis design and implement XSLT document editing system transformable as a un data type as a HTML data type applying XSLT at XML and developed the system. It can edit XSLT document that descript expression information in XML document that is used for data editing in the WYSIWYG environment.

Analysis of drama viewership related words through unstructured data collection (비정형데이터 수집을 통한 드라마 시청률 연관어 분석)

  • Kang, Sun-Kyoung;Lee, Hyun-Chang;Shin, Seong-Yoon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.8
    • /
    • pp.1567-1574
    • /
    • 2017
  • In this paper, we analyzed the stereotyped and non - stereotyped data in order to analyze the drama 's ratings. The formalized data collection collected 19 items from the four areas of drama information, person information, broadcasting information, and audience rating information of each broadcasting company. Atypical data were collected from bulletin boards, pre - broadcast blogs and post - broadcast blogs operated by each broadcasting company using a crawling technique. As a result of comparing the differences according to the four areas for each broadcaster from the collected regular data, the results were similar to each other. And we derived seven related words by analyzing the correlation of occurrence frequencies from unstructured data collected from bulletin boards and blogs of each broadcasting company. The derived associations were obtained through reliability analysis.

The Analysis of Chosun Danasty Poetry Using 3D Data Visualization (3D 시각화를 이용한 조선시대 시문 분석)

  • Min, Kyoung-Ju;Lee, Byoung-Chan
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.7
    • /
    • pp.861-868
    • /
    • 2021
  • With the development of technology for visualizing big-data, tasks such as intuitively analyzing a lot of data, detecting errors, and deriving meaning are actively progressing. In this paper, we describe the design and implementation of a 3D analysis that collects and stores the writing data in Chinese characters provided by the Korean Classical Database of the Korean Classics Translation Institute, stores and progress the data, and visualizes the writing information in a 3D network diagram. It solves the problem when a large amount of data is expressed in 2D, intuitive that analysis, error detection, meaningful data extraction such as characteristics, similarity, differences, etc. and user convenience can be provided. In this paper, we improved the problems of analyzing Chosun dynasty poetry in Chinese characters using 2D visualization conducted in previous studies.

Research trends in statistics for domestic and international journal using paper abstract data (초록데이터를 활용한 국내외 통계학 분야 연구동향)

  • Yang, Jong-Hoon;Kwak, Il-Youp
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.2
    • /
    • pp.267-278
    • /
    • 2021
  • As time goes by, the amount of data is increasing regardless of government, business, domestic or overseas. Accordingly, research on big data is increasing in academia. Statistics is one of the major disciplines of big data research, and it will be interesting to understand the research trend of statistics through big data in the growing number of papers in statistics. In this study, we analyzed what studies are being conducted through abstract data of statistical papers in Korea and abroad. Research trends in domestic and international were analyzed through the frequency of keyword data of the papers, and the relationship between the keywords was visualized through the Word Embedding method. In addition to the keywords selected by the authors, words that are importantly used in statistical papers selected through Textrank were also visualized. Lastly, 10 topics were investigated by applying the LDA technique to the abstract data. Through the analysis of each topic, we investigated which research topics are frequently studied and which words are used importantly.

Empirical Study on Analyzing Training Data for CNN-based Product Classification Deep Learning Model (CNN기반 상품분류 딥러닝모델을 위한 학습데이터 영향 실증 분석)

  • Lee, Nakyong;Kim, Jooyeon;Shim, Junho
    • The Journal of Society for e-Business Studies
    • /
    • v.26 no.1
    • /
    • pp.107-126
    • /
    • 2021
  • In e-commerce, rapid and accurate automatic product classification according to product information is important. Recent developments in deep learning technology have been actively applied to automatic product classification. In order to develop a deep learning model with good performance, the quality of training data and data preprocessing suitable for the model are crucial. In this study, when categories are inferred based on text product data using a deep learning model, both effects of the data preprocessing and of the selection of training data are extensively compared and analyzed. We employ our CNN model as an example of deep learning model. In the experimental analysis, we use a real e-commerce data to ensure the verification of the study results. The empirical analysis and results shown in this study may be meaningful as a reference study for improving performance when developing a deep learning product classification model.

A study on the systematic operation of the innovative patent strategy framework and the application plan of patent big data to secure competitive advantage (혁신특허전략 프레임워크의 체계적 운영 및 경쟁우위확보를 위한 특허빅테이터 활용방안에 관한 연구)

  • Kim, Hyun Ah;Cha, Wan Kyu
    • The Journal of the Convergence on Culture Technology
    • /
    • v.7 no.2
    • /
    • pp.351-357
    • /
    • 2021
  • At the time when interest in the use of big data is rising in the face of the technological paradigm shift of the 4th industrial revolution, interest in the use of patented big data is increasing, especially as the proportion of intangible assets of companies increases. In addition to quantitative information, patent data contains various information such as unstructured text such as title, abstract, claim, citation and citation relations, drawings, and technology classification. It is judged that the use of treatment is important. Therefore, in this study, in order to systematically operate the innovative patent strategy framework and to secure a competitive advantage by strengthening the fundamental technological competitiveness of the company, we propose a method of using patent big data centering on the case of Company A, and verify its validity. I would like to suggest some implications. Through this, it is intended to raise awareness of the use of patent big data, and to suggest ways to use patent big data in connection with the company's company-wide strategy, business strategy, and functional strategy.

The Method for Real-time Complex Event Detection of Unstructured Big data (비정형 빅데이터의 실시간 복합 이벤트 탐지를 위한 기법)

  • Lee, Jun Heui;Baek, Sung Ha;Lee, Soon Jo;Bae, Hae Young
    • Spatial Information Research
    • /
    • v.20 no.5
    • /
    • pp.99-109
    • /
    • 2012
  • Recently, due to the growth of social media and spread of smart-phone, the amount of data has considerably increased by full use of SNS (Social Network Service). According to it, the Big Data concept is come up and many researchers are seeking solutions to make the best use of big data. To maximize the creative value of the big data held by many companies, it is required to combine them with existing data. The physical and theoretical storage structures of data sources are so different that a system which can integrate and manage them is needed. In order to process big data, MapReduce is developed as a system which has advantages over processing data fast by distributed processing. However, it is difficult to construct and store a system for all key words. Due to the process of storage and search, it is to some extent difficult to do real-time processing. And it makes extra expenses to process complex event without structure of processing different data. In order to solve this problem, the existing Complex Event Processing System is supposed to be used. When it comes to complex event processing system, it gets data from different sources and combines them with each other to make it possible to do complex event processing that is useful for real-time processing specially in stream data. Nevertheless, unstructured data based on text of SNS and internet articles is managed as text type and there is a need to compare strings every time the query processing should be done. And it results in poor performance. Therefore, we try to make it possible to manage unstructured data and do query process fast in complex event processing system. And we extend the data complex function for giving theoretical schema of string. It is completed by changing the string key word into integer type with filtering which uses keyword set. In addition, by using the Complex Event Processing System and processing stream data at real-time of in-memory, we try to reduce the time of reading the query processing after it is stored in the disk.