• Title/Summary/Keyword: Text data

Search Result 2,953, Processing Time 0.034 seconds

Research of Proprioceptive -Vestibular Sensory Integration on Using Big Data Analysis

  • Hye-Sun Lee
    • International Journal of Advanced Culture Technology
    • /
    • v.12 no.2
    • /
    • pp.448-454
    • /
    • 2024
  • This study provides academic implications by considering trends of domestic research regarding therapy for sensory integration intervention based on vestibular-proprioceptive system. For the analysis of this study, text mining with the use of R program and social network analysis method have been used and 53 papers have been collected. In conclusion, this study presents significant results as it provided basic rehabilitation data for sensory integration intervention based on vestibular-proprioceptive system through new research methods by analyzing with big data method by proposing the results through visualization from seeking research trends of sensory integration intervention based on vestibular-proprioceptive system through text mining and social network analysis.

KOREAN TOPIC MODELING USING MATRIX DECOMPOSITION

  • June-Ho Lee;Hyun-Min Kim
    • East Asian mathematical journal
    • /
    • v.40 no.3
    • /
    • pp.307-318
    • /
    • 2024
  • This paper explores the application of matrix factorization, specifically CUR decomposition, in the clustering of Korean language documents by topic. It addresses the unique challenges of Natural Language Processing (NLP) in dealing with the Korean language's distinctive features, such as agglutinative words and morphological ambiguity. The study compares the effectiveness of Latent Semantic Analysis (LSA) using CUR decomposition with the classical Singular Value Decomposition (SVD) method in the context of Korean text. Experiments are conducted using Korean Wikipedia documents and newspaper data, providing insight into the accuracy and efficiency of these techniques. The findings demonstrate the potential of CUR decomposition to improve the accuracy of document clustering in Korean, offering a valuable approach to text mining and information retrieval in agglutinative languages.

A Study on the Value Evaluation of the Unstructured Data within Enterprise (기업내 비정형 데이터의 가치 평가 모델에 관한 연구)

  • Jang, Man-Chul;Kim, Jeong-Su;Kim, Jong-Hee;Kim, Jong-Bae
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2014.05a
    • /
    • pp.367-369
    • /
    • 2014
  • Digital data are mostly comprised of unstructured data such as text file, office file, image file, video file, and drawing file. The recent digital data being generated and used within enterprise are sharply increasing in quantity. Those digital data are becoming significant as digital assets, but the value of digital assets is not properly evaluated. Accordingly, this study will present a model to evaluate the value of unstructured data as digital assets within enterprise and will also present a differentiated management plan for unstructured data as assets.

  • PDF

Peronsal Happiness Analysis using Big Data Based Text Design Monitoring System Architecture Design (빅데이터 기반의 텍스트를 활용한 개인 행복도 분석 모니터링 시스템 아키텍쳐 설계)

  • Sim, Jong-seong;Kim, Hee-chul
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2019.05a
    • /
    • pp.504-506
    • /
    • 2019
  • The text and diary data of many SNSs around the world are uploaded, but it does not go beyond sharing and recording the data. In general, social big data is used to identify taste and interests. However, there is a need for a system that analyzes and displays their status and information. Therefore, in this paper, the happiness diary system deals with the design of the system that can record the data of the SNS and its own diary, store them in the big data system, and express the happiness through their diary and SNS data using emotional analysis.

  • PDF

Development of an Integrated DataBase System of Marine Geological and Geophysical Data Around the Korean Peninsula (한반도 해역 해양지질 및 지구물리 자료 통합 DB시스템 개발)

  • KIM, Sung-Dae;BAEK, Sang-Ho;CHOI, Sang-Hwa;PARK, Hyuk-Min
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.19 no.2
    • /
    • pp.47-62
    • /
    • 2016
  • An integrated database(DB) system was developed to manage the marine geological data and geophysical data acquired from around the Korean peninsula from 2009 to 2013. Geological data such as size analysis data, columnar section images, X-ray images, heavy metal data, and organic carbon data of sediment samples, were collected in the form of text files, excel files, PDF files and image files. Geophysical data such as seismic data, magnetic data, and gravity data were gathered in the form of SEG-Y binary files, image files and text files. We collected scientific data from research projects funded by the Ministry of Oceans and Fisheries, data produced by domestic marine organizations, and public data provided by foreign organizations. All the collected data were validated manually and stored in the archive DB according to data processing procedures. A geographic information system was developed to manage the spatial information and provide data effectively using the map interface. Geographic information system(GIS) software was used to import the position data from text files, manipulate spatial data, and produce shape files. A GIS DB was set up using the Oracle database system and ArcGIS spatial data engine. A client/server GIS application was developed to support data search, data provision, and visualization of scientific data. It provided complex search functions and on-the-fly visualization using ChartFX and specially developed programs. The system is currently being maintained and newly collected data is added to the DB system every year.

Analysis of the Unstructured Traffic Report from Traffic Broadcasting Network by Adapting the Text Mining Methodology (텍스트 마이닝을 적용한 한국교통방송제보 비정형데이터의 분석)

  • Roh, You Jin;Bae, Sang Hoon
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.17 no.3
    • /
    • pp.87-97
    • /
    • 2018
  • The traffic accident reports that are generated by the Traffic Broadcasting Networks(TBN) are unstructured data. It, however, has the value as some sort of real-time traffic information generated by the viewpoint of the drives and/or pedestrians that were on the roads, the time and spots, not the offender or the victim who caused the traffic accidents. However, the traffic accident reports, which are big data, were not applied to traffic accident analysis and traffic related research commonly. This study adopting text-mining technique was able to provide a clue for utilizing it for the impacts of traffic accidents. Seven years of traffic reports were grasped by this analysis. By analyzing the reports, it was possible to identify the road names, accident spot names, time, and to identify factors that have the greatest influence on other drivers due to traffic accidents. Authors plan to combine unstructured accident data with traffic reports for further study.

Material as a Key Element of Fashion Trend in 2010~2019 - Text Mining Analysis - (패션 트렌트(2010~2019)의 주요 요소로서 소재 - 텍스트마이닝을 통한 분석 -)

  • Jang, Namkyung;Kim, Min-Jeong
    • Fashion & Textile Research Journal
    • /
    • v.22 no.5
    • /
    • pp.551-560
    • /
    • 2020
  • Due to the nature of fashion design that responds quickly and sensitively to changes, accurate forecasting for upcoming fashion trends is an important factor in the performance of fashion product planning. This study analyzed the major phenomena of fashion trends by introducing text mining and a big data analysis method. The research questions were as follows. What is the key term of the 2010SS~2019FW fashion trend? What are the terms that are highly relevant to the key trend term by year? Which terms relevant to the key trend term has shown high frequency in news articles during the same period? Data were collected through the 2010SS~2019FW Pre-Trend data from the leading trend information company in Korea and 45,038 articles searched by "fashion+material" from the News Big Data System. Frequency, correlation coefficient, coefficient of variation and mapping were performed using R-3.5.1. Results showed that the fashion trend information were reflected in the consumer market. The term with the highest frequency in 2010SS~2019FW fashion trend information was material. In trend information, the terms most relevant to material were comfort, compact, look, casual, blend, functional, cotton, processing, metal and functional by year. In the news article, functional, comfort, sports, leather, casual, eco-friendly, classic, padding, culture, and high-quality showed the high frequency. Functional was the only fashion material term derived every year for 10 years. This study helps expand the scope and methods of fashion design research as well as improves the information analysis and forecasting capabilities of the fashion industry.

Research and Development of Document Recognition System for Utilizing Image Data (이미지데이터 활용을 위한 문서인식시스템 연구 및 개발)

  • Kwag, Hee-Kue
    • The KIPS Transactions:PartB
    • /
    • v.17B no.2
    • /
    • pp.125-138
    • /
    • 2010
  • The purpose of this research is to enhance document recognition system which is essential for developing full-text retrieval system of the document image data stored in the digital library of a public institution. To achieve this purpose, the main tasks of this research are: 1) analyzing the document image data and then developing its image preprocessing technology and document structure analysis one, 2) building its specialized knowledge base consisting of document layout and property, character model and word dictionary, respectively. In addition, developing the management tool of this knowledge base, the document recognition system is able to handle the various types of the document image data. Currently, we developed the prototype system of document recognition which is combined with the specialized knowledge base and the library of document structure analysis, respectively, adapted for the document image data housed in National Archives of Korea. With the results of this research, we plan to build up the test-bed and estimate the performance of document recognition system to maximize the utilization of full-text retrieval system.

Fire Accident Analysis of Hazardous Materials Using Data Analytics (Data Analytics를 활용한 위험물 화재사고 분석)

  • Shin, Eun-Ji;Koh, Moon-Soo;Shin, Dongil
    • Journal of the Korean Institute of Gas
    • /
    • v.24 no.5
    • /
    • pp.47-55
    • /
    • 2020
  • Hazardous materials accidents are not limited to the leakage of the material, but if the early response is not appropriate, it can lead to a fire or an explosion, which increases the scale of the damage. However, as the 4th industrial revolution and the rise of the big data era are being discussed, systematic analysis of hazardous materials accidents based on new techniques has not been attempted, but simple statistics are being collected. In this study, we perform the systematic analysis, using machine learning, on the fire accident data for the past 11 years (2008 ~ 2018), accumulated by the National Fire Service. The analysis results are visualized and presented through text mining analysis, and the possibility of developing a damage-scale prediction model is explored by applying the regression analysis method, using the main factors present in the hazardous materials fire accident data.

An Analysis of IT Proposal Evaluation Results using Big Data-based Opinion Mining (빅데이터 분석 기반의 오피니언 마이닝을 이용한 정보화 사업 평가 분석)

  • Kim, Hong Sam;Kim, Chong Su
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.41 no.1
    • /
    • pp.1-10
    • /
    • 2018
  • Current evaluation practices for IT projects suffer from several problems, which include the difficulty of self-explanation for the evaluation results and the improperly scaled scoring system. This study aims to develop a methodology of opinion mining to extract key factors for the causal relationship analysis and to assess the feasibility of quantifying evaluation scores from text comments using opinion mining based on big data analysis. The research has been performed on the domain of publicly procured IT proposal evaluations, which are managed by the National Procurement Service. Around 10,000 sets of comments and evaluation scores have been gathered, most of which are in the form of digital data but some in paper documents. Thus, more refined form of text has been prepared using various tools. From them, keywords for factors and polarity indicators have been extracted, and experts on this domain have selected some of them as the key factors and indicators. Also, those keywords have been grouped into into dimensions. Causal relationship between keyword or dimension factors and evaluation scores were analyzed based on the two research models-a keyword-based model and a dimension-based model, using the correlation analysis and the regression analysis. The results show that keyword factors such as planning, strategy, technology and PM mostly affects the evaluation result and that the keywords are more appropriate forms of factors for causal relationship analysis than the dimensions. Also, it can be asserted from the analysis that evaluation scores can be composed or calculated from the unstructured text comments using opinion mining, when a comprehensive dictionary of polarity for Korean language can be provided. This study may contribute to the area of big data-based evaluation methodology and opinion mining for IT proposal evaluation, leading to a more reliable and effective IT proposal evaluation method.