• Title/Summary/Keyword: analyzing unstructured data

Search Result 107, Processing Time 0.022 seconds

Analysis of related words for each private security service through collection of unstructured data

  • Park, Su-Hyeon;Cho, Cheol-Kyu
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.6
    • /
    • pp.219-224
    • /
    • 2020
  • The purpose of this study is mainly to provide theoretical basis of private security industry by analyzing the perception and flow of private security from the press-released materials according to periodic classification and duties through 'Big Kinds', a website of analyzing news big data. The research method has been changed to structured data to allow an analysis of various scattered unstructured data, and the keywords trend and related words by duties of private security were analyzed in growth period of private security. The perception of private security based on the results of the study was exposed a lot by the media through various crimes, accidents and incidents, and the issues related permanent position. Also, it tended to be perceived as a simple security guard, not recognized as the area of private security, and judging from the high correlation between private security and police, it was recognized not only as a role to assist the police force, but also as a common agent in charge of the public peace. Therefore, it should objectively judge the perception of private security, and through this, it is believed that it should be a foundation for recognizing private security as a main agent responsible for the safety of the nation and maintaining social orders.

An Implementation of Intefrated Database for Electronic Medical Record System in East-West Medical Collabration (${\cdot}$양방 협진 전자의무기록 시스템 구축을 위한 통합 데이터베이스 구축)

  • Ahn, Yo-Chan;Oh, Sang-Bong
    • Journal of Information Technology Applications and Management
    • /
    • v.12 no.2
    • /
    • pp.129-143
    • /
    • 2005
  • In recent years, two major streams in medical information systems are:1) system integration among OCS(Order Communication System), EMR(Electronic Medical Record), PACS(Picture Archiving and Communication System), and ERP(Enterprise Resource Planning) and 2) system integration through medical collaboration between East and West medical service providers. One of the characteristics which differentiate the Korean medical industry from the western medical industry is the East-West medical collaboration. In many respects there are many differences between East and West medical treatment. Although East and West medical treatment have developed from different medical philosophies and standards, we assume that the better medical care can be provided by integrating their medical procedures effectively. The two possible approaches to the integration of East and West medical information systems are suggested in this paper:One is loosely coupled model and the other is tightly coupled model. EMR improves the quality of medical record which reflects the quality of clinical practice. It provides more efficient and convenient way of input, retrieval, storage, communication and management of medical data. We abstracted the standard medical procedures from the two medical procedures performed in Daejeon Oriental Hospital and Hehwa Clinic at Daejeon University and also abstracted database schema by analyzing the characteristics of information needed in East-West medical collaboration. Our EMR is composed of two types of data:one is structured data and the other is unstructured data, which are formalized by SOAP(Subjective, Objective, Assessment, Plan) format. Currently the integrated system is implemented and operated successfully for six months.

  • PDF

Development of integrated management solution through log analysis based on Big Data (빅데이터기반의 로그분석을 통한 통합 관리 솔루션 개발)

  • Kang, Sun-Kyoung;Lee, Hyun-Chang;Shin, Seong-Yoon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2017.10a
    • /
    • pp.541-542
    • /
    • 2017
  • In this paper, we intend to develop an integrated management solution that can be easily operated by integrating complex and various cloud environments. This has the advantage that users and administrators can conveniently solve problems by collecting and analyzing fixed log data and unstructured log data based on big data and realizing integrated monitoring in real time. Hypervisor log pattern analysis technology will be able to manage existing complex and various cloud environment more efficiently.

  • PDF

Pilot Experiment for Named Entity Recognition of Construction-related Organizations from Unstructured Text Data

  • Baek, Seungwon;Han, Seung H.;Jung, Wooyong;Kim, Yuri
    • International conference on construction engineering and project management
    • /
    • 2022.06a
    • /
    • pp.847-854
    • /
    • 2022
  • The aim of this study is to develop a Named Entity Recognition (NER) model to automatically identify construction-related organizations from news articles. This study collected news articles using web crawling technique and construction-related organizations were labeled within a total of 1,000 news articles. The Bidirectional Encoder Representations from Transformers (BERT) model was used to recognize clients, constructors, consultants, engineers, and others. As a pilot experiment of this study, the best average F1 score of NER was 0.692. The result of this study is expected to contribute to the establishment of international business strategies by collecting timely information and analyzing it automatically.

  • PDF

Analyzing Contextual Polarity of Unstructured Data for Measuring Subjective Well-Being (주관적 웰빙 상태 측정을 위한 비정형 데이터의 상황기반 긍부정성 분석 방법)

  • Choi, Sukjae;Song, Yeongeun;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.83-105
    • /
    • 2016
  • Measuring an individual's subjective wellbeing in an accurate, unobtrusive, and cost-effective manner is a core success factor of the wellbeing support system, which is a type of medical IT service. However, measurements with a self-report questionnaire and wearable sensors are cost-intensive and obtrusive when the wellbeing support system should be running in real-time, despite being very accurate. Recently, reasoning the state of subjective wellbeing with conventional sentiment analysis and unstructured data has been proposed as an alternative to resolve the drawbacks of the self-report questionnaire and wearable sensors. However, this approach does not consider contextual polarity, which results in lower measurement accuracy. Moreover, there is no sentimental word net or ontology for the subjective wellbeing area. Hence, this paper proposes a method to extract keywords and their contextual polarity representing the subjective wellbeing state from the unstructured text in online websites in order to improve the reasoning accuracy of the sentiment analysis. The proposed method is as follows. First, a set of general sentimental words is proposed. SentiWordNet was adopted; this is the most widely used dictionary and contains about 100,000 words such as nouns, verbs, adjectives, and adverbs with polarities from -1.0 (extremely negative) to 1.0 (extremely positive). Second, corpora on subjective wellbeing (SWB corpora) were obtained by crawling online text. A survey was conducted to prepare a learning dataset that includes an individual's opinion and the level of self-report wellness, such as stress and depression. The participants were asked to respond with their feelings about online news on two topics. Next, three data sources were extracted from the SWB corpora: demographic information, psychographic information, and the structural characteristics of the text (e.g., the number of words used in the text, simple statistics on the special characters used). These were considered to adjust the level of a specific SWB. Finally, a set of reasoning rules was generated for each wellbeing factor to estimate the SWB of an individual based on the text written by the individual. The experimental results suggested that using contextual polarity for each SWB factor (e.g., stress, depression) significantly improved the estimation accuracy compared to conventional sentiment analysis methods incorporating SentiWordNet. Even though literature is available on Korean sentiment analysis, such studies only used only a limited set of sentimental words. Due to the small number of words, many sentences are overlooked and ignored when estimating the level of sentiment. However, the proposed method can identify multiple sentiment-neutral words as sentiment words in the context of a specific SWB factor. The results also suggest that a specific type of senti-word dictionary containing contextual polarity needs to be constructed along with a dictionary based on common sense such as SenticNet. These efforts will enrich and enlarge the application area of sentic computing. The study is helpful to practitioners and managers of wellness services in that a couple of characteristics of unstructured text have been identified for improving SWB. Consistent with the literature, the results showed that the gender and age affect the SWB state when the individual is exposed to an identical queue from the online text. In addition, the length of the textual response and usage pattern of special characters were found to indicate the individual's SWB. These imply that better SWB measurement should involve collecting the textual structure and the individual's demographic conditions. In the future, the proposed method should be improved by automated identification of the contextual polarity in order to enlarge the vocabulary in a cost-effective manner.

Feature Analyze and Research of National Convergence R&D: With Focus on the Text Mining (국가 융합 R&D 특성 분석에 관한 연구: 텍스트분석을 중심으로)

  • Yoo, KiCheol;Lee, TaeHee;Choi, SangHyun;Lee, JungHwan
    • Journal of Information Technology Applications and Management
    • /
    • v.27 no.1
    • /
    • pp.59-73
    • /
    • 2020
  • There is a growing interest in convergence. National R & D is also providing various policies and institutional support to promote convergence research. Convergence research, however, does not clearly specify its characteristics at the academic and government levels. This research proceeds with the process of collecting, refining, analyzing, modeling, verifying and visualizing national R & D data through the National Science and Technology Information Service (NTIS). The method is to derive the convergence research characteristics and to derive through text mining, focusing on the unstructured information of national R & D project data. The study confirmed that there was a difference in perception between the definition of converged research and the research site. In order to improve this, the research suggested that convergence among research subjects, collaboration among research topics reflecting various backgrounds and characteristics of researchers, and analysis of characteristics of convergence research using information were suggested in the process of establishing convergence policy.

A Study on the Measures for the Development of Electronic Security in the 4th Industrial Revolution Era (4차 산업혁명 시대 Electronic Security 발전 방안에 관한 연구)

  • Kim, Min Su
    • Convergence Security Journal
    • /
    • v.20 no.3
    • /
    • pp.109-114
    • /
    • 2020
  • Currently, in the 4th industrial revolution era(4IR), the convergent infrastructure has been established by actively utilizing data based on the existing digital technological innovation in the 3rd industrial revolution. Thus, the technological innovation based on the knowledge-information society needs to put innovative efforts for creating new business models in various areas. Thus, this study aims to present an Electronic Security Framework by suggesting the Cyber-Physical Security System(CPSS) that could more accurately predict and efficiently utilize it based on structured data obtained by collecting, analyzing, and processing an enormous amount of unstructured data which is a core technology distinguished from the 3rd industrial revolution.

Analytical Research to Identify Issues Using Online Media Related to Festivals (축제 관련 온라인 매체를 활용한 이슈 도출 분석연구)

  • Lee, Jeongwon;Lee, Choong Ho
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.493-495
    • /
    • 2021
  • Local festivals, an intangible tourism resource, contribute to the development of the local tourism industry by developing specialized products and tourism products to develop the region. With a very high interest in festivals in each of these regions, much attention is paid to data analysis on what issues and improvements will be made after the festival. In this study, for festivals in the Danyang-gun area, where many people visit every year among festivals in various regions, the issue of negative or positive relations is visually identified by collecting and analyzing unstructured data, which is an online medium, free from the difficulty of collecting commercial data This study was conducted to derive.

  • PDF

A Study on Effective Sentiment Analysis through News Classification in Bankruptcy Prediction Model (부도예측 모형에서 뉴스 분류를 통한 효과적인 감성분석에 관한 연구)

  • Kim, Chansong;Shin, Minsoo
    • Journal of Information Technology Services
    • /
    • v.18 no.1
    • /
    • pp.187-200
    • /
    • 2019
  • Bankruptcy prediction model is an issue that has consistently interested in various fields. Recently, as technology for dealing with unstructured data has been developed, researches applied to business model prediction through text mining have been activated, and studies using this method are also increasing in bankruptcy prediction. Especially, it is actively trying to improve bankruptcy prediction by analyzing news data dealing with the external environment of the corporation. However, there has been a lack of study on which news is effective in bankruptcy prediction in real-time mass-produced news. The purpose of this study was to evaluate the high impact news on bankruptcy prediction. Therefore, we classify news according to type, collection period, and analyzed the impact on bankruptcy prediction based on sentiment analysis. As a result, artificial neural network was most effective among the algorithms used, and commentary news type was most effective in bankruptcy prediction. Column and straight type news were also significant, but photo type news was not significant. In the news by collection period, news for 4 months before the bankruptcy was most effective in bankruptcy prediction. In this study, we propose a news classification methods for sentiment analysis that is effective for bankruptcy prediction model.

Text-mining based Cause Analysis of Accidents at Workplaces in Korea (텍스트 마이닝 기법을 활용한 우리나라 산업재해의 원인분석)

  • Choi, Gi Heung
    • Journal of the Korean Society of Safety
    • /
    • v.37 no.3
    • /
    • pp.9-15
    • /
    • 2022
  • The analysis of the causes of accidents in workplaces where machines and tools are used is essential to improve the effectiveness and efficiency of safety prevention policies in places of employment in Korea. The causes of workplace accidents are not fully understood mainly due to difficulties in analyzing available descriptive information. This study focuses on the automated accident cause analysis in workplaces based on the accident abstracts found in industrial accident reports written in an unstructured descriptive format. The method proposed in this paper is based on text data mining and uses the keyword search function of Excel software to automate the analysis. The analysis results indicate that the primary reason for the frequency of accidents is related to technical aspects at a stage in which dangerous situations occur in the workplace. Accidents due to managerial causes are typically observed when danger exists in the workplace; however, managerial actions play a more important role in reducing accident severity. A small company tends to use unsafe machines and devices, leading to further accidents due to technical causes, whereas managerial causes are more conspicuous as the company grows. To preclude the occurrence of accidents due to inadequate knowledge, the implementation of safety management and the provision of safety education to elderly workers at the early stage of their employment are particularly important for small companies with less than 100 workers.