• Title/Summary/Keyword: 텍스트 데이터 분석

Search Result 1,095, Processing Time 0.043 seconds

Trend of Research and Industry-Related Analysis in Data Quality Using Time Series Network Analysis (시계열 네트워크분석을 통한 데이터품질 연구경향 및 산업연관 분석)

  • Jang, Kyoung-Ae;Lee, Kwang-Suk;Kim, Woo-Je
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.6
    • /
    • pp.295-306
    • /
    • 2016
  • The purpose of this paper is both to analyze research trends and to predict industrial flows using the meta-data from the previous studies on data quality. There have been many attempts to analyze the research trends in various fields till lately. However, analysis of previous studies on data quality has produced poor results because of its vast scope and data. Therefore, in this paper, we used a text mining, social network analysis for time series network analysis to analyze the vast scope and data of data quality collected from a Web of Science index database of papers published in the international data quality-field journals for 10 years. The analysis results are as follows: Decreases in Mathematical & Computational Biology, Chemistry, Health Care Sciences & Services, Biochemistry & Molecular Biology, Biochemistry & Molecular Biology, and Medical Information Science. Increases, on the contrary, in Environmental Sciences, Water Resources, Geology, and Instruments & Instrumentation. In addition, the social network analysis results show that the subjects which have the high centrality are analysis, algorithm, and network, and also, image, model, sensor, and optimization are increasing subjects in the data quality field. Furthermore, the industrial connection analysis result on data quality shows that there is high correlation between technique, industry, health, infrastructure, and customer service. And it predicted that the Environmental Sciences, Biotechnology, and Health Industry will be continuously developed. This paper will be useful for people, not only who are in the data quality industry field, but also the researchers who analyze research patterns and find out the industry connection on data quality.

The Fourth Industrial Revolution Core Technology Association Analysis Using Text Mining (텍스트 마이닝을 활용한 4차 산업혁명 핵심기술 연관분석)

  • Ryu, Jae-Han;You, Yen-Yoo
    • Journal of Digital Convergence
    • /
    • v.16 no.8
    • /
    • pp.129-136
    • /
    • 2018
  • This study analyzed technology application field and technology transfer type related to the 4th industrial revolution using frequency, visualization, and association analysis of text mining of Big Data. The analysis was conducted between the last three years (2015 - 2017) registered with the NTB of KIAT transfer technology database was utilized. As a result of analysis, First, First, transfer technologies called core technologies of the Fourth Industrial Revolution are a lot of about robots, 3D, autonomous driving, and wearables. Second, as the year go by, transfer technolgy registration such as IoT, Cloud, VR is increasing. Third, the results of the association analysis of technology transfer type are as follows. IoT and VR showed preference for technology trading and licensing, autonomous driving technology trading, wearable licensing, robots preferring technology cooperation, licensing, and technology trading.

Classification Modeling for Predicting Medical Subjects using Patients' Subjective Symptom Text (환자의 주관적 증상 텍스트에 대한 진료과목 분류 모델 구축)

  • Lee, Seohee;Kang, Juyoung
    • The Journal of Bigdata
    • /
    • v.6 no.1
    • /
    • pp.51-62
    • /
    • 2021
  • In the field of medical artificial intelligence, there have been a lot of researches on disease prediction and classification algorithms that can help doctors judge, but relatively less interested in artificial intelligence that can help medical consumers acquire and judge information. The fact that more than 150,000 questions have been asked about which hospital to go over the past year in NAVER portal will be a testament to the need to provide medical information suitable for medical consumers. Therefore, in this study, we wanted to establish a classification model that classifies 8 medical subjects for symptom text directly described by patients which was collected from NAVER portal to help consumers choose appropriate medical subjects for their symptoms. In order to ensure the validity of the data involving patients' subject matter, we conducted similarity measurements between objective symptom text (typical symptoms by medical subjects organized by the Seoul Emergency Medical Information Center) and subjective symptoms (NAVER data). Similarity measurements demonstrated that if the two texts were symptoms of the same medical subject, they had relatively higher similarity than symptomatic texts from different medical subjects. Following the above procedure, the classification model was constructed using a ridge regression model for subjective symptom text that obtained validity, resulting in an accuracy of 0.73.

A Study on the Music Therapy Management Model Based on Text Mining (텍스트 마이닝 기반의 음악치료 관리 모델에 관한 연구)

  • Park, Seong-Hyun;Kim, Jae-Woong;Kim, Dong-Hyun;Cho, Han-Jin
    • Journal of the Korea Convergence Society
    • /
    • v.10 no.8
    • /
    • pp.15-20
    • /
    • 2019
  • Music therapy has shown many benefits in the treatment of disabled children and the mind. Today's music therapy system is a situation where no specific treatment system has been built. In order for the music therapist to make an accurate treatment, various music therapy cases and treatment history data must be analyzed. Although the most appropriate treatment is given to the client or patient, in reality a number of difficulties are followed due to several factors. In this paper, we propose a music therapy knowledge management model which convergence the existing therapy data and text mining technology. By using the proposed model, similar cases can be searched and accurate and effective treatment can be made for the patient or the client based on specific and reliable data related to the patient. This can be expected to bring out the original purpose of the music therapy and its effect to the maximum, and is expected to be useful for treating more patients.

A study on the Elements of Interest for VR Game Users Using Text Mining and Text Network Analysis - Focused on STEAM User Review Data - (텍스트마이닝과 네트워크 분석을 적용한 VR 게임 사용자의 관심 요소 연구 - STEAM 사용자 리뷰 데이터를 중심으로 -)

  • Wui, Min-Young;Na, Ji Young;Park, Young Il
    • Journal of Korea Game Society
    • /
    • v.18 no.6
    • /
    • pp.69-82
    • /
    • 2018
  • The need of high quality VR contents has been steadily raised in recent years. Therefore, this study investigated the user's interest factors of VR game which is receiving the most attention among VR contents. We used STEAM review data and applied Text mining and Network analysis to perform this research. As a result, it was possible to confirm 4 word clusters related VR game users. Each cluster is named by 'presence', 'first person view game', 'auditory factor' and 'interaction'. This study has its meaning. First, user related research would be very helpful to develop high quality VR game. Second, it confirms that review data of VR game users can be structured, analyzed and used.

Analysis of Transportation Big Data in Busan on Media (미디어에 나타난 부산 교통 관련 빅데이터의 분석)

  • Ban, ChaeHoon;Kim, YongSu;Lee, YeChan;Jung, YoonSeung;Jeong, DongMin;Cho, HaeChan
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2016.05a
    • /
    • pp.378-381
    • /
    • 2016
  • 정보기술과 디지털 경제의 확산으로 대규모의 데이터가 생산되는 정보화시대에서 빅데이터의 중요성이 강조되고 있으며 다양한 분야에서 이를 응용하고 있다. 빅 데이터 분석 도구인 R은 통계 기반의 정보 분석을 가능하게 하는 언어와 환경이다. 본 논문에서는 R을 이용하여 미디어에 나타난 부산 교통 관련 빅데이터를 분석한다. 다양한 미디어에서 부산 교통 관련 데이터를 수집하고 어떠한 텍스트가 분포되어 있는지 빈도 조사를 수행한다.

  • PDF

A Pilot Study on Applying Text Mining Tools to Analyzing Steel Industry Trends : A Case Study of the Steel Industry for the Company "P" (철강산업 트렌드 분석을 위한 텍스트 마이닝 도입 연구 : P사(社) 사례를 중심으로)

  • Min, Ki Young;Kim, Hoon Tae;Ji, Yong Gu
    • The Journal of Society for e-Business Studies
    • /
    • v.19 no.3
    • /
    • pp.51-64
    • /
    • 2014
  • It becomes more and more important for business survival to have the ability to predict the future with uncertainties increasing faster and faster. To predict the future, text mining tools are one of the main candidate other than traditional quantitative analyses, but those efforts are still at their infancy. This paper is to introduce one of those efforts using the case of company "P" in the steel industry. Even with only four month pilot studies, we found strong possibilities, if not testified robustly, to predict future industrial trends using text mining tools. For these text mining case studies, we categorized steel industry trend keywords into ten components (10 categories) to study ten different subjects for each category. Once found any meaningful changes in a trend, we had investigated in more detail what and how some trend happened so. To be more roust, firstly we need to define more cleary the purpose of text mining analyses. Then we need to categorize industry trend key words in a more systematic way using systems thinking models. With these improvements, we are quite sure that applying text mining tools to analyzing industry trends will contribute to predicting the future industry trends as well as to identifying the unseen trends otherwise.

토픽모델링을 활용한 부산항 항만안전성 이슈 동향에 관한 연구

  • 이정민;하도연;김율성
    • Proceedings of the Korean Institute of Navigation and Port Research Conference
    • /
    • 2023.11a
    • /
    • pp.66-67
    • /
    • 2023
  • 최근 들어, 현대사회는 예측이 불가능한 다양한 위험성들이 존재하여 글로벌 의존도가 높은 항만물류산업의 위험부담이 증가하고 있다. 이에 본 연구에서는 항만산업의 안전성에 영향을 미치는 요인을 알아보기 위해 과거부터 현재까지 국내 항만 안전성에 영향을 미친 이슈들을 시계열적으로 살펴보고자 하였다. 이를 위하여 국내를 대표하는 부산항의 항만 안전성과 관련된 뉴스 기사 텍스트 데이터를 활용하여 LDA 토픽모델링 분석을 진행하여 부산항 항만안전 주요 이슈들의 동향을 살펴보고자 하였다.

  • PDF

Development of SVM-based Construction Project Document Classification Model to Derive Construction Risk (건설 리스크 도출을 위한 SVM 기반의 건설프로젝트 문서 분류 모델 개발)

  • Kang, Donguk;Cho, Mingeon;Cha, Gichun;Park, Seunghee
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.43 no.6
    • /
    • pp.841-849
    • /
    • 2023
  • Construction projects have risks due to various factors such as construction delays and construction accidents. Based on these construction risks, the method of calculating the construction period of the construction project is mainly made by subjective judgment that relies on supervisor experience. In addition, unreasonable shortening construction to meet construction project schedules delayed by construction delays and construction disasters causes negative consequences such as poor construction, and economic losses are caused by the absence of infrastructure due to delayed schedules. Data-based scientific approaches and statistical analysis are needed to solve the risks of such construction projects. Data collected in actual construction projects is stored in unstructured text, so to apply data-based risks, data pre-processing involves a lot of manpower and cost, so basic data through a data classification model using text mining is required. Therefore, in this study, a document-based data generation classification model for risk management was developed through a data classification model based on SVM (Support Vector Machine) by collecting construction project documents and utilizing text mining. Through quantitative analysis through future research results, it is expected that risk management will be possible by being used as efficient and objective basic data for construction project process management.

Bio-Sensing Convergence Big Data Computing Architecture (바이오센싱 융합 빅데이터 컴퓨팅 아키텍처)

  • Ko, Myung-Sook;Lee, Tae-Gyu
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.2
    • /
    • pp.43-50
    • /
    • 2018
  • Biometric information computing is greatly influencing both a computing system and Big-data system based on the bio-information system that combines bio-signal sensors and bio-information processing. Unlike conventional data formats such as text, images, and videos, biometric information is represented by text-based values that give meaning to a bio-signal, important event moments are stored in an image format, a complex data format such as a video format is constructed for data prediction and analysis through time series analysis. Such a complex data structure may be separately requested by text, image, video format depending on characteristics of data required by individual biometric information application services, or may request complex data formats simultaneously depending on the situation. Since previous bio-information processing computing systems depend on conventional computing component, computing structure, and data processing method, they have many inefficiencies in terms of data processing performance, transmission capability, storage efficiency, and system safety. In this study, we propose an improved biosensing converged big data computing architecture to build a platform that supports biometric information processing computing effectively. The proposed architecture effectively supports data storage and transmission efficiency, computing performance, and system stability. And, it can lay the foundation for system implementation and biometric information service optimization optimized for future biometric information computing.