• Title/Summary/Keyword: Text data

Search Result 2,953, Processing Time 0.029 seconds

Practical Text Mining for Trend Analysis: Ontology to visualization in Aerospace Technology

  • Kim, Yoosin;Ju, Yeonjin;Hong, SeongGwan;Jeong, Seung Ryul
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.8
    • /
    • pp.4133-4145
    • /
    • 2017
  • Advances in science and technology are driving us to the better life but also forcing us to make more investment at the same time. Therefore, the government has provided the investment to carry on the promising futuristic technology successfully. Indeed, a lot of resources from the government have supported into the science and technology R&D projects for several decades. However, the performance of the public investments remains unclear in many ways, so thus it is required that planning and evaluation about the new investment should be on data driven decision with fact based evidence. In this regard, the government wanted to know the trend and issue of the science and technology with evidences, and has accumulated an amount of database about the science and technology such as research papers, patents, project reports, and R&D information. Nowadays, the database is supporting to various activities such as planning policy, budget allocation, and investment evaluation for the science and technology but the information quality is not reached to the expectation because of limitations of text mining to drill out the information from the unstructured data like the reports and papers. To solve the problem, this study proposes a practical text mining methodology for the science and technology trend analysis, in case of aerospace technology, and conduct text mining methods such as ontology development, topic analysis, network analysis and their visualization.

A Text Categorization Method Improved by Removing Noisy Training Documents (오류 학습 문서 제거를 통한 문서 범주화 기법의 성능 향상)

  • Han, Hyoung-Dong;Ko, Young-Joong;Seo, Jung-Yun
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.9
    • /
    • pp.912-919
    • /
    • 2005
  • When we apply binary classification to multi-class classification for text categorization, we use the One-Against-All method generally, However, this One-Against-All method has a problem. That is, documents of a negative set are not labeled by human. Thus, they can include many noisy documents in the training data. In this paper, we propose that the Sliding Window technique and the EM algorithm are applied to binary text classification for solving this problem. We here improve binary text classification through extracting noise documents from the training data by the Sliding Window technique and re-assigning categories of these documents using the EM algorithm.

Hazard Analysis for Usability Evaluation of Central Monitoring System through Text Network Analysis (텍스트 네트워크 분석을 통한 환자중앙감시시스템의 사용적합성 평가를 위한 위해요인 분석)

  • Ji-Yong Chung;Wonseuk Jang
    • Journal of Biomedical Engineering Research
    • /
    • v.45 no.4
    • /
    • pp.187-194
    • /
    • 2024
  • In this study, text network analysis was performed using PMS(Post-Marketing Surveillance) data collected from the FDA's MAUDE(Manufacturer and User Facility Device Experience) database to evaluate the usability of the central monitoring system. Based on the data reported from January 1, 2021 to June 30, 2023, keywords related to the central monitoring system were extracted and visualized with a text network. By analyzing the eigenvector centrality of text network, we identified hazards and types of hazardous situations related to usability of the central monitoring system. Eigenvector centrality was chosen because it is relatively more accurate than other centralities. In addition, we derived an appropriate use scenario to evaluate the usability of the central monitoring system. The research results provide more realistic and valuable insights through data derived based on actual adverse event cases, and are expected to contribute to improving safety and reliability by identifying user requirements for improved usability and reducing use errors in the future.

Text-Mining of Online Discourse to Characterize the Nature of Pain in Low Back Pain

  • Ryu, Young Uk
    • Journal of the Korean Society of Physical Medicine
    • /
    • v.14 no.3
    • /
    • pp.55-62
    • /
    • 2019
  • PURPOSE: Text-mining has been shown to be useful for understanding the clinical characteristics and patients' concerns regarding a specific disease. Low back pain (LBP) is the most common disease in modern society and has a wide variety of causes and symptoms. On the other hand, it is difficult to understand the clinical characteristics and the needs as well as demands of patients with LBP because of the various clinical characteristics. This study examined online texts on LBP to determine of text-mining can help better understand general characteristics of LBP and its specific elements. METHODS: Online data from www.spine-health.com were used for text-mining. Keyword frequency analysis was performed first on the complete text of postings (full-text analysis). Only the sentences containing the highest frequency word, pain, were selected. Next, texts including the sentences were used to re-analyze the keyword frequency (pain-text analysis). RESULTS: Keyword frequency analysis showed that pain is of utmost concern. Full-text analysis was dominated by structural, pathological, and therapeutic words, whereas pain-text analysis was related mainly to the location and quality of the pain. CONCLUSION: The present study indicated that text-mining for a specific element (keyword) of a particular disease could enhance the understanding of the specific aspect of the disease. This suggests that a consideration of the text source is required when interpreting the results. Clinically, the present results suggest that clinicians pay more attention to the pain a patient is experiencing, and provide information based on medical knowledge.

Keyword Data Analysis Using Bayesian Conjugate Prior Distribution (베이지안 공액 사전분포를 이용한 키워드 데이터 분석)

  • Jun, Sunghae
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.6
    • /
    • pp.1-8
    • /
    • 2020
  • The use of text data in big data analytics has been increased. So, much research on methods for text data analysis has been performed. In this paper, we study Bayesian learning based on conjugate prior for analyzing keyword data extracted from text big data. Bayesian statistics provides learning process for updating parameters when new data is added to existing data. This is an efficient process in big data environment, because a large amount of data is created and added over time in big data platform. In order to show the performance and applicability of proposed method, we carry out a case study by analyzing the keyword data from real patent document data.

Automated Classification of PubMed Texts for Disambiguated Annotation Using Text and Data Mining

  • Choi, Yun-Jeong;Park, Seung-Soo
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.101-106
    • /
    • 2005
  • Recently, as the size of genetic knowledge grows faster, automated analysis and systemization into high-throughput database has become hot issue. One essential task is to recognize and identify genomic entities and discover their relations. However, ambiguity of name entities is a serious problem because of their multiplicity of meanings and types. So far, many effective techniques have been proposed to analyze documents. Yet, accuracy is high when the data fits the model well. The purpose of this paper is to design and implement a document classification system for identifying entity problems using text/data mining combination, supplemented by rich data mining algorithms to enhance its performance. we propose RTP ost system of different style from any traditional method, which takes fault tolerant system approach and data mining strategy. This feedback cycle can enhance the performance of the text mining in terms of accuracy. We experimented our system for classifying RB-related documents on PubMed abstracts to verify the feasibility.

  • PDF

Wearable Computing System for the bland persons (시각 장애우를 위한 Wearable Computing System)

  • Kim, Hyung-Ho;Choi, Sun-Hee;Jo, Tea-Jong;Kim, Soon-Ju;Jang, Jea-In
    • Proceedings of the KIEE Conference
    • /
    • 2006.04a
    • /
    • pp.261-263
    • /
    • 2006
  • Nowadays, technologies such as RFID, sensor network makes our life comfortable more and more. In this paper we propose a wearable computing system for blind and deaf person who can be easily out of sight from our technology. We are making a wearable computing system that is consisted of embedded board to processing data, ultrasonic sensors to get distance data and motors that make vibration as a signal to see the screen for a deaf person. This system offers environmental informations by text and voice. For example, distance data from a obstacle to a person are calculated by data compounding module using sensed ultrasonic reflection time. This data is converted to text or voice by main processing module, and are serviced to a handicapped person. Furthermore we will extend this system using a voice recognition module and text to voice convertor module to help communication among the blind and deaf persons.

  • PDF

Study of Mental Disorder Schizophrenia, based on Big Data

  • Hye-Sun Lee
    • International Journal of Advanced Culture Technology
    • /
    • v.11 no.4
    • /
    • pp.279-285
    • /
    • 2023
  • This study provides academic implications by considering trends of domestic research regarding therapy for Mental disorder schizophrenia and psychosocial. For the analysis of this study, text mining with the use of R program and social network analysis method have been used and 65 papers have been collected The result of this study is as follows. First, collected data were visualized through analysis of keywords by using word cloud method. Second, keywords such as intervention, schizophrenia, research, patients, program, effect, society, mind, ability, function were recorded with highest frequency resulted from keyword frequency analysis. Third, LDA (latent Dirichlet allocation) topic modeling result showed that classified into 3 keywords: patient, subjects, intervention of psychosocial, efficacy of interventions. Fourth, the social network analysis results derived connectivity, closeness centrality, betweennes centrality. In conclusion, this study presents significant results as it provided basic rehabilitation data for schizophrenia and psychosocial therapy through new research methods by analyzing with big data method by proposing the results through visualization from seeking research trends of schizophrenia and psychosocial therapy through text mining and social network analysis.

Big Data Analytics of Construction Safety Incidents Using Text Mining (텍스트 마이닝을 활용한 건설안전사고 빅데이터 분석)

  • Jeong Uk Seo;Chie Hoon Song
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.27 no.3
    • /
    • pp.581-590
    • /
    • 2024
  • This study aims to extract key topics through text mining of incident records (incident history, post-incident measures, preventive measures) from construction safety accident case data available on the public data portal. It also seeks to provide fundamental insights contributing to the establishment of manuals for disaster prevention by identifying correlations between these topics. After pre-processing the input data, we used the LDA-based topic modeling technique to derive the main topics. Consequently, we obtained five topics related to incident history, and four topics each related to post-incident measures and preventive measures. Although no dominant patterns emerged from the topic pattern analysis, the study holds significance as it provides quantitative information on the follow-up actions related to the incident history, thereby suggesting practical implications for the establishment of a preventive decision-making system through the linkage between accident history and subsequent measures for reccurrence prevention.

a Study on Using Social Big Data for Expanding Analytical Knowledge - Domestic Big Data supply-demand expectation - (분석지의 확장을 위한 소셜 빅데이터 활용연구 - 국내 '빅데이터' 수요공급 예측 -)

  • Kim, Jung-Sun;Kwon, Eun-Ju;Song, Tae-Min
    • Knowledge Management Research
    • /
    • v.15 no.3
    • /
    • pp.169-188
    • /
    • 2014
  • Big data seems to change knowledge management system and method of enterprises to large extent. Further, the type of method for utilization of unstructured data including image, v ideo, sensor data a nd text may determine the decision on expansion of knowledge management of the enterprise or government. This paper, in this light, attempts to figure out the prediction model of demands and supply for big data market of Korea trough data mining decision making tree by utilizing text bit data generated for 3 years on web and SNS for expansion of form for knowledge management. The results indicate that the market focused on H/W and storage leading by the government is big data market of Korea. Further, the demanders of big data have been found to put important on attribute factors including interest, quickness and economics. Meanwhile, innovation and growth have been found to be the attribute factors onto which the supplier puts importance. The results of this research show that the factors affect acceptance of big data technology differ for supplier and demander. This article may provide basic method for study on expansion of analysis form of enterprise and connection with its management activities.

  • PDF