• Title/Summary/Keyword: Text data

Search Result 2,953, Processing Time 0.031 seconds

A Study on Research Trends of Age-Friendly Using Text Network Analysis : Focusing on 「The Korean Journal of Health Service Management」 (2007-2018) (텍스트 네트워크 분석을 활용한 고령친화 분야의 연구동향 분석 : 「보건의료산업학회지」 게재논문(2007~2018)을 중심으로)

  • Ko, Min-Seok
    • The Korean Journal of Health Service Management
    • /
    • v.13 no.4
    • /
    • pp.19-31
    • /
    • 2019
  • Objectives: The purpose of this study was to analyze research trends in age-friendly research and suggest directions for future research. Methods: For this study, 112 articles related to age-friendly research were selected, from 605 published articles in The Korean Journal of Health Service Management (2007-2018). Content analysis and text network analysis were conducted using SPSS 23.0 and NetMiner 4. Results: First, 2 authors (30.4%) and 4 keywords (45.5%) were the most studied. Most of the studies used quantitative research (93.8%). Primary data (61.9%) and SPSS (77.7%) were the most used for analysis. Second, there were seven common keywords in the top 10 in all the centralities. They were Elderly, Geriatric Hospital, Depression, Care Workers, Long-Term Care Facilities, Experience, and Attitude. Conclusions: This study shows the need for diversity of research topics, subjects, research methods, and analytical tools in future age-friendly related studies. In addition, it suggests activating convergence research in this field linked to various industries and services.

Development of Radar Data Use Program (레이더자료 활용 프로그램 개발)

  • Han, Myoung Sun;Lee, Dong-Ryul
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2016.05a
    • /
    • pp.233-233
    • /
    • 2016
  • 현재 한국건설기술연구원에서 X-밴드 이중편파레이더를 운영하고 있으며, 이 결과 NetCDF 파일형식의 레이더 관측자료가 생성되고 있다. 레이더 자료포맷인 Netcdf 자료의 경우 레이더 관측과정에서 발생한 결과를 극좌표 형식으로 저장하고 있어 이를 분석이나 시스템에 적용하여 활용하기 위해서는 격자좌표로 변환하는 것이 필요하고, 또한 다양한 자료 변환 및 추출작업을 텍스트 기반으로 하기 위해 다양한 사전 작업이 필요하여 일반사용자가 사용하는데 어려운 상황이다. 그래서 이를 쉽게 수행할 수 있도록 JAVA를 이용하여 윈도우 기반으로 사용할 수 있는 프로그램(KICTRadar4WIN) 프로그램을 개발하였다. KICTRadar4WIN 프로그램의 경우 레이더 자료 품질관리, 레이더 자료 관리, 레이더 자료 추출, 레이더 자료 표출의 4가지 기능을 포함하고 있다. ${\bullet}$ 레이더 자료 품질관리 - 원시자료에 QC 기준을 입력하여 QC된 레이더자료를 생성 ${\bullet}$ 레이더 자료관리 - CAPPI 자료생성 : 관측된 PPI 및 RHI 자료를 이용하여, CAPPI 자료를 생성 - QPE 자료생성 : CAPPI 자료를 이용하여 QPE 자료를 생 - QPE 자료보정 : 지점우량을 이용한 G/R비를 산정하여 QPE 보정자료를 생성 ${\bullet}$ 레이더 자료 추출 - 격자자료 추출 : PPI, CAPPI, QPE 자료를 TEXT 자료로 변환하여 저장 - 지점자료 추출 : 입력된 지점좌표 중심으로 선택한 범위의 평균값을 TEXT 파일로 저장 - 면적자료 추출 : 입력된 면적자료의 평균값을 추출하여 TEXT파일로 저장 ${\bullet}$ 레이더 자료 표출 - 영상표출 : PPI, CAPPI, QPE 관측변수 자료를 그림파일 생성 - KMZ 자료생성 : PPI, CAPPI, QPE 자료를 KMZ 파일 생성

  • PDF

Cross-Domain Text Sentiment Classification Method Based on the CNN-BiLSTM-TE Model

  • Zeng, Yuyang;Zhang, Ruirui;Yang, Liang;Song, Sujuan
    • Journal of Information Processing Systems
    • /
    • v.17 no.4
    • /
    • pp.818-833
    • /
    • 2021
  • To address the problems of low precision rate, insufficient feature extraction, and poor contextual ability in existing text sentiment analysis methods, a mixed model account of a CNN-BiLSTM-TE (convolutional neural network, bidirectional long short-term memory, and topic extraction) model was proposed. First, Chinese text data was converted into vectors through the method of transfer learning by Word2Vec. Second, local features were extracted by the CNN model. Then, contextual information was extracted by the BiLSTM neural network and the emotional tendency was obtained using softmax. Finally, topics were extracted by the term frequency-inverse document frequency and K-means. Compared with the CNN, BiLSTM, and gate recurrent unit (GRU) models, the CNN-BiLSTM-TE model's F1-score was higher than other models by 0.0147, 0.006, and 0.0052, respectively. Then compared with CNN-LSTM, LSTM-CNN, and BiLSTM-CNN models, the F1-score was higher by 0.0071, 0.0038, and 0.0049, respectively. Experimental results showed that the CNN-BiLSTM-TE model can effectively improve various indicators in application. Lastly, performed scalability verification through a takeaway dataset, which has great value in practical applications.

Postural Control Strategies on Smart Phone use during Gait in Over 50-year-old Adults (50세 이상 성인의 보행 시 스마트폰 사용에 따른 자세 조절 전략)

  • Yu, Yeon Joo;Lee, Ki Kwang;Lee, Jung Ho;Kim, Suk Bum
    • Korean Journal of Applied Biomechanics
    • /
    • v.29 no.2
    • /
    • pp.71-77
    • /
    • 2019
  • Objective: The aim of this study was to investigate postural control strategies on smart phone use during gait in over 50-year-old adults. Method: 8 elderly subjects (age: $55.5{\pm}3.29yrs$, height: $159.75{\pm}4.20cm$, weight: $62.87{\pm}8.44kg$) and 10 young subjects (age: $23.8{\pm}3.19yrs$, height: $158.8{\pm}5.97cm$, weight: $53.6{\pm}5.6kg$) participated in the study. They walked at a comfortable pace in a gaitway of ~8 m while: 1) reading text on a smart phone, 2) typing text on a smart phone, or 3) walking without the use of a phone. Gait parameters and kinematic data were evaluated using a three-dimensional movement analysis system. Results: The participants read or wrote text messages they walked with: slower speed; lesser stride length and step width; greater flexion range of motion of the head; more flexion of the thorax in comparison with normal walking. Conclusion: Texting or reading message on a smart phone while walking may pose an additional risk to pedestrians' safety.

OryzaGP: rice gene and protein dataset for named-entity recognition

  • Larmande, Pierre;Do, Huy;Wang, Yue
    • Genomics & Informatics
    • /
    • v.17 no.2
    • /
    • pp.17.1-17.3
    • /
    • 2019
  • Text mining has become an important research method in biology, with its original purpose to extract biological entities, such as genes, proteins and phenotypic traits, to extend knowledge from scientific papers. However, few thorough studies on text mining and application development, for plant molecular biology data, have been performed, especially for rice, resulting in a lack of datasets available to solve named-entity recognition tasks for this species. Since there are rare benchmarks available for rice, we faced various difficulties in exploiting advanced machine learning methods for accurate analysis of the rice literature. To evaluate several approaches to automatically extract information from gene/protein entities, we built a new dataset for rice as a benchmark. This dataset is composed of a set of titles and abstracts, extracted from scientific papers focusing on the rice species, and is downloaded from PubMed. During the 5th Biomedical Linked Annotation Hackathon, a portion of the dataset was uploaded to PubAnnotation for sharing. Our ultimate goal is to offer a shared task of rice gene/protein name recognition through the BioNLP Open Shared Tasks framework using the dataset, to facilitate an open comparison and evaluation of different approaches to the task.

Improving methods for normalizing biomedical text entities with concepts from an ontology with (almost) no training data at BLAH5 the CONTES

  • Ferre, Arnaud;Ba, Mouhamadou;Bossy, Robert
    • Genomics & Informatics
    • /
    • v.17 no.2
    • /
    • pp.20.1-20.5
    • /
    • 2019
  • Entity normalization, or entity linking in the general domain, is an information extraction task that aims to annotate/bind multiple words/expressions in raw text with semantic references, such as concepts of an ontology. An ontology consists minimally of a formally organized vocabulary or hierarchy of terms, which captures knowledge of a domain. Presently, machine-learning methods, often coupled with distributional representations, achieve good performance. However, these require large training datasets, which are not always available, especially for tasks in specialized domains. CONTES (CONcept-TErm System) is a supervised method that addresses entity normalization with ontology concepts using small training datasets. CONTES has some limitations, such as it does not scale well with very large ontologies, it tends to overgeneralize predictions, and it lacks valid representations for the out-of-vocabulary words. Here, we propose to assess different methods to reduce the dimensionality in the representation of the ontology. We also propose to calibrate parameters in order to make the predictions more accurate, and to address the problem of out-of-vocabulary words, with a specific method.

Building a Hierarchy of Product Categories through Text Analysis of Product Description (텍스트 분석을 통한 제품 분류 체계 수립방안: 관광분야 App을 중심으로)

  • Lim, Hyuna;Choi, Jaewon;Lee, Hong Joo
    • Knowledge Management Research
    • /
    • v.20 no.3
    • /
    • pp.139-154
    • /
    • 2019
  • With the increasing use of smartphone apps, many apps are coming out in various fields. In order to analyze the current status and trends of apps in a specific field, it is necessary to establish a classification scheme. Various schemes considering users' behavior and characteristics of apps have been proposed, but there is a problem in that many apps are released and a fixed classification scheme must be updated according to the passage of time. Although it is necessary to consider many aspects in establishing classification scheme, it is possible to grasp the trend of the app through the proposal of a classification scheme according to the characteristic of the app. This research proposes a method of establishing an app classification scheme through the description of the app written by the app developers. For this purpose, we collected explanations about apps in the tourism field and identified major categories through topic modeling. Using only the apps corresponding to the topic, we construct a network of words contained in the explanatory text and identify subcategories based on the networks of words. Six topics were selected, and Clauset Newman Moore algorithm was applied to each topic to identify subcategories. Four or five subcategories were identified for each topic.

Estimating Media Environments of Fashion Contents through Semantic Network Analysis from Social Network Service of Global SPA Brands (패션콘텐츠 미디어 환경 예측을 위한 해외 SPA 브랜드의 SNS 언어 네트워크 분석)

  • Jun, Yuhsun
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.43 no.3
    • /
    • pp.427-439
    • /
    • 2019
  • This study investigated the semantic network based on the focus of the fashion image and SNS text utilized by global SPA brands on the last seven years in terms of the quantity and quality of data generated by the fast-changing fashion trends and fashion content-based media environment. The research method relocated frequency, density and repetitive key words as well as visualized algorithms using the UCINET 6.347 program and the overall classification of the text related to fashion images on social networks used by global SPA brands. The conclusions of the study are as follows. A common aspect of global SPA brands is that by looking at the basis of text extraction on SNS, exposure through image of products is considered important for sales. The following is a discriminatory aspect of global SPA brands. First, ZARA consistently exposes marketing using a variety of professions and nationalities to SNS. Second, UNIQLO's correlation exposes its collaboration promotion to SNS while steadily exposing basic items. Third, in the case of H&M, some discriminatory results were found with other brands in connectivity with each cluster category that showed remarkably independent results.

A weighted method for evaluating software quality (가중치를 적용한 소프트웨어 품질 평가 방법)

  • Jung, Hye Jung
    • Journal of Digital Convergence
    • /
    • v.19 no.8
    • /
    • pp.249-255
    • /
    • 2021
  • This study proposed a method for determining weights for the eight quality characteristics, such as functionality, reliability, usability, maintainability, portability, efficiency, security, and interoperability, which are suggested by international standards, focusing on software test reports. Currently, the test results for software quality evaluation apply the same weight to 8 quality characteristics to obtain the arithmetic average. Weights for 8 quality characteristics were applied using the results from text analysis, and weights were applied using the results of text analysis of test reports for two products. It was confirmed that the average of test reports according to the weighted quality characteristics was more efficient.

A Study on 『HaeHokByeonUi』 by Lee, ByungHa (이병하(李炳夏)의 『해혹변의(解惑辨疑)』 연구)

  • Park, Hun-pyeong
    • Journal of Korean Medical classics
    • /
    • v.34 no.1
    • /
    • pp.1-25
    • /
    • 2021
  • Objectives : The purpose of this paper is to analyze the text of the 『HaeHokByeonUi(解惑辨疑)』 in detail and to collect information on its author, Lee, ByungHa. Methods : Family and life of Lee, ByungHa were reconstructed through genealogy and historical data published by the government. The contents and frequency of title items were analyzed. Results :1. The period of writing is estimated to be between 1827-1831. 2. At that time, there were one JeonUigam(典醫監)-bujigjang(副直長), and four medical officers who belonged to the Chijongcheong(治腫廳). 3. There was a total of 2434 title items, of which 472 items were overlaps. 4. The proportion of general vocabulary is higher than that of other vocabulary. 5. The overlapping title items are presumed to be important basic concepts within the medical text of that time. Conclusions : 『HaeHokByeonUi(解惑辨疑)』 was likely an introductory text to those preparing for the National Medical Examination of the 19th century. It provides useful basic medical vocabulary to learners of Korean Medicine even today.