• 제목/요약/키워드: Text data

Search Result 2,953, Processing Time 0.034 seconds

Characterization of Five Shu Acupoint Pattern in Saam Acupuncture Using Text Mininig (텍스트마이닝을 통한 사암침법 오수혈 사용 패턴 분석)

  • Park, In-Soo;Jung, Won-Mo;Lee, Ye-Seul;Hahm, Dae-Hyun;Park, Hi-Joon;Chae, Younbyoung
    • Korean Journal of Acupuncture
    • /
    • v.32 no.2
    • /
    • pp.66-74
    • /
    • 2015
  • Background : Saam acupuncture were composed by applying the elemental concepts from the Five Phase theory - the relationships between the cycles such as Saeng(Sheng, 'nourishing' or 'creating') and Geuk(Ke, 'suppressing' or 'controlling') - onto the Five Phase points and 12 channels to compensate for the imbalance in each of the 12 main energy traits. Objective : The present study is aimed to find out the characteristics of Five Phase points pattern in Saam acupuncture. Methods : We analysed the characteristics of five elements of the Five Phase points in Korean medical texts such as Saamdoinchimguyogyeol, Dongeuibogam and Chimgugyeongheombang in mid Chosun Dynasty. Using non-negative factorization(NNMF) methods, we extracted the feature matrix of five elements of Five Phase points in each classic medical text. Results : In Saam acupuncture, two characteristics were most prominent: (1) "Self" component of Five elements, (2) "Mother" and "Grandmother" component of Five elements. Conclusions : Saam acupuncture used the combination of Five-Shu acupoint based on ZangFu pattern identification. Our findings suggest that grasping the characteristics of Five Phase points combinations can improve the understanding the selection of the relevant acupoints based on the ZangFu pattern identifications.

Analysis of the Perception of Autonomous Vehicles Using Text Mining Technique (텍스트 마이닝 기법을 활용한 자율주행자동차 인식분석연구)

  • Im, I-Jeong;Song, Jae-In;Lee, Ja-Young;Hwang, Kee-Yeon
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.16 no.6
    • /
    • pp.231-243
    • /
    • 2017
  • The purpose of this study is to improve the social acceptance of AVs by analyzing the citizen's perception using an emotional analysis technique which belongs to a type of text mining. The source of the data is originated from 3 year accumulated internet articles and comments on AV from 164 newspapers and Naver. According to the study results, there exists a positive perception on AVs, although negative ones are more frequent than the positive. Also most of people take neutral position on AV due to the unfamiliarity and lack of experience on AVs And these problems needs to be responded before AV's commercialization through continuous analyses on the perception and social acceptance.

Automatic Generation of Concatenate Morphemes for Korean LVCSR (대어휘 연속음성 인식을 위한 결합형태소 자동생성)

  • 박영희;정민화
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.4
    • /
    • pp.407-414
    • /
    • 2002
  • In this paper, we present a method that automatically generates concatenate morpheme based language models to improve the performance of Korean large vocabulary continuous speech recognition. The focus was brought into improvement against recognition errors of monosyllable morphemes that occupy 54% of the training text corpus and more frequently mis-recognized. Knowledge-based method using POS patterns has disadvantages such as the difficulty in making rules and producing many low frequency concatenate morphemes. Proposed method automatically selects morpheme-pairs from training text data based on measures such as frequency, mutual information, and unigram log likelihood. Experiment was performed using 7M-morpheme text corpus and 20K-morpheme lexicon. The frequency measure with constraint on the number of morphemes used for concatenation produces the best result of reducing monosyllables from 54% to 30%, bigram perplexity from 117.9 to 97.3. and MER from 21.3% to 17.6%.

A Classification and Selection Method of Emotion Based on Classifying Emotion Terms by Users (사용자의 정서 단어 분류에 기반한 정서 분류와 선택 방법)

  • Rhee, Shin-Young;Ham, Jun-Seok;Ko, Il-Ju
    • Science of Emotion and Sensibility
    • /
    • v.15 no.1
    • /
    • pp.97-104
    • /
    • 2012
  • Recently, a big text data has been produced by users, an opinion mining to analyze information and opinion about users is becoming a hot issue. Of the opinion mining, especially a sentiment analysis is a study for analysing emotions such as a positive, negative, happiness, sadness, and so on analysing personal opinions or emotions for commercial products, social issues and opinions of politician. To analyze the sentiment analysis, previous studies used a mapping method setting up a distribution of emotions using two dimensions composed of a valence and arousal. But previous studies set up a distribution of emotions arbitrarily. In order to solve the problem, we composed a distribution of 12 emotions through carrying out a survey using Korean emotion words list. Also, certain emotional states on two dimension overlapping multiple emotions, we proposed a selection method with Roulette wheel method using a selection probability. The proposed method shows to classify a text into emotion extracting emotion terms from a text.

  • PDF

Optimal Implementation of Format Preserving Encryption Algorithm FEA in Various Environments (다양한 환경에서의 형태보존 암호 FEA에 대한 최적 구현)

  • Park, Cheolhee;Jeong, Sooyong;Hong, Dowon;Seo, Changho
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.28 no.1
    • /
    • pp.41-51
    • /
    • 2018
  • Format preserving encryption(FPE) performs encryption with preserving the size and format of plain-text. Therefore, it is possible to minimize the structural change of the database before and after the encryption. For example, when encrypting data such as credit card number or social security number, it is possible to maintain the existing database structure because FPE outputs the same form of cipher-text as plain-text. Currently, the National Institute of Standards and Technology (NIST) recommends FF1 and FF3 as standards for FPE. Recently, in Korea, FEA, which is a very efficient FPE algorithm, has been adopted as the standard of FPE. In this paper, we analyze FEA and measure the performance of FEA by optimizing it in various environments.

Developing of Text Plagiarism Detection Model using Korean Corpus Data (한글 말뭉치를 이용한 한글 표절 탐색 모델 개발)

  • Ryu, Chang-Keon;Kim, Hyong-Jun;Cho, Hwan-Gue
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.2
    • /
    • pp.231-235
    • /
    • 2008
  • Recently we witnessed a few scandals on plagiarism among academic paper and novels. Plagiarism on documents is getting worse more frequently. Although plagiarism on English had been studied so long time, we hardly find the systematic and complete studies on plagiarisms in Korean documents. Since the linguistic features of Korean are quite different from those of English, we cannot apply the English-based method to Korean documents directly. In this paper, we propose a new plagiarism detecting method for Korean, and we throughly tested our algorithm with one benchmark Korean text corpus. The proposed method is based on "k-mer" and "local alignment" which locates the region of plagiarized document pairs fast and accurately. Using a Korean corpus which contains more than 10 million words, we establish a probability model (or local alignment score (random similarity by chance). The experiment has shown that our system was quite successful to detect the plagiarized documents.

Customer Satisfaction Analysis for Global Cosmetic Brands: Text-mining Based Online Review Analysis (글로벌 화장품 브랜드의 소비자 만족도 분석: 텍스트마이닝 기반의 사용자 후기 분석을 중심으로)

  • Park, Jaehun;Kim, Ye-Rim;Kang, Su-Bin
    • Journal of Korean Society for Quality Management
    • /
    • v.49 no.4
    • /
    • pp.595-607
    • /
    • 2021
  • Purpose: This study introduces a systematic framework to evaluate service satisfaction of cosmetic brands through online review analysis utilizing Text-Mining technique. Methods: The framework assumes that the service satisfaction is evaluated by positive comments from online reviews. That is, the service satisfaction of a cosmetic brand is evaluated higher as more positive opinions are commented in the online reviews. This study focuses on two approaches. First, it collects online review comments from the top 50 global cosmetic brands and evaluates customer service satisfaction for each cosmetic brands by applying Sentimental Analysis and Latent Dirichlet Allocation. Second, it analyzes the determinants that induce or influence service satisfaction and suggests the guidelines for cosmetic brands with low satisfaction to improve their service satisfaction. Results: For the satisfaction evaluation, online review data were extracted from the top 50 global cosmetic brands in the world based on 2018 sales announced by Brand Finance in the UK. As a result of the satisfaction analysis, it was found that overall there were more positive opinions than negative opinions and the averages for polarity, subjectivity, positive ratio, and negative ratio were calculated as 0.50, 0.76, 0.57, and 0.19, respectively. Polarity, subjectivity and positive ratio showed the opposite pattern to negative ratio, and although there was a slight difference in fluctuation range and ranking between them, the patterns are almost same. Conclusion: The usefulness of the proposed framework was verified through case study. Although some studies have suggested a method to analyze online reviews, they didn't deal with the satisfaction evaluation among competitors and cause analysis. This study is different from previous studies in that it evaluates service satisfaction from a relative point of view among cosmetic brands and analyze determinants.

Trends in the Study of Nursing Professionals in Korea: A Convergence Study of Text Network Analysis and Topic Modeling (국내 간호전문직관 연구 주제 동향: 텍스트네트워크분석과 토픽모델링의 융합)

  • Park, Chan-Sook
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.9
    • /
    • pp.295-305
    • /
    • 2021
  • The purpose of this study is to explore the trend of nursing professional research topics published domestically through quantitative content analysis. The research method performed procedures for collecting academic papers, refining and extracting words, and data analysis. A text network was developed by collecting 351 papers and extracting words from the abstract, and network analysis and topic modeling were performed. The core-topics were nurses, nursing professionalism, nursing students, nursing care, professional self-concept, health care professionals, satisfaction, clinical competence, and self-efficacy. Through topic modeling, topic groups of nurse's professionalism, nursing students' professionalism, nursing professional identity, and nursing competency were identified. Over time, core-topics remained unchanged, but topics such as role conflict and ethical values in the 1990s, self-leadership and socialization in the 2000s, and clinical practice stress and support systems in the 2010s have emerged. In conclusion, it is necessary to facilitate multidimensional interventional research to improve nursing professionalism of clinical nurses and nursing students.

Group-based speaker embeddings for text-independent speaker verification (문장 독립 화자 검증을 위한 그룹기반 화자 임베딩)

  • Jung, Youngmoon;Eom, Youngsik;Lee, Yeonghyeon;Kim, Hoirin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.40 no.5
    • /
    • pp.496-502
    • /
    • 2021
  • Recently, deep speaker embedding approach has been widely used in text-independent speaker verification, which shows better performance than the traditional i-vector approach. In this work, to improve the deep speaker embedding approach, we propose a novel method called group-based speaker embedding which incorporates group information. We cluster all speakers of the training data into a predefined number of groups in an unsupervised manner, so that a fixed-length group embedding represents the corresponding group. A Group Decision Network (GDN) produces a group weight, and an aggregated group embedding is generated from the weighted sum of the group embeddings and the group weights. Finally, we generate a group-based embedding by adding the aggregated group embedding to the deep speaker embedding. In this way, a speaker embedding can reduce the search space of the speaker identity by incorporating group information, and thereby can flexibly represent a significant number of speakers. We conducted experiments using the VoxCeleb1 database to show that our proposed approach can improve the previous approaches.

Quantitative Analysis of Research Trends in Korean E-Government Using Text Mining and Network Analysis Methods (국내 전자정부 연구동향에 대한 정량적 분석: 텍스트 마이닝과 네트워크 분석 기법을 중심으로)

  • Lee, Soo-In;Shin, Shin-Ae;Kang, Dong-Seok;Kim, Sang-Hyun
    • Informatization Policy
    • /
    • v.25 no.4
    • /
    • pp.84-107
    • /
    • 2018
  • The existing research on domestic e-government trends in Korea has weaknesses in that it depends only on qualitative research methods. Therefore, a quantitative analysis was conducted through this study as of September 2018 based on the data from 1996 to 2017. A total of seven research topics were derived from text mining, of which the network centrality of the framework and public policy effect were identified as highly significant. The results of this study provide academic and policy implications for the development of e-government. including that using a quantitative analysis method instead of a qualitative method contributes to ensuring relative objectivity and diversity of learning.