• Title/Summary/Keyword: natural language generation

Search Result 138, Processing Time 0.028 seconds

Performance Improvement of a Korean Prosodic Phrase Boundary Prediction Model using Efficient Feature Selection (효율적인 기계학습 자질 선별을 통한 한국어 운율구 경계 예측 모델의 성능 향상)

  • Kim, Min-Ho;Kwon, Hyuk-Chul
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.11
    • /
    • pp.837-844
    • /
    • 2010
  • Prediction of the prosodic phrase boundary is one of the most important natural language processing tasks. We propose, for the natural prediction of the Korean prosodic phrase boundary, a statistical approach incorporating efficient learning features. These new features reflect the factors that affect generation of the prosodic phrase boundary better than existing learning features. Notably, moreover, such learning features, extracted according to the hand-crafted prosodic phrase boundary prediction rule, impart higher accuracy. We developed a statistical model for Korean prosodic phrase boundaries based on the proposed new features. The results were 86.63% accuracy for three levels (major break, minor break, no break) and 81.14% accuracy for six levels (major break with falling tone/rising tone, minor break with falling tone/rising tone/middle tone, no break).

A Study on the Traditional Sash of‘She’Ethnic Group in China (중국 소수민족 이족의 채대)

  • 김성희
    • Journal of the Korean Society of Costume
    • /
    • v.39
    • /
    • pp.59-77
    • /
    • 1998
  • This paper is focused on the traditional sash weaving handicraft of‘She’ethnic group, which is located in Fujian, Zhejiang, Jiangxi, Guangdong province of China. This research is main-ly based on the field work, analyzed and inter-preted the traditional sash in systematic and reasoned way. The summary of this study are as follows : 1. On its technological aspect, weaving structure of the traditional sash is made of warp rod backed weaving. The used tool is primitive one but the weaving process includes scientific method. 2. From the social-cultural point of view, the sash ha been the symbol of love towards her lover. Every woman of this group had taken training for this sash weaving from a child. 3. On its ethnological aspect, it has been long history and has interchanged with other ethnic group like Miao, Han and also Okinawa country of Japan. The pattern inside this sash are almost looks like characters, but they are not Chinese characters whereas are the inde-pendent code of‘She’ group and have been inherent from ancestors and which will be tran-smitted to their posterity. These independent code of‘She’group are the traditional message to their later generation implicating their natural circumstances, human relationship, ethnic myth, spirit etc. 4. I recognize that the pattern inside the sash is defined as the communicative code and in comparison to language, it is more repetition and less apparent as close code. Nowadays China has been developed es-pecially in the economical fields rapidly. Under the circumstances traditional weaving culture of ethnic groups has been facing a crisis of disappearance, which will be a great loss for the country as well as the human beings. For this reason, I emphasize that it is very immediate to make co-researches into the material culture of Chinese ethnic groups.

  • PDF

A study on automation of modal analysis of a spindle system of machine tools using ANSYS (ANSYS를 활용한 공작기계 주축 시스템의 진동 모드 해석 자동화에 관한 연구)

  • Lee, Bong-Gu;Choi, Jin-Woo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.16 no.4
    • /
    • pp.2338-2343
    • /
    • 2015
  • An analytical model was developed in this study and then implemented into a tool for automation of FEA (Finite Element Analysis) of a spindle system for natural frequencies and modes in the universal FEA software, ANSYS. VBA of EXCEL was used for the implementation. It allowed graphic user interfaces (GUIs) to be developed for a user to interact with the tool and, in addition, an EXCEL spreadsheet to be used for data arrangement. A code was developed in the language of ANSYS to generate the geometric model of the spindle system, sequentially to construct the analytical model based on the information in the GUIs, and finally to perform computation for the FEA. Its automation of the model generation and analysis can help to identify a near optimal design of the spindle system under design in minimum time and efforts.

A Study on Smartwatch review data of SNS and sentiment analytical using opinion mining (스마트워치 SNS 리뷰 데이터와 오피니언 마이닝을 통한 감성 분석 처리에 대한 연구)

  • Shin, Donghyun;Choi, YongLak
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2015.10a
    • /
    • pp.1047-1050
    • /
    • 2015
  • Wearable device, along with IoT(Internet of Things), is considered the core of upcoming generation's convergence technology. Companies are intensely competing one another for prior occupation in the smartwatch market. Consumers that use smartwatch express their preferences by sharing their opinions through SNS(Social Networking Service). Through this study, emotions dictionary is built, which consists of attributes and emotional words related to smartwatch. Based on the emotions dictionary, SNS data has been categorized according to the attributes through opinion data model. Afterwards, overall polarity and attribute polarity of collected data are distinguished through natural language parsing, followed by an analysis of smartwatch reviews. This study will contribute to determination of which attributes of smartwatch to be improved, to arise consumer's interest for individual smartwatch.

  • PDF

Classification and analysis of error types for deep learning-based Korean spelling correction (딥러닝 기반 한국어 맞춤법 교정을 위한 오류 유형 분류 및 분석)

  • Koo, Seonmin;Park, Chanjun;So, Aram;Lim, Heuiseok
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.12
    • /
    • pp.65-74
    • /
    • 2021
  • Recently, studies on Korean spelling correction have been actively conducted based on machine translation and automatic noise generation. These methods generate noise and use as train and data set. This has limitation in that it is difficult to accurately measure performance because it is unlikely that noise other than the noise used for learning is included in the test set In addition, there is no practical error type standard, so the type of error used in each study is different, making qualitative analysis difficult. This paper proposes new 'error type classification' for deep learning-based Korean spelling correction research, and error analysis perform on existing commercialized Korean spelling correctors (System A, B, C). As a result of analysis, it was found the three correction systems did not perform well in correcting other error types presented in this paper other than spacing, and hardly recognized errors in word order or tense.

Evaluation of Sentimental Texts Automatically Generated by a Generative Adversarial Network (생성적 적대 네트워크로 자동 생성한 감성 텍스트의 성능 평가)

  • Park, Cheon-Young;Choi, Yong-Seok;Lee, Kong Joo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.6
    • /
    • pp.257-264
    • /
    • 2019
  • Recently, deep neural network based approaches have shown a good performance for various fields of natural language processing. A huge amount of training data is essential for building a deep neural network model. However, collecting a large size of training data is a costly and time-consuming job. A data augmentation is one of the solutions to this problem. The data augmentation of text data is more difficult than that of image data because texts consist of tokens with discrete values. Generative adversarial networks (GANs) are widely used for image generation. In this work, we generate sentimental texts by using one of the GANs, CS-GAN model that has a discriminator as well as a classifier. We evaluate the usefulness of generated sentimental texts according to various measurements. CS-GAN model not only can generate texts with more diversity but also can improve the performance of its classifier.

Text summarization of dialogue based on BERT

  • Nam, Wongyung;Lee, Jisoo;Jang, Beakcheol
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.8
    • /
    • pp.41-47
    • /
    • 2022
  • In this paper, we propose how to implement text summaries for colloquial data that are not clearly organized. For this study, SAMSum data, which is colloquial data, was used, and the BERTSumExtAbs model proposed in the previous study of the automatic summary model was applied. More than 70% of the SAMSum dataset consists of conversations between two people, and the remaining 30% consists of conversations between three or more people. As a result, by applying the automatic text summarization model to colloquial data, a result of 42.43 or higher was derived in the ROUGE Score R-1. In addition, a high score of 45.81 was derived by fine-tuning the BERTSum model, which was previously proposed as a text summarization model. Through this study, the performance of colloquial generation summary has been proven, and it is hoped that the computer will understand human natural language as it is and be used as basic data to solve various tasks.

Alzheimer's Diagnosis and Generation-Based Chatbot Using Hierarchical Attention and Transformer (계층적 어탠션 구조와 트랜스포머를 활용한 알츠하이머 진단과 생성 기반 챗봇)

  • Park, Jun Yeong;Choi, Chang Hwan;Shin, Su Jong;Lee, Jung Jae;Choi, Sang-il
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2022.07a
    • /
    • pp.333-335
    • /
    • 2022
  • 본 논문에서는 기존에 두 가지 모델이 필요했던 작업을 하나의 모델로 처리할 수 있는 자연어 처리 아키텍처를 제안한다. 단일 모델로 알츠하이머 환자의 언어패턴과 대화맥락을 분석하고 두 가지 결과인 환자분류와 챗봇의 대답을 도출한다. 일상생활에서 챗봇으로 환자의 언어특징을 파악한다면 의사는 조기진단을 위해 더 정밀한 진단과 치료를 계획할 수 있다. 제안된 모델은 전문가가 필요했던 질문지법을 대체하는 챗봇 개발에 활용된다. 모델이 수행하는 자연어 처리 작업은 두 가지이다. 첫 번째는 환자가 병을 가졌는지 여부를 확률로 표시하는 '자연어 분류'이고 두 번째는 환자의 대답에 대한 챗봇의 다음 '대답을 생성'하는 것이다. 전반부에서는 셀프어탠션 신경망을 통해 환자 발화 특징인 맥락벡터(context vector)를 추출한다. 이 맥락벡터와 챗봇(전문가, 진행자)의 질문을 함께 인코더에 입력해 질문자와 환자 사이 상호작용 특징을 담은 행렬을 얻는다. 벡터화된 행렬은 환자분류를 위한 확률값이 된다. 행렬을 챗봇(진행자)의 다음 대답과 함께 디코더에 입력해 다음 발화를 생성한다. 이 구조를 DementiaBank의 쿠키도둑묘사 말뭉치로 학습한 결과 인코더와 디코더의 손실함수 값이 유의미하게 줄어들며 수렴하는 양상을 확인할 수 있었다. 이는 알츠하이머병 환자의 발화 언어패턴을 포착하는 것이 향후 해당 병의 조기진단과 종단연구에 기여할 수 있음을 보여준다.

  • PDF

Voice Recognition Chatbot System for an Aging Society: Technology Development and Customized UI/UX Design (고령화 사회를 위한 음성 인식 챗봇 시스템 : 기술 개발과 맞춤형 UI/UX 설계)

  • Yun-Ji Jeong;Min-Seong Yu;Joo-Young Oh;Hyeon-Seok Hwang;Won-Whoi Hun
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.24 no.4
    • /
    • pp.9-14
    • /
    • 2024
  • This study developed a voice recognition chatbot system to address depression and loneliness among the elderly in an aging society. The system utilizes the Whisper model, GPT 2.5, and XTTS2 to provide high-performance voice recognition, natural language processing, and text-to-speech conversion. Users can express their emotions and states and receive appropriate responses, with voice recognition functionality using familiar voices for comfort and reassurance. The UX/UI design considers the cognitive responses, visual impairments, and physical limitations of the smart senior generation, using high contrast colors and readable fonts for enhanced usability. This research is expected to improve the quality of life for the elderly through voice-based interfaces.

KOMUChat: Korean Online Community Dialogue Dataset for AI Learning (KOMUChat : 인공지능 학습을 위한 온라인 커뮤니티 대화 데이터셋 연구)

  • YongSang Yoo;MinHwa Jung;SeungMin Lee;Min Song
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.2
    • /
    • pp.219-240
    • /
    • 2023
  • Conversational AI which allows users to interact with satisfaction is a long-standing research topic. To develop conversational AI, it is necessary to build training data that reflects real conversations between people, but current Korean datasets are not in question-answer format or use honorifics, making it difficult for users to feel closeness. In this paper, we propose a conversation dataset (KOMUChat) consisting of 30,767 question-answer sentence pairs collected from online communities. The question-answer pairs were collected from post titles and first comments of love and relationship counsel boards used by men and women. In addition, we removed abuse records through automatic and manual cleansing to build high quality dataset. To verify the validity of KOMUChat, we compared and analyzed the result of generative language model learning KOMUChat and benchmark dataset. The results showed that our dataset outperformed the benchmark dataset in terms of answer appropriateness, user satisfaction, and fulfillment of conversational AI goals. The dataset is the largest open-source single turn text data presented so far and it has the significance of building a more friendly Korean dataset by reflecting the text styles of the online community.