• Title/Summary/Keyword: Sentence Generation

Search Result 103, Processing Time 0.026 seconds

Design and Implementation of a Mobile Learning System for Improving Reading Ability of Hearing-impaired Persons (청각장애인의 읽기 능력 향상을 위한 2Bi 접근 모형을 활용한 모바일 학습 시스템의 설계 및 구현)

  • Jung, Mi-A;Jun, Woo-Chun
    • Journal of The Korean Association of Information Education
    • /
    • v.14 no.1
    • /
    • pp.1-12
    • /
    • 2010
  • For hearing-impaired students, it is known that reading ability is the most important means of communication. In the meanwhile, with recent development of wireless communication technologies, mobile devices are used in various education fields. The purpose of this study is to design and implement a mobile system to improve reading ability of hearing-impaired students. For this purpose, "Question Generation Strategy", known as one of the effective methods for improving reading ability, is adopted to make study contents. Also, 2Bi (Bilingual-Bicultural) Approach Model, an attractive model for improving reading ability of hearing-impaired students, is used. Characteristics of the proposed mobile system are as follows. First, the system is developed to let students learn written language usage through repetition and difference of two organically-related curriculums for hearing-impaired students. Second, study contents are made to increase sentence understanding ability using an activity that is to let students read articles, make questions and answer questions for themselves. Third, the proposed system is designed and implemented to allow students to choose study contents individually anytime anywhere depending on their study levels.

  • PDF

Automatic Training Corpus Generation Method of Named Entity Recognition Using Knowledge-Bases (개체명 인식 코퍼스 생성을 위한 지식베이스 활용 기법)

  • Park, Youngmin;Kim, Yejin;Kang, Sangwoo;Seo, Jungyun
    • Korean Journal of Cognitive Science
    • /
    • v.27 no.1
    • /
    • pp.27-41
    • /
    • 2016
  • Named entity recognition is to classify elements in text into predefined categories and used for various departments which receives natural language inputs. In this paper, we propose a method which can generate named entity training corpus automatically using knowledge bases. We apply two different methods to generate corpus depending on the knowledge bases. One of the methods attaches named entity labels to text data using Wikipedia. The other method crawls data from web and labels named entities to web text data using Freebase. We conduct two experiments to evaluate corpus quality and our proposed method for generating Named entity recognition corpus automatically. We extract sentences randomly from two corpus which called Wikipedia corpus and Web corpus then label them to validate both automatic labeled corpus. We also show the performance of named entity recognizer trained by corpus generated in our proposed method. The result shows that our proposed method adapts well with new corpus which reflects diverse sentence structures and the newest entities.

  • PDF

A Study on Regression Class Generation of MLLR Adaptation Using State Level Sharing (상태레벨 공유를 이용한 MLLR 적응화의 회귀클래스 생성에 관한 연구)

  • 오세진;성우창;김광동;노덕규;송민규;정현열
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.8
    • /
    • pp.727-739
    • /
    • 2003
  • In this paper, we propose a generation method of regression classes for adaptation in the HM-Net (Hidden Markov Network) system. The MLLR (Maximum Likelihood Linear Regression) adaptation approach is applied to the HM-Net speech recognition system for expressing the characteristics of speaker effectively and the use of HM-Net in various tasks. For the state level sharing, the context domain state splitting of PDT-SSS (Phonetic Decision Tree-based Successive State Splitting) algorithm, which has the contextual and time domain clustering, is adopted. In each state of contextual domain, the desired phoneme classes are determined by splitting the context information (classes) including target speaker's speech data. The number of adaptation parameters, such as means and variances, is autonomously controlled by contextual domain state splitting of PDT-SSS, depending on the context information and the amount of adaptation utterances from a new speaker. The experiments are performed to verify the effectiveness of the proposed method on the KLE (The center for Korean Language Engineering) 452 data and YNU (Yeungnam Dniv) 200 data. The experimental results show that the accuracies of phone, word, and sentence recognition system increased by 34∼37%, 9%, and 20%, respectively, Compared with performance according to the length of adaptation utterances, the performance are also significantly improved even in short adaptation utterances. Therefore, we can argue that the proposed regression class method is well applied to HM-Net speech recognition system employing MLLR speaker adaptation.

The Effect of Encoding strategy and Transfer Appropriate Processing on Prospective Memory Performance (부호화 전략 유형과 동시과제 처리 적절성이 미래계획기억 수행에 미치는 효과)

  • Park, Youngshin
    • Korean Journal of Cognitive Science
    • /
    • v.27 no.1
    • /
    • pp.101-127
    • /
    • 2016
  • The present study was conducted to examine the effect of meta-cognitive strategy and transfer appropriate processing(TAP) on prospective memory performance. In two experiments, encoding strategy for PM target words was manipulated by instructions. Participants who were assigned to meta strategic condition were engaged to rate task difficulty(EOL) in addition to predict their own performance(JOL), while participants in cognitive strategy condition were to remember target words by pleasantness ratings and sentence generation. In experiment1 and experiment 2, all participants in both conditions performed not only TAP ongoing task but also TIP ongoing task. Results revealed the benefit of meta cognition and transfer appropriate processing on PM performance. Furthermore, the benefit of TAP was diminished in cognitive strategy condition. There were no-costs on judgement tasks across conditions. The findings suggest that meta-cognition allows to sustain PM targets and intention without regard to cognitive resource.

  • PDF

Link between Periodontal Disease and Cancer: A Recent Research Trend (염증-치주 질환과 암에 관한 최근 연구 동향)

  • Lee, Shin Hwa;Choi, Yung Hyun
    • Journal of Life Science
    • /
    • v.23 no.4
    • /
    • pp.602-608
    • /
    • 2013
  • The multifaceted role of chronic inflammation in multistep carcinogenesis has been extensively investigated and well documented. Periodontal diseases are associated with multifactorial agents, including bacterial endotoxins and the generation of an inflammatory response, indicating that poor oral health is associated with a variety of systemic diseases. The association between poor oral health, chronic inflammation, smoking, and increased alcohol consumption as risk factors for tumorogenesis is well established. More recently, associations between oral health and tooth loss and gastric, lung, and pancreatic cancers have been explored, with some studies pointing to smoking and oral health as a common link with an increased risk for malignant disease. In addition, epidemiological studies consistently indicate increased risks of various cancers with periodontal disease or poor oral condition caused by oral bacteria, which may activate alcohol- and smoking-related carcinogens locally or act through chronic inflammation. Appropriate oral care is vital in preventing cancer, as well as many other diseases. Thus, research on the correlation between oral care and periodontal inflammation and cancer is required. This review highlights the association between oral health and the risk of certain malignancies, such as periodontal disease-associated chemoprevention of inflammation" in this sentence.

A Usability Testing on the Tablet PC-based Korean High-tech AAC Software (태블릿 PC 기반 한국형 하이테크 AAC 소프트웨어의 사용성 평가)

  • Lee, Heeyeon;Hong, Ki-Hyung
    • Journal of the HCI Society of Korea
    • /
    • v.7 no.2
    • /
    • pp.35-42
    • /
    • 2012
  • The purpose of this study was to evaluate the usability of the tablet PC-based Korean high-tech AAC(Augmentative Alternative Communication System) software. In order to develop an AAC software which is appropriate to Korean cultural/linguistic contexts and communication needs of the users, we examined the necessity and ease of use for the communication functions that are required in native Korean communication, such as polite expressions, tense expressions, negative expressions, subject-verb auto-matching, and automatic sentence generation functions, using a scenario-based user testing. We also investigated the users' needs, preferences, and satisfaction for the tablet PC-based Korean high tech AAC using a semi-structured and open questionnaires. The participants of this study were 9 special education teachers, 6 speech therapists, and 6 parents whose children had communication disabilities. The results of the usability testing of the tablet PC-based Korean high-tech AAC software presented positive responses in general, by indicating overall scores of above 4 out of 5 except in tense and negative expressions. The necessity and ease of use in the tense and negative expressions were evaluated relatively low, and it might be related to the inconsistent interface with the polite expressions. In terms of the user interface(UI), there were users' needs for clear visual feedback in the symbol selection and display, consistent interface for all functions, more natural subject-verb auto-matching, and spacing in the text within symbols. The results of the usability testing and users' feedback might serve as a guideline to compensate and improve the function and UI of the existing AAC software.

  • PDF

Translation of Korean Object Case Markers to Mongolian's Suffixes (한국어 목적격조사의 몽골어 격 어미 번역)

  • Setgelkhuu, Khulan;Shin, Joon Choul;Ock, Cheol Young
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.2
    • /
    • pp.79-88
    • /
    • 2019
  • Machine translation (MT) system, especially Korean-Mongolian MT system, has recently attracted much attention due to its necessary for the globalization generation. Korean and Mongolian have the same sentence structure SOV and the arbitrarily changing of their words order does not change the meaning of sentences due to postpositional particles. The particles that are attached behind words to indicate their grammatical relationship to the clause or make them more specific in meaning. Hence, the particles play an important role in the translation between Korean and Mongolian. However, one Korean particle can be translated into several Mongolian particles. This is a major issue of the Korean-Mongolian MT systems. In this paper, to address this issue, we propose a method to use the combination of UTagger and a Korean-Mongolian particles table. UTagger is a system that can analyze morphologies, tag POS, and disambiguate homographs for Korean texts. The Korean-Mongolian particles table was manually constructed for matching Korean particles with those of Mongolian. The experiment on the test set extracted from the National Institute of Korean Language's Korean-Mongolian Learner's Dictionary shows that our method achieved the accuracy of 88.38% and it improved the result of using only UTagger by 41.48%.

Analyzing Korean Math Word Problem Data Classification Difficulty Level Using the KoEPT Model (KoEPT 기반 한국어 수학 문장제 문제 데이터 분류 난도 분석)

  • Rhim, Sangkyu;Ki, Kyung Seo;Kim, Bugeun;Gweon, Gahgene
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.8
    • /
    • pp.315-324
    • /
    • 2022
  • In this paper, we propose KoEPT, a Transformer-based generative model for automatic math word problems solving. A math word problem written in human language which describes everyday situations in a mathematical form. Math word problem solving requires an artificial intelligence model to understand the implied logic within the problem. Therefore, it is being studied variously across the world to improve the language understanding ability of artificial intelligence. In the case of the Korean language, studies so far have mainly attempted to solve problems by classifying them into templates, but there is a limitation in that these techniques are difficult to apply to datasets with high classification difficulty. To solve this problem, this paper used the KoEPT model which uses 'expression' tokens and pointer networks. To measure the performance of this model, the classification difficulty scores of IL, CC, and ALG514, which are existing Korean mathematical sentence problem datasets, were measured, and then the performance of KoEPT was evaluated using 5-fold cross-validation. For the Korean datasets used for evaluation, KoEPT obtained the state-of-the-art(SOTA) performance with 99.1% in CC, which is comparable to the existing SOTA performance, and 89.3% and 80.5% in IL and ALG514, respectively. In addition, as a result of evaluation, KoEPT showed a relatively improved performance for datasets with high classification difficulty. Through an ablation study, we uncovered that the use of the 'expression' tokens and pointer networks contributed to KoEPT's state of being less affected by classification difficulty while obtaining good performance.

KOMUChat: Korean Online Community Dialogue Dataset for AI Learning (KOMUChat : 인공지능 학습을 위한 온라인 커뮤니티 대화 데이터셋 연구)

  • YongSang Yoo;MinHwa Jung;SeungMin Lee;Min Song
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.2
    • /
    • pp.219-240
    • /
    • 2023
  • Conversational AI which allows users to interact with satisfaction is a long-standing research topic. To develop conversational AI, it is necessary to build training data that reflects real conversations between people, but current Korean datasets are not in question-answer format or use honorifics, making it difficult for users to feel closeness. In this paper, we propose a conversation dataset (KOMUChat) consisting of 30,767 question-answer sentence pairs collected from online communities. The question-answer pairs were collected from post titles and first comments of love and relationship counsel boards used by men and women. In addition, we removed abuse records through automatic and manual cleansing to build high quality dataset. To verify the validity of KOMUChat, we compared and analyzed the result of generative language model learning KOMUChat and benchmark dataset. The results showed that our dataset outperformed the benchmark dataset in terms of answer appropriateness, user satisfaction, and fulfillment of conversational AI goals. The dataset is the largest open-source single turn text data presented so far and it has the significance of building a more friendly Korean dataset by reflecting the text styles of the online community.

A Study on Knowledge Entity Extraction Method for Individual Stocks Based on Neural Tensor Network (뉴럴 텐서 네트워크 기반 주식 개별종목 지식개체명 추출 방법에 관한 연구)

  • Yang, Yunseok;Lee, Hyun Jun;Oh, Kyong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.25-38
    • /
    • 2019
  • Selecting high-quality information that meets the interests and needs of users among the overflowing contents is becoming more important as the generation continues. In the flood of information, efforts to reflect the intention of the user in the search result better are being tried, rather than recognizing the information request as a simple string. Also, large IT companies such as Google and Microsoft focus on developing knowledge-based technologies including search engines which provide users with satisfaction and convenience. Especially, the finance is one of the fields expected to have the usefulness and potential of text data analysis because it's constantly generating new information, and the earlier the information is, the more valuable it is. Automatic knowledge extraction can be effective in areas where information flow is vast, such as financial sector, and new information continues to emerge. However, there are several practical difficulties faced by automatic knowledge extraction. First, there are difficulties in making corpus from different fields with same algorithm, and it is difficult to extract good quality triple. Second, it becomes more difficult to produce labeled text data by people if the extent and scope of knowledge increases and patterns are constantly updated. Third, performance evaluation is difficult due to the characteristics of unsupervised learning. Finally, problem definition for automatic knowledge extraction is not easy because of ambiguous conceptual characteristics of knowledge. So, in order to overcome limits described above and improve the semantic performance of stock-related information searching, this study attempts to extract the knowledge entity by using neural tensor network and evaluate the performance of them. Different from other references, the purpose of this study is to extract knowledge entity which is related to individual stock items. Various but relatively simple data processing methods are applied in the presented model to solve the problems of previous researches and to enhance the effectiveness of the model. From these processes, this study has the following three significances. First, A practical and simple automatic knowledge extraction method that can be applied. Second, the possibility of performance evaluation is presented through simple problem definition. Finally, the expressiveness of the knowledge increased by generating input data on a sentence basis without complex morphological analysis. The results of the empirical analysis and objective performance evaluation method are also presented. The empirical study to confirm the usefulness of the presented model, experts' reports about individual 30 stocks which are top 30 items based on frequency of publication from May 30, 2017 to May 21, 2018 are used. the total number of reports are 5,600, and 3,074 reports, which accounts about 55% of the total, is designated as a training set, and other 45% of reports are designated as a testing set. Before constructing the model, all reports of a training set are classified by stocks, and their entities are extracted using named entity recognition tool which is the KKMA. for each stocks, top 100 entities based on appearance frequency are selected, and become vectorized using one-hot encoding. After that, by using neural tensor network, the same number of score functions as stocks are trained. Thus, if a new entity from a testing set appears, we can try to calculate the score by putting it into every single score function, and the stock of the function with the highest score is predicted as the related item with the entity. To evaluate presented models, we confirm prediction power and determining whether the score functions are well constructed by calculating hit ratio for all reports of testing set. As a result of the empirical study, the presented model shows 69.3% hit accuracy for testing set which consists of 2,526 reports. this hit ratio is meaningfully high despite of some constraints for conducting research. Looking at the prediction performance of the model for each stocks, only 3 stocks, which are LG ELECTRONICS, KiaMtr, and Mando, show extremely low performance than average. this result maybe due to the interference effect with other similar items and generation of new knowledge. In this paper, we propose a methodology to find out key entities or their combinations which are necessary to search related information in accordance with the user's investment intention. Graph data is generated by using only the named entity recognition tool and applied to the neural tensor network without learning corpus or word vectors for the field. From the empirical test, we confirm the effectiveness of the presented model as described above. However, there also exist some limits and things to complement. Representatively, the phenomenon that the model performance is especially bad for only some stocks shows the need for further researches. Finally, through the empirical study, we confirmed that the learning method presented in this study can be used for the purpose of matching the new text information semantically with the related stocks.