• Title/Summary/Keyword: Language Models

Search Result 885, Processing Time 0.031 seconds

A Study on Word Sense Disambiguation Using Bidirectional Recurrent Neural Network for Korean Language

  • Min, Jihong;Jeon, Joon-Woo;Song, Kwang-Ho;Kim, Yoo-Sung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.4
    • /
    • pp.41-49
    • /
    • 2017
  • Word sense disambiguation(WSD) that determines the exact meaning of homonym which can be used in different meanings even in one form is very important to understand the semantical meaning of text document. Many recent researches on WSD have widely used NNLM(Neural Network Language Model) in which neural network is used to represent a document into vectors and to analyze its semantics. Among the previous WSD researches using NNLM, RNN(Recurrent Neural Network) model has better performance than other models because RNN model can reflect the occurrence order of words in addition to the word appearance information in a document. However, since RNN model uses only the forward order of word occurrences in a document, it is not able to reflect natural language's characteristics that later words can affect the meanings of the preceding words. In this paper, we propose a WSD scheme using Bidirectional RNN that can reflect not only the forward order but also the backward order of word occurrences in a document. From the experiments, the accuracy of the proposed model is higher than that of previous method using RNN. Hence, it is confirmed that bidirectional order information of word occurrences is useful for WSD in Korean language.

Comparative Study on English Proficiency of Children of ESL(English as a Second Language) & EFL(English as Foreign Language) Learning Programs (ESL과 EFL학습프로그램에 의한 아동 영어능력 비교연구)

  • Yoon, Eu-Gene;Chong, Young-Sook
    • Korean Journal of Human Ecology
    • /
    • v.14 no.6
    • /
    • pp.961-972
    • /
    • 2005
  • The purpose of this study is to investigate the improvement of English proficiency of children in the ESL and EFL learning style classrooms through the experiment method. The results of this research are as follows: first, the scores of listening and speaking and the perception of alphabets in the ESL program are higher than that in the EFL program. This means that learning in the ESL style classroom is the better way to improve English skills than in the EFL style classroom, which is common in Korea. Second, there is no difference in the English listening and speaking skills and the perception of the English alphabets between the two gender groups in the ESL & EFL style classrooms. These results suggest that the target language may be used in the English classrooms by the teachers and the students with the materials, books, and equipment are English. Teachers are expected to be in charge of playing decisive roles as demonstrators of speech, models and correctors of pronunciation and providers of materials including TV, VCR, CD players, and cassette recorders, etc.

  • PDF

Analysis of the Korean Tokenizing Library Module (한글 토크나이징 라이브러리 모듈 분석)

  • Lee, Jae-kyung;Seo, Jin-beom;Cho, Young-bok
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.78-80
    • /
    • 2021
  • Currently, research on natural language processing (NLP) is rapidly evolving. Natural language processing is a technology that allows computers to analyze the meanings of languages used in everyday life, and is used in various fields such as speech recognition, spelling tests, and text classification. Currently, the most commonly used natural language processing library is NLTK based on English, which has a disadvantage in Korean language processing. Therefore, after introducing KonLPy and Soynlp, the Korean Tokenizing libraries, we will analyze morphology analysis and processing techniques, compare and analyze modules with Soynlp that complement KonLPy's shortcomings, and use them as natural language processing models.

  • PDF

Korean Text to Gloss: Self-Supervised Learning approach

  • Thanh-Vu Dang;Gwang-hyun Yu;Ji-yong Kim;Young-hwan Park;Chil-woo Lee;Jin-Young Kim
    • Smart Media Journal
    • /
    • v.12 no.1
    • /
    • pp.32-46
    • /
    • 2023
  • Natural Language Processing (NLP) has grown tremendously in recent years. Typically, bilingual, and multilingual translation models have been deployed widely in machine translation and gained vast attention from the research community. On the contrary, few studies have focused on translating between spoken and sign languages, especially non-English languages. Prior works on Sign Language Translation (SLT) have shown that a mid-level sign gloss representation enhances translation performance. Therefore, this study presents a new large-scale Korean sign language dataset, the Museum-Commentary Korean Sign Gloss (MCKSG) dataset, including 3828 pairs of Korean sentences and their corresponding sign glosses used in Museum-Commentary contexts. In addition, we propose a translation framework based on self-supervised learning, where the pretext task is a text-to-text from a Korean sentence to its back-translation versions, then the pre-trained network will be fine-tuned on the MCKSG dataset. Using self-supervised learning help to overcome the drawback of a shortage of sign language data. Through experimental results, our proposed model outperforms a baseline BERT model by 6.22%.

A Study on the Web Building Assistant System Using GUI Object Detection and Large Language Model (웹 구축 보조 시스템에 대한 GUI 객체 감지 및 대규모 언어 모델 활용 연구)

  • Hyun-Cheol Jang;Hyungkuk Jang
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2024.05a
    • /
    • pp.830-833
    • /
    • 2024
  • As Large Language Models (LLM) like OpenAI's ChatGPT[1] continue to grow in popularity, new applications and services are expected to emerge. This paper introduces an experimental study on a smart web-builder application assistance system that combines Computer Vision with GUI object recognition and the ChatGPT (LLM). First of all, the research strategy employed computer vision technology in conjunction with Microsoft's "ChatGPT for Robotics: Design Principles and Model Abilities"[2] design strategy. Additionally, this research explores the capabilities of Large Language Model like ChatGPT in various application design tasks, specifically in assisting with web-builder tasks. The study examines the ability of ChatGPT to synthesize code through both directed prompts and free-form conversation strategies. The researchers also explored ChatGPT's ability to perform various tasks within the builder domain, including functions and closure loop inferences, basic logical and mathematical reasoning. Overall, this research proposes an efficient way to perform various application system tasks by combining natural language commands with computer vision technology and LLM (ChatGPT). This approach allows for user interaction through natural language commands while building applications.

Neural computer systems (뇌세포형 컴퓨터 시스템)

  • 김성수;우광방
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1988.10a
    • /
    • pp.552-555
    • /
    • 1988
  • In this paper, the authors introduce the concepts of neural computer systems which have been studied over 25 years in other countries. And also we illustrate the models of neural networks suggested by researchers. Our fundamental hypothesis is that these models are applicable to the construction of artificial neural systems including neural computers. Therefore we assume that neural computer systems are abstract computer systems based on the computational properties of human brains and particularly well suited for problems in vision and language understanding.

  • PDF

CNN-based Sign Language Translation Program for the Deaf (CNN기반의 청각장애인을 위한 수화번역 프로그램)

  • Hong, Kyeong-Chan;Kim, Hyung-Su;Han, Young-Hwan
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.22 no.4
    • /
    • pp.206-212
    • /
    • 2021
  • Society is developing more and more, and communication methods are developing in many ways. However, developed communication is a way for the non-disabled and has no effect on the deaf. Therefore, in this paper, a CNN-based sign language translation program is designed and implemented to help deaf people communicate. Sign language translation programs translate sign language images entered through WebCam according to meaning based on data. The sign language translation program uses 24,000 pieces of Korean vowel data produced directly and conducts U-Net segmentation to train effective classification models. In the implemented sign language translation program, 'ㅋ' showed the best performance among all sign language data with 97% accuracy and 99% F1-Score, while 'ㅣ' showed the highest performance among vowel data with 94% accuracy and 95.5% F1-Score.

How is 'Contrast' Imposed on -Nun?

  • Kim, Ji-Eun
    • Language and Information
    • /
    • v.16 no.1
    • /
    • pp.1-24
    • /
    • 2012
  • -Nun is generally known as a Topic marker in Korean. However, when it is combined with an accent, it is thought to have a different function, which is alleged to indicate 'contrast' (Kuno 1972). Although the fact that -nun marked item generates some kind of 'contrastive meaning' is uncontroversial, what 'contrast(ive)' means is still unclear. In t his paper, I propose that accented -nun generates two types of implicit propositions in addition to its at-issue meaning. A simple sentence has been repeatedly tested in various models in order to see what type of proposition each proposition corresponds to and it has been concluded that one is presupposition and the other is implicature. This tedious-looking test forms the main part of the first-half of this paper. The presupposition is the essential factor for the -nun marked item to obtain the 'contrastive' meaning. Based on the generation of this presupposition, I argue that -nun works as a contrast operator in a sentence. To illustrate -nun's function as a contrast operator forms the latter part of this paper.

  • PDF

Spontaneous Speech Language Modeling using N-gram based Similarity (N-gram 기반의 유사도를 이용한 대화체 연속 음성 언어 모델링)

  • Park Young-Hee;Chung Minhwa
    • MALSORI
    • /
    • no.46
    • /
    • pp.117-126
    • /
    • 2003
  • This paper presents our language model adaptation for Korean spontaneous speech recognition. Korean spontaneous speech is observed various characteristics of content and style such as filled pauses, word omission, and contraction as compared with the written text corpus. Our approaches focus on improving the estimation of domain-dependent n-gram models by relevance weighting out-of-domain text data, where style is represented by n-gram based tf/sup */idf similarity. In addition to relevance weighting, we use disfluencies as Predictor to the neighboring words. The best result reduces 9.7% word error rate relatively and shows that n-gram based relevance weighting reflects style difference greatly and disfluencies are good predictor also.

  • PDF

A Study in Design and Construction of Structured Documents for Dialogue Corpus (대화형 코퍼스의 설계 및 구조적 문서화에 관한 연구)

  • Kang Chang-Qui;Nam Myung-Woo;Yang Ok-Yul
    • The Journal of the Korea Contents Association
    • /
    • v.4 no.4
    • /
    • pp.1-10
    • /
    • 2004
  • Dialogue speech corpora that contain sufficient dialogue speech features are needed for performance assessment of a spoken language dialogue system. And labeling information of dialogue speech corpora plays an important role for improvement of recognition rate in acoustic and language models. In this paper, we examine the methods by which labeling information of dialogue speech corpora can be structured. More specifically, we examined how to represent features of dialogue speech in a structured document based XML and how to design the repository system of the information.

  • PDF