• Title/Summary/Keyword: contrastive learning

Search Result 36, Processing Time 0.022 seconds

On the Effectiveness of the Special Token Cutoff Method for Korean Sentence Representation in Unsupervised Contrastive Learning (비지도 대조 학습에서 한국어 문장 표현을 위한 특수 토큰 컷오프 방법의 유효성 분석)

  • Myeongsoo Han;Yoo Hyun Jeong;Dong-Kyu Chae
    • Annual Conference on Human and Language Technology
    • /
    • 2023.10a
    • /
    • pp.491-496
    • /
    • 2023
  • 사전학습 언어모델을 개선하여 고품질의 문장 표현(sentence representation)을 도출하기 위한 다양한 대조 학습 방법에 대한 연구가 진행되고 있다. 그러나, 대부분의 대조학습 방법들은 문장 쌍의 관계만을 고려하며, 문장 간의 유사 정도를 파악하는데는 한계가 있어서 근본적인 대조 학습 목표를 저해하였다. 이에 최근 삼중항 손실 (triplet loss) 함수를 도입하여 문장의 상대적 유사성을 파악하여 대조학습의 성능을 개선한 연구들이 제안되었다. 그러나 많은 연구들이 영어를 기반으로한 사전학습 언어모델을 대상으로 하였으며, 한국어 기반의 비지도 대조학습에 대한 삼중항 손실 함수의 실효성 검증 및 분석은 여전히 부족한 실정이다. 본 논문에서는 이러한 방법론이 한국어 비지도 대조학습에서도 유효한지 면밀히 검증하였으며, 다양한 평가 지표를 통해 해당 방법론의 타당성을 확인하였다. 본 논문의 결과가 향후 한국어 문장 표현 연구 발전에 기여하기를 기대한다.

  • PDF

Multimedia Recommender System Based on Contrastive Learning with Modality-Reflective View (모달리티 반영 뷰를 활용하는 대조 학습 기반의 멀티미디어 추천 시스템)

  • SoHee Ban;Taeri Kim;Sang-Wook Kim
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2024.05a
    • /
    • pp.635-638
    • /
    • 2024
  • 최근, 대조 학습 기반의 멀티미디어 추천 시스템들이 활발하게 연구되고 있다. 이들은 아이템의 다양한 모달리티 피처들을 활용하여 사용자와 아이템에 대한 임베딩들(뷰들)을 생성하고, 이들을 통해 대조 학습을 진행한다. 학습한 뷰들을 추천에 활용함으로써, 이들은 기존 멀티미디어 추천 시스템들보다 상당히 향상된 추천 정확도를 획득했다. 그럼에도 불구하고, 우리는 기존 대조 학습 기반의 멀티미디어 추천 시스템들이 아이템의 뷰들을 생성하는 데에 아이템의 모달리티 피처들을 올바르게 반영하는 것의 중요성을 간과하며, 그 결과 추천 정확도 향상에 제약을 갖는다고 주장한다. 이는 아이템 임베딩에 아이템 자신의 모달리티 피처를 올바르게 반영하는 것이 추천 정확도에 향상에 도움이 된다는 기존 멀티미디어 추천 시스템의 발견에 기반한다. 따라서 본 논문에서 우리는 아이템의 모달리티 피처들을 올바르게 반영할 수 있는 뷰(구체적으로, 모달리티 반영 뷰)를 통해 대조 학습을 진행하는 새로운 멀티미디어 추천 시스템을 제안한다. 제안 방안은 두 가지 실세계 공개 데이터 집합들에 대해 최신 멀티미디어 추천 시스템보다 6.78%까지 향상된 추천 정확도를 보였다.

TAGS: Text Augmentation with Generation and Selection (생성-선정을 통한 텍스트 증강 프레임워크)

  • Kim Kyung Min;Dong Hwan Kim;Seongung Jo;Heung-Seon Oh;Myeong-Ha Hwang
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.10
    • /
    • pp.455-460
    • /
    • 2023
  • Text augmentation is a methodology that creates new augmented texts by transforming or generating original texts for the purpose of improving the performance of NLP models. However existing text augmentation techniques have limitations such as lack of expressive diversity semantic distortion and limited number of augmented texts. Recently text augmentation using large language models and few-shot learning can overcome these limitations but there is also a risk of noise generation due to incorrect generation. In this paper, we propose a text augmentation method called TAGS that generates multiple candidate texts and selects the appropriate text as the augmented text. TAGS generates various expressions using few-shot learning while effectively selecting suitable data even with a small amount of original text by using contrastive learning and similarity comparison. We applied this method to task-oriented chatbot data and achieved more than sixty times quantitative improvement. We also analyzed the generated texts to confirm that they produced semantically and expressively diverse texts compared to the original texts. Moreover, we trained and evaluated a classification model using the augmented texts and showed that it improved the performance by more than 0.1915, confirming that it helps to improve the actual model performance.

Small Sample Face Recognition Algorithm Based on Novel Siamese Network

  • Zhang, Jianming;Jin, Xiaokang;Liu, Yukai;Sangaiah, Arun Kumar;Wang, Jin
    • Journal of Information Processing Systems
    • /
    • v.14 no.6
    • /
    • pp.1464-1479
    • /
    • 2018
  • In face recognition, sometimes the number of available training samples for single category is insufficient. Therefore, the performances of models trained by convolutional neural network are not ideal. The small sample face recognition algorithm based on novel Siamese network is proposed in this paper, which doesn't need rich samples for training. The algorithm designs and realizes a new Siamese network model, SiameseFacel, which uses pairs of face images as inputs and maps them to target space so that the $L_2$ norm distance in target space can represent the semantic distance in input space. The mapping is represented by the neural network in supervised learning. Moreover, a more lightweight Siamese network model, SiameseFace2, is designed to reduce the network parameters without losing accuracy. We also present a new method to generate training data and expand the number of training samples for single category in AR and labeled faces in the wild (LFW) datasets, which improves the recognition accuracy of the models. Four loss functions are adopted to carry out experiments on AR and LFW datasets. The results show that the contrastive loss function combined with new Siamese network model in this paper can effectively improve the accuracy of face recognition.

MATERIALS AND METHODS FOR TEACHING INTONATION

  • Ashby, Michael
    • Proceedings of the KSPS conference
    • /
    • 1997.07a
    • /
    • pp.228-229
    • /
    • 1997
  • 1 Intonation is important. It cannot be ignored. To convince students of the importance of intonation, we can use sentences with two very different interpretations according to intonation. Example: "I thought it would rain" with a fallon "rain" means it did not rain, but with a fall on "thought" and a rise on "rain" it means that it did rain. 2 Although complex, intonation is structured. For both teacher and student, the big job of tackling intonation is made simpler by remembering that intonation can be analysed into systems and units. There are three main systems in English intonation: Tonality (division into phrases) Tonicity (selection of accented syllables) Tone (the choice of pitch movements) Examples: Tonality: My brother who lives in London is a doctor. Tonicity: Hello. How ARE you. Hello. How are YOU. Tone: Ways to say "Thank you" 3 In deciding what to teach, we must distinguish what is universal from what is specifically English. This is where contrastive studies of intonation are very valuable. Usually, for instance, division into phrases (tonality) works in broadly similar ways across languages. Some uses of pitch are also similar across languages - for example, very high pitch may signal excitement or urgency. 4 Although most people think that intonation is mainly about pitch (the tone system), actually accent placement (tonicity) is probably the single most important aspect of English intonation. This is because it is connected with information focus, and the effects on interpretation are very clear-cut. Example: They asked for coffee, so I made them coffee. (The second occurrence of "coffee" must not be accented). 5 Ear-training is the beginning of intonation training in the VeL approach. First, students learn to identify fall vs rise vs fall-rise. To begin with, single words are used, then phrases and sentences. When learning tones, the fIrst words used should have unstressed syllables after the stressed syllable (Saturday) to make the pitch movement clearer. 6 In production drills, the fIrst thing is to establish simple neutral patterns. There should be no drama or really special meanings. Simple drills can be used to teach important patterns: Example: A: Peter likes football B: Yes JOHN likes football TOO A: Mary rides a bike B: Yes JENny rides a bike TOO 7 The teacher must be systematic and let learners KNOW what they are learning. It is no good using new patterns and hoping that students will "pick them up" without noticing. 8 Visual feedback of fundamental frequency with a computer display can help students learn correct patterns. The teacher can use the display to demonstrate patterns, or students can practise by themselves, imitating recorded models.

  • PDF

Considerations for Helping Korean Students Write Better Technical Papers in English (한국 대학생들의 영어 기술 논문 작성 능력 향상을 위한 고찰)

  • Kim, Yee-Jin;Pak, Bo-Young;Lee, Chang-Ha;Kim, Moon-Kyum
    • Journal of Engineering Education Research
    • /
    • v.10 no.3
    • /
    • pp.64-78
    • /
    • 2007
  • For Korean researchers, English is essential. In fact, this is the case for any researcher who is a non-native English speaker, as recognition and success is predicated on being published, while publications that reach the broadest audiences are in English. Unfortunately, university science and engineering programs in Korea often do not provide formal coursework to help students attain greater competence in English composition. Aggravating this situation is the general lack of literature covering this specific pedagogical issue. While there is plenty of information to help native speakers with technical writing and much covering general English composition for EFL learners, there is very little information available to help EFL learners become better technical writers. Thus, the purpose of this report is twofold. First, as most Korean educators in science and engineering are not well acquainted with pedagogical issues of EFL writing, this report provides a general introduction to some relevant issues. It reviews the importance of contrastive rhetoric as well as some considerations for choosing the appropriate teaching approach, class arrangement, and use of computer assisted learning tools. Secondly, a course proposal is discussed. Based on a review of student writing samples as well as student responses to a self-assessment questionnaire, the proposed course is intended to balance the needs of Korean EFL learners to develop grammar, process, and genre skills involved in technical writing. Although, the scope of this report is very modest, by sharing the considerations made towards the development of an EFL technical writing course it seeks to provide a small example to a field that is perhaps lacking examples.