• 제목/요약/키워드: Cross Language

검색결과 453건 처리시간 0.025초

Bi-Cross 사전 학습을 통한 자연어 이해 성능 향상 (The Bi-Cross Pretraining Method to Enhance Language Representation)

  • 김성주;김선훈;박진성;유강민;강인호
    • 한국정보과학회 언어공학연구회:학술대회논문집(한글 및 한국어 정보처리)
    • /
    • 한국정보과학회언어공학연구회 2021년도 제33회 한글 및 한국어 정보처리 학술대회
    • /
    • pp.320-325
    • /
    • 2021
  • BERT는 사전 학습 단계에서 다음 문장 예측 문제와 마스킹된 단어에 대한 예측 문제를 학습하여 여러 자연어 다운스트림 태스크에서 높은 성능을 보였다. 본 연구에서는 BERT의 사전 학습 문제 중 다음 문장 예측 문제에 대해 주목했다. 다음 문장 예측 문제는 자연어 추론 문제와 질의 응답 문제와 같이 임의의 두 문장 사이의 관계를 모델링하는 문제들에 성능 향상을 위해 사용되었다. 하지만 BERT의 다음 문장 예측 문제는 두 문장을 특수 토큰으로 분리하여 단일 문자열 형태로 모델에 입력으로 주어지는 cross-encoding 방식만을 학습하기 때문에 문장을 각각 인코딩하는 bi-encoding 방식의 다운스트림 태스크를 고려하지 않은 점에서 아쉬움이 있다. 본 논문에서는 기존 BERT의 다음 문장 예측 문제를 확장하여 bi-encoding 방식의 다음 문장 예측 문제를 추가적으로 사전 학습하여 단일 문장 분류 문제와 문장 임베딩을 활용하는 문제에서 성능을 향상 시키는 Bi-Cross 사전 학습 기법을 소개한다. Bi-Cross 학습 기법은 영화 리뷰 감성 분류 데이터 셋인 NSMC 데이터 셋에 대해 학습 데이터의 0.1%만 사용하는 학습 환경에서 Bi-Cross 사전 학습 기법 적용 전 모델 대비 5점 가량의 성능 향상이 있었다. 또한 KorSTS의 bi-encoding 방식의 문장 임베딩 성능 평가에서 Bi-Cross 사전 학습 기법 적용 전 모델 대비 1.5점의 성능 향상을 보였다.

  • PDF

A Method of Chinese and Thai Cross-Lingual Query Expansion Based on Comparable Corpus

  • Tang, Peili;Zhao, Jing;Yu, Zhengtao;Wang, Zhuo;Xian, Yantuan
    • Journal of Information Processing Systems
    • /
    • 제13권4호
    • /
    • pp.805-817
    • /
    • 2017
  • Cross-lingual query expansion is usually based on the relationship among monolingual words. Bilingual comparable corpus contains relationships among bilingual words. Therefore, this paper proposes a method based on these relationships to conduct query expansion. First, the word vectors which characterize the bilingual words are trained using Chinese and Thai bilingual comparable corpus. Then, the correlation between Chinese query words and Thai words are computed based on these word vectors, followed with selecting the Thai candidate expansion terms via the correlative value. Then, multi-group Thai query expansion sentences are built by the Thai candidate expansion words based on Chinese query sentence. Finally, we can get the optimal sentence using the Chinese and Thai query expansion method, and perform the Thai query expansion. Experiment results show that the cross-lingual query expansion method we proposed can effectively improve the accuracy of Chinese and Thai cross-language information retrieval.

The Dilemma of Language in Education Policies in Ghana and Tanzania

  • Dzahene-Quarshie, Josephine;Moshi, Lioba
    • 비교문화연구
    • /
    • 제36권
    • /
    • pp.149-173
    • /
    • 2014
  • This paper examines language policies of Ghana and Tanzania (former British Colonies) since independence. The view that language use in education is a problem for African countries is evident in the ever changing language in education policies in many African countries. Because of the inevitable multilingual situation in many African countries, there are unavoidable challenges in their quest to adopt a language policy that works for the entire country since it is not practical to adopt all the languages spoken in the country as Media of Instruction. Ghana is not immune to this challenge and has fallen victim to this tendency to change the language in education policy from time to time in an attempt to adopt a satisfactory policy which would yield the intended results. Tanzania, however, is one of the few African countries that have found a sustainable language in education policy since independence. Nonetheless, it has its fair share of challenges as a consequence of the perceived competition between Kiswahili and English as official languages. The paper discusses the challenges that both Ghana and Tanzania face against the background of colonization. The paper also offers a discussion on possible future perspectives for the two countries.

한글 채팅 말뭉치를 이용한 크로스-텍스팅 방지 시스템 (Cross-Texting Prevention System using Korean Chat Corpus)

  • 이다영;조환규
    • 한국정보과학회 언어공학연구회:학술대회논문집(한글 및 한국어 정보처리)
    • /
    • 한국정보과학회언어공학연구회 2020년도 제32회 한글 및 한국어 정보처리 학술대회
    • /
    • pp.377-382
    • /
    • 2020
  • cross-texting은 실수로 의도하지 않은 상대방에게 메세지를 잘못 전송하는 것을 말한다. 휴대폰 메신저 사용이 활발해짐에 따라 이 같은 실수가 빈번하게 발생하는데 메신저에서 제공하는 기능은 대체로 사후 해결책에 해당하고 사용자가 사전에 실수를 발견하기는 어렵다. 본 논문에서는 사용자가 작성한 문장의 형식적 자질를 분석하여 현재 참여중인 대화에서 작성한 문장이 cross-texting인지를 판별하는 모델을 제안했다. 문장에서 높임법, 표층적 완성도 자질을 추출하고 이를 통해 특정 사용자의 대화를 모델링하여 주어진 문장이 대화에 부합하는지 여부를 판단한다. 이같은 방식은 채팅방의 이전 기록만으로도 사용자가 작성한 문장이 cross-texting인지 여부를 쉽게 판단할 수 있는 힌트를 제공할 수 있다. 실제 메신저 대화 말뭉치를 이용해 제작한 데이터에서 94% 정확도로 cross-texting을 탐지했다.

  • PDF

The Importance of Learning Language and Culture Integration: Focused on TOEIC Reading Comprehension

  • Shin, Myeong-Hee;Lee, Eunpyo
    • 영어어문교육
    • /
    • 제18권3호
    • /
    • pp.207-221
    • /
    • 2012
  • This study examines the importance of learning language and cultural integration in general English class focused on TOEIC reading comprehension. The understanding of cultural learning and learners' cultural awareness has long been a subject of debate. This study was not only to analyze the improvement of students' interest and reading comprehension ability of TOEIC through cultural learning, but also to ensure students who learn American culture overcome cross-cultural miscommunication and improve their English reading comprehension skills. Pre-post surveys and the pre-post TOEIC tests were used to measure language proficiency and American cultural knowledge to two groups: the experimental and control group. The results from the study were as follows: First, students had better TOEIC scores with improved motivation after understanding American culture relevant to the lessons. Second, reading comprehension skills with regards to TOEIC also improved, compared with the students who were not exposed to American culture due to lack of opportunity.

  • PDF

When 5004 is Said "Five Thousand Zero Hundred Remainder Four": The Influence of Language on Natural Number Transcoding: Cross-National Comparison

  • Nguyen, Hien Thi-Thu;Gregoire, Jacques
    • 한국수학교육학회지시리즈D:수학교육연구
    • /
    • 제18권2호
    • /
    • pp.149-170
    • /
    • 2014
  • The Vietnamese language has a specific property related to the zero in the name-number system. This study was conducted to examine the impact of linguistic differences and of the zero's position in a number on a transcoding task (verbal number into Arabic number). Vietnamese children and French-speaking Belgian children, from grades 3 to 6, participated in the study. The success rate and the type of errors they made varied, depending on their grade and language. At Grade 4, Vietnamese children showed performances equivalent to Grade 6 Belgian children. Our results confirmed the support provided by language to the understanding and performances in a transcoding task. Results also showed that a syntactic zero is easier to manipulate than a lexical zero for Vietnamese children. The relative influence of language and the source of errors are discussed.

한·중 청자의 음높이 변화에 대한 지각 연구 (Cross-linguistic Study of Perceptual Cues to F0 Variations)

  • 윤은경;자오원카이
    • 한국어교육
    • /
    • 제28권3호
    • /
    • pp.25-51
    • /
    • 2017
  • This study aimed to identify the differences in pitch perception between tonal and non-tonal language listeners. A total of 60 Korean and Chinese listeners participated in the perception test. A two-syllable nonsense word /paba/ was manipulated in five steps. The pitch height or contour on the second syllable was raised or lowered. Both groups were asked to select which of the two syllables had the higher pitch. The findings showed that the majority of Korean listeners (GK) perceived decreased pitch as each peak of the syllable was lowered and perceived increased pitch as it was raised, which means the pitch height is a primary perceptual cue for GK. However, Chinese listeners (GC) perceived sensitive pitch movements as the pitch contour changed. GC's perception may presumably be affected by the L1's tone sandhi. We found it reasonable to assume that language experience has a significant effect on the cross-linguistic perceptual differences between tone and non-tonal language listeners.

교육과정 변화에 따른 러시아어 문화 교육 내용 분석 (Analysis of Russian Culture Education According to the Curriculum Changes)

  • 어건주
    • 비교문화연구
    • /
    • 제29권
    • /
    • pp.479-501
    • /
    • 2012
  • In this paper, I analyzed the russian cultural content of the russian textbook according to curriculum changes. The aim of this study is to analyze the content of the russian textbooks on russian culture. Our education of russian language begins in high school as a second language. And russian education in high school entirely depend on the textbook. In these circumstances, Russian textbooks play a very important role in the Russian language learning. For a practical and efficient language learning, acquisition of cultural knowledge is very important. Because cultural content can be learning motivational factors. But the contents of a textbook is not satisfactory enough to teach russian culture. More efficient textbook must be developed to advance student's linguistic ability.

다문화 가정 대상 한국어 교육의 현황과 성과 (The Present State and Outcomes of Korean Language Education for Multicultural Family Members)

  • 김선정
    • 비교문화연구
    • /
    • 제29권
    • /
    • pp.367-389
    • /
    • 2012
  • The aim of this paper is to briefly consider the present state regarding Korean language education for multicultural family members, and to consider the outcomes produced so far. Married woman immigrants and their children must be one of the most significant groups for Korean language education in terms of their huge number and their roles and meanings in Korean society. In order to uplift the Korean communicative ability for multicultural family members, an effective operating system for Korean language education is needed, and also live and efficient Korean language instruction must be given by capable Korean language teachers with adequate teaching materials. A customized Korean language education must also be offered based on researches about the characteristics of multicultural family members as "Korean language learners". Korean language education for married woman immigrants has almost been set up in some extent, in terms of teaching materials and the teacher training system. Therefore, an efficient operating system must be constructed so that the developed teaching materials can be actively utilized in the site of Korean language education. A periodical retraining of Korean language teachers for multicultural family members is also necessary for the improvement of Korean language teaching efficiency. However, Korean language education for multicultural children is still in its infancy due to its late start-up. By the support of the Korean government, the curriculum of Korean language education has recently been fixed, KSL text books are being developed, and a diagnostic tool for evaluating their Korean language ability is now in progress. Many continuing concerns and support must still be provided for the improvement of their Korean language ability and fostering them as competitive and capable of speaking Korean.

Cross-speaker anaphora in dynamic semantics

  • Yeom, Jae-Il
    • 한국언어정보학회지:언어와정보
    • /
    • 제14권2호
    • /
    • pp.103-129
    • /
    • 2010
  • In this paper, I show that anaphora across speakers shows both dynamic and static sides. To capture them all formally, I will adopt semantics based on the assumption that variables range over individual concepts that connect epistemic alternatives. As information increases, a variable can take a different range of possible individual concepts. This is captured by the notion of virtual individual (= vi), a set of individual concepts which are indistinguishable in an information state. The use of a pronoun involves two information states, one for the antecedent, which is always part of the common ground, and the other for the pronoun. Information increase changes vis for variables in the common ground. A pronoun can be used felicitously if there is a unique virtual individual in the information state for the antecedent which does not split in two or more distinctive virtual individuals in the information state for the pronoun. The felicity condition for cross-speaker anaphora can be satisfied in declaratives involving modality, interrogatives and imperatives in a rather less demanding way, because in these cases the utterance does not necessarily require non-trivial personal information for proper use of a pronoun.

  • PDF