• 제목/요약/키워드: online corpora

검색결과 7건 처리시간 0.021초

온라인 코퍼스를 활용한 한국어 유의어 교수 방안 연구 (A Study on the Method of Teaching Korean Synonyms Using Online Corpora)

  • 전지은
    • 언어사실과 관점
    • /
    • 제47권
    • /
    • pp.177-203
    • /
    • 2019
  • The purpose of this study is to suggest the possibility of using online corpora for teaching synonyms in Korean. The research included how to develop the effective concordance learning materials for teaching synonyms in Korean using data driven learning(DDL). Because synonyms are similar in meaning and usage, even native speaker can not clearly explain the difference in synonyms. Furthermore, it is not easy to provide proper example sentences for each word, and it is a reality that the differentiation of the synonyms are not sufficiently provided in the Korean textbooks. In recent years, it has been claimed that DDL helps students produce vocabulary as well as comprehend vocabulary. Nevertheless, it is hard to find how the concordance materials should be made for them. In this study, we extract concordance examples from the various kinds of online corpora; written and spoken corpora, korean textbooks, newspapers. We presented how to make corpus-designed activities using concordance materials for teaching Korean synonyms. In order to examine the effects of DDL, five experimental lessons were given to a group of 15 advanced korean learners in the university and follow-up surveys(attitude-questionnaire) were conducted. This study is meaningful in that it proposed a new teaching method in Korean synonym education.

Using Small Corpora of Critiques to Set Pedagogical Goals in First Year ESP Business English

  • Wang, Yu-Chi;Davis, Richard Hill
    • 아시아태평양코퍼스연구
    • /
    • 제2권2호
    • /
    • pp.17-29
    • /
    • 2021
  • The current study explores small corpora of critiques written by Chinese and non-Chinese university students and how strategies used by these writers compare with high-rated L1 students. Data collection includes three small corpora of student writing; 20 student critiques in 2017, 23 student critiques from 2018, and 23 critiques from the online Michigan MICUSP collection at the University of Michigan. The researchers employ Text Inspector and Lexical Complexity to identify university students' vocabulary knowledge and awareness of syntactic complexity. In addition, WMatrix4® is used to identify and support the comparison of lexical and semantic differences among the three corpora. The findings indicate that gaps between Chinese and non-Chinese writers in the same university classes exist in students' knowledge of grammatical features and interactional metadiscourse. In addition, critiques by Chinese writers are more likely to produce shorter clauses and sentences. In addition, the mean value of complex nominal and coordinate phrases is smaller for Chinese students than for non-Chinese and MICUSP writers. Finally, in terms of lexical bundles, Chinese student writers prefer clausal bundles instead of phrasal bundles, which, according to previous studies, are more often found in texts of skilled writers. The current study's findings suggest incorporating implicit and explicit instruction through the implementation of corpora in language classrooms to advance skills and strategies of all, but particularly of Chinese writers of English.

English No Matter Construction: A Construction-based Perspective

  • Kim, Jong-Bok;Lee, Seung Han
    • 영어영문학
    • /
    • 제57권6호
    • /
    • pp.959-976
    • /
    • 2011
  • The expression no matter, combining with an interrogative clause X, expresses 'it doesn't matter what the value is of X' and displays many syntactic and semantic peculiarities. To better understand the grammatical properties of the construction in question, we investigate English corpora available online and suggest that some of the irreducible properties the construction displays can be best captured by the inheritance mechanism which plays a central role in the HPSG and Construction Grammar. We show that the construction in question has its own constructional properties, but also inherits properties from related major head constructions.

주관적 웰빙 상태 측정을 위한 비정형 데이터의 상황기반 긍부정성 분석 방법 (Analyzing Contextual Polarity of Unstructured Data for Measuring Subjective Well-Being)

  • 최석재;송영은;권오병
    • 지능정보연구
    • /
    • 제22권1호
    • /
    • pp.83-105
    • /
    • 2016
  • 의료IT 서비스의 유망 분야인 정신건강 증진을 위한 주관적 웰빙 서비스(subjective well-being service) 구현의 핵심은 개인의 주관적 웰빙 상태를 정확하고 무구속적이며 비용 효율적으로 측정하는 것인데 이를 위해 보편적으로 사용되는 설문지에 의한 자기보고나 신체부착형 센서 기반의 측정 방법론은 정확성은 뛰어나나 비용효율성과 무구속성에 취약하다. 비용효율성과 무구속성을 보강하기 위한 온라인 텍스트 기반의 측정 방법은 사전에 준비된 감정어 어휘만을 사용함으로써 상황에 따라 감정어로 볼 수 있는 이른바 상황적 긍부정성(contextual polarity)을 고려하지 못하여 측정 정확도가 낮다. 한편 기존의 상황적 긍부정성을 활용한 감성분석으로는 주관적 웰빙 상태인 맥락에서의 감성분석을 할 수 있는 감정어휘사전이나 온톨로지가 구축되어 있지 않다. 더구나 온톨로지 구축도 매우 노력이 소요되는 작업이다. 따라서 본 연구의 목적은 온라인상에 사용자의 의견이 표출된 비정형 텍스트로부터 주관적 웰빙과 관련한 상황감정어를 추출하고, 이를 근거로 상황적 긍부정성 파악의 정확도를 개선하는 방법을 제안하는 것이다. 기본 절차는 다음과 같다. 먼저 일반 감정어휘사전을 준비한다. 본 연구에서는 가장 대표적인 디지털 감정어휘사전인 SentiWordNet을 사용하였다. 둘째, 정신건강지수를 동적으로 추정하는데 필요한 비정형 자료인 Corpora를 온라인 서베이로 확보하였다. 셋째, Corpora로부터 세 가지 종류의 자원을 확보하였다. 넷째, 자원을 입력변수로 하고 특정 정신건강 상태의 지수값을 종속변수로 하는 추론 모형을 구축하고 추론 규칙을 추출하였다. 마지막으로, 추론 규칙으로 정신건강 상태를 추론하였다. 본 연구는 감정을 분석함에 있어, 기존의 연구들과 달리 상황적 감정어를 적용하여 특정 도메인에 따라 다양한 감정 어휘를 파악할 수 있다는 점에서 독창성이 있다.

Copy Raising Construction in English: A Usage-based Perspective

  • Kim, Jong-Bok
    • 한국언어정보학회지:언어와정보
    • /
    • 제16권2호
    • /
    • pp.1-15
    • /
    • 2012
  • In accounting for the so-called copy raising (CR) in English, the movement perspective has assumed that the embedded subject of the CR verb's sentential complement is raised to the matrix subject, leaving behind its pronominal copy. This kind of movement-based analysis raises both empirical and analytical issues, when considering variations in the pronominal copy constraint. This paper investigates the actual uses of the construction, using online-available corpora. Based on this corpus search, we classify two different types of copy raising predicates (genuine and perception), and discuss their grammatical properties in detail. We suggest that the simple copying rule couched upon movement operations is not enough to capture great variations in the uses of the construction, and show that interpretive constraints, e.g., perceptual characterization condition, play an important role in licensing the construction.

  • PDF

리뷰에서의 고객의견의 다층적 지식표현 (Multilayer Knowledge Representation of Customer's Opinion in Reviews)

  • ;원광복;옥철영
    • 한국정보과학회 언어공학연구회:학술대회논문집(한글 및 한국어 정보처리)
    • /
    • 한국정보과학회언어공학연구회 2018년도 제30회 한글 및 한국어 정보처리 학술대회
    • /
    • pp.652-657
    • /
    • 2018
  • With the rapid development of e-commerce, many customers can now express their opinion on various kinds of product at discussion groups, merchant sites, social networks, etc. Discerning a consensus opinion about a product sold online is difficult due to more and more reviews become available on the internet. Opinion Mining, also known as Sentiment analysis, is the task of automatically detecting and understanding the sentimental expressions about a product from customer textual reviews. Recently, researchers have proposed various approaches for evaluation in sentiment mining by applying several techniques for document, sentence and aspect level. Aspect-based sentiment analysis is getting widely interesting of researchers; however, more complex algorithms are needed to address this issue precisely with larger corpora. This paper introduces an approach of knowledge representation for the task of analyzing product aspect rating. We focus on how to form the nature of sentiment representation from textual opinion by utilizing the representation learning methods which include word embedding and compositional vector models. Our experiment is performed on a dataset of reviews from electronic domain and the obtained result show that the proposed system achieved outstanding methods in previous studies.

  • PDF

Phrase-Chunk Level Hierarchical Attention Networks for Arabic Sentiment Analysis

  • Abdelmawgoud M. Meabed;Sherif Mahdy Abdou;Mervat Hassan Gheith
    • International Journal of Computer Science & Network Security
    • /
    • 제23권9호
    • /
    • pp.120-128
    • /
    • 2023
  • In this work, we have presented ATSA, a hierarchical attention deep learning model for Arabic sentiment analysis. ATSA was proposed by addressing several challenges and limitations that arise when applying the classical models to perform opinion mining in Arabic. Arabic-specific challenges including the morphological complexity and language sparsity were addressed by modeling semantic composition at the Arabic morphological analysis after performing tokenization. ATSA proposed to perform phrase-chunks sentiment embedding to provide a broader set of features that cover syntactic, semantic, and sentiment information. We used phrase structure parser to generate syntactic parse trees that are used as a reference for ATSA. This allowed modeling semantic and sentiment composition following the natural order in which words and phrase-chunks are combined in a sentence. The proposed model was evaluated on three Arabic corpora that correspond to different genres (newswire, online comments, and tweets) and different writing styles (MSA and dialectal Arabic). Experiments showed that each of the proposed contributions in ATSA was able to achieve significant improvement. The combination of all contributions, which makes up for the complete ATSA model, was able to improve the classification accuracy by 3% and 2% on Tweets and Hotel reviews datasets, respectively, compared to the existing models.