• 제목/요약/키워드: 어휘평가

Search Result 388, Processing Time 0.026 seconds

Automatic Error Correction System for Erroneous SMS Strings (SMS 변형된 문자열의 자동 오류 교정 시스템)

  • Kang, Seung-Shik;Chang, Du-Seong
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.6
    • /
    • pp.386-391
    • /
    • 2008
  • Some spoken word errors that violate grammatical or writing rules occurs frequently in communication environments like mobile phone and messenger. These unexpected errors cause a problem in a language processing system for many applications like speech recognition, text-to-speech translation, and so on. In this paper, we proposed and implemented an automatic correction system of ill-formed words and word spacing errors in SMS sentences that has been the major errors of poor accuracy. We experimented three methods of constructing the word correction dictionary and evaluated the results of those methods. They are (1) manual construction of error words from the vocabulary list of ill-formed communication languages, (2) automatic construction of error dictionary from the manually constructed corpus, and (3) context-dependent method of automatic construction of error dictionary.

A Study on the Emotion-Responsive Interior Design centered on a Color Coordinate Digital Wall (감성반응형 실내디자인에 관한 연구 - 감성어휘별 색채배색에 의한 디지털 벽면을 중심으로 -)

  • 김주연;이현수
    • Science of Emotion and Sensibility
    • /
    • v.6 no.2
    • /
    • pp.1-7
    • /
    • 2003
  • The objectives of this study is to develop an adaptable digital wall model whose color can be changed dynamically according to the identified emotional state of a user. This study addresses how to capture a specific user's emotion through the web and use it for modifying VR model mainly for color adaptation. This adaptation process of a VR model consists of three phases: 1) identification of the user's emotional state projected onto the list of emotional keywords 2) translation of those captured emotional keywords into a pertinent set of color coordinations, and finally, 3) automated color adaptation process for the given model. This process derives an on-line viewer's emotional state, which can be utilized to find a new color scheme reflecting the identified emotion.

  • PDF

Comparison of Readability between Documents in the Community Question-Answering (질의응답 커뮤니티에서 문서 간 이독성 비교)

  • Mun, Gil-Seong
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.10
    • /
    • pp.25-34
    • /
    • 2020
  • Community question and answering service is one of the main sources of information and knowledge in the Web. The quality of information in question and answer documents is determined by the clarity of the question and the relevance of the answers, and the readability of a document is a key factor for evaluating the quality. This study is to measure the quality of documents used in community question and answering service. For this purpose, we compare the frequency of occurrence by vocabulary level used in community documents and measure the readability index of documents by institution of author. To measure the readability index, we used the Dale-Chall formula which is calculated by vocabulary level and sentence length. The results show that the vocabulary used in the answers is more difficult than in the questions and the sentence length is longer. The gap in readability between questions and answers is also found by writing institution. The results of this study can be used as basic data for improving online counseling services.

A Sentiment Classification Method Using Context Information in Product Review Summarization (상품 리뷰 요약에서의 문맥 정보를 이용한 의견 분류 방법)

  • Yang, Jung-Yeon;Myung, Jae-Seok;Lee, Sang-Goo
    • Journal of KIISE:Databases
    • /
    • v.36 no.4
    • /
    • pp.254-262
    • /
    • 2009
  • As the trend of e-business activities develop, customers come into contact with products through on-line shopping sites and lots of customers refer product reviews before the purchasing on-line. However, as the volume of product reviews grow, it takes a great deal of time and effort for customers to read and evaluate voluminous product reviews. Lately, attention is being paid to Opinion Mining(OM) as one of the effective solutions to this problem. In this paper, we propose an efficient method for opinion sentiment classification of product reviews using product specific context information of words occurred in the reviews. We define the context information of words and propose the application of context for sentiment classification and we show the performance of our method through the experiments. Additionally, in case of word corpus construction, we propose the method to construct word corpus automatically using the review texts and review scores in order to prevent traditional manual process. In consequence, we can easily get exact sentiment polarities of opinion words in product reviews.

Analysis of Gaze Related to Cooperation, Competition and Focus Levels (협력, 경쟁, 집중 수준에 따른 시선 분석)

  • Cho, Ji Eun;Lee, Dong Won;Park, MinJi;Whang, Min-Cheol
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.9
    • /
    • pp.281-291
    • /
    • 2017
  • Emotional interaction in virtual reality is necessary of social communication. However, social emotion has been tried to be less recognized quantitatively. This study was to determined social gaze of emotion in business domain. 417 emotion words were collected and 16 emotion words were selected to Goodness of Fit. Emotion word were mapped into 2 dimensional space through multidimensional scaling analysis. Then, X axis defined dimensions of cooperation, competition, and Y axis of low focus and high focus through the FGD. 52 subjects were presented to stimuli for emotion and gaze movement data were collected. Independent t-test results showed that the gaze factor increased in the face, eye, and nose areas at cooperation, and the gaze factor increased in the right face and nose areas at the low focus. It is expected that this will be used as a basic research to evaluate emotions needed in business environment in virtual space.

Empirical Study for Automatic Evaluation of Abstractive Summarization by Error-Types (오류 유형에 따른 생성요약 모델의 본문-요약문 간 요약 성능평가 비교)

  • Seungsoo Lee;Sangwoo Kang
    • Korean Journal of Cognitive Science
    • /
    • v.34 no.3
    • /
    • pp.197-226
    • /
    • 2023
  • Generative Text Summarization is one of the Natural Language Processing tasks. It generates a short abbreviated summary while preserving the content of the long text. ROUGE is a widely used lexical-overlap based metric for text summarization models in generative summarization benchmarks. Although it shows very high performance, the studies report that 30% of the generated summary and the text are still inconsistent. This paper proposes a methodology for evaluating the performance of the summary model without using the correct summary. AggreFACT is a human-annotated dataset that classifies the types of errors in neural text summarization models. Among all the test candidates, the two cases, generation summary, and when errors occurred throughout the summary showed the highest correlation results. We observed that the proposed evaluation score showed a high correlation with models finetuned with BART and PEGASUS, which is pretrained with a large-scale Transformer structure.

A View on the Diversity of the Word and Mathematical Notation Expression Used in High School Mathematics Textbooks (고등학교 수학 교과서에서 사용되는 어휘(語彙)와 수학 기호 표현의 다양성에 대한 소고(小考))

  • Yang, Seong Hyun
    • Journal of the Korean School Mathematics Society
    • /
    • v.20 no.3
    • /
    • pp.211-237
    • /
    • 2017
  • Depending on the type of textbook, the word and mathematical notation expression used in high school mathematics textbooks varied and there were also some differences on the mathematical definition and the content description methods. Not only the composition of textbooks but also various expressing ways of textbooks have significant impacts on teaching and learning of teacher and student. The diversity of expression had pros and cons like both sides of a coin. There is a positive aspect that we can pursue pedagogical diversity. Simultaneously there is a negative aspect that the possibility of acting as a learning burden exists in the viewpoint of the student and the equality of evaluation may be undermined. In this study, Preferentially we focused on analyzing the actual situation rather than judging what is more appropriate about the diversity of words and notation expressions used in mathematics textbooks which is based on the current curriculum. For this purpose, we analyzed 56 kinds of mathematics textbooks based on the 2009 revised mathematics curriculum, and presented four aspects(terms expressing, notations expression, mathematical definition, content description method) with examples about differences of the various expressions used in textbooks including 'terms and notations'.

  • PDF

Representative Labels Selection Technique for Document Cluster using WordNet (문서 클러스터를 위한 워드넷기반의 대표 레이블 선정 방법)

  • Kim, Tae-Hoon;Sohn, Mye
    • Journal of Internet Computing and Services
    • /
    • v.18 no.2
    • /
    • pp.61-73
    • /
    • 2017
  • In this paper, we propose a Documents Cluster Labeling method using information content of words in clusters to understand what the clusters imply. To do so, we calculate the weight and frequency of the words. These two measures are used to determine the weight among the words in the cluster. As a nest step, we identify the candidate labels using the WordNet. At this time, the candidate labels are matched to least common hypernym of the words in the cluster. Finally, the representative labels are determined with respect to information content of the words and the weight of the words. To prove the superiority of our method, we perform the heuristic experiment using two kinds of measures, named the suitability of the candidate label ($Suitability_{cl}$) and the appropriacy of representative label ($Appropriacy_{rl}$). In applying the method proposed in this research, in case of suitability of the candidate label, it decreases slightly compared with existing methods, but the computational cost is about 20% of the conventional methods. And we confirmed that appropriacy of the representative label is better results than the existing methods. As a result, it is expected to help data analysts to interpret the document cluster easier.

Probabilistic Parsing of Korean Sentences Based on Lexical Co-occurrence and Syntactic Rules (중심어간의 공기 정보와 구문 규칙을 기반으로 한 확률적 한국어 구문 분석)

  • Lee, Kong-Joo;Kim, Jae-Hoon;Kim, Gil-Chang
    • Annual Conference on Human and Language Technology
    • /
    • 1997.10a
    • /
    • pp.332-338
    • /
    • 1997
  • 어휘 정보는 구문 구조의 중의성을 해결하는데 중요한 정보원으로서 작용할 수 있다. 본 논문에서는 입력 문장에 대한 구조적 중의성을 해결하는데 확률 구문 규칙뿐만 아니라, 어휘간에 발생할 수 있는 공기 정보를 사용할 수 있는 확률 모델을 제안한다. 제안된 확률 모델에 대하여 실험 데이타에 대해 평가한 결과 약 84%정도의 구문 분석 정확도를 얻을 수 있었다.

  • PDF