• Title/Summary/Keyword: Performance Criterion

Search Result 1,187, Processing Time 0.023 seconds

A Study on the Effect of Using Sentiment Lexicon in Opinion Classification (오피니언 분류의 감성사전 활용효과에 대한 연구)

  • Kim, Seungwoo;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.1
    • /
    • pp.133-148
    • /
    • 2014
  • Recently, with the advent of various information channels, the number of has continued to grow. The main cause of this phenomenon can be found in the significant increase of unstructured data, as the use of smart devices enables users to create data in the form of text, audio, images, and video. In various types of unstructured data, the user's opinion and a variety of information is clearly expressed in text data such as news, reports, papers, and various articles. Thus, active attempts have been made to create new value by analyzing these texts. The representative techniques used in text analysis are text mining and opinion mining. These share certain important characteristics; for example, they not only use text documents as input data, but also use many natural language processing techniques such as filtering and parsing. Therefore, opinion mining is usually recognized as a sub-concept of text mining, or, in many cases, the two terms are used interchangeably in the literature. Suppose that the purpose of a certain classification analysis is to predict a positive or negative opinion contained in some documents. If we focus on the classification process, the analysis can be regarded as a traditional text mining case. However, if we observe that the target of the analysis is a positive or negative opinion, the analysis can be regarded as a typical example of opinion mining. In other words, two methods (i.e., text mining and opinion mining) are available for opinion classification. Thus, in order to distinguish between the two, a precise definition of each method is needed. In this paper, we found that it is very difficult to distinguish between the two methods clearly with respect to the purpose of analysis and the type of results. We conclude that the most definitive criterion to distinguish text mining from opinion mining is whether an analysis utilizes any kind of sentiment lexicon. We first established two prediction models, one based on opinion mining and the other on text mining. Next, we compared the main processes used by the two prediction models. Finally, we compared their prediction accuracy. We then analyzed 2,000 movie reviews. The results revealed that the prediction model based on opinion mining showed higher average prediction accuracy compared to the text mining model. Moreover, in the lift chart generated by the opinion mining based model, the prediction accuracy for the documents with strong certainty was higher than that for the documents with weak certainty. Most of all, opinion mining has a meaningful advantage in that it can reduce learning time dramatically, because a sentiment lexicon generated once can be reused in a similar application domain. Additionally, the classification results can be clearly explained by using a sentiment lexicon. This study has two limitations. First, the results of the experiments cannot be generalized, mainly because the experiment is limited to a small number of movie reviews. Additionally, various parameters in the parsing and filtering steps of the text mining may have affected the accuracy of the prediction models. However, this research contributes a performance and comparison of text mining analysis and opinion mining analysis for opinion classification. In future research, a more precise evaluation of the two methods should be made through intensive experiments.

Applying Meta-model Formalization of Part-Whole Relationship to UML: Experiment on Classification of Aggregation and Composition (UML의 부분-전체 관계에 대한 메타모델 형식화 이론의 적용: 집합연관 및 복합연관 판별 실험)

  • Kim, Taekyung
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.99-118
    • /
    • 2015
  • Object-oriented programming languages have been widely selected for developing modern information systems. The use of concepts relating to object-oriented (OO, in short) programming has reduced efforts of reusing pre-existing codes, and the OO concepts have been proved to be a useful in interpreting system requirements. In line with this, we have witnessed that a modern conceptual modeling approach supports features of object-oriented programming. Unified Modeling Language or UML becomes one of de-facto standards for information system designers since the language provides a set of visual diagrams, comprehensive frameworks and flexible expressions. In a modeling process, UML users need to consider relationships between classes. Based on an explicit and clear representation of classes, the conceptual model from UML garners necessarily attributes and methods for guiding software engineers. Especially, identifying an association between a class of part and a class of whole is included in the standard grammar of UML. The representation of part-whole relationship is natural in a real world domain since many physical objects are perceived as part-whole relationship. In addition, even abstract concepts such as roles are easily identified by part-whole perception. It seems that a representation of part-whole in UML is reasonable and useful. However, it should be admitted that the use of UML is limited due to the lack of practical guidelines on how to identify a part-whole relationship and how to classify it into an aggregate- or a composite-association. Research efforts on developing the procedure knowledge is meaningful and timely in that misleading perception to part-whole relationship is hard to be filtered out in an initial conceptual modeling thus resulting in deterioration of system usability. The current method on identifying and classifying part-whole relationships is mainly counting on linguistic expression. This simple approach is rooted in the idea that a phrase of representing has-a constructs a par-whole perception between objects. If the relationship is strong, the association is classified as a composite association of part-whole relationship. In other cases, the relationship is an aggregate association. Admittedly, linguistic expressions contain clues for part-whole relationships; therefore, the approach is reasonable and cost-effective in general. Nevertheless, it does not cover concerns on accuracy and theoretical legitimacy. Research efforts on developing guidelines for part-whole identification and classification has not been accumulated sufficient achievements to solve this issue. The purpose of this study is to provide step-by-step guidelines for identifying and classifying part-whole relationships in the context of UML use. Based on the theoretical work on Meta-model Formalization, self-check forms that help conceptual modelers work on part-whole classes are developed. To evaluate the performance of suggested idea, an experiment approach was adopted. The findings show that UML users obtain better results with the guidelines based on Meta-model Formalization compared to a natural language classification scheme conventionally recommended by UML theorists. This study contributed to the stream of research effort about part-whole relationships by extending applicability of Meta-model Formalization. Compared to traditional approaches that target to establish criterion for evaluating a result of conceptual modeling, this study expands the scope to a process of modeling. Traditional theories on evaluation of part-whole relationship in the context of conceptual modeling aim to rule out incomplete or wrong representations. It is posed that qualification is still important; but, the lack of consideration on providing a practical alternative may reduce appropriateness of posterior inspection for modelers who want to reduce errors or misperceptions about part-whole identification and classification. The findings of this study can be further developed by introducing more comprehensive variables and real-world settings. In addition, it is highly recommended to replicate and extend the suggested idea of utilizing Meta-model formalization by creating different alternative forms of guidelines including plugins for integrated development environments.

Development of Customer Sentiment Pattern Map for Webtoon Content Recommendation (웹툰 콘텐츠 추천을 위한 소비자 감성 패턴 맵 개발)

  • Lee, Junsik;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.67-88
    • /
    • 2019
  • Webtoon is a Korean-style digital comics platform that distributes comics content produced using the characteristic elements of the Internet in a form that can be consumed online. With the recent rapid growth of the webtoon industry and the exponential increase in the supply of webtoon content, the need for effective webtoon content recommendation measures is growing. Webtoons are digital content products that combine pictorial, literary and digital elements. Therefore, webtoons stimulate consumer sentiment by making readers have fun and engaging and empathizing with the situations in which webtoons are produced. In this context, it can be expected that the sentiment that webtoons evoke to consumers will serve as an important criterion for consumers' choice of webtoons. However, there is a lack of research to improve webtoons' recommendation performance by utilizing consumer sentiment. This study is aimed at developing consumer sentiment pattern maps that can support effective recommendations of webtoon content, focusing on consumer sentiments that have not been fully discussed previously. Metadata and consumer sentiments data were collected for 200 works serviced on the Korean webtoon platform 'Naver Webtoon' to conduct this study. 488 sentiment terms were collected for 127 works, excluding those that did not meet the purpose of the analysis. Next, similar or duplicate terms were combined or abstracted in accordance with the bottom-up approach. As a result, we have built webtoons specialized sentiment-index, which are reduced to a total of 63 emotive adjectives. By performing exploratory factor analysis on the constructed sentiment-index, we have derived three important dimensions for classifying webtoon types. The exploratory factor analysis was performed through the Principal Component Analysis (PCA) using varimax factor rotation. The three dimensions were named 'Immersion', 'Touch' and 'Irritant' respectively. Based on this, K-Means clustering was performed and the entire webtoons were classified into four types. Each type was named 'Snack', 'Drama', 'Irritant', and 'Romance'. For each type of webtoon, we wrote webtoon-sentiment 2-Mode network graphs and looked at the characteristics of the sentiment pattern appearing for each type. In addition, through profiling analysis, we were able to derive meaningful strategic implications for each type of webtoon. First, The 'Snack' cluster is a collection of webtoons that are fast-paced and highly entertaining. Many consumers are interested in these webtoons, but they don't rate them well. Also, consumers mostly use simple expressions of sentiment when talking about these webtoons. Webtoons belonging to 'Snack' are expected to appeal to modern people who want to consume content easily and quickly during short travel time, such as commuting time. Secondly, webtoons belonging to 'Drama' are expected to evoke realistic and everyday sentiments rather than exaggerated and light comic ones. When consumers talk about webtoons belonging to a 'Drama' cluster in online, they are found to express a variety of sentiments. It is appropriate to establish an OSMU(One source multi-use) strategy to extend these webtoons to other content such as movies and TV series. Third, the sentiment pattern map of 'Irritant' shows the sentiments that discourage customer interest by stimulating discomfort. Webtoons that evoke these sentiments are hard to get public attention. Artists should pay attention to these sentiments that cause inconvenience to consumers in creating webtoons. Finally, Webtoons belonging to 'Romance' do not evoke a variety of consumer sentiments, but they are interpreted as touching consumers. They are expected to be consumed as 'healing content' targeted at consumers with high levels of stress or mental fatigue in their lives. The results of this study are meaningful in that it identifies the applicability of consumer sentiment in the areas of recommendation and classification of webtoons, and provides guidelines to help members of webtoons' ecosystem better understand consumers and formulate strategies.

A Study on the Faculty Evaluation Model with Considering the Characteristics of Education-Based Colleges (전문대학의 특성을 고려한 교수업적평가 모델 연구)

  • Hwang, Il-Kyu;Kim, Kyeong-Sook;Kwon, O-Young;Ahn, Tae-Won;Park, Young-Tae
    • Journal of vocational education research
    • /
    • v.30 no.4
    • /
    • pp.23-49
    • /
    • 2011
  • Faculty performance evaluation system has been settled down as an uncomfortable but unavoidable system, and it is one of the most important factors to grow the college competitiveness up. In this study, we selected and surveyed faculty evaluation models of several universities and colleges in Korea, and analyzed by comparing each evaluation areas of educational achievement, college-industry collaboration, research, and service. We also identified the properties of the current faculty evaluation models of the junior colleges, and derived several problems from these models such as an imitation of four-year university model, a disorders of job evaluation with respect to the attributes of classified jobs, a large variation of individual item weights, and an insufficient reflection of major characteristics. Based on these surveys and analysis, an improved faculty evaluation model for the junior college is proposed in this study. This model proposed four basic areas-educational achievement, college-industry collaboration, research, and service by considering the importance of the college-industry collaboration in the junior college-as well as the team evaluation area. Weights of the SCI-class paper was selected as a criterion for the arrangement of objective comparison of each evaluation items. We showed the integration method of several different evaluation model with respect to the attributes of classified jobs of each faculties, and evaluation plan of variational characteristics according to the majors of individuals in this model. Finally, we introduced an area fail and rating system to operate efficiently the proposed faculty evaluation model.

A Recidivism Prediction Model Based on XGBoost Considering Asymmetric Error Costs (비대칭 오류 비용을 고려한 XGBoost 기반 재범 예측 모델)

  • Won, Ha-Ram;Shim, Jae-Seung;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.127-137
    • /
    • 2019
  • Recidivism prediction has been a subject of constant research by experts since the early 1970s. But it has become more important as committed crimes by recidivist steadily increase. Especially, in the 1990s, after the US and Canada adopted the 'Recidivism Risk Assessment Report' as a decisive criterion during trial and parole screening, research on recidivism prediction became more active. And in the same period, empirical studies on 'Recidivism Factors' were started even at Korea. Even though most recidivism prediction studies have so far focused on factors of recidivism or the accuracy of recidivism prediction, it is important to minimize the prediction misclassification cost, because recidivism prediction has an asymmetric error cost structure. In general, the cost of misrecognizing people who do not cause recidivism to cause recidivism is lower than the cost of incorrectly classifying people who would cause recidivism. Because the former increases only the additional monitoring costs, while the latter increases the amount of social, and economic costs. Therefore, in this paper, we propose an XGBoost(eXtream Gradient Boosting; XGB) based recidivism prediction model considering asymmetric error cost. In the first step of the model, XGB, being recognized as high performance ensemble method in the field of data mining, was applied. And the results of XGB were compared with various prediction models such as LOGIT(logistic regression analysis), DT(decision trees), ANN(artificial neural networks), and SVM(support vector machines). In the next step, the threshold is optimized to minimize the total misclassification cost, which is the weighted average of FNE(False Negative Error) and FPE(False Positive Error). To verify the usefulness of the model, the model was applied to a real recidivism prediction dataset. As a result, it was confirmed that the XGB model not only showed better prediction accuracy than other prediction models but also reduced the cost of misclassification most effectively.

『Chūn-qiū』Wáng-lì(『春秋』王曆)➂ - from Zhōu-lì(周曆) to Xià-lì(夏曆), and "Xíng-xià-zhī-shí(行夏之時)" Mentioned by Confucius (『춘추』 왕력(王曆)➂ - 주력(周曆)에서 하력(夏曆)으로, 그리고 공자의 "행하지시(行夏之時)")

  • Seo, Jeong-Hwa
    • The Journal of Korean Philosophical History
    • /
    • no.54
    • /
    • pp.153-184
    • /
    • 2017
  • During the Pre-Qin(秦) Dynasty era, there were the records that there had been many calendar systems, such as $g{\check{u}}-li{\grave{u}}-l{\grave{i}}$(古六曆 : six ancient calendar systems). Then, the fact that particularly $zh{\bar{o}}u-l{\grave{i}}$(周曆) and $xi{\grave{a}}-l{\grave{i}}$(夏曆) were mainly discussed among them resulted from a lot of discussions from the differences in the calendar system in "$Ch{\bar{u}}n-qi{\bar{u}}$(春秋)" known to have been written by Confucius from the calendar system in "$X{\acute{i}}ng-xi{\grave{a}}-zh{\bar{i}}-sh{\acute{i}}$(行夏之時 : implement the calendar of Ha dynasty.)" that Confucius mentioned himself to his disciple. $zh{\bar{o}}u-l{\grave{i}}$(周曆) with $d{\bar{o}}ngzh{\grave{i}}-yu{\grave{e}}$(冬至月 : the 11th month of the lunar calendar) as the first month of a year had the system of the lunar calendar, and $xi{\grave{a}}-l{\grave{i}}$(夏曆) called as the calendar of Ha(夏) dynasty had the system of $ji{\acute{e}}-q{\grave{i}}-l{\grave{i}}$(節氣曆 : a kind of the solar calendar that divides one year of 365 days into 24 solar terms) with $y{\acute{i}}n-yu{\grave{e}}$(寅月 :one month from the present Feb 5) as the first month of a year. These two calendars had definite differences in the first months of a year, names of seasons, and the lunar calendar and the solar calendar. The fundamental reason why Confucius recommended the performance of $xi{\grave{a}}-l{\grave{i}}$(夏曆) as a way to run the nation was not that it started from the philosophical view of the universe that among the 'three $zh{\bar{e}}ng$'(三正)' of $ti{\bar{a}}n-zh{\bar{e}}ng$(天正 : the first month of a year with the heaven as the standard), $d{\grave{i}}-zh{\bar{e}}ng$(地正 : the first month of a year with the earth as the standard) and $r{\acute{e}}n-zh{\bar{e}}ng$(人正 : the first month of a year with humans as the standard), but that he wanted to emphasize the importance of practical national economic policies to enhance agricultural productivity. It becomes the criterion that even though Confucius emphasized that politicians should not have moral flaws ideally, with regard to public policies, he wanted to stress politicians' duties based on the reality a lot.

Analyzing Mathematical Performances of ChatGPT: Focusing on the Solution of National Assessment of Educational Achievement and the College Scholastic Ability Test (ChatGPT의 수학적 성능 분석: 국가수준 학업성취도 평가 및 대학수학능력시험 수학 문제 풀이를 중심으로)

  • Kwon, Oh Nam;Oh, Se Jun;Yoon, Jungeun;Lee, Kyungwon;Shin, Byoung Chul;Jung, Won
    • Communications of Mathematical Education
    • /
    • v.37 no.2
    • /
    • pp.233-256
    • /
    • 2023
  • This study conducted foundational research to derive ways to use ChatGPT in mathematics education by analyzing ChatGPT's responses to questions from the National Assessment of Educational Achievement (NAEA) and the College Scholastic Ability Test (CSAT). ChatGPT, a generative artificial intelligence model, has gained attention in various fields, and there is a growing demand for its use in education as the number of users rapidly increases. To the best of our knowledge, there are very few reported cases of educational studies utilizing ChatGPT. In this study, we analyzed ChatGPT 3.5 responses to questions from the three-year National Assessment of Educational Achievement and the College Scholastic Ability Test, categorizing them based on the percentage of correct answers, the accuracy of the solution process, and types of errors. The correct answer rates for ChatGPT in the National Assessment of Educational Achievement and the College Scholastic Ability Test questions were 37.1% and 15.97%, respectively. The accuracy of ChatGPT's solution process was calculated as 3.44 for the National Assessment of Educational Achievement and 2.49 for the College Scholastic Ability Test. Errors in solving math problems with ChatGPT were classified into procedural and functional errors. Procedural errors referred to mistakes in connecting expressions to the next step or in calculations, while functional errors were related to how ChatGPT recognized, judged, and outputted text. This analysis suggests that relying solely on the percentage of correct answers should not be the criterion for assessing ChatGPT's mathematical performance, but rather a combination of the accuracy of the solution process and types of errors should be considered.