• Title/Summary/Keyword: Semantic Classification Model

Search Result 112, Processing Time 0.02 seconds

The e-Business Agent Prototyping System with Component Based Development Architecture (CBD 아키텍처 기반 e-비즈니스 에이전트 프로토타이핑 시스템)

  • Shin, Ho-Jun;Kim, Haeng-Kon
    • The KIPS Transactions:PartD
    • /
    • v.11D no.1
    • /
    • pp.133-142
    • /
    • 2004
  • The next generation of web applications will need to be larger, more complex, and flexible Agent-oriented systems have great potential for these e-commerce applications. Agents can dynamically discover and compose e-services and mediate interactions. Development of software agents with CBD (Component Based Development) has proved to be successful in increasing speed to market of development Projects, lowering the development cost and providing better qualify. In this thesis, we propose a systemic development process for software agents using component and UML (Unified Modeling Language). We suggest a etA (e-business Agent) CBD reference architecture for layer the related components through identification and classification of general agent and e-business agent. We also propose the ebA-CBD process that is a guideline to consider the best features of existing agent oriented software engineering methodologies, while grounding agent-oriented concepts in the same underlying semantic framework used by UML. We first developed the agent components specification and modeled it with Goal, Role, Interaction, and Architecture Model. Based on this, we developed e-CPIMAS (e-Commerce Product Information Mailing Agent System) as a case study that provides the product information's mailing service according to proposed process formality. We finally describe how these concepts may assist in increasing the efficiency reusability, productivity and quality to develop the business application and e-business agent.

A Study of 'Emotion Trigger' by Text Mining Techniques (텍스트 마이닝을 이용한 감정 유발 요인 'Emotion Trigger'에 관한 연구)

  • An, Juyoung;Bae, Junghwan;Han, Namgi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.69-92
    • /
    • 2015
  • The explosion of social media data has led to apply text-mining techniques to analyze big social media data in a more rigorous manner. Even if social media text analysis algorithms were improved, previous approaches to social media text analysis have some limitations. In the field of sentiment analysis of social media written in Korean, there are two typical approaches. One is the linguistic approach using machine learning, which is the most common approach. Some studies have been conducted by adding grammatical factors to feature sets for training classification model. The other approach adopts the semantic analysis method to sentiment analysis, but this approach is mainly applied to English texts. To overcome these limitations, this study applies the Word2Vec algorithm which is an extension of the neural network algorithms to deal with more extensive semantic features that were underestimated in existing sentiment analysis. The result from adopting the Word2Vec algorithm is compared to the result from co-occurrence analysis to identify the difference between two approaches. The results show that the distribution related word extracted by Word2Vec algorithm in that the words represent some emotion about the keyword used are three times more than extracted by co-occurrence analysis. The reason of the difference between two results comes from Word2Vec's semantic features vectorization. Therefore, it is possible to say that Word2Vec algorithm is able to catch the hidden related words which have not been found in traditional analysis. In addition, Part Of Speech (POS) tagging for Korean is used to detect adjective as "emotional word" in Korean. In addition, the emotion words extracted from the text are converted into word vector by the Word2Vec algorithm to find related words. Among these related words, noun words are selected because each word of them would have causal relationship with "emotional word" in the sentence. The process of extracting these trigger factor of emotional word is named "Emotion Trigger" in this study. As a case study, the datasets used in the study are collected by searching using three keywords: professor, prosecutor, and doctor in that these keywords contain rich public emotion and opinion. Advanced data collecting was conducted to select secondary keywords for data gathering. The secondary keywords for each keyword used to gather the data to be used in actual analysis are followed: Professor (sexual assault, misappropriation of research money, recruitment irregularities, polifessor), Doctor (Shin hae-chul sky hospital, drinking and plastic surgery, rebate) Prosecutor (lewd behavior, sponsor). The size of the text data is about to 100,000(Professor: 25720, Doctor: 35110, Prosecutor: 43225) and the data are gathered from news, blog, and twitter to reflect various level of public emotion into text data analysis. As a visualization method, Gephi (http://gephi.github.io) was used and every program used in text processing and analysis are java coding. The contributions of this study are as follows: First, different approaches for sentiment analysis are integrated to overcome the limitations of existing approaches. Secondly, finding Emotion Trigger can detect the hidden connections to public emotion which existing method cannot detect. Finally, the approach used in this study could be generalized regardless of types of text data. The limitation of this study is that it is hard to say the word extracted by Emotion Trigger processing has significantly causal relationship with emotional word in a sentence. The future study will be conducted to clarify the causal relationship between emotional words and the words extracted by Emotion Trigger by comparing with the relationships manually tagged. Furthermore, the text data used in Emotion Trigger are twitter, so the data have a number of distinct features which we did not deal with in this study. These features will be considered in further study.