KNU Korean Sentiment Lexicon: Bi-LSTM-based Method for Building a Korean Sentiment Lexicon (Bi-LSTM 기반의 한국어 감성사전 구축 방안)
-
- Journal of Intelligence and Information Systems
- /
- v.24 no.4
- /
- pp.219-240
- /
- 2018
Sentiment analysis, which is one of the text mining techniques, is a method for extracting subjective content embedded in text documents. Recently, the sentiment analysis methods have been widely used in many fields. As good examples, data-driven surveys are based on analyzing the subjectivity of text data posted by users and market researches are conducted by analyzing users' review posts to quantify users' reputation on a target product. The basic method of sentiment analysis is to use sentiment dictionary (or lexicon), a list of sentiment vocabularies with positive, neutral, or negative semantics. In general, the meaning of many sentiment words is likely to be different across domains. For example, a sentiment word, 'sad' indicates negative meaning in many fields but a movie. In order to perform accurate sentiment analysis, we need to build the sentiment dictionary for a given domain. However, such a method of building the sentiment lexicon is time-consuming and various sentiment vocabularies are not included without the use of general-purpose sentiment lexicon. In order to address this problem, several studies have been carried out to construct the sentiment lexicon suitable for a specific domain based on 'OPEN HANGUL' and 'SentiWordNet', which are general-purpose sentiment lexicons. However, OPEN HANGUL is no longer being serviced and SentiWordNet does not work well because of language difference in the process of converting Korean word into English word. There are restrictions on the use of such general-purpose sentiment lexicons as seed data for building the sentiment lexicon for a specific domain. In this article, we construct 'KNU Korean Sentiment Lexicon (KNU-KSL)', a new general-purpose Korean sentiment dictionary that is more advanced than existing general-purpose lexicons. The proposed dictionary, which is a list of domain-independent sentiment words such as 'thank you', 'worthy', and 'impressed', is built to quickly construct the sentiment dictionary for a target domain. Especially, it constructs sentiment vocabularies by analyzing the glosses contained in Standard Korean Language Dictionary (SKLD) by the following procedures: First, we propose a sentiment classification model based on Bidirectional Long Short-Term Memory (Bi-LSTM). Second, the proposed deep learning model automatically classifies each of glosses to either positive or negative meaning. Third, positive words and phrases are extracted from the glosses classified as positive meaning, while negative words and phrases are extracted from the glosses classified as negative meaning. Our experimental results show that the average accuracy of the proposed sentiment classification model is up to 89.45%. In addition, the sentiment dictionary is more extended using various external sources including SentiWordNet, SenticNet, Emotional Verbs, and Sentiment Lexicon 0603. Furthermore, we add sentiment information about frequently used coined words and emoticons that are used mainly on the Web. The KNU-KSL contains a total of 14,843 sentiment vocabularies, each of which is one of 1-grams, 2-grams, phrases, and sentence patterns. Unlike existing sentiment dictionaries, it is composed of words that are not affected by particular domains. The recent trend on sentiment analysis is to use deep learning technique without sentiment dictionaries. The importance of developing sentiment dictionaries is declined gradually. However, one of recent studies shows that the words in the sentiment dictionary can be used as features of deep learning models, resulting in the sentiment analysis performed with higher accuracy (Teng, Z., 2016). This result indicates that the sentiment dictionary is used not only for sentiment analysis but also as features of deep learning models for improving accuracy. The proposed dictionary can be used as a basic data for constructing the sentiment lexicon of a particular domain and as features of deep learning models. It is also useful to automatically and quickly build large training sets for deep learning models.
Ⅰ. Introduction Retailers in the 21st century are being told that future retailers are those who can execute seamless multi-channel access. The reason is that retailers should be where shoppers want them, when they want them anytime, anywhere and in multiple formats. Multi-channel access is considered one of the top 10 trends of all business in the next decade (Patricia T. Warrington, et al., 2007) And most firms use both direct and indirect channels in their markets. Given this trend, we need to evaluate a channel equity more systematically than before as this issue is expected to get more attention to consumers as well as to brand managers. Consumers are becoming very much confused concerning the choice of place where they shop for durable goods as there are at least 6-7 retail options. On the other hand, manufacturers have to deal with category killers, their dealers network, Internet shopping malls, and other avenue of distribution channels and they hope their retail channel behave like extensions of their own companies. They would like their products to be foremost in the retailer's mind-the first to be proposed and effectively communicated to potential customers. To enable this hope to come reality, they should know each channel's advantages and disadvantages from consumer perspectives. In addition, customer satisfaction is the key determinant of retail customer loyalty. However, there are only a few researches regarding the effects of shopping satisfaction and perceptions on consumers' channel choices and channels. The purpose of this study was to assess Korean consumers' channel choice and satisfaction towards channels they prefer to use in the case of electronic goods shopping. Korean electronic goods retail market is one of good example of multi-channel shopping environments. As the Korea retail market has been undergoing significant structural changes since it had opened to global retailers in 1996, new formats such as hypermarkets, Internet shopping malls and category killers have arrived for the last decade. Korean electronic goods shoppers have seven major channels : (1)category killers (2) hypermarket (3) manufacturer dealer shop (4) Internet shopping malls (5) department store (6) TV home-shopping (7) speciality shopping arcade. Korean retail sector has been modernized with amazing speed for the last decade. Overall summary of major retail channels is as follows: Hypermarket has been number 1 retailer type in sales volume from 2003 ; non-store retailing has been number 2 from 2007 ; department store is now number 3 ; small scale category killers are growing rapidly in the area of electronics and office products in particular. We try to evaluate each channel's equity using a consumer survey. The survey was done by telephone interview with 1000 housewife who lives nationwide. Sampling was done according to 2005 national census and average interview time was 10 to 15 minutes. Ⅱ. Research Summary We have found that seven major retail channels compete with each other within Korean consumers' minds in terms of price and service. Each channel seem to have its unique selling points. Department stores were perceived as the best electronic goods shopping destinations due to after service. Internet shopping malls were perceived as the convenient channel owing to price checking. Category killers and hypermarkets were more attractive in both price merits and location conveniences. On the other hand, manufacturers dealer networks were pulling customers mainly by location and after service. Category killers and hypermarkets were most beloved retail channel for Korean consumers. However category killers compete mainly with department stores and shopping arcades while hypermarkets tend to compete with Internet and TV home shopping channels. Regarding channel satisfaction, the top 3 channels were service-driven retailers: department stores (4.27); dealer shop (4.21); and Internet shopping malls (4.21). Speciality shopping arcade(3.98) were the least satisfied channels among Korean consumers. Ⅲ. Implications We try to identify the whole picture of multi-channel retail shopping environments and its implications in the context of Korean electronic goods. From manufacturers' perspectives, multi-channel may cause channel conflicts. Furthermore, inter-channel competition draws much more attention as hypermarkets and category killers have grown rapidly in recent years. At the same time, from consumers' perspectives, 'buy where' is becoming an important buying decision as it would decide the level of shopping satisfaction. We need to develop the concept of 'channel equity' to manage multi-channel distribution effectively. Firms should measure and monitor their prime channel equity in regular basis to maximize their channel potentials. Prototype channel equity positioning map has been developed as follows. We expect more studies to develop the concept of 'channel equity' in the future.
As a results of researches on the cultivation processes and settlement developments on the Mangyoung river valley as a whole could be have four 'Space-Time Continuity' through a [Origin-Destination] theory model. On a initial phases of cultivation, the cultivation process has been begun at mountain slopes and tributory plains in upper part of river-basin from Koryo Dynasty to early Chosun Dynasty. At first, indigenous peasants burned forests on the mountain slopes for making 'dryfield' for a cereal crops. Following population increase more stable food supply is necessary facets of life inducing a change production method into a 'wetfield' in tributory plains matching the population increase. First sedentary agriculture maybe initiated at this mountain slopes and tributory plains on upper part of river basin through a burning cultivation methods. Mountain slopes and tributory plains are become a Origin area in cultivation processes. It expanded from up to down through the valleys with 'a bits of land' fashion in a steady pace like a terraced fields expanded with bit by bit of land to downward. They expanded their land to the middle part of river basin in mid period of Chosun Dynasty with dike construction techniques on the river bank. Lower part of river cultivated with embankment building techniques in 1920s and then naturally expanded to the tidal marshes on the estuaries and river inlets of coastal areas. 'Pioneer fringes' are consolidated at there in modern times. Changes in landscapes are appeared it's own characters with each periods of time. Followings are results of study through the Mangyoung river valley as a whole. (1) Mountain slopes and tributory plains on the upper part of river are cultivated 'dryfields' by indigenous peasants with Burning cultivation methods at first and developed sedentary settlements at the edges of mountain slopes and on the river terrace near the fields. They formed a kind of 'periphery-located cluster type' of settlement. This type of settlement are become a prominant type in upper part of river basin. 'Dryfields' has been changed into a 'wetfields' at the narrow tributory plains by increasing population pressure in later time. These wetfields are supplied water by Weir and Ponds Irrigation System(제언수리방법). Streams on the tributory plains has been attracted wetfields besides of it and formed a [water+land] complex on it. 'Wetfields' are expanded from up to downward with a terraced land pattern(adder like pattern, 붕전) according to the gradient of valley. These periphery located settlements are formed a intimate ecological linkage with several sets of surroundings. Inner villages are expanded to Outer villages according to the expansion of arable lands into downward. (2) Mountain slopes and tributory plains expanded its territory to the alluvial deposited plains on the middle part of river valley with a urgent need of new land by population increase. This part of alluvial plains are cultivated mainly in mid period of Chosun Dynasty. Irrigation methods are changed into a Dike Construction Irrigation method(천방수리방법) for the control of floods. It has a trend to change the subjectives of cultivation from community-oriented one who constructed Bochang along tributories making rice paddies to local government authorities who could be gather large sums of capitals, techniques and labours for the big dike construction affairs. Settlements are advanced in the midst of plains avoiding friction of distances and formed a 'Centrallocated cluster type' of settlements. There occured a hierarchical structures of settlements in ranks and sizes according merits of water supply and transportation convenience at the broad plains. Big towns are developed at there. It strengthened a more prominant [water+land] complex along the canals. Ecological linkages between settlements and surroundings are shaded out into a tiny one in this area. (3) It is very necessary to get a modern technology of flood control at the rivers that have a large volume of water and broad width. The alluvial plains are remained in a wilderness phase until a technical level reached a large artificial levee construction ability that could protect the arable land from flood. Until that time on most of alluvial land at the lower part of river are remained a wilderness of overgrown with reeds in lacks of techniques to build a large-scale artificial levee along the riverbank. Cultivation processes are progressed in a large scale one by Japanese agricultural companies with [River Rennovation Project] of central government in 1920s. Large scale artificial levees are constructed along the riverbank. Subjectives of cultivation are changed from Korean peasants to Japanese agricultural companies and Korean peasants fell down as a tenant in a colonial situation of that time in Korea. They could not have any voices in planning of spatial structure and decreased their role in planning. Newly cultivated lands are reflected company's intensions, objectives and perspectives for achieving their goals for the sake of colonial power. Newly cultivated lands are planned into a regular Rectangular Block settings of rice paddies and implanted a large scale Bureaucratic-oriented Irrigation System on the cultivated plains. Every settlements are located in the midst of rice paddies with a Central located Cluster type of settlements. [water+land] complex along the canal system are more strengthened. Cultivated space has a characters of [I-IT] landscapes. (4) Artificial levees are connected into a coastal emnankment for a reclamation of broad tidal marshes on the estuaries and inlets of rivers in the colonial times. Subjectives of reclamation are enlarged into a big agricultural companies that could be acted a role as a big cultivator. After that time on most of reclamation project of tidal marshes are controlled by these agricultural companies formed by mostly Japanese capitalists. Reclaimed lands on the estuaries and river inlets are under hands of agricultural companies and all the spatial structures are formed by their intensions, objectives and perspectives. They constructed a Unit Farming Area for the sake of companies. Spatial structures are planned in a regular one with broad arable land for the rice production of rectangular blocks, regular canal systems and tank reservoir for the irrigation water supply into reclaimed lands. There developed a 'Central-located linear type' of settlements in midst of reclaimed land. These settlements are settled in a detail program upon this newly reclaimed land at once with a master plan and they have planned patterns in their distribution, building materials, location, and form. Ecological linkage between Newly settled settlemrnts and its surroundings are lost its colours and became a more artificial one by human-centred environment. [I-IT] landscapes are become more prominant. This region is a destination area of [Origin-Destination] theory model and formed a 'Pioneer Fringe'. It is a kind of pioneer front that could advance or retreat discontinously by physical conditions and socio-cultural conditions of that region.
To survive in the global competitive environment, enterprise should be able to solve various problems and find the optimal solution effectively. The big-data is being perceived as a tool for solving enterprise problems effectively and improve competitiveness with its' various problem solving and advanced predictive capabilities. Due to its remarkable performance, the implementation of big data systems has been increased through many enterprises around the world. Currently the big-data is called the 'crude oil' of the 21st century and is expected to provide competitive superiority. The reason why the big data is in the limelight is because while the conventional IT technology has been falling behind much in its possibility level, the big data has gone beyond the technological possibility and has the advantage of being utilized to create new values such as business optimization and new business creation through analysis of big data. Since the big data has been introduced too hastily without considering the strategic value deduction and achievement obtained through the big data, however, there are difficulties in the strategic value deduction and data utilization that can be gained through big data. According to the survey result of 1,800 IT professionals from 18 countries world wide, the percentage of the corporation where the big data is being utilized well was only 28%, and many of them responded that they are having difficulties in strategic value deduction and operation through big data. The strategic value should be deducted and environment phases like corporate internal and external related regulations and systems should be considered in order to introduce big data, but these factors were not well being reflected. The cause of the failure turned out to be that the big data was introduced by way of the IT trend and surrounding environment, but it was introduced hastily in the situation where the introduction condition was not well arranged. The strategic value which can be obtained through big data should be clearly comprehended and systematic environment analysis is very important about applicability in order to introduce successful big data, but since the corporations are considering only partial achievements and technological phases that can be obtained through big data, the successful introduction is not being made. Previous study shows that most of big data researches are focused on big data concept, cases, and practical suggestions without empirical study. The purpose of this study is provide the theoretically and practically useful implementation framework and strategies of big data systems with conducting comprehensive literature review, finding influencing factors for successful big data systems implementation, and analysing empirical models. To do this, the elements which can affect the introduction intention of big data were deducted by reviewing the information system's successful factors, strategic value perception factors, considering factors for the information system introduction environment and big data related literature in order to comprehend the effect factors when the corporations introduce big data and structured questionnaire was developed. After that, the questionnaire and the statistical analysis were performed with the people in charge of the big data inside the corporations as objects. According to the statistical analysis, it was shown that the strategic value perception factor and the inside-industry environmental factors affected positively the introduction intention of big data. The theoretical, practical and political implications deducted from the study result is as follows. The frist theoretical implication is that this study has proposed theoretically effect factors which affect the introduction intention of big data by reviewing the strategic value perception and environmental factors and big data related precedent studies and proposed the variables and measurement items which were analyzed empirically and verified. This study has meaning in that it has measured the influence of each variable on the introduction intention by verifying the relationship between the independent variables and the dependent variables through structural equation model. Second, this study has defined the independent variable(strategic value perception, environment), dependent variable(introduction intention) and regulatory variable(type of business and corporate size) about big data introduction intention and has arranged theoretical base in studying big data related field empirically afterwards by developing measurement items which has obtained credibility and validity. Third, by verifying the strategic value perception factors and the significance about environmental factors proposed in the conventional precedent studies, this study will be able to give aid to the afterwards empirical study about effect factors on big data introduction. The operational implications are as follows. First, this study has arranged the empirical study base about big data field by investigating the cause and effect relationship about the influence of the strategic value perception factor and environmental factor on the introduction intention and proposing the measurement items which has obtained the justice, credibility and validity etc. Second, this study has proposed the study result that the strategic value perception factor affects positively the big data introduction intention and it has meaning in that the importance of the strategic value perception has been presented. Third, the study has proposed that the corporation which introduces big data should consider the big data introduction through precise analysis about industry's internal environment. Fourth, this study has proposed the point that the size and type of business of the corresponding corporation should be considered in introducing the big data by presenting the difference of the effect factors of big data introduction depending on the size and type of business of the corporation. The political implications are as follows. First, variety of utilization of big data is needed. The strategic value that big data has can be accessed in various ways in the product, service field, productivity field, decision making field etc and can be utilized in all the business fields based on that, but the parts that main domestic corporations are considering are limited to some parts of the products and service fields. Accordingly, in introducing big data, reviewing the phase about utilization in detail and design the big data system in a form which can maximize the utilization rate will be necessary. Second, the study is proposing the burden of the cost of the system introduction, difficulty in utilization in the system and lack of credibility in the supply corporations etc in the big data introduction phase by corporations. Since the world IT corporations are predominating the big data market, the big data introduction of domestic corporations can not but to be dependent on the foreign corporations. When considering that fact, that our country does not have global IT corporations even though it is world powerful IT country, the big data can be thought to be the chance to rear world level corporations. Accordingly, the government shall need to rear star corporations through active political support. Third, the corporations' internal and external professional manpower for the big data introduction and operation lacks. Big data is a system where how valuable data can be deducted utilizing data is more important than the system construction itself. For this, talent who are equipped with academic knowledge and experience in various fields like IT, statistics, strategy and management etc and manpower training should be implemented through systematic education for these talents. This study has arranged theoretical base for empirical studies about big data related fields by comprehending the main variables which affect the big data introduction intention and verifying them and is expected to be able to propose useful guidelines for the corporations and policy developers who are considering big data implementationby analyzing empirically that theoretical base.
1. The 'Kao Zheng Pai(考證派) comes from the 'Zhe Zhong Pai' and is a school that is influenced by the confucianism of the Qing dynasty. In Japan Inoue Kinga(井上金娥), Yoshida Koton(吉田篁墩) became central members, and the rise of the methodology of historical research(考證學) influenced the members of the 'Zhe Zhong Pai', and the trend of historical research changed from confucianism to medicine, making a school of medicine based on the study of texts and proving that the classics were right. 2. Based on the function of 'Nei Qu Li '(內驅力) the 'Kao Zheng Pai', in the spirit of 'use confucianism as the base', researched letters, meanings and historical origins. Because they were influenced by the methodology of historical research(考證學) of the Qing era, they valued the evidential research of classic texts, and there was even one branch that did only historical research, the 'Rue Xue Kao Zheng Pai'(儒學考證派). Also, the 'Yi Xue Kao Zheng Pai'(醫學考證派) appeared by the influence of Yoshida Kouton and Kariya Ekisai(狩谷掖齋). 3. In the 'Kao Zheng Pai(考證派)'s theories and views the 'Yi Xue Kao Zheng Pai' did not look at medical scriptures like the "Huang Di Nei Jing"("黃帝內經") and did not do research on 'medical' related areas like acupuncture, the meridian and medicinal herbs. Since they were doctors that used medicine, they naturally were based on 'formulas'(方劑) and since their thoughts were based on the historical ideologies, they valued the "Shang Han Ja Bing Lun" which was revered as the 'ancestor of all formulas'(衆方之祖). 4. The lives of the important doctors of the 'Kao Zheng Pai' Meguro Dotaku(目黑道琢) Yamada Seichin(山田正珍), Yamada Kyoko(山田業廣), Mori Ritsi(森立之) Kitamura Naohara(喜多村直寬) are as follows. 1) Meguro Dotaku(目黑道琢 1739
1.The 'Kao Zheng Pai'(考證派) comes from the 'Zhe Zhong Pai(折衷派)' and is a school that is influenced by the confucianism of the Qing dynasty. In Japan Inoue Kinga(井上金峨), Yoshida Koton(古田篁墩