KNU Korean Sentiment Lexicon: Bi-LSTM-based Method for Building a Korean Sentiment Lexicon (Bi-LSTM 기반의 한국어 감성사전 구축 방안)
-
- Journal of Intelligence and Information Systems
- /
- v.24 no.4
- /
- pp.219-240
- /
- 2018
Sentiment analysis, which is one of the text mining techniques, is a method for extracting subjective content embedded in text documents. Recently, the sentiment analysis methods have been widely used in many fields. As good examples, data-driven surveys are based on analyzing the subjectivity of text data posted by users and market researches are conducted by analyzing users' review posts to quantify users' reputation on a target product. The basic method of sentiment analysis is to use sentiment dictionary (or lexicon), a list of sentiment vocabularies with positive, neutral, or negative semantics. In general, the meaning of many sentiment words is likely to be different across domains. For example, a sentiment word, 'sad' indicates negative meaning in many fields but a movie. In order to perform accurate sentiment analysis, we need to build the sentiment dictionary for a given domain. However, such a method of building the sentiment lexicon is time-consuming and various sentiment vocabularies are not included without the use of general-purpose sentiment lexicon. In order to address this problem, several studies have been carried out to construct the sentiment lexicon suitable for a specific domain based on 'OPEN HANGUL' and 'SentiWordNet', which are general-purpose sentiment lexicons. However, OPEN HANGUL is no longer being serviced and SentiWordNet does not work well because of language difference in the process of converting Korean word into English word. There are restrictions on the use of such general-purpose sentiment lexicons as seed data for building the sentiment lexicon for a specific domain. In this article, we construct 'KNU Korean Sentiment Lexicon (KNU-KSL)', a new general-purpose Korean sentiment dictionary that is more advanced than existing general-purpose lexicons. The proposed dictionary, which is a list of domain-independent sentiment words such as 'thank you', 'worthy', and 'impressed', is built to quickly construct the sentiment dictionary for a target domain. Especially, it constructs sentiment vocabularies by analyzing the glosses contained in Standard Korean Language Dictionary (SKLD) by the following procedures: First, we propose a sentiment classification model based on Bidirectional Long Short-Term Memory (Bi-LSTM). Second, the proposed deep learning model automatically classifies each of glosses to either positive or negative meaning. Third, positive words and phrases are extracted from the glosses classified as positive meaning, while negative words and phrases are extracted from the glosses classified as negative meaning. Our experimental results show that the average accuracy of the proposed sentiment classification model is up to 89.45%. In addition, the sentiment dictionary is more extended using various external sources including SentiWordNet, SenticNet, Emotional Verbs, and Sentiment Lexicon 0603. Furthermore, we add sentiment information about frequently used coined words and emoticons that are used mainly on the Web. The KNU-KSL contains a total of 14,843 sentiment vocabularies, each of which is one of 1-grams, 2-grams, phrases, and sentence patterns. Unlike existing sentiment dictionaries, it is composed of words that are not affected by particular domains. The recent trend on sentiment analysis is to use deep learning technique without sentiment dictionaries. The importance of developing sentiment dictionaries is declined gradually. However, one of recent studies shows that the words in the sentiment dictionary can be used as features of deep learning models, resulting in the sentiment analysis performed with higher accuracy (Teng, Z., 2016). This result indicates that the sentiment dictionary is used not only for sentiment analysis but also as features of deep learning models for improving accuracy. The proposed dictionary can be used as a basic data for constructing the sentiment lexicon of a particular domain and as features of deep learning models. It is also useful to automatically and quickly build large training sets for deep learning models.
This study aimed at contributing to the improvement of cropping systems after finding out the effects of excrements and components of crop root influence on other crops as well as themselves. The following forage crops suitable for our country were selected for the present study. Aqueous extracts of fresh roots, aqueous extracts of rotting roots and aqueous solutions of excrements of red clover, orchard grass and brome grass were studied for the effects influencing the germination and growth of seedlings of red clover, ladino clover, lespedeza, soybean, orchard grass, Italian ryegrass, brome grass, barley, wheat, sorghum, corn and Hog-millet. In view of the possibility that the organic acid might be closely related to the excrements and components of crop root connected with soil sickness, the acid components of three species of roots were analysed by paper chromatography and gas chromatography method. The following results were obtained: 1. Effects of Aqueous Extracts of Fresh Roots : Aqueous extracts of red clover: The extracts inhibited the growth of seedlings of the ladino clover and lespedeza and also inhibited the development of most crops except that of sorghum among the Graminaceae. Aqueous extracts of orchard grass: The extracts promoted the seedlings growth of red clover and soybean, while it inhibited the germination and growth of orchard grass. There were no noticeable effects influencing other crops while it inhibited the growth of barley and Hog-millet. Aqueous extracts of brome grass: There was no effect on Italian ryegrass but there was an inhibiting effect on the other crops. 2. Effects of Aqueous Extracts of Rotting Roots : Aqueous extracts of red clover: The extracts promoted the seedling growth of red clover. But it reflected the inhibiting effects on other crops except sorghum. Aqueous extracts of orchard grass: The extracts promoted the growth of red clover, ladino clover, soybean and sorghun, while it inhibited the germination and rooting of barley and Hog-millet. Aqueous extracts of brome grass: The extracts gave the promotive effects to the growth of red clover, soybean and sorghum, but caused inhibiting effects on orchard grass, brome grass, barley and Hog-millet. 3. Effects of Aqueous Solutions of Excrements : The aqueous solution of excrements of red clover reflected the inhibition effects to the growth of Graminaceae, while the aqueous solutions of excrements of orchard grass and Italian ryegrass caused the promotive effects on the growth of red clover. 4. Results of Organic Acid Analysis : The oxalic acid, citric acid, tartaric acid, malonic acid, malic acid and succinic acid were included in the roots of red clover as unvolatile organic acid, and in the orchard grass and brome grass there were included the oxalic acid, citric acid, tartaric acid and malic acid. And formic acid was confirmed in the red clover, orchard grass and brome grass as volatile organic acid. In consideration of the results mentioned in above the effects of excrements and components of roots found in this studies may be summarized as follows. 1) The red clover generally gave a disadvantageous effect on the Graminaceae. Such trend was considered chiefly caused by the presence of many organic acids, namely oxalic, citric, tartaric, malonic, malic, succinic and formic acid. 2) The orchard grass generally gave an advantageous effect on the Leguminosae. This may be due to a few kinds of organic acid contained in the root, namely oxalic, citric, tartaric, malic and formic acid. Furthermore a certain of promotive materials for growth was noted. 3) As long as the root of brome grass are not rotten, it gave a disadvantageous effect on the Leguminosae and Graminaceae. This may be due to the fact that several unidentified volatile organic acid were also included besides the confirmed organic acid, namely oxalic, citric, tartaric, malic and formic acid. 5. Effects of Components in Roots to the Soil Sickness : 1) It was considered that the cause of alleged red clover's soil sickness did not result from the toxic components of the roots. 2) It was recognized that the toxic components of roots might be the cause of soil sickness in case the orchard grass and brome grass were put into the long-term single cropping. 6. Effects of Rooted Components to the Companion Crops in the Cropping System : a) In case of aqueous extracts of fresh roots and aqueous excrements (Inter cropping and mixed cropping) : 1) Advantageous combinations : Orchard grass->Red clover, Soybean, Italian ryegrass->Red clover, 2) Disadvantageous combinations : Red clover->Ladino clover, Lespedeza, Orchard grass, Italian ryegrass, Fescue Ky-31, Brome grass, Barley, Wheat, Corn and Hog.millet, Orchard grass->Lespedeza, Orchard grass, Barley and Hog-millet, Brome grass->Red clover, Ladino clover, Lespedeza, Soybean, Orchard grass, Brome grass, Barley, Wheat, Sorghum, Corn and Hog-millet, 3) Harmless combinations : Red clover->Red clover, Soybean and Sorghum, Orchard grass->Ladino clover, Italian ryegrass, Brome grass, Wheat, Sorghum and Corn, Brome grass->Italian ryegrass, b) In case of aquecus extracts of rotting roots(After cropping) : 1) Advantageous combinations : Red clover->Red clover and Sorghum, Orchard grass->Red clover, Ladino clover, Soybean, Sorghum, and Corn, Brome grass->Red clover, Soybean and Sorghum, 2) Disadvantageous combinations : Red clover->Lespedeza, Orchard grass, Italian ryegrass, Brome grass, Barley, Wheat, and Hog-millet Orchard grass->Barley and Hog-millet, Brome grass->Orchard grass, Brome grass, Barley and Hog-millet, 3) Harmless combinations : Red clover->Ladino clover, Soybean and Corn, Orchard grass->Lespedeza, Orchard grass, Italian ryegrass, Brome grass and Wheat Brome gass->Ladino clover, Lespedeza, Italian ryegrass and Wheat.