• Title/Summary/Keyword: speech

Search Result 7,763, Processing Time 0.027 seconds

Sentiment Analysis of Korean Reviews Using CNN: Focusing on Morpheme Embedding (CNN을 적용한 한국어 상품평 감성분석: 형태소 임베딩을 중심으로)

  • Park, Hyun-jung;Song, Min-chae;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.59-83
    • /
    • 2018
  • With the increasing importance of sentiment analysis to grasp the needs of customers and the public, various types of deep learning models have been actively applied to English texts. In the sentiment analysis of English texts by deep learning, natural language sentences included in training and test datasets are usually converted into sequences of word vectors before being entered into the deep learning models. In this case, word vectors generally refer to vector representations of words obtained through splitting a sentence by space characters. There are several ways to derive word vectors, one of which is Word2Vec used for producing the 300 dimensional Google word vectors from about 100 billion words of Google News data. They have been widely used in the studies of sentiment analysis of reviews from various fields such as restaurants, movies, laptops, cameras, etc. Unlike English, morpheme plays an essential role in sentiment analysis and sentence structure analysis in Korean, which is a typical agglutinative language with developed postpositions and endings. A morpheme can be defined as the smallest meaningful unit of a language, and a word consists of one or more morphemes. For example, for a word '예쁘고', the morphemes are '예쁘(= adjective)' and '고(=connective ending)'. Reflecting the significance of Korean morphemes, it seems reasonable to adopt the morphemes as a basic unit in Korean sentiment analysis. Therefore, in this study, we use 'morpheme vector' as an input to a deep learning model rather than 'word vector' which is mainly used in English text. The morpheme vector refers to a vector representation for the morpheme and can be derived by applying an existent word vector derivation mechanism to the sentences divided into constituent morphemes. By the way, here come some questions as follows. What is the desirable range of POS(Part-Of-Speech) tags when deriving morpheme vectors for improving the classification accuracy of a deep learning model? Is it proper to apply a typical word vector model which primarily relies on the form of words to Korean with a high homonym ratio? Will the text preprocessing such as correcting spelling or spacing errors affect the classification accuracy, especially when drawing morpheme vectors from Korean product reviews with a lot of grammatical mistakes and variations? We seek to find empirical answers to these fundamental issues, which may be encountered first when applying various deep learning models to Korean texts. As a starting point, we summarized these issues as three central research questions as follows. First, which is better effective, to use morpheme vectors from grammatically correct texts of other domain than the analysis target, or to use morpheme vectors from considerably ungrammatical texts of the same domain, as the initial input of a deep learning model? Second, what is an appropriate morpheme vector derivation method for Korean regarding the range of POS tags, homonym, text preprocessing, minimum frequency? Third, can we get a satisfactory level of classification accuracy when applying deep learning to Korean sentiment analysis? As an approach to these research questions, we generate various types of morpheme vectors reflecting the research questions and then compare the classification accuracy through a non-static CNN(Convolutional Neural Network) model taking in the morpheme vectors. As for training and test datasets, Naver Shopping's 17,260 cosmetics product reviews are used. To derive morpheme vectors, we use data from the same domain as the target one and data from other domain; Naver shopping's about 2 million cosmetics product reviews and 520,000 Naver News data arguably corresponding to Google's News data. The six primary sets of morpheme vectors constructed in this study differ in terms of the following three criteria. First, they come from two types of data source; Naver news of high grammatical correctness and Naver shopping's cosmetics product reviews of low grammatical correctness. Second, they are distinguished in the degree of data preprocessing, namely, only splitting sentences or up to additional spelling and spacing corrections after sentence separation. Third, they vary concerning the form of input fed into a word vector model; whether the morphemes themselves are entered into a word vector model or with their POS tags attached. The morpheme vectors further vary depending on the consideration range of POS tags, the minimum frequency of morphemes included, and the random initialization range. All morpheme vectors are derived through CBOW(Continuous Bag-Of-Words) model with the context window 5 and the vector dimension 300. It seems that utilizing the same domain text even with a lower degree of grammatical correctness, performing spelling and spacing corrections as well as sentence splitting, and incorporating morphemes of any POS tags including incomprehensible category lead to the better classification accuracy. The POS tag attachment, which is devised for the high proportion of homonyms in Korean, and the minimum frequency standard for the morpheme to be included seem not to have any definite influence on the classification accuracy.

A Study on Lee, Man-Bu's Thought of Space and Siksanjeongsa with Special Reference of Prototype Landscape Analyzing Nuhangdo(陋巷圖) and Nuhangnok(陋巷錄) (누항도(陋巷圖)와 누항록(陋巷錄)을 통해 본 이만부의 공간철학과 식산정사의 원형경관)

  • Kahng, Byung-Seon;Lee, Seung-Yeon;Shin, Sang-Sup;Rho, Jae-Hyun
    • Journal of the Korean Institute of Traditional Landscape Architecture
    • /
    • v.39 no.2
    • /
    • pp.15-28
    • /
    • 2021
  • 'Cheonunjeongsa (天雲精舍)', designated as Gyeongsangbukdo Folklore Cultural Property No. 76, is a Siksanjeongsa built in 1700 by Manbu Lee Shiksan. In this study, we investigate the life and perspective of Manbu Lee in relation to Siksanjeongsa, and estimate the feng shui location, territoriality, and original landscape by analyzing 「Nuhangnok」 and 「Nuhando」, the results of his political management. The following results were derived by examining the philosophy that the scholar wanted to include in his space. First, Manbu Lee Shiksan was a representative hermit-type confucian scholar in the late Joseon Dynasty. 'Siksan', the name of the government official and the nickname of Manbu Lee, is derived from the mountain behind the village, and he wanted to rest in the four areas of thought(思), body(躬), speech(言), and friendship(交). During the difficult years of King Sukjong, Lee Manbu of a Namin family expressed his will to seclude through the title 'Siksan'. Second, There is a high possibility of restoration close to the original. Manbu Lee recorded the location of Siksanjeongsa, spatial structure, buildings and landscape facilities, trees, surrounding landscape, and usage behaviors in 「Nuhangnok」, and left a book of 《Nuhangdo》. Third, Manbu Lee refers to the feng shui geography view that Oenogok is closed in two when viewed from the outside, but is cozy and deep and can be seen from a far when entering inside. The whole village of Nogok was called Siksanjeongsa, which means through the name. It can be seen that the area was formed and expanded. Fourth, the spatial composition of Siksanjeongsa can be divided into a banquet space, an education space, a support space, a rest space, a vegetable and an herbal garden. The banquet space composed of Dang, Lu, and Yeonji is a personal space where Manbu Lee, who thinks about the unity of the heavenly people, the virtue of the gentleman, and humanity, is a place for lectures and a place to live. Fifth, Yangjeongjae area is an educational space, and Yangjeongjae is a name taken from the main character Monggwa, and it is a name that prayed for young students to grow brightly and academically. Sixth, the support space composed of Ganjijeong, Gobandae, and Sehandan is a place where the forested areas in the innermost part of Siksanjeongsa are cleared and a small pavilion is built using natural standing stones and pine trees as a folding screen. The virtue and grace of stopping. It contains the meaning of leisure and the wisdom of a gentleman. Seventh, outside the wall of Siksanjeongsa, across the eastern stream, an altar was built in a place with many old trees, called Yeonggwisa, and a place of rest was made by piling up an oddly shaped stone and planting flowers. Eighth, Manbu Lee, who knew the effects of vegetables and medicinal herbs in detail like the scholars of the Joseon Dynasty, cultivated a vegetable garden and an herbal garden in Jeongsa. Ninth, it can be seen that Lee Manbu realized the Neo-Confucian utopia in his political life by giving meaning to each space of Siksanjeongsa by naming buildings and landscaping facilities and planting them according to ancient events.

The Development of the Korean Evaluation Scale for Hearing Handicap (KESHH) for the Geriatric Hearing Los (노인성난청을 위한 청각장애평가지수(KESHH)의 개발)

  • Ku, Ho-Lim;Kim, Jin-Sook
    • 한국노년학
    • /
    • v.30 no.3
    • /
    • pp.973-992
    • /
    • 2010
  • The hearing impairment is the representative disorder that affects the quality of the routine life of the aged period. This study was aimed to develop the Korean evaluation scale for hearing handicap(KESHH) with which we can evaluate social and psychological effects of the hearing impairment. Applying this scale clinically, we can analyze the geriatric hearing loss specifically and improve the quality of the aural rehabilitation that can help the hardness of the hearing impairment. Data were collected from 288 participants(176 hearing aid users and 112 non-hearing aid users) and the average age of the participants was 67.4 years old ( 60.15 for the hearing aids users and 78.9 for the non hearing users). The composition ratio of the male and female participants were 58.0% and 42.0% and extrovert and introvert personality were 49.3% and 50.7% showing balanced formation. The tentative draft of KESHH measurements were produced with 30 items and following 5 subscales. Using factor analysis, 6 items were erased and 4 subscales - social effect, psycho/emotional effect, interpersonal effect, and perception of hearing aids - were identified. As each subscale consisted of 6 items, 24 items were corrected and remained totally. Conclusively, the KESHH was developed with 24 items and 4 subscales including 6 items on each subscale. In addition, the KESHH was divided into type-1 and 2 depending on hearing aid users and non hearing aid users. The results of this study can be summarized as the following 5 parts. Firstly, the reliabilities of the KESHH were proved to be high because the subscales' Cronbach alpha values were from 0.723 through 0.895. Secondly, the KESHH showed systematically increasing score as the hearing impairment increased. The lowest score was 24 and the highest score was 117 and the average scores of the hearing impaired and non-hearing impaired are 72.06(SD=15.67) and 66.98(SD=20.94) showing 5.08 increased score for the hearing impaired. Depending on the degree of the hearing loss, the scores recorded 52.63 at the below of the mild hearing loss, 67.29 for the moderate hearing loss, 71.89 for the moderately severe hearing loss, and 75.57 for the severe hearing loss The comparison of the scores by hearing levels indicated that the higher the hearing levels were, the higher the scores of the KESHH with statistical significance(p<0.001). Thirdly, the correlation among 4 subscales was 0.384~0.880(p<0.001). Also, the pure tone average, personality, and the four subscales correlations showed statistical significance with 0.148~0.880 except for the pure tone average and personality and the pure tone average and perception of hearing aids. Fourthly, the total variances explained for the independent subscles were analyzed with multiple regression. The social effect was explained 17.4% with pure tone average, personality, and the status of hearing aid use variances. The psycho/emotional effect was explained 14.4% with puretone average, personality, and age variances. The interpersonal effect was explained 11.2% with pure tone average, personality, and the status of hearing aid use variances. The perception of hearing aids effect was explained 2.2% with only personality. Finally, test-retest reliability was proved to be high with 0.791(p<0.001). Conclusively, the KESHH that was developed considering Korean culture can be a useful instrument for expressing the hearing handicaps of the Korean aged hearing impaired in scores for both hearing aid users and non-users. Also, it is thought that the KESHH is useful clinically for identifying the changes of the hearing handicap scores before and after wearing hearing aids and aural rehabilitation at diverse situations.