Search | Korea Science

A Study of Pre-trained Language Models for Korean Language Generation (한국어 자연어생성에 적합한 사전훈련 언어모델 특성 연구)

Song, Minchae;Shin, Kyung-shik
- Journal of Intelligence and Information Systems
- /
- v.28 no.4
- /
- pp.309-328
- /
- 2022
This study empirically analyzed a Korean pre-trained language models (PLMs) designed for natural language generation. The performance of two PLMs - BART and GPT - at the task of abstractive text summarization was compared. To investigate how performance depends on the characteristics of the inference data, ten different document types, containing six types of informational content and creation content, were considered. It was found that BART (which can both generate and understand natural language) performed better than GPT (which can only generate). Upon more detailed examination of the effect of inference data characteristics, the performance of GPT was found to be proportional to the length of the input text. However, even for the longest documents (with optimal GPT performance), BART still out-performed GPT, suggesting that the greatest influence on downstream performance is not the size of the training data or PLMs parameters but the structural suitability of the PLMs for the applied downstream task. The performance of different PLMs was also compared through analyzing parts of speech (POS) shares. BART's performance was inversely related to the proportion of prefixes, adjectives, adverbs and verbs but positively related to that of nouns. This result emphasizes the importance of taking the inference data's characteristics into account when fine-tuning a PLMs for its intended downstream task.
https://doi.org/10.13088/jiis.2022.28.4.309 인용 PDF KSCI

Affective Responses to ASMR Using Multidimensional Scaling and Classification (다차원척도법과 분류분석을 이용한 ASMR에 대한 정서표상)

Kim, Hyeonjung;Kim, Jongwan
- Science of Emotion and Sensibility
- /
- v.25 no.3
- /
- pp.47-62
- /
- 2022
Previous emotion studies revealed the two core affective dimensions of valence and arousal using affect-eliciting stimuli, such as pictures, music, and videos. Autonomous sensory meridian response (ASMR), a type of stimuli that has emerged recently, produces a sense of psychological stability and calmness. We explored whether ASMR could be represented on the core affect dimensions. In this study, we used three affective types ASMR (negative, neutral, and positive) as stimuli. Auditory ASMR videos were used in Study 1, while auditory and audiovisual ASMR videos were used in Study 2. Participants were asked to rate how they felt about the ten adjectives using five-point Likert scales. Multidimensional scaling (MDS) and classification analyses were performed. The results of the MDS showed that distinctions between auditory and audiovisual ASMR videos were represented well in the valence dimension. Additionally, the results of the classification showed that affective conditions within and across individuals for within- and cross-modalities. Thus, we confirmed that the affective representations for individuals could be predicted and that the affective representations were consistent between individuals. These results suggest that ASMR videos, including other affect-eliciting videos, were also located in the core affect dimension space, supporting the core affect theory (Russell, 1980).
https://doi.org/10.14695/KJSOS.2022.25.3.47 인용 PDF KSCI

Assessment of Visual Landscape Image Analysis Method Using CNN Deep Learning - Focused on Healing Place - (CNN 딥러닝을 활용한 경관 이미지 분석 방법 평가 - 힐링장소를 대상으로 -)

Sung, Jung-Han;Lee, Kyung-Jin
- Journal of the Korean Institute of Landscape Architecture
- /
- v.51 no.3
- /
- pp.166-178
- /
- 2023
This study aims to introduce and assess CNN Deep Learning methods to analyze visual landscape images on social media with embedded user perceptions and experiences. This study analyzed visual landscape images by focusing on a healing place. For the study, seven adjectives related to healing were selected through text mining and consideration of previous studies. Subsequently, 50 evaluators were recruited to build a Deep Learning image. Evaluators were asked to collect three images most suitable for 'healing', 'healing landscape', and 'healing place' on portal sites. The collected images were refined and a data augmentation process was applied to build a CNN model. After that, 15,097 images of 'healing' and 'healing landscape' on portal sites were collected and classified to analyze the visual landscape of a healing place. As a result of the study, 'quiet' was the highest in the category except 'other' and 'indoor' with 2,093 (22%), followed by 'open', 'joyful', 'comfortable', 'clean', 'natural', and 'beautiful'. It was found through research that CNN Deep Learning is an analysis method that can derive results from visual landscape image analysis. It also suggested that it is one way to supplement the existing visual landscape analysis method, and suggests in-depth and diverse visual landscape analysis in the future by establishing a landscape image learning dataset.
https://doi.org/10.9715/KILA.2023.51.3.166 인용 PDF

A Study on the Fraud Detection in an Online Second-hand Market by Using Topic Modeling and Machine Learning (토픽 모델링과 머신 러닝 방법을 이용한 온라인 C2C 중고거래 시장에서의 사기 탐지 연구)

Dongwoo Lee;Jinyoung Min
- Information Systems Review
- /
- v.23 no.4
- /
- pp.45-67
- /
- 2021
As the transaction volume of the C2C second-hand market is growing, the number of frauds, which intend to earn unfair gains by sending products different from specified ones or not sending them to buyers, is also increasing. This study explores the model that can identify frauds in the online C2C second-hand market by examining the postings for transactions. For this goal, this study collected 145,536 field data from actual C2C second-hand market. Then, the model is built with the characteristics from postings such as the topic and the linguistic characteristics of the product description, and the characteristics of products, postings, sellers, and transactions. The constructed model is then trained by the machine learning algorithm XGBoost. The final analysis results show that fraudulent postings have less information, which is also less specific, fewer nouns and images, a higher ratio of the number and white space, and a shorter length than genuine postings do. Also, while the genuine postings are focused on the product information for nouns, delivery information for verbs, and actions for adjectives, the fraudulent postings did not show those characteristics. This study shows that the various features can be extracted from postings written in C2C second-hand transactions and be used to construct an effective model for frauds. The proposed model can be also considered and applied for the other C2C platforms. Overall, the model proposed in this study can be expected to have positive effects on suppressing and preventing fraudulent behavior in online C2C markets.
https://doi.org/10.14329/isr.2021.23.4.045 인용 PDF

Analysis of Urban-to-Rural Migrants' Perceptions of the 'Everyday Landscape' Using Diary-Based Text Mining (일기를 통해 본 귀농·귀촌인 '일상 경관' 인식 - 텍스트 마이닝 적용 -)

OH Jungshim
- Korean Journal of Heritage: History & Science
- /
- v.57 no.3
- /
- pp.184-199
- /
- 2024
This study was conducted in response to the global trend of emphasizing the importance of "everyday landscapes", focusing on the perspective of those who have returned to rural life. With a focus on the case of Gokseong-gun in Jeollanam-do, 460 diaries written by these individuals were collected and analyzed using text mining techniques such as "frequency analysis", "topic modeling", and "sentiment analysis". The analysis of noun morphemes was interpreted from a cognitive aspect, while adjective morphemes were interpreted from an emotional aspect. In particular, this study applied semantic network analysis to overcome the limitations of existing sentiment analysis, and extracted a word network list and examined the content of nouns connected to adjectives that express emotions to identify the targets and contents of sentiments. This method represents a differentiated approach that is not commonly found in existing research. One of the intriguing findings is that the urban-to-rural migrants identified everyday landscapes such as "flowers on neighborhood walking paths", "harvest of a garden", "neighborhood events", and "cozy cafe spaces" as important. These elements all contain visual and enjoyable aspects of everyday landscapes. Currently, many rural villages are attempting to add visual elements to their everyday landscapes by unifying roof colors or painting murals on walls. However, such artificial measures do not necessarily leave a lasting impression on people. A critical review of current policies and systems is necessary. This research is significant because it is the first to study everyday landscapes from the perspective of urban-to-rural migration using diaries and text mining. With a lack of domestic research on everyday landscapes, this study hopes to contribute to the activation of related research in Korea.
https://doi.org/10.22755/kjchs.2024.57.3.184 인용 PDF

Stage Costume Design for Performance Hamlet (II) - The Study on Pattern and Manufactured Product - (햄릿 공연을 위한 무대의상 디자인 (II) - 패턴 및 실물제작 -)

Kim, Soon-Ku;Hwang, Seong-Won
- Fashion & Textile Research Journal
- /
- v.6 no.1
- /
- pp.41-50
- /
- 2004
This research proposes the on-stage costumes for the play Hamlet of Shakespeare performed by Yunheedan Guhri Pae - the Street Theater Troupe. Stage costumes have an important role in displaying the characteristics of each characters to the audience and has big visual effects. However, in order to design the costumes in the object viewpoints of the audience, the survey on the images of the characters who had actually watched the performance was taken place and proposed the costume design according to the results of the survey. Hamlet a: This result was applied to propose a sweater in black color, black leather pants and vest. Hamlet b: This result was applied to propose hooded coat in purple in middle level of brightness and color spectrum and yellow coat. For free image, loose pants in blue and vest in the same color tone were proposed. Gertrude a: This result was applied to use purple (violet) with reddish tone to propose the formation of a dress applying tailored suit. Gertrude b: This result was applied to propose purple gown and the one-piece dress with black laces. Ophelia a: This result was applied to propose feminine white dress and cape in purple color tone. Ophelia b: This result was applied to propose dyed and weaved clothes. Through the surveys as above, the images of each character was driven in adjectives, and using the results driven from the brightness, coloration, and color, color images were proposed. Only one costume cannot make up for the stage costumes and because it exists as an element of stage production, it is true that costumes are limited in some areas. However, that limit can become the motive of the costume. There is a limit, which the designer cannot produce the costumes as he or she had designed but I believe it is the center of the on-stage customers to display the characteristics of the characters according to the given concept. The limit of this research is the fact that because the costumes were designed so they fit the conditions already given, thus it was difficult to regard the process of designing and producing the costume as a project done according to the interaction. And in the future, if it is possible, I wish for the joint research with the people responsible for stage art to take place as a practical stage art. It was possible to produce practical costume since they were produced for actual performance and the production of costumes considering the dance steps, line of flow, and acting, was able to reduce the trial and error on stage. Through this research, I felt that the understanding and smooth interaction on diverse other areas not limited to the costume design should be taken place and believe that this was a research that proposes new research method since there had been only a few previous research regarding the on-stage costumes for actual performances. Therefore, this research had depended on the surveys given to the audiences to endow objectivity, however, I wish this research can contribute to defining effective process and methods for the on-stage costumes with more active researches with diverse methods and in diverse areas. I am sorry that the costume production for all the characters and all the scenes in Hamlet couldn't be done due to many limitations. As the following research assignment, I am planning on designing the costumes for all the scenes.
PDF KSCI

Analyzing Contextual Polarity of Unstructured Data for Measuring Subjective Well-Being (주관적 웰빙 상태 측정을 위한 비정형 데이터의 상황기반 긍부정성 분석 방법)

Choi, Sukjae;Song, Yeongeun;Kwon, Ohbyung
- Journal of Intelligence and Information Systems
- /
- v.22 no.1
- /
- pp.83-105
- /
- 2016
Measuring an individual's subjective wellbeing in an accurate, unobtrusive, and cost-effective manner is a core success factor of the wellbeing support system, which is a type of medical IT service. However, measurements with a self-report questionnaire and wearable sensors are cost-intensive and obtrusive when the wellbeing support system should be running in real-time, despite being very accurate. Recently, reasoning the state of subjective wellbeing with conventional sentiment analysis and unstructured data has been proposed as an alternative to resolve the drawbacks of the self-report questionnaire and wearable sensors. However, this approach does not consider contextual polarity, which results in lower measurement accuracy. Moreover, there is no sentimental word net or ontology for the subjective wellbeing area. Hence, this paper proposes a method to extract keywords and their contextual polarity representing the subjective wellbeing state from the unstructured text in online websites in order to improve the reasoning accuracy of the sentiment analysis. The proposed method is as follows. First, a set of general sentimental words is proposed. SentiWordNet was adopted; this is the most widely used dictionary and contains about 100,000 words such as nouns, verbs, adjectives, and adverbs with polarities from -1.0 (extremely negative) to 1.0 (extremely positive). Second, corpora on subjective wellbeing (SWB corpora) were obtained by crawling online text. A survey was conducted to prepare a learning dataset that includes an individual's opinion and the level of self-report wellness, such as stress and depression. The participants were asked to respond with their feelings about online news on two topics. Next, three data sources were extracted from the SWB corpora: demographic information, psychographic information, and the structural characteristics of the text (e.g., the number of words used in the text, simple statistics on the special characters used). These were considered to adjust the level of a specific SWB. Finally, a set of reasoning rules was generated for each wellbeing factor to estimate the SWB of an individual based on the text written by the individual. The experimental results suggested that using contextual polarity for each SWB factor (e.g., stress, depression) significantly improved the estimation accuracy compared to conventional sentiment analysis methods incorporating SentiWordNet. Even though literature is available on Korean sentiment analysis, such studies only used only a limited set of sentimental words. Due to the small number of words, many sentences are overlooked and ignored when estimating the level of sentiment. However, the proposed method can identify multiple sentiment-neutral words as sentiment words in the context of a specific SWB factor. The results also suggest that a specific type of senti-word dictionary containing contextual polarity needs to be constructed along with a dictionary based on common sense such as SenticNet. These efforts will enrich and enlarge the application area of sentic computing. The study is helpful to practitioners and managers of wellness services in that a couple of characteristics of unstructured text have been identified for improving SWB. Consistent with the literature, the results showed that the gender and age affect the SWB state when the individual is exposed to an identical queue from the online text. In addition, the length of the textual response and usage pattern of special characters were found to indicate the individual's SWB. These imply that better SWB measurement should involve collecting the textual structure and the individual's demographic conditions. In the future, the proposed method should be improved by automated identification of the contextual polarity in order to enlarge the vocabulary in a cost-effective manner.
https://doi.org/10.13088/jiis.2016.22.1.083 인용 PDF KSCI

A Study on Analysis of consumer perception of YouTube advertising using text mining (텍스트 마이닝을 활용한 Youtube 광고에 대한 소비자 인식 분석)

Eum, Seong-Won
- Management & Information Systems Review
- /
- v.39 no.2
- /
- pp.181-193
- /
- 2020
This study is a study that analyzes consumer perception by utilizing text mining, which is a recent issue. we analyzed the consumer's perception of Samsung Galaxy by analyzing consumer reviews of Samsung Galaxy YouTube ads. for analysis, 1,819 consumer reviews of YouTube ads were extracted. through this data pre-processing, keywords for advertisements were classified and extracted into nouns, adjectives, and adverbs. after that, frequency analysis and emotional analysis were performed. Finally, clustering was performed through CONCOR. the summary of this study is as follows. the first most frequently mentioned words were Galaxy Note (n = 217), Good (n = 135), Pen (n = 40), and Function (n = 29). it can be judged through the advertisement that consumers "Galaxy Note", "Good", "Pen", and "Features" have good functional aspects for Samsung mobile phone products and positively recognize the Note Pen. in addition, the recognition of "Samsung Pay", "Innovation", "Design", and "iPhone" shows that Samsung's mobile phone is highly regarded for its innovative design and functional aspects of Samsung Pay. second, it is the result of sentiment analysis on YouTube advertising. As a result of emotional analysis, the ratio of emotional intensity was positive (75.95%) and higher than negative (24.05%). this means that consumers are positively aware of Samsung Galaxy mobile phones. As a result of the emotional keyword analysis, positive keywords were "good", "good", "innovative", "highest", "fast", "pretty", etc., negative keywords were "frightening", "I want to cry", "discomfort", "sorry", "no", etc. were extracted. the implication of this study is that most of the studies by quantitative analysis methods were considered when looking at the consumer perception study of existing advertisements. In this study, we deviated from quantitative research methods for advertising and attempted to analyze consumer perception through qualitative research. this is expected to have a great influence on future research, and I am sure that it will be a starting point for consumer awareness research through qualitative research.
https://doi.org/10.29214/damis.2020.39.2.011 인용 PDF KSCI

A Study on Developing Sensibility Model for Visual Display (시각 디스플레이에서의 감성 모형 개발 -움직임과 색을 중심으로-)

임은영;조경자;한광희
- Korean Journal of Cognitive Science
- /
- v.15 no.2
- /
- pp.1-15
- /
- 2004
The structure of sensibility from motion was developed for the purpose of understanding relationship between sensibilities and physical factors to apply it to dynamic visual display. Seventy adjectives were collected by assessing adequacy to express sensibilities from motion and reporting sensibilities recalled from dynamic displays with achromatic color. Various motion displays with a moving single dot were rated according to the degree of sensibility corresponding to each adjective, on the basis of the Semantic Differential (SD) method. The results of assessment were analyzed by means of the factor analysis to reduce 70 words into 19 fundamental sensibilities from motion. The Multidimensional Scaling (MDS) technique constructed the sensibility space in motion, in which 19 sensibilities were scattered with two dimensions, active-passive and bright-dark Motion types systemically varied in kinematic factors were placed on the two-dimensional space of motion sensibility, in order to analyze important variables affecting sensibility from motion. Patterns of placement indicate that speed and both of cycle and amplitude in trajectories tend to partially determine sensibility. Although color and motion affected sensibility according to the in dimensions, it seemed that combination of motion and color made each have dominant effect individually in a certain sensibility dimension, motion to active-passive and color to bright-dark.
PDF

Building a Korean Sentiment Lexicon Using Collective Intelligence (집단지성을 이용한 한글 감성어 사전 구축)

An, Jungkook;Kim, Hee-Woong
- Journal of Intelligence and Information Systems
- /
- v.21 no.2
- /
- pp.49-67
- /
- 2015
Recently, emerging the notion of big data and social media has led us to enter data's big bang. Social networking services are widely used by people around the world, and they have become a part of major communication tools for all ages. Over the last decade, as online social networking sites become increasingly popular, companies tend to focus on advanced social media analysis for their marketing strategies. In addition to social media analysis, companies are mainly concerned about propagating of negative opinions on social networking sites such as Facebook and Twitter, as well as e-commerce sites. The effect of online word of mouth (WOM) such as product rating, product review, and product recommendations is very influential, and negative opinions have significant impact on product sales. This trend has increased researchers' attention to a natural language processing, such as a sentiment analysis. A sentiment analysis, also refers to as an opinion mining, is a process of identifying the polarity of subjective information and has been applied to various research and practical fields. However, there are obstacles lies when Korean language (Hangul) is used in a natural language processing because it is an agglutinative language with rich morphology pose problems. Therefore, there is a lack of Korean natural language processing resources such as a sentiment lexicon, and this has resulted in significant limitations for researchers and practitioners who are considering sentiment analysis. Our study builds a Korean sentiment lexicon with collective intelligence, and provides API (Application Programming Interface) service to open and share a sentiment lexicon data with the public (www.openhangul.com). For the pre-processing, we have created a Korean lexicon database with over 517,178 words and classified them into sentiment and non-sentiment words. In order to classify them, we first identified stop words which often quite likely to play a negative role in sentiment analysis and excluded them from our sentiment scoring. In general, sentiment words are nouns, adjectives, verbs, adverbs as they have sentimental expressions such as positive, neutral, and negative. On the other hands, non-sentiment words are interjection, determiner, numeral, postposition, etc. as they generally have no sentimental expressions. To build a reliable sentiment lexicon, we have adopted a concept of collective intelligence as a model for crowdsourcing. In addition, a concept of folksonomy has been implemented in the process of taxonomy to help collective intelligence. In order to make up for an inherent weakness of folksonomy, we have adopted a majority rule by building a voting system. Participants, as voters were offered three voting options to choose from positivity, negativity, and neutrality, and the voting have been conducted on one of the largest social networking sites for college students in Korea. More than 35,000 votes have been made by college students in Korea, and we keep this voting system open by maintaining the project as a perpetual study. Besides, any change in the sentiment score of words can be an important observation because it enables us to keep track of temporal changes in Korean language as a natural language. Lastly, our study offers a RESTful, JSON based API service through a web platform to make easier support for users such as researchers, companies, and developers. Finally, our study makes important contributions to both research and practice. In terms of research, our Korean sentiment lexicon plays an important role as a resource for Korean natural language processing. In terms of practice, practitioners such as managers and marketers can implement sentiment analysis effectively by using Korean sentiment lexicon we built. Moreover, our study sheds new light on the value of folksonomy by combining collective intelligence, and we also expect to give a new direction and a new start to the development of Korean natural language processing.
https://doi.org/10.13088/jiis.2015.21.2.49 인용 PDF KSCI

Search Result 435, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)