Search | Korea Science

<Field Action Report> Local Governance for COVID-19 Response of Daegu Metropolitan City (<사례보고> 코로나바이러스감염증-19 유행과 로컬 거버넌스 - 2020년 대구광역시 유행에 대한 대응을 중심으로 -)

Kyeong-Soo Lee;Jung Jeung Lee;Keon-Yeop Kim;Jong-Yeon Kim;Tae-Yoon Hwang;Nam-Soo Hong;Jun Hyun Hwang;Jaeyoung Ha
- Journal of agricultural medicine and community health
- /
- v.49 no.1
- /
- pp.13-36
- /
- 2024
Objectives: The purpose of this field case report is 1) to analyze the community's strategy and performance in responding to infectious diseases through the case of COVID-19 infectious disease crisis response of Daegu Metropolitan City, and 2) to interpret this case using governance theory and infectious disease response governance framework. and 3) to propose a strategic model to prepare for future infectious disease outbreaks of the community. Methods: Cases of Daegu Metropolitan City's infectious disease crisis response were analyzed through researchers' participatory observations. And review of OVID-19 White Paper of Daegu Metropolitan City, Daegu Medical Association's COVID-19 White Paper, and literature review of domestic and international governance, and administrative documents. Results: Through the researcher's participatory observation and literature review, 1) establishment of leadership and response system to respond to the infectious disease crisis in Daegu Metropolitan City, 2) citizen's participation and communication strategy through the pan-citizen response committee, 3) cooperation between Daegu Metropolitan City and governance of public-private medical facilities, 4) decision-making and crisis response through participation and communication between the Daegu Metropolitan City Medical Association, Medi-City Daegu Council, and medical experts of private sector, 5) symptom monitoring and patient triage strategies and treatment response for confirmed infectious disease patients by member of Daegu Medical Association, 6) strategies and implications for establishing and utilizing a local infectious disease crisis response information system were derived. Conclusions: The results of the study empirically demonstrate that collaborative governance of the community through the participation of citizens, private sector experts, and community medical facilities is a key element for effective response to infectious disease crises.
https://doi.org/10.5393/JAMCH.2024.49.1.013 인용 PDF

Comparative study of flood detection methodologies using Sentinel-1 satellite imagery (Sentinel-1 위성 영상을 활용한 침수 탐지 기법 방법론 비교 연구)

Lee, Sungwoo;Kim, Wanyub;Lee, Seulchan;Jeong, Hagyu;Park, Jongsoo;Choi, Minha
- Journal of Korea Water Resources Association
- /
- v.57 no.3
- /
- pp.181-193
- /
- 2024
The increasing atmospheric imbalance caused by climate change leads to an elevation in precipitation, resulting in a heightened frequency of flooding. Consequently, there is a growing need for technology to detect and monitor these occurrences, especially as the frequency of flooding events rises. To minimize flood damage, continuous monitoring is essential, and flood areas can be detected by the Synthetic Aperture Radar (SAR) imagery, which is not affected by climate conditions. The observed data undergoes a preprocessing step, utilizing a median filter to reduce noise. Classification techniques were employed to classify water bodies and non-water bodies, with the aim of evaluating the effectiveness of each method in flood detection. In this study, the Otsu method and Support Vector Machine (SVM) technique were utilized for the classification of water bodies and non-water bodies. The overall performance of the models was assessed using a Confusion Matrix. The suitability of flood detection was evaluated by comparing the Otsu method, an optimal threshold-based classifier, with SVM, a machine learning technique that minimizes misclassifications through training. The Otsu method demonstrated suitability in delineating boundaries between water and non-water bodies but exhibited a higher rate of misclassifications due to the influence of mixed substances. Conversely, the use of SVM resulted in a lower false positive rate and proved less sensitive to mixed substances. Consequently, SVM exhibited higher accuracy under conditions excluding flooding. While the Otsu method showed slightly higher accuracy in flood conditions compared to SVM, the difference in accuracy was less than 5% (Otsu: 0.93, SVM: 0.90). However, in pre-flooding and post-flooding conditions, the accuracy difference was more than 15%, indicating that SVM is more suitable for water body and flood detection (Otsu: 0.77, SVM: 0.92). Based on the findings of this study, it is anticipated that more accurate detection of water bodies and floods could contribute to minimizing flood-related damages and losses.
https://doi.org/10.3741/JKWRA.2024.57.3.181 인용 PDF

Evaluation of Park Service in Neighborhood Parks based on the Analysis of Walking Accessibility - Focused on Bundang-gu, Seongnam-si - (보행접근성 분석에 기반한 근린공원의 공원서비스 평가 - 성남시 분당구를 대상으로 -)

Hwang, Hae-Kwon;Son, Yong-Hoon
- Journal of the Korean Institute of Landscape Architecture
- /
- v.52 no.1
- /
- pp.59-70
- /
- 2024
As urbanization progresses, the demand for parks and green space is increasing. Park green spaces in the city are important spaces in the city because they are recognized as spaces where people can freely engage in outdoor activities. The park service area is a measure that shows the extent to which services are provided based on distance. In this process, the concept of accessibility plays an important role, and walking, in particular, as the most basic means of transportation for people and has a great influence on the use of parks. However, the current park service area analysis focuses on discovering underprivileged areas, so detailed evaluation of beneficiary areas is insufficient. This study seeks to evaluate park service areas based on the pedestrian accessibility and the pedestrian network. Park services are services that occur when users directly visit the park, and accessibility is expected to be reflected in terms of usability. To quantify the pedestrian network, this study used space syntax to analyze pedestrian accessibility based on integration values. The integration values are an indicators that quantify the level of accessibility of the pedestrian network, and in this study, the higher the integration value, the higher the possibility of park use. The results of the study are as follows. First, Bundang-gu's park service area accounts for 43%, and includes most sections with high pedestrian accessibility, but some sections with good pedestrian accessibility are excluded. This can be seen as a phenomenon that occurs when residential areas and commercial and business areas are given priority during the urban planning process, and then park and green areas are selected. Second, based on Bundang-gu, the park service area and pedestrian accessibility within the park service area were classified by neighborhood unit. Differences appear for each individual neighborhood unit, and it is expected that the availability of the park will vary accordingly. In addition, even in areas created during the same urban planning process, there were differences in the evaluation of park service areas according to pedestrian accessibility. Using this, it is possible to evaluate individual neighborhood units that can be reflected in living area plans, and it can be used as a useful indicator in park and green space policies that reflect this in the future.
https://doi.org/10.9715/KILA.2024.52.1.059 인용 PDF

Epidemiological Characteristic and Risk Factor of COVID-19 Cluster Related to Educational Facilities in Gangwon-do, Korea (December 10, 2020-September 23, 2021) (강원도내 교육시설관련 코로나바이러스감염증19 집단발생의 역학적특성과 위험요인 (2020.12.10-2021.9.23))

Hyosug Choi;Mi Young Kim;Shinyoung Lee;Eunmi Kim;Yeo Jin Kim
- Pediatric Infection and Vaccine
- /
- v.31 no.1
- /
- pp.102-112
- /
- 2024
Purpose: To identify the epidemiological characteristics and risk factors of coronavirus disease 19 (COVID-19) outbreaks depending on the type of educational facility by analyzing the COVID-19 cluster associated with educational facilities. Methods: This study is based on epidemiological investigation of COVID-19 cluster in Gangwon-do, Korea from December 10, 2020 to September 23, 2021 reported to the Korea Disease Control and Prevention Agency's Integrated Disease and Health Management System. Four hundred seven patients in 19 facilities, classified as cluster related to educational facilities, were the study population. The result of preliminary epidemiology survey report, in-depth epidemiological survey by phone and the result of risk assessment derived from the field epidemiology investigation were retrospectively analyzed to evaluate infectivity and the characteristics of the risk factors. Results: There were total of 407 confirmed patients related to 19 educational facilities, with 204 students under the age of 19 (50.1%). One hundred fifty-five preceding spreaders were from families (38.1%) and 125 were the teachers (30.7%). The place exposed to confirmed patients was the highest with 139 people (34.2%) at home. Conclusions: It was confirmed that the cause of the occurrence of clusters related to educational facilities was higher due to family transmission than the risk of facilities in schools. Nevertheless, continuous efforts should be made to control infection in educational facilities, and that teachers' implementation of principles for prevention of COVID-19 personal hygiene in their daily lives should be strengthened.
https://doi.org/10.14776/piv.2024.31.e4 인용 PDF

A Study on Nutritive Values and Salt Contents of Commercially Prepared Take-Out Boxed-Lunch In Korea (한국형 시판 도시락의 영양가 및 식염함량)

Kim, Bok-Hee;Lee, Eun-Wha;Kim, Won-Kyung;Lee, Yoon-Na;Kwak, Chung-Shil;Mo, Sumi
- Journal of Nutrition and Health
- /
- v.24 no.3
- /
- pp.230-242
- /
- 1991
This research was conducted on the 10 take-out boxed-lunches commercially prepared in the department stores. chain stores. and the public railroad trains in Korea. Sampling was conducted from February 1990 to March 1990. Nutritive values and sodium contents of the 10 boxed-lunch samples are summarized as follows : 1) The average weight(percentage) of the cooked rice and the side dishes were 304.6g(49.4) and 312.4(506%), respectively. The weight of these samples were significantly heavier than that of Japanese style boxed-lunches. 2) The average number of the side dishes was 12. The average numbers of food items classified by the five food groups were 6.1 in protein food group, 0.3 in calcium food group. 6.0 in vitamin and mineral food group. 1.5 in carbohydrate food group, and 1.5 in oil and fat food group. 3) They contained on the average 840.7kcal of energy, 38.9g of protein, 22.7g of fat, 120.4g of carbohydrate. 300.8mg of calcium. 410.8mg of phosphours, 6.61 mg of iron. 219.8 R.E. of vitamin A, 0.46mg of thiamin, 0.67mg of riboflavin, 10.5mg of niacin, 27.5mg of ascorbic acid. Thus. except vitamin t the content of all the nutrients were higher than the value of 1/3 of the RDA for adults. 4) The high priced group(group 2) had more protein, calcuim. iron and niacin contents than the cheaper group(group 1). Probably, it's because the group 2 had more animal foods than the group 1. 5) The average energy content per unit price(100 won) was 37.3kcal and the average protein content per unit price(100 won) was 1.64g. Korena style boxed-lunches had higher energy and protein contents per unit price than Japanese style, and the group 1 higher than the group 2. 6) The average energy Proportions of Protein, carbohydrate. and fat were 18.3%, 57.4%, and 24.3%, respectively. These proportions are good enough. 7) Frequency of cooking methods for the side dishes were found in the decreasing order : pan-frying, frying, braising, seasoning, kimchi, grilling, pickling, stir-frying, steaming and fermenting. Generally simple cooking methods were used, thus the menus were lack or varieties. 8) Frequency of colors for the side dishes were found in the decreasing order : red, brown. yellow, green, black, white. Too much red pepper was used. 9) The average capacity of the containers for the staples and the side dishes were 468.1ml and 590.6ml, respectively. And the containers could not keep the food items well seperated. 10) The average contensts of sodium and salt were 2.287mg and 5.76g, in the range of 1, 398mg to 3, 489mg and 3.53g to 8.80g, respectively. These are much higher values than the recommended amount of salt.
PDF

Clickstream Big Data Mining for Demographics based Digital Marketing (인구통계특성 기반 디지털 마케팅을 위한 클릭스트림 빅데이터 마이닝)

Park, Jiae;Cho, Yoonho
- Journal of Intelligence and Information Systems
- /
- v.22 no.3
- /
- pp.143-163
- /
- 2016
The demographics of Internet users are the most basic and important sources for target marketing or personalized advertisements on the digital marketing channels which include email, mobile, and social media. However, it gradually has become difficult to collect the demographics of Internet users because their activities are anonymous in many cases. Although the marketing department is able to get the demographics using online or offline surveys, these approaches are very expensive, long processes, and likely to include false statements. Clickstream data is the recording an Internet user leaves behind while visiting websites. As the user clicks anywhere in the webpage, the activity is logged in semi-structured website log files. Such data allows us to see what pages users visited, how long they stayed there, how often they visited, when they usually visited, which site they prefer, what keywords they used to find the site, whether they purchased any, and so forth. For such a reason, some researchers tried to guess the demographics of Internet users by using their clickstream data. They derived various independent variables likely to be correlated to the demographics. The variables include search keyword, frequency and intensity for time, day and month, variety of websites visited, text information for web pages visited, etc. The demographic attributes to predict are also diverse according to the paper, and cover gender, age, job, location, income, education, marital status, presence of children. A variety of data mining methods, such as LSA, SVM, decision tree, neural network, logistic regression, and k-nearest neighbors, were used for prediction model building. However, this research has not yet identified which data mining method is appropriate to predict each demographic variable. Moreover, it is required to review independent variables studied so far and combine them as needed, and evaluate them for building the best prediction model. The objective of this study is to choose clickstream attributes mostly likely to be correlated to the demographics from the results of previous research, and then to identify which data mining method is fitting to predict each demographic attribute. Among the demographic attributes, this paper focus on predicting gender, age, marital status, residence, and job. And from the results of previous research, 64 clickstream attributes are applied to predict the demographic attributes. The overall process of predictive model building is compose of 4 steps. In the first step, we create user profiles which include 64 clickstream attributes and 5 demographic attributes. The second step performs the dimension reduction of clickstream variables to solve the curse of dimensionality and overfitting problem. We utilize three approaches which are based on decision tree, PCA, and cluster analysis. We build alternative predictive models for each demographic variable in the third step. SVM, neural network, and logistic regression are used for modeling. The last step evaluates the alternative models in view of model accuracy and selects the best model. For the experiments, we used clickstream data which represents 5 demographics and 16,962,705 online activities for 5,000 Internet users. IBM SPSS Modeler 17.0 was used for our prediction process, and the 5-fold cross validation was conducted to enhance the reliability of our experiments. As the experimental results, we can verify that there are a specific data mining method well-suited for each demographic variable. For example, age prediction is best performed when using the decision tree based dimension reduction and neural network whereas the prediction of gender and marital status is the most accurate by applying SVM without dimension reduction. We conclude that the online behaviors of the Internet users, captured from the clickstream data analysis, could be well used to predict their demographics, thereby being utilized to the digital marketing.
https://doi.org/10.13088/jiis.2016.22.3.143 인용 PDF KSCI

Methodology for Identifying Issues of User Reviews from the Perspective of Evaluation Criteria: Focus on a Hotel Information Site (사용자 리뷰의 평가기준 별 이슈 식별 방법론: 호텔 리뷰 사이트를 중심으로)

Byun, Sungho;Lee, Donghoon;Kim, Namgyu
- Journal of Intelligence and Information Systems
- /
- v.22 no.3
- /
- pp.23-43
- /
- 2016
As a result of the growth of Internet data and the rapid development of Internet technology, "big data" analysis has gained prominence as a major approach for evaluating and mining enormous data for various purposes. Especially, in recent years, people tend to share their experiences related to their leisure activities while also reviewing others' inputs concerning their activities. Therefore, by referring to others' leisure activity-related experiences, they are able to gather information that might guarantee them better leisure activities in the future. This phenomenon has appeared throughout many aspects of leisure activities such as movies, traveling, accommodation, and dining. Apart from blogs and social networking sites, many other websites provide a wealth of information related to leisure activities. Most of these websites provide information of each product in various formats depending on different purposes and perspectives. Generally, most of the websites provide the average ratings and detailed reviews of users who actually used products/services, and these ratings and reviews can actually support the decision of potential customers in purchasing the same products/services. However, the existing websites offering information on leisure activities only provide the rating and review based on one stage of a set of evaluation criteria. Therefore, to identify the main issue for each evaluation criterion as well as the characteristics of specific elements comprising each criterion, users have to read a large number of reviews. In particular, as most of the users search for the characteristics of the detailed elements for one or more specific evaluation criteria based on their priorities, they must spend a great deal of time and effort to obtain the desired information by reading more reviews and understanding the contents of such reviews. Although some websites break down the evaluation criteria and direct the user to input their reviews according to different levels of criteria, there exist excessive amounts of input sections that make the whole process inconvenient for the users. Further, problems may arise if a user does not follow the instructions for the input sections or fill in the wrong input sections. Finally, treating the evaluation criteria breakdown as a realistic alternative is difficult, because identifying all the detailed criteria for each evaluation criterion is a challenging task. For example, if a review about a certain hotel has been written, people tend to only write one-stage reviews for various components such as accessibility, rooms, services, or food. These might be the reviews for most frequently asked questions, such as distance between the nearest subway station or condition of the bathroom, but they still lack detailed information for these questions. In addition, in case a breakdown of the evaluation criteria was provided along with various input sections, the user might only fill in the evaluation criterion for accessibility or fill in the wrong information such as information regarding rooms in the evaluation criteria for accessibility. Thus, the reliability of the segmented review will be greatly reduced. In this study, we propose an approach to overcome the limitations of the existing leisure activity information websites, namely, (1) the reliability of reviews for each evaluation criteria and (2) the difficulty of identifying the detailed contents that make up the evaluation criteria. In our proposed methodology, we first identify the review content and construct the lexicon for each evaluation criterion by using the terms that are frequently used for each criterion. Next, the sentences in the review documents containing the terms in the constructed lexicon are decomposed into review units, which are then reconstructed by using the evaluation criteria. Finally, the issues of the constructed review units by evaluation criteria are derived and the summary results are provided. Apart from the derived issues, the review units are also provided. Therefore, this approach aims to help users save on time and effort, because they will only be reading the relevant information they need for each evaluation criterion rather than go through the entire text of review. Our proposed methodology is based on the topic modeling, which is being actively used in text analysis. The review is decomposed into sentence units rather than considering the whole review as a document unit. After being decomposed into individual review units, the review units are reorganized according to each evaluation criterion and then used in the subsequent analysis. This work largely differs from the existing topic modeling-based studies. In this paper, we collected 423 reviews from hotel information websites and decomposed these reviews into 4,860 review units. We then reorganized the review units according to six different evaluation criteria. By applying these review units in our methodology, the analysis results can be introduced, and the utility of proposed methodology can be demonstrated.
https://doi.org/10.13088/jiis.2016.22.3.023 인용 PDF KSCI

The Pattern Analysis of Financial Distress for Non-audited Firms using Data Mining (데이터마이닝 기법을 활용한 비외감기업의 부실화 유형 분석)

Lee, Su Hyun;Park, Jung Min;Lee, Hyoung Yong
- Journal of Intelligence and Information Systems
- /
- v.21 no.4
- /
- pp.111-131
- /
- 2015
There are only a handful number of research conducted on pattern analysis of corporate distress as compared with research for bankruptcy prediction. The few that exists mainly focus on audited firms because financial data collection is easier for these firms. But in reality, corporate financial distress is a far more common and critical phenomenon for non-audited firms which are mainly comprised of small and medium sized firms. The purpose of this paper is to classify non-audited firms under distress according to their financial ratio using data mining; Self-Organizing Map (SOM). SOM is a type of artificial neural network that is trained using unsupervised learning to produce a lower dimensional discretized representation of the input space of the training samples, called a map. SOM is different from other artificial neural networks as it applies competitive learning as opposed to error-correction learning such as backpropagation with gradient descent, and in the sense that it uses a neighborhood function to preserve the topological properties of the input space. It is one of the popular and successful clustering algorithm. In this study, we classify types of financial distress firms, specially, non-audited firms. In the empirical test, we collect 10 financial ratios of 100 non-audited firms under distress in 2004 for the previous two years (2002 and 2003). Using these financial ratios and the SOM algorithm, five distinct patterns were distinguished. In pattern 1, financial distress was very serious in almost all financial ratios. 12% of the firms are included in these patterns. In pattern 2, financial distress was weak in almost financial ratios. 14% of the firms are included in pattern 2. In pattern 3, growth ratio was the worst among all patterns. It is speculated that the firms of this pattern may be under distress due to severe competition in their industries. Approximately 30% of the firms fell into this group. In pattern 4, the growth ratio was higher than any other pattern but the cash ratio and profitability ratio were not at the level of the growth ratio. It is concluded that the firms of this pattern were under distress in pursuit of expanding their business. About 25% of the firms were in this pattern. Last, pattern 5 encompassed very solvent firms. Perhaps firms of this pattern were distressed due to a bad short-term strategic decision or due to problems with the enterpriser of the firms. Approximately 18% of the firms were under this pattern. This study has the academic and empirical contribution. In the perspectives of the academic contribution, non-audited companies that tend to be easily bankrupt and have the unstructured or easily manipulated financial data are classified by the data mining technology (Self-Organizing Map) rather than big sized audited firms that have the well prepared and reliable financial data. In the perspectives of the empirical one, even though the financial data of the non-audited firms are conducted to analyze, it is useful for find out the first order symptom of financial distress, which makes us to forecast the prediction of bankruptcy of the firms and to manage the early warning and alert signal. These are the academic and empirical contribution of this study. The limitation of this research is to analyze only 100 corporates due to the difficulty of collecting the financial data of the non-audited firms, which make us to be hard to proceed to the analysis by the category or size difference. Also, non-financial qualitative data is crucial for the analysis of bankruptcy. Thus, the non-financial qualitative factor is taken into account for the next study. This study sheds some light on the non-audited small and medium sized firms' distress prediction in the future.
https://doi.org/10.13088/jiis.2015.21.4.111 인용 PDF KSCI

The Market Segmentation of Coffee Shops and the Difference Analysis of Consumer Behavior: A Case based on Caffe Bene (커피전문점의 시장세분화와 소비자행동 차이 분석 : 카페베네 사례를 중심으로)

Yu, Jong-Pil;Yoon, Nam-Soo
- Journal of Distribution Science
- /
- v.9 no.4
- /
- pp.5-13
- /
- 2011
This study provides analysis of the effectiveness of domestic marketing strategies of the Korean coffee shop "Caffe Bene". It bases its evaluation on statistical outputs of 'choice attributes,' "market segmentation," demographic characteristics," and "satisfaction differences." The results are summarized in four points. First, five choice attributes were extracted from factor analysis: price, atmosphere, comfort, taste, and location; these are related to coffee shop selection behavior. Based on these five factors, cluster analysis was conducted, with statistical results classifying customers into three major groups: atmosphere oriented; comfort oriented; and taste oriented. Second, discriminant analysis tested cluster analysis and showed two discriminant functions: location and atmosphere. Third, cross-tabulation analysis based on demographic characteristics showed distinctive demographic characteristics within the three groups. Atmosphere oriented group, early-20s, as women of all ages was found to be 'walking down the street 'and 'through acquaintances' in many cases, as the cognitive path, and mostly found the store through 'outdoor advertising', and 'introduction'. Comfort oriented group was mainly women who are students in their early twenties or professionals, and appeared as a group to be very loyal because of high recommendation to other customers compared to other groups. Taste oriented group, unlike the other group, was mainly late-20s' college graduates, and was confirmed, as low loyalty, with lower recommendation activity. Fourth, to analyze satisfaction differences, one-way ANOVA was conducted. It shows that groups which show high satisfaction in the five main factors also show high menu satisfaction and high overall satisfaction. This results show that segmented marketing strategies are necessary because customers are considering price, atmosphere, comfort, taste, location when they choose coffee shop and demographics show different attributes based on segmented groups. For example, atmosphere oriented group is satisfied with shop interior and comfort while dissatisfied with price because most of the customers in this group are early 20s and do not have great financial capability. Thus, price discounting marketing strategies based on individual situations through CRM system is critical. Comfort oriented group shows high satisfaction level about location and shop comfort. Also, in this group, there are many early 20s female customers, students, and self-employed people. This group customers show high word of mouth tendency, hence providing positive brand image to the customers would be important. In case of taste oriented group, while the scores of taste and location are high, word of mouth score is low. This group is mainly composed of educated and professional many late 20s customers, therefore, menu differentiation, increasing quality of coffee taste and price discrimination is critical to increase customers' satisfaction. However, it is hard to generalize the results of study to other coffee shop brand, because this study have researched only one domestic coffee shop, Caffe Bene. Thus if future study expand the scope of locations, brands, and occupations, the results of the study would provide more generalizable results. Finally, research of customer satisfactions of menu, trust, loyalty, and switching cost would be critical in the future study.
PDF

A Study of 'Emotion Trigger' by Text Mining Techniques (텍스트 마이닝을 이용한 감정 유발 요인 'Emotion Trigger'에 관한 연구)

An, Juyoung;Bae, Junghwan;Han, Namgi;Song, Min
- Journal of Intelligence and Information Systems
- /
- v.21 no.2
- /
- pp.69-92
- /
- 2015
The explosion of social media data has led to apply text-mining techniques to analyze big social media data in a more rigorous manner. Even if social media text analysis algorithms were improved, previous approaches to social media text analysis have some limitations. In the field of sentiment analysis of social media written in Korean, there are two typical approaches. One is the linguistic approach using machine learning, which is the most common approach. Some studies have been conducted by adding grammatical factors to feature sets for training classification model. The other approach adopts the semantic analysis method to sentiment analysis, but this approach is mainly applied to English texts. To overcome these limitations, this study applies the Word2Vec algorithm which is an extension of the neural network algorithms to deal with more extensive semantic features that were underestimated in existing sentiment analysis. The result from adopting the Word2Vec algorithm is compared to the result from co-occurrence analysis to identify the difference between two approaches. The results show that the distribution related word extracted by Word2Vec algorithm in that the words represent some emotion about the keyword used are three times more than extracted by co-occurrence analysis. The reason of the difference between two results comes from Word2Vec's semantic features vectorization. Therefore, it is possible to say that Word2Vec algorithm is able to catch the hidden related words which have not been found in traditional analysis. In addition, Part Of Speech (POS) tagging for Korean is used to detect adjective as "emotional word" in Korean. In addition, the emotion words extracted from the text are converted into word vector by the Word2Vec algorithm to find related words. Among these related words, noun words are selected because each word of them would have causal relationship with "emotional word" in the sentence. The process of extracting these trigger factor of emotional word is named "Emotion Trigger" in this study. As a case study, the datasets used in the study are collected by searching using three keywords: professor, prosecutor, and doctor in that these keywords contain rich public emotion and opinion. Advanced data collecting was conducted to select secondary keywords for data gathering. The secondary keywords for each keyword used to gather the data to be used in actual analysis are followed: Professor (sexual assault, misappropriation of research money, recruitment irregularities, polifessor), Doctor (Shin hae-chul sky hospital, drinking and plastic surgery, rebate) Prosecutor (lewd behavior, sponsor). The size of the text data is about to 100,000(Professor: 25720, Doctor: 35110, Prosecutor: 43225) and the data are gathered from news, blog, and twitter to reflect various level of public emotion into text data analysis. As a visualization method, Gephi (http://gephi.github.io) was used and every program used in text processing and analysis are java coding. The contributions of this study are as follows: First, different approaches for sentiment analysis are integrated to overcome the limitations of existing approaches. Secondly, finding Emotion Trigger can detect the hidden connections to public emotion which existing method cannot detect. Finally, the approach used in this study could be generalized regardless of types of text data. The limitation of this study is that it is hard to say the word extracted by Emotion Trigger processing has significantly causal relationship with emotional word in a sentence. The future study will be conducted to clarify the causal relationship between emotional words and the words extracted by Emotion Trigger by comparing with the relationships manually tagged. Furthermore, the text data used in Emotion Trigger are twitter, so the data have a number of distinct features which we did not deal with in this study. These features will be considered in further study.
https://doi.org/10.13088/jiis.2015.21.2.69 인용 PDF KSCI

Search Result 6,451, Processing Time 0.031 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)