• Title/Summary/Keyword: 웹 크롤링

Search Result 114, Processing Time 0.022 seconds

Revisiting the cause of unemployment problem in Korea's labor market: The job seeker's interests-based topic analysis (취업준비생 토픽 분석을 통한 취업난 원인의 재탐색)

  • Kim, Jung-Su;Lee, Suk-Jun
    • Management & Information Systems Review
    • /
    • v.35 no.1
    • /
    • pp.85-116
    • /
    • 2016
  • The present study aims to explore the causes of employment difficulty on the basis of job applicant's interest from P-E (person-environment) fit perspective. Our approach relied on a textual analytic method to reveal insights from their situational interests in a job search during the change of labor market. Thus, to investigate the type of major interests and psychological responses, user-generated texts in a social community were collected for analysis between January 1, 2013 through December 31, 2015 by crawling the online-community in regard to job seeking and sharing information and opinions. The results of topic analysis indicated user's primary interests were divided into four types: perception of vocation expectation, employment pre-preparation behaviors, perception of labor market, and job-seeking stress. Specially, job applicants put mainly concerns of monetary reward and a form of employment, rather than their work values or career exploration, thus youth job applicants expressed their psychological responses using contextualized language (e.g., slang, vulgarisms) for projecting their unstable state under uncertainty in response to environmental changes. Additionally, they have perceived activities in the restricted preparation (e.g., certification, English exam) as determinant factors for success in employment and suffered form job-seeking stress. On the basis of these findings, current unemployment matters are totally attributed to the absence of pursing the value of vocation and job in individuals, organizations, and society. Concretely, job seekers are preoccupied with occupational prestige in social aspect and have undecided vocational value. On the other hand, most companies have no perception of the importance of human resources and have overlooked the needs for proper work environment development in respect of stimulating individual motivation. The attempt in this study to reinterpret the effect of environment as for classifying job applicant's interests in reference to linguistic and psychological theories not only helps conduct a more comprehensive meaning for understanding social matters, but guides new directions for future research on job applicant's psychological factors (e.g., attitudes, motivation) using topic analysis.

  • PDF

Multi-Category Sentiment Analysis for Social Opinion Related to Artificial Intelligence on Social Media (소셜 미디어 상에서의 인공지능 관련 사회적 여론에 대한 다 범주 감성 분석)

  • Lee, Sang Won;Choi, Chang Wook;Kim, Dong Sung;Yeo, Woon Young;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.51-66
    • /
    • 2018
  • As AI (Artificial Intelligence) technologies have been swiftly evolved, a lot of products and services are under development in various fields for better users' experience. On this technology advance, negative effects of AI technologies also have been discussed actively while there exists positive expectation on them at the same time. For instance, many social issues such as trolley dilemma and system security issues are being debated, whereas autonomous vehicles based on artificial intelligence have had attention in terms of stability increase. Therefore, it needs to check and analyse major social issues on artificial intelligence for their development and societal acceptance. In this paper, multi-categorical sentiment analysis is conducted over online public opinion on artificial intelligence after identifying the trending topics related to artificial intelligence for two years from January 2016 to December 2017, which include the event, match between Lee Sedol and AlphaGo. Using the largest web portal in South Korea, online news, news headlines and news comments were crawled. Considering the importance of trending topics, online public opinion was analysed into seven multiple sentimental categories comprised of anger, dislike, fear, happiness, neutrality, sadness, and surprise by topics, not only two simple positive or negative sentiment. As a result, it was found that the top sentiment is "happiness" in most events and yet sentiments on each keyword are different. In addition, when the research period was divided into four periods, the first half of 2016, the second half of the year, the first half of 2017, and the second half of the year, it is confirmed that the sentiment of 'anger' decreases as goes by time. Based on the results of this analysis, it is possible to grasp various topics and trends currently discussed on artificial intelligence, and it can be used to prepare countermeasures. We hope that we can improve to measure public opinion more precisely in the future by integrating empathy level of news comments.

Perception and Appraisal of Urban Park Users Using Text Mining of Google Maps Review - Cases of Seoul Forest, Boramae Park, Olympic Park - (구글맵리뷰 텍스트마이닝을 활용한 공원 이용자의 인식 및 평가 - 서울숲, 보라매공원, 올림픽공원을 대상으로 -)

  • Lee, Ju-Kyung;Son, Yong-Hoon
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.49 no.4
    • /
    • pp.15-29
    • /
    • 2021
  • The study aims to grasp the perception and appraisal of urban park users through text analysis. This study used Google review data provided by Google Maps. Google Maps Review is an online review platform that provides information evaluating locations through social media and provides an understanding of locations from the perspective of general reviewers and regional guides who are registered as members of Google Maps. The study determined if the Google Maps Reviews were useful for extracting meaningful information about the user perceptions and appraisals for parks management plans. The study chose three urban parks in Seoul, South Korea; Seoul Forest, Boramae Park, and Olympic Park. Review data for each of these three parks were collected via web crawling using Python. Through text analysis, the keywords and network structure characteristics for each park were analyzed. The text was analyzed, as were park ratings, and the analysis compared the reviews of residents and foreign tourists. The common keywords found in the review comments for the three parks were "walking", "bicycle", "rest" and "picnic" for activities, "family", "child" and "dogs" for accompanying types, and "playground" and "walking trail" for park facilities. Looking at the characteristics of each park, Seoul Forest shows many outdoor activities based on nature, while the lack of parking spaces and congestion on weekends negatively impacted users. Boramae Park has the appearance of a city park, with various facilities providing numerous activities, but reviewers often cited the park's complexity and the negative aspects in terms of dog walking groups. At Olympic Park, large-scale complex facilities and cultural events were frequently mentioned, emphasizing its entertainment functions. Google Maps Review can function as useful data to identify parks' overall users' experiences and general feelings. Compared to data from other social media sites, Google Maps Review's data provides ratings and understanding factors, including user satisfaction and dissatisfaction.

Prediction of Key Variables Affecting NBA Playoffs Advancement: Focusing on 3 Points and Turnover Features (미국 프로농구(NBA)의 플레이오프 진출에 영향을 미치는 주요 변수 예측: 3점과 턴오버 속성을 중심으로)

  • An, Sehwan;Kim, Youngmin
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.263-286
    • /
    • 2022
  • This study acquires NBA statistical information for a total of 32 years from 1990 to 2022 using web crawling, observes variables of interest through exploratory data analysis, and generates related derived variables. Unused variables were removed through a purification process on the input data, and correlation analysis, t-test, and ANOVA were performed on the remaining variables. For the variable of interest, the difference in the mean between the groups that advanced to the playoffs and did not advance to the playoffs was tested, and then to compensate for this, the average difference between the three groups (higher/middle/lower) based on ranking was reconfirmed. Of the input data, only this year's season data was used as a test set, and 5-fold cross-validation was performed by dividing the training set and the validation set for model training. The overfitting problem was solved by comparing the cross-validation result and the final analysis result using the test set to confirm that there was no difference in the performance matrix. Because the quality level of the raw data is high and the statistical assumptions are satisfied, most of the models showed good results despite the small data set. This study not only predicts NBA game results or classifies whether or not to advance to the playoffs using machine learning, but also examines whether the variables of interest are included in the major variables with high importance by understanding the importance of input attribute. Through the visualization of SHAP value, it was possible to overcome the limitation that could not be interpreted only with the result of feature importance, and to compensate for the lack of consistency in the importance calculation in the process of entering/removing variables. It was found that a number of variables related to three points and errors classified as subjects of interest in this study were included in the major variables affecting advancing to the playoffs in the NBA. Although this study is similar in that it includes topics such as match results, playoffs, and championship predictions, which have been dealt with in the existing sports data analysis field, and comparatively analyzed several machine learning models for analysis, there is a difference in that the interest features are set in advance and statistically verified, so that it is compared with the machine learning analysis result. Also, it was differentiated from existing studies by presenting explanatory visualization results using SHAP, one of the XAI models.