• Title/Summary/Keyword: 트위터 분석

Search Result 344, Processing Time 0.025 seconds

Twitter Issue Tracking System by Topic Modeling Techniques (토픽 모델링을 이용한 트위터 이슈 트래킹 시스템)

  • Bae, Jung-Hwan;Han, Nam-Gi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.109-122
    • /
    • 2014
  • People are nowadays creating a tremendous amount of data on Social Network Service (SNS). In particular, the incorporation of SNS into mobile devices has resulted in massive amounts of data generation, thereby greatly influencing society. This is an unmatched phenomenon in history, and now we live in the Age of Big Data. SNS Data is defined as a condition of Big Data where the amount of data (volume), data input and output speeds (velocity), and the variety of data types (variety) are satisfied. If someone intends to discover the trend of an issue in SNS Big Data, this information can be used as a new important source for the creation of new values because this information covers the whole of society. In this study, a Twitter Issue Tracking System (TITS) is designed and established to meet the needs of analyzing SNS Big Data. TITS extracts issues from Twitter texts and visualizes them on the web. The proposed system provides the following four functions: (1) Provide the topic keyword set that corresponds to daily ranking; (2) Visualize the daily time series graph of a topic for the duration of a month; (3) Provide the importance of a topic through a treemap based on the score system and frequency; (4) Visualize the daily time-series graph of keywords by searching the keyword; The present study analyzes the Big Data generated by SNS in real time. SNS Big Data analysis requires various natural language processing techniques, including the removal of stop words, and noun extraction for processing various unrefined forms of unstructured data. In addition, such analysis requires the latest big data technology to process rapidly a large amount of real-time data, such as the Hadoop distributed system or NoSQL, which is an alternative to relational database. We built TITS based on Hadoop to optimize the processing of big data because Hadoop is designed to scale up from single node computing to thousands of machines. Furthermore, we use MongoDB, which is classified as a NoSQL database. In addition, MongoDB is an open source platform, document-oriented database that provides high performance, high availability, and automatic scaling. Unlike existing relational database, there are no schema or tables with MongoDB, and its most important goal is that of data accessibility and data processing performance. In the Age of Big Data, the visualization of Big Data is more attractive to the Big Data community because it helps analysts to examine such data easily and clearly. Therefore, TITS uses the d3.js library as a visualization tool. This library is designed for the purpose of creating Data Driven Documents that bind document object model (DOM) and any data; the interaction between data is easy and useful for managing real-time data stream with smooth animation. In addition, TITS uses a bootstrap made of pre-configured plug-in style sheets and JavaScript libraries to build a web system. The TITS Graphical User Interface (GUI) is designed using these libraries, and it is capable of detecting issues on Twitter in an easy and intuitive manner. The proposed work demonstrates the superiority of our issue detection techniques by matching detected issues with corresponding online news articles. The contributions of the present study are threefold. First, we suggest an alternative approach to real-time big data analysis, which has become an extremely important issue. Second, we apply a topic modeling technique that is used in various research areas, including Library and Information Science (LIS). Based on this, we can confirm the utility of storytelling and time series analysis. Third, we develop a web-based system, and make the system available for the real-time discovery of topics. The present study conducted experiments with nearly 150 million tweets in Korea during March 2013.

The Effect of Public Frame, Stereotype, and Twitter Remediation on Country Reputation: Focused on Japan and China's Country Reputation around Diaoyudao Issue (공중 프레임, 고정관념, 트위터의 재매개(remediation)가 국가명성에 미치는 영향: 댜오위다오 이슈를 둘러싼 일본 및 중국 명성을 중심으로)

  • Cha, Hee-Won;Chang, Seo-Jin;Jang, Hyun-Ji
    • Korean journal of communication and information
    • /
    • v.62
    • /
    • pp.286-314
    • /
    • 2013
  • The purpose of this paper is to explore the effect of stereotype, public frame, and Twitter remediation on country reputation. Diaoyudao/Senkaku dominium dispute was chosen which has been a lasting territorial problem between Japan and China. According to a survey conducted toward 210 Koreans aged 20 to 40, stereotype of Japan and China has effects on each reputation. Legitimacy-securing frame only affect China's country reputation. While stereotype of Japan and 'attribution to Japan's responsibility' frame significantly interacted with Japan's country reputation, stereotype of Japan and 'attribution to both sides' responsibility' frame also interacted with Japan. Also, Twitter remediation activity categorized into producing, distributing and viewing was examined. Only producing and viewing had effects on Japan's reputation. In terms of interaction, the more opinion leaders do distributing, the more they agree on attribution to both sides' responsibility frame, and recognize China's reputation more positive. Consequently, it was proved that Twitter opinion leaders interpreted public frame and recognized country reputation in a different way compared to the normal Twitterians.

  • PDF

Keyword Filtering about Disaster and the Method of Detecting Area in Detecting Real-Time Event Using Twitter (트위터를 활용한 실시간 이벤트 탐지에서의 재난 키워드 필터링과 지명 검출 기법)

  • Ha, Hyunsoo;Hwang, Byung-Yeon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.7
    • /
    • pp.345-350
    • /
    • 2016
  • This research suggests the keyword filtering about disaster and the method of detecting area in real-time event detecting system by analyzing contents of twitter. The diffusion of smart-mobile has lead to a fast spread of SNS and nowadays, various researches based on studying SNS are being processed. Among SNS, the twitter has a characteristic of fast diffusion since it is written in 140 words of short paragraph. Therefore, the tweets that are written by twitter users are able to perform a role of sensor. By using these features the research has been constructed which detects the events that have been occurred. However, people became reluctant to open their information of location because it is reported that private information leakage are increasing. Also, problems associated with accuracy are occurred in process of analyzing the tweet contents that do not follow the spelling rule. Therefore, additional designing keyword filtering and the method of area detection on detecting real-time event process were required in order to develop the accuracy. This research suggests the method of keyword filtering about disaster and two methods of detecting area. One is the method of removing area noise which removes the noise that occurred in the local name words. And the other one is the method of determinating the area which confirms local name words by using landmarks. By applying the method of keyword filtering about disaster and two methods of detecting area, the accuracy has improved. It has improved 49% to 78% by using the method of removing area noise and the other accuracy has improved 49% to 89% by using the method of determinating the area.

Analysis of Twitter Post with 'Self-Iinjury' and 'Ssuicide' Using Text Mining (텍스트 마이닝기법을 활용한 '자해' 및 '자살' 관련 트위터 게시물 분석)

  • Yuri Lee;Hoin Kwon
    • Korean Journal of Culture and Social Issue
    • /
    • v.29 no.1
    • /
    • pp.147-170
    • /
    • 2023
  • This study explored keywords and key topics by collecting posts related to 'self-Iinjury' and 'suicide' through Twitter. The study subjects were selected as posts containing related hashtags related to self-injury and suicide from October 29, 2019 to November 30, 2020. Text mining based on collected posts resulted in a total of 11 key topics: -6 related to 'self-Iinjury' and 5 related to 'suicide'. The main message in the topic is as follows. First, looking at the main messages contained in the topic, they honestly expressed self-harm and suicide experiences that are difficult to express offline online, and used SNS as a channelpath for requesting help requests. Second, there were common and discriminatory characteristics in posts related to 'self-Iinjury' and 'suicide'. Although topics related to 'self-Iinjury' mainly revealed emotional control and interpersonal functions of self-harm, messages related to 'suicide' showed more clearly messages about suicide prevention addressing and social problems. These results are meaningful in that they can understand the opinions of people who have experienced self-harm and suicide accidents and the public voice on self-harm and suicide-related issues could be better understood, and that this study seeks for effective self-harm and suicide prevention and intervention measures for self-harm and suicide issues.

Web crawling process of each social network service for recognizing water quality accidents in the water supply networks (물공급네트워크 수질사고인지를 위한 소셜네트워크 서비스 별 웹크롤링 방법론 개발)

  • Yoo, Do Guen;Hong, Seunghyeok;Moon, Gihoon
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2022.05a
    • /
    • pp.398-398
    • /
    • 2022
  • 최근 수돗물 공급과정에 있어 적수, 유충 발생 등 지역 단위의 수질문제로 국민의 직간접적인 피해가 발생된 바 있다. 수질문제 발생 시, 소셜네트워크서비스(SNS)에 게시되는 피해 관련 의견은 시공간적으로 빠르게 확산되며, 궁극적으로는 물공급과정 전체의 부정적 인식증가와 신뢰도 저하를 초래한다. 따라서, 물공급시스템에서의 수질사고 발생을 빠르게 인지하는 다양한 방법론의 적용을 통한 피해 최소화를 위한 노력이 반드시 필요하다. 일반적으로 수질사고는 다양한 항목의 실시간 계측기에서 획득되는 시계열자료의 변화양상을 통해 판단할 수 있으나, 이와 같은 방법론의 효율적 적용을 위해서는 선진계측인프라의 도입이 선행되어야 한다. 본 연구에서는 국내의 발달된 정보통신기술환경을 활용하여, 물공급네트워크 내 수질사고인지를 위한 SNS 별 웹크롤링 방법론을 제안하고, 적용결과를 분석하였다. 방법론의 구현에 앞서, 각종 SNS 별(트위터, 인스타그램, 블로그, 네이버 카페 등) 프로그래밍을 통한 웹크롤링 가능여부, 정보획득 기간 등을 확인하였으며, 과거 유사 수질사고 발생 시 영향력과 관련 게시글이 크게 나타난 네이버 카페와 트위터를 중심으로 웹 크롤링 절차를 제시하였다. 네이버 카페의 경우 대상급수구역 내의 시민들이 다수 참여하는 카페를 목록화하고, 지자체명과 핵심 키워드(수돗물, 유충, 적수) 조합을 활용한 웹크롤링을 수행하여, 관련 게시물 건수와 의미를 실시간으로 분석하는 절차를 마련하였다. 개발된 SNS 별 웹크롤링 방법론에 따라 과거 수질사고가 발생된 바 있는 2개 이상의 지자체에 대한 분석을 실시하였으며, SNS 별 결과에 있어 차이점을 확인하여 제시하였다. 향후 제안된 방법을 적용하여 시공간적 수질사고 정보의 전파 및 확산양상을 추가적으로 분석할수 있을 것으로 기대된다.

  • PDF

Event Template Extraction for the Decision Support based on Social Media (소셜미디어 기반 의사결정 지원을 위한 이벤트 템플릿 추출)

  • Heo, Jeong;Ryu, Pum-Mo;Choi, Yoon-Jae;Kim, Hyun-Ki
    • Annual Conference on Human and Language Technology
    • /
    • 2012.10a
    • /
    • pp.53-57
    • /
    • 2012
  • 본 논문은 소셜 미디어 기반 의사결정 지원 시스템인 '소셜위즈덤'에 포함된 이벤트 템플릿 추출에 대해서 소개한다. 의사결정 지원 시스템은 경제적, 사회적 중요사항을 결정할 수 있도록 관련 정보와 인사이트(Insight)를 제공하는 정보시스템을 이른다. 기존 시스템은 단지 특정 키워드 빈도나 공기하는 키워드들의 관계만을 제공하였다. 그러나, 소셜위즈덤은 이벤트로 정의되는 주체(Subject), 이벤트 속성(Event-Property), 객체(Object)의 트리플(Triple) 집합인 템플릿을 추출하여 이를 기반으로 이벤트 정보를 함께 제공한다. 템플릿 추출은 고정밀 언어분석의 관계추출 기술과 온톨로지에 기반한 템플릿 제약 및 필터링 규칙을 이용하였다. 수작업으로 구축한 평가데이터로 평가한 결과, 템플릿 추출 성능(F-Score)은 뉴스 0.544, 블로그 0.3386, 트위터 0.3251이고 전체 통합 성능은 0.4648이었다. 필터링 성능(Accuracy)은 뉴스 0.7257, 블로그 0.6122, 트위터 0.6207이고 전체 통합 성능은 0.722이었다.

  • PDF

A Development Of A System For Earthquake Warning Using Social Media (소셜미디어를 이용한 지진정보전달 시스템 개발)

  • Jeon, Inchan;Choi, Seong-Jong;Lee, Yong-Tae;Hong, Sung-Dae
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.12 no.5
    • /
    • pp.169-175
    • /
    • 2012
  • The Great East Japan Earthquake left some implications. Especially the case of alerting by social media had present. This paper suggests system for posting earthquake information to microblog like twitter and me2day. Microblog is most efficient and effective social media. So, this system receive earthquake information from the Earthquake Broadcast System in the Korea Meteorological Administration and post the information to twitter and me2day. By this system, earthquake information can be notice easily and response can be checked.

A Study on Social Media Marketing Strategies for Digital Libraries (디지털도서관의 소셜미디어 마케팅 전략에 관한 연구)

  • Hwang, Jae-Young;Koo, Chan-Mi
    • Journal of Information Management
    • /
    • v.42 no.4
    • /
    • pp.225-242
    • /
    • 2011
  • With the development of the IT and internet, companies try to use Blogs, Twitter, Facebook, smart phone for their marketing and customer relationship management. What is called social media marketing appear and it lead to new value creation to the company. However, until now most of libraries in Korea didn't have much interest in marketing and PR. In recent, libraries are starting to be interested in marketing and social network service. Library try to use social network service for marketing. This research introduces the various cases and status of using social network service for marketing in Korea and international library field, analyzes these cases from a marketing perspective. Finally, this research suggests the considerations and successful strategy for using social media marketing in library field.

Exploring Opinions on University Online Classes During the COVID-19 Pandemic Through Twitter Opinion Mining (트위터 오피니언 마이닝을 통한 코로나19 기간 대학 비대면 수업에 대한 의견 고찰)

  • Kim, Donghun;Jiang, Ting;Zhu, Yongjun
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.55 no.4
    • /
    • pp.5-22
    • /
    • 2021
  • This study aimed to understand how people perceive the transition from offline to online classes at universities during the COVID-19 pandemic. To achieve the goal, we collected tweets related to online classes on Twitter and performed sentiment and time series topic analysis. We have the following findings. First, through the sentiment analysis, we found that there were more negative than positive opinions overall, but negative opinions had gradually decreased over time. Through exploring the monthly distribution of sentiment scores of tweets, we found that sentiment scores during the semesters were more widespread than the ones during the vacations. Therefore, more diverse emotions and opinions were showed during the semesters. Second, through time series topic analysis, we identified five main topics of positive tweets that include class environment and equipment, positive emotions, places of taking online classes, language class, and tests and assignments. The four main topics of negative tweets include time (class & break time), tests and assignments, negative emotions, and class environment and equipment. In addition, we examined the trends of public opinions on online classes by investigating the changes in topic composition over time through checking the proportions of representative keywords in each topic. Different from the existing studies of understanding public opinions on online classes, this study attempted to understand the overall opinions from tweet data using sentiment and time series topic analysis. The results of the study can be used to improve the quality of online classes in universities and help universities and instructors to design and offer better online classes.

Understanding Public Opinion by Analyzing Twitter Posts Related to Real Estate Policy (부동산 정책 관련 트위터 게시물 분석을 통한 대중 여론 이해)

  • Kim, Kyuli;Oh, Chanhee;Zhu, Yongjun
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.56 no.3
    • /
    • pp.47-72
    • /
    • 2022
  • This study aims to understand the trends of subjects related to real estate policies and public's emotional opinion on the policies. Two keywords related to real estate policies such as "real estate policy" and "real estate measure" were used to collect tweets created from February 25, 2008 to August 31, 2021. A total of 91,740 tweets were collected and we applied sentiment analysis and dynamic topic modeling to the final preprocessed and categorized data of 18,925 tweets. Sentiment analysis and dynamic topic model analysis were conducted for a total of 18,925 posts after preprocessing data and categorizing them into supply, real estate tax, interest rate, and population variance. Keywords of each category are as follows: the supply categories (rental housing, greenbelt, newlyweds, homeless, supply, reconstruction, sale), real estate tax categories (comprehensive real estate tax, acquisition tax, holding tax, multiple homeowners, speculation), interest rate categories (interest rate), and population variance categories (Sejong, new city). The results of the sentiment analysis showed that one person posted on average one or two positive tweets whereas in the case of negative and neutral tweets, one person posted two or three. In addition, we found that part of people have both positive as well as negative and neutral opinions towards real estate policies. As the results of dynamic topic modeling analysis, negative reactions to real estate speculative forces and unearned income were identified as major negative topics and as for positive topics, expectation on increasing supply of housing and benefits for homeless people who purchase houses were identified. Unlike previous studies, which focused on changes and evaluations of specific real estate policies, this study has academic significance in that it collected posts from Twitter, one of the social media platforms, used emotional analysis, dynamic topic modeling analysis, and identified potential topics and trends of real estate policy over time. The results of the study can help create new policies that take public opinion on real estate policies into consideration.