• Title/Summary/Keyword: twitter data

Search Result 305, Processing Time 0.029 seconds

Propagation Models for Structural Parameters in Online Social Networks (온라인 소셜 네트워크에서 구조적 파라미터를 위한 확산 모델)

  • Kong, Jong-Hwan;Kim, Ik Kyun;Han, Myung-Mook
    • Journal of Internet Computing and Services
    • /
    • v.15 no.1
    • /
    • pp.125-134
    • /
    • 2014
  • As the social media which was simple communication media is activated on account of twitter and facebook, it's usability and importance are growing recently. Although many companies are making full use of its the capacity of information diffusion for marketing, the adverse effects of this capacity are growing. Because social network is formed and communicates based on friendships and relationships, the spreading speed of the spam and mal-ware is very swift. In this paper, we draw parameters affecting malicious data diffusion in social network environment, and compare and analyze the diffusion capacity of each parameters by propagation experiment with XSS Worm and Koobface Worm. In addition, we discuss the structural characteristics of social network environment and then proposed malicious data propagation model based on parameters affecting information diffusion. n this paper, we made up BA and HK models based on SI model, dynamic model, to conduct the experiments, and as a result of the experiments it was proved that parameters which effect on propagation of XSS Worm and Koobface Worm are clustering coefficient and closeness centrality.

Online Social Capital Analysis on the Yeungnam Local Presses : Website and Social Media (영남지역 언론사의 온라인 사회자본 분석 : 웹사이트와 소셜미디어를 중심으로)

  • Kim, Ji Young;Ha, Young Ji;Park, Han Woo
    • The Journal of the Korea Contents Association
    • /
    • v.13 no.4
    • /
    • pp.73-85
    • /
    • 2013
  • This study examines the online social capital of local press using the website and social media. Moreover, the paper respectively visualizes web feature as Web 1.0 and social feature analysis as Web 2.0 by applying correspondence analysis. For data, the study analyzes 10 representative local press in Yeungnam areas. To collect the data, two coders coded web features from the websites and we employed NodeXL, an open-source software tool, for social media data. The results reveal that local websites expend online social capital using social media account. Especially, the social features of local presses attach importance to Twitter as the main press keep the well-balance use among all platforms.

Personalized Recommendation System using Level of Cosine Similarity of Emotion Word from Social Network (소셜 네트워크에서 감정단어의 단계별 코사인 유사도 기법을 이용한 추천시스템)

  • Kwon, Eungju;Kim, Jongwoo;Heo, Nojeong;Kang, Sanggil
    • Journal of Information Technology and Architecture
    • /
    • v.9 no.3
    • /
    • pp.333-344
    • /
    • 2012
  • This paper proposes a system which recommends movies using information from social network services containing personal interest and taste. Method for establishing data is as follows. The system gathers movies' information from web sites and user's information from social network services such as Facebook and twitter. The data from social network services is categorized into six steps of emotion level for more accurate processing following users' emotional states. Gathered data will be established into vector space model which is ideal for analyzing and deducing the information with the system which is suggested in this paper. The existing similarity measurement method for movie recommendation is presentation of vector information about emotion level and similarity measuring method on the coordinates using Cosine measure. The deducing method suggested in this paper is two-phase arithmetic operation as follows. First, using general cosine measurement, the system establishes movies list. Second, using similarity measurement, system decides recommendable movie list by vector operation from the coordinates. After Comparative Experimental Study on the previous recommendation systems and new one, it turned out the new system from this study is more helpful than existing systems.

Improved Feature Extraction Method for the Contents Polluter Detection in Social Networking Service (SNS에서 콘텐츠 오염자 탐지를 위한 개선된 특징 추출 방법)

  • Han, Jin Seop;Park, Byung Joon
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.11
    • /
    • pp.47-54
    • /
    • 2015
  • The number of users of SNS such as Twitter and Facebook increases due to the development of internet and the spread of supply of mobile devices such as smart phone. Moreover, there are also an increasing number of content pollution problems that pollute SNS by posting a product advertisement, defamatory comment and adult contents, and so on. This paper proposes an improved method of extracting the feature of content polluter for detecting a content polluter in SNS. In particular, this paper presents a method of extracting the feature of content polluter on the basis of incremental approach that considers only increment in data, not batch processing system of entire data in order to efficiently extract the feature value of new user data at the stage of predicting and classifying a content polluter. And it comparatively assesses whether the proposed method maintains classification accuracy and improves time efficiency in comparison with batch processing method through experiment.

The Study of Koreans' Perception about Vietnam using Social Big Data (베트남에 대한 한국인의 인식 연구 : 소셜 빅데이터를 활용하여)

  • Seo, Eun Hee;Lee, Jaeseong
    • The Journal of the Korea Contents Association
    • /
    • v.19 no.3
    • /
    • pp.1-9
    • /
    • 2019
  • The purposes of the study are to investigate Koreans' perception about Vietnam by analyzing social big data and to seek changing direction in perception. For the purposes, the texts about Vietnam in Naver Blog and Twitter and the number of search and click for Vietnam in Naver were analyzed by Social Metrics of Daum Soft and Datalab of Naver. The study also analyzed the annual change of their interest in Vietnam based on social media. The results showed that Koreans still remember the Vietnam war, have a positive emotion toward Vietnam, and view Vietnam as a country where we can gain mutual benefit by exchange. The findings also indicated that Koreans perceive Vietnam as a favorite tourist spot regardless of age. Meanwhile, children under 12 showed a different pattern of an annual change in perception. It might be a positive sign that Koreans' interest region toward Vietnam would be diversified because children under 12 would be the central axis of cultural contents.

An Exploratory Analysis of Online Discussion of Library and Information Science Professionals in India using Text Mining

  • Garg, Mohit;Kanjilal, Uma
    • Journal of Information Science Theory and Practice
    • /
    • v.10 no.3
    • /
    • pp.40-56
    • /
    • 2022
  • This paper aims to implement a topic modeling technique for extracting the topics of online discussions among library professionals in India. Topic modeling is the established text mining technique popularly used for modeling text data from Twitter, Facebook, Yelp, and other social media platforms. The present study modeled the online discussions of Library and Information Science (LIS) professionals posted on Lis Links. The text data of these posts was extracted using a program written in R using the package "rvest." The data was pre-processed to remove blank posts, posts having text in non-English fonts, punctuation, URLs, emails, etc. Topic modeling with the Latent Dirichlet Allocation algorithm was applied to the pre-processed corpus to identify each topic associated with the posts. The frequency analysis of the occurrence of words in the text corpus was calculated. The results found that the most frequent words included: library, information, university, librarian, book, professional, science, research, paper, question, answer, and management. This shows that the LIS professionals actively discussed exams, research, and library operations on the forum of Lis Links. The study categorized the online discussions on Lis Links into ten topics, i.e. "LIS Recruitment," "LIS Issues," "Other Discussion," "LIS Education," "LIS Research," "LIS Exams," "General Information related to Library," "LIS Admission," "Library and Professional Activities," and "Information Communication Technology (ICT)." It was found that the majority of the posts belonged to "LIS Exam," followed by "Other Discussions" and "General Information related to the Library."

The Analysis on the Relationship between Firms' Exposures to SNS and Stock Prices in Korea (기업의 SNS 노출과 주식 수익률간의 관계 분석)

  • Kim, Taehwan;Jung, Woo-Jin;Lee, Sang-Yong Tom
    • Asia pacific journal of information systems
    • /
    • v.24 no.2
    • /
    • pp.233-253
    • /
    • 2014
  • Can the stock market really be predicted? Stock market prediction has attracted much attention from many fields including business, economics, statistics, and mathematics. Early research on stock market prediction was based on random walk theory (RWT) and the efficient market hypothesis (EMH). According to the EMH, stock market are largely driven by new information rather than present and past prices. Since it is unpredictable, stock market will follow a random walk. Even though these theories, Schumaker [2010] asserted that people keep trying to predict the stock market by using artificial intelligence, statistical estimates, and mathematical models. Mathematical approaches include Percolation Methods, Log-Periodic Oscillations and Wavelet Transforms to model future prices. Examples of artificial intelligence approaches that deals with optimization and machine learning are Genetic Algorithms, Support Vector Machines (SVM) and Neural Networks. Statistical approaches typically predicts the future by using past stock market data. Recently, financial engineers have started to predict the stock prices movement pattern by using the SNS data. SNS is the place where peoples opinions and ideas are freely flow and affect others' beliefs on certain things. Through word-of-mouth in SNS, people share product usage experiences, subjective feelings, and commonly accompanying sentiment or mood with others. An increasing number of empirical analyses of sentiment and mood are based on textual collections of public user generated data on the web. The Opinion mining is one domain of the data mining fields extracting public opinions exposed in SNS by utilizing data mining. There have been many studies on the issues of opinion mining from Web sources such as product reviews, forum posts and blogs. In relation to this literatures, we are trying to understand the effects of SNS exposures of firms on stock prices in Korea. Similarly to Bollen et al. [2011], we empirically analyze the impact of SNS exposures on stock return rates. We use Social Metrics by Daum Soft, an SNS big data analysis company in Korea. Social Metrics provides trends and public opinions in Twitter and blogs by using natural language process and analysis tools. It collects the sentences circulated in the Twitter in real time, and breaks down these sentences into the word units and then extracts keywords. In this study, we classify firms' exposures in SNS into two groups: positive and negative. To test the correlation and causation relationship between SNS exposures and stock price returns, we first collect 252 firms' stock prices and KRX100 index in the Korea Stock Exchange (KRX) from May 25, 2012 to September 1, 2012. We also gather the public attitudes (positive, negative) about these firms from Social Metrics over the same period of time. We conduct regression analysis between stock prices and the number of SNS exposures. Having checked the correlation between the two variables, we perform Granger causality test to see the causation direction between the two variables. The research result is that the number of total SNS exposures is positively related with stock market returns. The number of positive mentions of has also positive relationship with stock market returns. Contrarily, the number of negative mentions has negative relationship with stock market returns, but this relationship is statistically not significant. This means that the impact of positive mentions is statistically bigger than the impact of negative mentions. We also investigate whether the impacts are moderated by industry type and firm's size. We find that the SNS exposures impacts are bigger for IT firms than for non-IT firms, and bigger for small sized firms than for large sized firms. The results of Granger causality test shows change of stock price return is caused by SNS exposures, while the causation of the other way round is not significant. Therefore the correlation relationship between SNS exposures and stock prices has uni-direction causality. The more a firm is exposed in SNS, the more is the stock price likely to increase, while stock price changes may not cause more SNS mentions.

Structural features and Diffusion Patterns of Gartner Hype Cycle for Artificial Intelligence using Social Network analysis (인공지능 기술에 관한 가트너 하이프사이클의 네트워크 집단구조 특성 및 확산패턴에 관한 연구)

  • Shin, Sunah;Kang, Juyoung
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.107-129
    • /
    • 2022
  • It is important to preempt new technology because the technology competition is getting much tougher. Stakeholders conduct exploration activities continuously for new technology preoccupancy at the right time. Gartner's Hype Cycle has significant implications for stakeholders. The Hype Cycle is a expectation graph for new technologies which is combining the technology life cycle (S-curve) with the Hype Level. Stakeholders such as R&D investor, CTO(Chef of Technology Officer) and technical personnel are very interested in Gartner's Hype Cycle for new technologies. Because high expectation for new technologies can bring opportunities to maintain investment by securing the legitimacy of R&D investment. However, contrary to the high interest of the industry, the preceding researches faced with limitations aspect of empirical method and source data(news, academic papers, search traffic, patent etc.). In this study, we focused on two research questions. The first research question was 'Is there a difference in the characteristics of the network structure at each stage of the hype cycle?'. To confirm the first research question, the structural characteristics of each stage were confirmed through the component cohesion size. The second research question is 'Is there a pattern of diffusion at each stage of the hype cycle?'. This research question was to be solved through centralization index and network density. The centralization index is a concept of variance, and a higher centralization index means that a small number of nodes are centered in the network. Concentration of a small number of nodes means a star network structure. In the network structure, the star network structure is a centralized structure and shows better diffusion performance than a decentralized network (circle structure). Because the nodes which are the center of information transfer can judge useful information and deliver it to other nodes the fastest. So we confirmed the out-degree centralization index and in-degree centralization index for each stage. For this purpose, we confirmed the structural features of the community and the expectation diffusion patterns using Social Network Serice(SNS) data in 'Gartner Hype Cycle for Artificial Intelligence, 2021'. Twitter data for 30 technologies (excluding four technologies) listed in 'Gartner Hype Cycle for Artificial Intelligence, 2021' were analyzed. Analysis was performed using R program (4.1.1 ver) and Cyram Netminer. From October 31, 2021 to November 9, 2021, 6,766 tweets were searched through the Twitter API, and converting the relationship user's tweet(Source) and user's retweets (Target). As a result, 4,124 edgelists were analyzed. As a reult of the study, we confirmed the structural features and diffusion patterns through analyze the component cohesion size and degree centralization and density. Through this study, we confirmed that the groups of each stage increased number of components as time passed and the density decreased. Also 'Innovation Trigger' which is a group interested in new technologies as a early adopter in the innovation diffusion theory had high out-degree centralization index and the others had higher in-degree centralization index than out-degree. It can be inferred that 'Innovation Trigger' group has the biggest influence, and the diffusion will gradually slow down from the subsequent groups. In this study, network analysis was conducted using social network service data unlike methods of the precedent researches. This is significant in that it provided an idea to expand the method of analysis when analyzing Gartner's hype cycle in the future. In addition, the fact that the innovation diffusion theory was applied to the Gartner's hype cycle's stage in artificial intelligence can be evaluated positively because the Gartner hype cycle has been repeatedly discussed as a theoretical weakness. Also it is expected that this study will provide a new perspective on decision-making on technology investment to stakeholdes.

An Update-Efficient, Disk-Based Inverted Index Structure for Keyword Search on Data Streams (데이터 스트림에 대한 키워드 검색을 위한, 효율적인 갱신이 가능한 디스크 기반 역색인 구조)

  • Park, Eun Ju;Lee, Ki Yong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.4
    • /
    • pp.171-180
    • /
    • 2016
  • As social networking services such as twitter become increasingly popular, data streams are widely prevalent these days. In order to search data accumulated from data streams efficiently, the use of an index structure is essential. In this paper, we propose an update-efficient, disk-based inverted index structure for efficient keyword search on data streams. When new data arrive at the data stream, the index needs to be updated to incorporate the new data. The traditional inverted index is very inefficient to update in terms of disk I/O, because all index data stored in the disk need to be read and written to the disk each time the index is updated. To solve this problem, we divide the whole inverted index into a sequence of inverted indices with exponentially increasing size. When new data arrives, it is first inserted into the smallest index and, later, the small indices are merged with the larger indices, which leads to a small amortize update cost for each new data. Furthermore, when indices stored in the disk are merged with each other, we minimize the disk I/O cost incurred for the merge operation, resulting in an even smaller update cost. Through various experiments, we compare the update efficiency of the proposed index structure with the previous one, and show the performance advantage of the proposed structure in terms of the update cost.

Analyzing Spatial Correlation between Location-Based Social Media Data and Real Estates Price Index through Rasterization (격자기반 분석을 통한 위치기반 소셜 미디어 데이터와 부동산 가격지수 간의 공간적 상관성 분석 연구)

  • Park, Woo Jin;Eo, Seung Won;Yu, Ki Yun
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.23 no.1
    • /
    • pp.23-29
    • /
    • 2015
  • In this study, the spatial relevance between the regional housing price data and the spatial distribution of the location-based social media data is explored. The spatial analysis with rasterization was applied to this study, because the both data have a different form to analyze. The geo-tagged Twitter data had been collected for a month and the regional housing price index about sales and lease were used. The spatial range of both data includes Seoul and the some parts of the metropolitan area. 2,000m grid was constructed to consider the different spatial measure between two data, and they were combined into the constructed grids. The Hotspot Analysis was operated using the combined dataset to see the comparison of spatial distribution, and the bivariate spatial correlation coefficients between two data were measured for the quantitative analysis. The result of this study shows that Seocho-gu area is detected as a common hotspot of tweet and housing sales price index data. though the spatial relevance is not detected between tweet and housing lease price index data.