• Title/Summary/Keyword: 소셜 데이터 분석

Search Result 739, Processing Time 0.026 seconds

Personal Information Overload and User Resistance in the Big Data Age (빅데이터 시대의 개인정보 과잉이 사용자 저항에 미치는 영향)

  • Lee, Hwansoo;Lim, Dongwon;Zo, Hangjung
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.1
    • /
    • pp.125-139
    • /
    • 2013
  • Big data refers to the data that cannot be processes with conventional contemporary data technologies. As smart devices and social network services produces vast amount of data, big data attracts much attention from researchers. There are strong demands form governments and industries for bib data as it can create new values by drawing business insights from data. Since various new technologies to process big data introduced, academic communities also show much interest to the big data domain. A notable advance related to the big data technology has been in various fields. Big data technology makes it possible to access, collect, and save individual's personal data. These technologies enable the analysis of huge amounts of data with lower cost and less time, which is impossible to achieve with traditional methods. It even detects personal information that people do not want to open. Therefore, people using information technology such as the Internet or online services have some level of privacy concerns, and such feelings can hinder continued use of information systems. For example, SNS offers various benefits, but users are sometimes highly exposed to privacy intrusions because they write too much personal information on it. Even though users post their personal information on the Internet by themselves, the data sometimes is not under control of the users. Once the private data is posed on the Internet, it can be transferred to anywhere by a few clicks, and can be abused to create fake identity. In this way, privacy intrusion happens. This study aims to investigate how perceived personal information overload in SNS affects user's risk perception and information privacy concerns. Also, it examines the relationship between the concerns and user resistance behavior. A survey approach and structural equation modeling method are employed for data collection and analysis. This study contributes meaningful insights for academic researchers and policy makers who are planning to develop guidelines for privacy protection. The study shows that information overload on the social network services can bring the significant increase of users' perceived level of privacy risks. In turn, the perceived privacy risks leads to the increased level of privacy concerns. IF privacy concerns increase, it can affect users to from a negative or resistant attitude toward system use. The resistance attitude may lead users to discontinue the use of social network services. Furthermore, information overload is mediated by perceived risks to affect privacy concerns rather than has direct influence on perceived risk. It implies that resistance to the system use can be diminished by reducing perceived risks of users. Given that users' resistant behavior become salient when they have high privacy concerns, the measures to alleviate users' privacy concerns should be conceived. This study makes academic contribution of integrating traditional information overload theory and user resistance theory to investigate perceived privacy concerns in current IS contexts. There is little big data research which examined the technology with empirical and behavioral approach, as the research topic has just emerged. It also makes practical contributions. Information overload connects to the increased level of perceived privacy risks, and discontinued use of the information system. To keep users from departing the system, organizations should develop a system in which private data is controlled and managed with ease. This study suggests that actions to lower the level of perceived risks and privacy concerns should be taken for information systems continuance.

Professional Baseball Viewing Culture Survey According to Corona 19 using Social Network Big Data (소셜네트워크 빅데이터를 활용한 코로나 19에 따른 프로야구 관람문화조사)

  • Kim, Gi-Tak
    • Journal of Korea Entertainment Industry Association
    • /
    • v.14 no.6
    • /
    • pp.139-150
    • /
    • 2020
  • The data processing of this study focuses on the textom and social media words about three areas: 'Corona 19 and professional baseball', 'Corona 19 and professional baseball', and 'Corona 19 and professional sports' The data was collected and refined in a web environment and then processed in batch, and the Ucinet6 program was used to visualize it. Specifically, the web environment was collected using Naver, Daum, and Google's channels, and was summarized into 30 words through expert meetings among the extracted words and used in the final study. 30 extracted words were visualized through a matrix, and a CONCOR analysis was performed to identify clusters of similarity and commonality of words. As a result of analysis, the clusters related to Corona 19 and Pro Baseball were composed of one central cluster and five peripheral clusters, and it was found that the contents related to the opening of professional baseball according to the corona 19 wave were mainly searched. The cluster related to Corona 19 and unrelated to professional baseball consisted of one central cluster and five peripheral clusters, and it was found that the keyword of the position of professional baseball related to the professional baseball game according to Corona 19 was mainly searched. Corona 19 and the cluster related to professional sports consisted of one central cluster and five peripheral clusters, and it was found that the keywords related to the start of professional sports according to the aftermath of Corona 19 were mainly searched.

Evaluation of Preference by Bukhansan Dulegil Course Using Sentiment Analysis of Blog Data (블로그 데이터 감성분석을 통한 북한산둘레길 구간별 선호도 평가)

  • Lee, Sung-Hee;Son, Yong-Hoon
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.49 no.3
    • /
    • pp.1-10
    • /
    • 2021
  • This study aimed to evaluate preferences of Bukhansan dulegil using sentiment analysis, a natural language processing technique, to derive preferred and non-preferred factors. Therefore, we collected blog articles written in 2019 and produced sentimental scores by the derivation of positive and negative words in the texts for 21 dulegil courses. Then, content analysis was conducted to determine which factors led visitors to prefer or dislike each course. In blogs written about Bukhansan dulegil, positive words appeared in approximately 73% of the content, and the percentage of positive documents was significantly higher than that of negative documents for each course. Through this, it can be seen that visitors generally had positive sentiments toward Bukhansan dulegil. Nevertheless, according to the sentiment score analysis, all 21 dulegil courses belonged to both the preferred and non-preferred courses. Among courses, visitors preferred less difficult courses, in which they could walk without a burden, and in which various landscape elements (visual, auditory, olfactory, etc.) were harmonious yet distinct. Furthermore, they preferred courses with various landscapes and landscape sequences. Additionally, visitors appreciated the presence of viewpoints, such as observation decks, as a significant factor and preferred courses with excellent accessibility and information provisions, such as information boards. Conversely, the dissatisfaction with the dulegil courses was due to noise caused by adjacent roads, excessive urban areas, and the inequality or difficulty of the course which was primarily attributed to insufficient information on the landscape or section of the course. The results of this study can serve not only serve as a guide in national parks but also in the management of nearby forest green areas to formulate a plan to repair and improve dulegil. Further, the sentiment analysis used in this study is meaningful in that it can continuously monitor actual users' responses towards natural areas. However, since it was evaluated based on a predefined sentiment dictionary, continuous updates are needed. Additionally, since there is a tendency to share positive content rather than negative views due to the nature of social media, it is necessary to compare and review the results of analysis, such as with on-site surveys.

The Distribution and Characteristics of Protected Areas and Natural Resources in the Metropolitan Area in Blog Posts (블로그 게시물에 나타난 수도권 보전지역 및 자연자원의 분포 및 특성)

  • Lee, Sung-Hee;Son, Yong-Hoon
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.50 no.5
    • /
    • pp.30-39
    • /
    • 2022
  • This study aimed to evaluate the awareness of conservation areas and green resources and analyze their characteristics by utilizing accumulated blog data created for specific places and objects. Among all the conservation areas and resources located in the Seoul metropolitan area, places that can be evaluated were classified, and sites were evaluated by dividing them into ten categories based on the number of blog posts written. As a result of the study, the users' awareness of forests was the highest, and the awareness of conservation areas and green resources was higher in urban areas than suburban areas. The result shows that the conservation areas and green resources located around the metropolitan area serve as natural tourist destinations while being the object of conservation for users. In addition, these results are in the same vein as the research results in domestic and foreign studies on the importance of ecosystem services in urban areas. Unlike existing research methods, this study is meaningful in that it identified the level of user awareness through social media analysis and applied it to evaluating conservation areas and green resources. It can be used as basic data to prepare a management plan considering public interest and awareness or to establish a development plan to increase awareness. In addition, the cumulative amount of blog content used in the study is meaningful in that it can identify and monitor users' interest in the space. However, it was not possible to examine the contents of each blog in detail because it was evaluated based on the amount of social media content. In addition, in the case of conservation areas and green resources, it is necessary to review and supplement the evaluation contents by adding keyword analysis and content analysis for the site to be evaluated as content other than the pure viewpoint of users may be mixed with development issues.

Incremental Frequent Pattern Detection Scheme Based on Sliding Windows in Graph Streams (그래프 스트림에서 슬라이딩 윈도우 기반의 점진적 빈발 패턴 검출 기법)

  • Jeong, Jaeyun;Seo, Indeok;Song, Heesub;Park, Jaeyeol;Kim, Minyeong;Choi, Dojin;Bok, Kyoungsoo;Yoo, Jaesoo
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.2
    • /
    • pp.147-157
    • /
    • 2018
  • Recently, with the advancement of network technologies, and the activation of IoT and social network services, many graph stream data have been generated. As the relationship between objects in the graph streams changes dynamically, studies have been conducting to detect or analyze the change of the graph. In this paper, we propose a scheme to incrementally detect frequent patterns by using frequent patterns information detected in previous sliding windows. The proposed scheme calculates values that represent whether the frequent patterns detected in previous sliding windows will be frequent in how many future silding windows. By using the values, the proposed scheme reduces the overall amount of computation by performing only necessary calculations in the next sliding window. In addition, only the patterns that are connected between the patterns are recognized as one pattern, so that only the more significant patterns are detected. We conduct various performance evaluations in order to show the superiority of the proposed scheme. The proposed scheme is faster than existing similar scheme when the number of duplicated data is large.

Fake News Detection Using CNN-based Sentiment Change Patterns (CNN 기반 감성 변화 패턴을 이용한 가짜뉴스 탐지)

  • Tae Won Lee;Ji Su Park;Jin Gon Shon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.4
    • /
    • pp.179-188
    • /
    • 2023
  • Recently, fake news disguises the form of news content and appears whenever important events occur, causing social confusion. Accordingly, artificial intelligence technology is used as a research to detect fake news. Fake news detection approaches such as automatically recognizing and blocking fake news through natural language processing or detecting social media influencer accounts that spread false information by combining with network causal inference could be implemented through deep learning. However, fake news detection is classified as a difficult problem to solve among many natural language processing fields. Due to the variety of forms and expressions of fake news, the difficulty of feature extraction is high, and there are various limitations, such as that one feature may have different meanings depending on the category to which the news belongs. In this paper, emotional change patterns are presented as an additional identification criterion for detecting fake news. We propose a model with improved performance by applying a convolutional neural network to a fake news data set to perform analysis based on content characteristics and additionally analyze emotional change patterns. Sentimental polarity is calculated for the sentences constituting the news and the result value dependent on the sentence order can be obtained by applying long-term and short-term memory. This is defined as a pattern of emotional change and combined with the content characteristics of news to be used as an independent variable in the proposed model for fake news detection. We train the proposed model and comparison model by deep learning and conduct an experiment using a fake news data set to confirm that emotion change patterns can improve fake news detection performance.

A Study of 'Emotion Trigger' by Text Mining Techniques (텍스트 마이닝을 이용한 감정 유발 요인 'Emotion Trigger'에 관한 연구)

  • An, Juyoung;Bae, Junghwan;Han, Namgi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.69-92
    • /
    • 2015
  • The explosion of social media data has led to apply text-mining techniques to analyze big social media data in a more rigorous manner. Even if social media text analysis algorithms were improved, previous approaches to social media text analysis have some limitations. In the field of sentiment analysis of social media written in Korean, there are two typical approaches. One is the linguistic approach using machine learning, which is the most common approach. Some studies have been conducted by adding grammatical factors to feature sets for training classification model. The other approach adopts the semantic analysis method to sentiment analysis, but this approach is mainly applied to English texts. To overcome these limitations, this study applies the Word2Vec algorithm which is an extension of the neural network algorithms to deal with more extensive semantic features that were underestimated in existing sentiment analysis. The result from adopting the Word2Vec algorithm is compared to the result from co-occurrence analysis to identify the difference between two approaches. The results show that the distribution related word extracted by Word2Vec algorithm in that the words represent some emotion about the keyword used are three times more than extracted by co-occurrence analysis. The reason of the difference between two results comes from Word2Vec's semantic features vectorization. Therefore, it is possible to say that Word2Vec algorithm is able to catch the hidden related words which have not been found in traditional analysis. In addition, Part Of Speech (POS) tagging for Korean is used to detect adjective as "emotional word" in Korean. In addition, the emotion words extracted from the text are converted into word vector by the Word2Vec algorithm to find related words. Among these related words, noun words are selected because each word of them would have causal relationship with "emotional word" in the sentence. The process of extracting these trigger factor of emotional word is named "Emotion Trigger" in this study. As a case study, the datasets used in the study are collected by searching using three keywords: professor, prosecutor, and doctor in that these keywords contain rich public emotion and opinion. Advanced data collecting was conducted to select secondary keywords for data gathering. The secondary keywords for each keyword used to gather the data to be used in actual analysis are followed: Professor (sexual assault, misappropriation of research money, recruitment irregularities, polifessor), Doctor (Shin hae-chul sky hospital, drinking and plastic surgery, rebate) Prosecutor (lewd behavior, sponsor). The size of the text data is about to 100,000(Professor: 25720, Doctor: 35110, Prosecutor: 43225) and the data are gathered from news, blog, and twitter to reflect various level of public emotion into text data analysis. As a visualization method, Gephi (http://gephi.github.io) was used and every program used in text processing and analysis are java coding. The contributions of this study are as follows: First, different approaches for sentiment analysis are integrated to overcome the limitations of existing approaches. Secondly, finding Emotion Trigger can detect the hidden connections to public emotion which existing method cannot detect. Finally, the approach used in this study could be generalized regardless of types of text data. The limitation of this study is that it is hard to say the word extracted by Emotion Trigger processing has significantly causal relationship with emotional word in a sentence. The future study will be conducted to clarify the causal relationship between emotional words and the words extracted by Emotion Trigger by comparing with the relationships manually tagged. Furthermore, the text data used in Emotion Trigger are twitter, so the data have a number of distinct features which we did not deal with in this study. These features will be considered in further study.

Clustering Method based on Genre Interest for Cold-Start Problem in Movie Recommendation (영화 추천 시스템의 초기 사용자 문제를 위한 장르 선호 기반의 클러스터링 기법)

  • You, Tithrottanak;Rosli, Ahmad Nurzid;Ha, Inay;Jo, Geun-Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.1
    • /
    • pp.57-77
    • /
    • 2013
  • Social media has become one of the most popular media in web and mobile application. In 2011, social networks and blogs are still the top destination of online users, according to a study from Nielsen Company. In their studies, nearly 4 in 5active users visit social network and blog. Social Networks and Blogs sites rule Americans' Internet time, accounting to 23 percent of time spent online. Facebook is the main social network that the U.S internet users spend time more than the other social network services such as Yahoo, Google, AOL Media Network, Twitter, Linked In and so on. In recent trend, most of the companies promote their products in the Facebook by creating the "Facebook Page" that refers to specific product. The "Like" option allows user to subscribed and received updates their interested on from the page. The film makers which produce a lot of films around the world also take part to market and promote their films by exploiting the advantages of using the "Facebook Page". In addition, a great number of streaming service providers allows users to subscribe their service to watch and enjoy movies and TV program. They can instantly watch movies and TV program over the internet to PCs, Macs and TVs. Netflix alone as the world's leading subscription service have more than 30 million streaming members in the United States, Latin America, the United Kingdom and the Nordics. As the matter of facts, a million of movies and TV program with different of genres are offered to the subscriber. In contrast, users need spend a lot time to find the right movies which are related to their interest genre. Recent years there are many researchers who have been propose a method to improve prediction the rating or preference that would give the most related items such as books, music or movies to the garget user or the group of users that have the same interest in the particular items. One of the most popular methods to build recommendation system is traditional Collaborative Filtering (CF). The method compute the similarity of the target user and other users, which then are cluster in the same interest on items according which items that users have been rated. The method then predicts other items from the same group of users to recommend to a group of users. Moreover, There are many items that need to study for suggesting to users such as books, music, movies, news, videos and so on. However, in this paper we only focus on movie as item to recommend to users. In addition, there are many challenges for CF task. Firstly, the "sparsity problem"; it occurs when user information preference is not enough. The recommendation accuracies result is lower compared to the neighbor who composed with a large amount of ratings. The second problem is "cold-start problem"; it occurs whenever new users or items are added into the system, which each has norating or a few rating. For instance, no personalized predictions can be made for a new user without any ratings on the record. In this research we propose a clustering method according to the users' genre interest extracted from social network service (SNS) and user's movies rating information system to solve the "cold-start problem." Our proposed method will clusters the target user together with the other users by combining the user genre interest and the rating information. It is important to realize a huge amount of interesting and useful user's information from Facebook Graph, we can extract information from the "Facebook Page" which "Like" by them. Moreover, we use the Internet Movie Database(IMDb) as the main dataset. The IMDbis online databases that consist of a large amount of information related to movies, TV programs and including actors. This dataset not only used to provide movie information in our Movie Rating Systems, but also as resources to provide movie genre information which extracted from the "Facebook Page". Formerly, the user must login with their Facebook account to login to the Movie Rating System, at the same time our system will collect the genre interest from the "Facebook Page". We conduct many experiments with other methods to see how our method performs and we also compare to the other methods. First, we compared our proposed method in the case of the normal recommendation to see how our system improves the recommendation result. Then we experiment method in case of cold-start problem. Our experiment show that our method is outperform than the other methods. In these two cases of our experimentation, we see that our proposed method produces better result in case both cases.

Performance Evaluation and Analysis of NVMe SSD (Non-volatile Memory Express 인터페이스 기반 저장장치의 성능 평가 및 분석)

  • Son, Yongseok;Yeom, Heon Young;Han, Hyuck
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.7
    • /
    • pp.428-433
    • /
    • 2017
  • Recently, the demand for high performance non-volatile memory storage devices that can replace existing hard disks has been increasing in environments requiring high performance computing such as data-centers and social network services. The performance of such non-volatile memory can greatly depend on the interface between the host and the storage device. With the evolution of storage interfaces, the non-volatile memory express (NVMe) interface has emerged, which can replace serial attached SCSI and serial ATA (SAS/SATA) interfaces based on existing hard disks. The NVMe interface has a higher level of scalability and provides lower latency than traditional interfaces. In this paper, an evaluation and analysis are conducted of the performance of NVMe storage devices through various workloads. We also compare and evaluate the cost efficiency of NVMe SSD and SATA SSD.

TRED : Twitter based Realtime Event-location Detector (트위터 기반의 실시간 이벤트 지역 탐지 시스템)

  • Yim, Junyeob;Hwang, Byung-Yeon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.4 no.8
    • /
    • pp.301-308
    • /
    • 2015
  • SNS is a web-based online platform service supporting the formation of relations between users. SNS users have usually used a desktop or laptop for this purpose so far. However, the number of SNS users is greatly increasing and their access to the web is improving with the spread of smart phones. They share their daily lives with other users through SNSs. We can detect events if we analyze the contents that are left by SNS users, where the individual acts as a sensor. Such analyses have already been attempted by many researchers. In particular, Twitter is used in related spheres in various ways, because it has structural characteristics suitable for detecting events. However, there is a limitation concerning the detection of events and their locations. Thus, we developed a system that can detect the location immediately based on the district mentioned in Twitter. We tested whether the system can function in real time and evaluated its ability to detect events that occurred in reality. We also tried to improve its detection efficiency by removing noise.