• Title/Summary/Keyword: Social language use

Search Result 216, Processing Time 0.026 seconds

Twitter Issue Tracking System by Topic Modeling Techniques (토픽 모델링을 이용한 트위터 이슈 트래킹 시스템)

  • Bae, Jung-Hwan;Han, Nam-Gi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.109-122
    • /
    • 2014
  • People are nowadays creating a tremendous amount of data on Social Network Service (SNS). In particular, the incorporation of SNS into mobile devices has resulted in massive amounts of data generation, thereby greatly influencing society. This is an unmatched phenomenon in history, and now we live in the Age of Big Data. SNS Data is defined as a condition of Big Data where the amount of data (volume), data input and output speeds (velocity), and the variety of data types (variety) are satisfied. If someone intends to discover the trend of an issue in SNS Big Data, this information can be used as a new important source for the creation of new values because this information covers the whole of society. In this study, a Twitter Issue Tracking System (TITS) is designed and established to meet the needs of analyzing SNS Big Data. TITS extracts issues from Twitter texts and visualizes them on the web. The proposed system provides the following four functions: (1) Provide the topic keyword set that corresponds to daily ranking; (2) Visualize the daily time series graph of a topic for the duration of a month; (3) Provide the importance of a topic through a treemap based on the score system and frequency; (4) Visualize the daily time-series graph of keywords by searching the keyword; The present study analyzes the Big Data generated by SNS in real time. SNS Big Data analysis requires various natural language processing techniques, including the removal of stop words, and noun extraction for processing various unrefined forms of unstructured data. In addition, such analysis requires the latest big data technology to process rapidly a large amount of real-time data, such as the Hadoop distributed system or NoSQL, which is an alternative to relational database. We built TITS based on Hadoop to optimize the processing of big data because Hadoop is designed to scale up from single node computing to thousands of machines. Furthermore, we use MongoDB, which is classified as a NoSQL database. In addition, MongoDB is an open source platform, document-oriented database that provides high performance, high availability, and automatic scaling. Unlike existing relational database, there are no schema or tables with MongoDB, and its most important goal is that of data accessibility and data processing performance. In the Age of Big Data, the visualization of Big Data is more attractive to the Big Data community because it helps analysts to examine such data easily and clearly. Therefore, TITS uses the d3.js library as a visualization tool. This library is designed for the purpose of creating Data Driven Documents that bind document object model (DOM) and any data; the interaction between data is easy and useful for managing real-time data stream with smooth animation. In addition, TITS uses a bootstrap made of pre-configured plug-in style sheets and JavaScript libraries to build a web system. The TITS Graphical User Interface (GUI) is designed using these libraries, and it is capable of detecting issues on Twitter in an easy and intuitive manner. The proposed work demonstrates the superiority of our issue detection techniques by matching detected issues with corresponding online news articles. The contributions of the present study are threefold. First, we suggest an alternative approach to real-time big data analysis, which has become an extremely important issue. Second, we apply a topic modeling technique that is used in various research areas, including Library and Information Science (LIS). Based on this, we can confirm the utility of storytelling and time series analysis. Third, we develop a web-based system, and make the system available for the real-time discovery of topics. The present study conducted experiments with nearly 150 million tweets in Korea during March 2013.

The Effect of Corporate SNS Marketing on User Behavior: Focusing on Facebook Fan Page Analytics (기업의 SNS 마케팅 활동이 이용자 행동에 미치는 영향: 페이스북 팬페이지 애널리틱스를 중심으로)

  • Jeon, Hyeong-Jun;Seo, Bong-Goon;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.75-95
    • /
    • 2020
  • With the growth of social networks, various forms of SNS have emerged. Based on various motivations for use such as interactivity, information exchange, and entertainment, SNS users are also on the fast-growing trend. Facebook is the main SNS channel, and companies have started using Facebook pages as a public relations channel. To this end, in the early stages of operation, companies began to secure a number of fans, and as a result, the number of corporate Facebook fans has recently increased to as many as millions. from a corporate perspective, Facebook is attracting attention because it makes it easier for you to meet the customers you want. Facebook provides an efficient advertising platform based on the numerous data it has. Advertising targeting can be conducted using their demographic characteristics, behavior, or contact information. It is optimized for advertisements that can expose information to a desired target, so that results can be obtained more effectively. it rethink and communicate corporate brand image to customers through contents. The study was conducted through Facebook advertising data, and could be of great help to business people working in the online advertising industry. For this reason, the independent variables used in the research were selected based on the characteristics of the content that the actual business is concerned with. Recently, the company's Facebook page operation goal is to go beyond securing the number of fan pages, branding to promote its brand, and further aiming to communicate with major customers. the main figures for this assessment are Facebook's 'OK', 'Attachment', 'Share', and 'Number of Click' which are the dependent variables of this study. in order to measure the outcome of the target, the consumer's response is set as a key measurable key performance indicator (KPI), and a strategy is set and executed to achieve this. Here, KPI uses Facebook's ad numbers 'reach', 'exposure', 'like', 'share', 'comment', 'clicks', and 'CPC' depending on the situation. in order to achieve the corresponding figures, the consideration of content production must be prior, and in this study, the independent variables were organized by dividing into three considerations for content production into three. The effects of content material, content structure, and message styles on Facebook's user behavior were analyzed using regression analysis. Content materials are related to the content's difficulty, company relevance, and daily involvement. According to existing research, it was very important how the content would attract users' interest. Content could be divided into informative content and interesting content. Informational content is content related to the brand, and information exchange with users is important. Interesting content is defined as posts that are not related to brands related to interesting movies or anecdotes. Based on this, this study started with the assumption that the difficulty, company relevance, and daily involvement have an effect on the dependent variable. In addition, previous studies have found that content types affect Facebook user activity. I think it depends on the combination of photos and text used in the content. Based on this study, the actual photos were used and the hashtag and independent variables were also examined. Finally, we focused on the advertising message. In the previous studies, the effect of advertising messages on users was different depending on whether they were narrative or non-narrative, and furthermore, the influence on message intimacy was different. In this study, we conducted research on the behavior that Facebook users' behavior would be different depending on the language and formality. For dependent variables, 'OK' and 'Full Click Count' are set by every user's action on the content. In this study, we defined each independent variable in the existing study literature and analyzed the effect on the dependent variable, and found that 'good' factors such as 'self association', 'actual use', and 'hidden' are important. Could. Material difficulties', 'actual participation' and 'large scale * difficulties'. In addition, variables such as 'Self Connect', 'Actual Engagement' and 'Sexual Sexual Attention' have been shown to have a significant impact on 'Full Click'. It is expected that through research results, it is possible to contribute to the operation and production strategy of company Facebook operators and content creators by presenting a content strategy optimized for the purpose of the content. In this study, we defined each independent variable in the existing research literature and analyzed its effect on the dependent variable, and we could see that factors on 'good' were significant such as 'self-association', 'reality use', 'concernal material difficulty', 'real-life involvement' and 'massive*difficulty'. In addition, variables such as 'self-connection', 'real-life involvement' and 'formative*attention' were shown to have significant effects for 'full-click'. Through the research results, it is expected that by presenting an optimized content strategy for content purposes, it can contribute to the operation and production strategy of corporate Facebook operators and content producers.

A study on the liquor package design of international competitive advantage - Focused on Soju and Sake - (국제 경쟁력을 위한 술 포장디자인 연구 - 국내소주 및 일본 Sake 중심으로 -)

  • 장욱선
    • Archives of design research
    • /
    • v.16 no.3
    • /
    • pp.151-160
    • /
    • 2003
  • Packages have been used for a wide variety of purposes, for protection, for display, for transportation of goods, or for keeping personal belongings. According to the demands of society and the times, liquor packages have been specialized and have appeared in almost every shape and size without restriction to cine particular type of material. In spite of its rapid development and wide application in our society, liquor package design has rarely been considered as a subject of comprehensive study. Majoring in package design, I have become especially interested in the area of liquor package design. I would like to explore liquor package design from several aspects. With the advent of new market and the rise of a new consumer society, advertising and mass media have expanded rapidly. While convenience of use is not a major issue, serving size certainly are quality, appeal of heritage and health concerns. Heritage is a major consumer appeal in Whisky, Beer, Wine and spirits. Designers have drawn heavily on the tradition of alcoholic products, have used type and graphics to create the illusion of heritage for new products. A sidelight to the heritage aspect of spirits package is the evolution of outer boxes for international liquors. International liquors package design illustrated the past and current themes. The design is contemporary and spare. Colored panels correlated to the liquor flavor used on clean white, black, gold boxes. While this research does not deny the impact of structural innovation and convenience package design , it does deny the existence of a graphic plateau. It is assumed therefore, that development in technology can facilitate communication between East and West. This can be accomplished because as containers of products are used in social setting, their form will gradually apply strong influence to the need for economical, easily handled, easily utilized packaging. Typically, ethnic package designs are those packages containing products which are prepared and marketed to a category of people who are prepared and marketed to a culture traits. They are liquor products sold in the metropolitan New York area which are marketed specially to Asians, Hispanics, or Eurpean population. These cultural groups share numerous traits including religion, language, dietary habits and traditional drinking styles. Therefore, the products which are familiar or common in their native countries are often imported or marketed there to serve them. These packages and products are frequently found on the shelves of supermarkets in predominantly ethnic areas. That is Korea, Japan if packaging is correctly design it would appeal to the American market. My research is that oriental beverage -Soju is good example of this precept. Assumedly, there must be a degree of subjectivity since it is a mean in which the consumers can relate to its advertising. This degree to relate and identify is the degree to which the package will be remembered and purchased. Subjectivity is intimately related to purchases since there is no such thing as a rational purchase in a society that operates on mass consumption. It is essential that packages become more personal human, entertaining, and more like advertising in order to maximize merchandising potential.

  • PDF

The Standard of Judgement on Plagiarism in Research Ethics and the Guideline of Global Journals for KODISA (KODISA 연구윤리의 표절 판단기준과 글로벌 학술지 가이드라인)

  • Hwang, Hee-Joong;Kim, Dong-Ho;Youn, Myoung-Kil;Lee, Jung-Wan;Lee, Jong-Ho
    • Journal of Distribution Science
    • /
    • v.12 no.6
    • /
    • pp.15-20
    • /
    • 2014
  • Purpose - In general, researchers try to abide by the code of research ethics, but many of them are not fully aware of plagiarism, unintentionally committing the research misconduct when they write a research paper. This research aims to introduce researchers a clear and easy guideline at a conference, which helps researchers avoid accidental plagiarism by addressing the issue. This research is expected to contribute building a climate and encouraging creative research among scholars. Research design, data, methodology & Results - Plagiarism is considered a sort of research misconduct along with fabrication and falsification. It is defined as an improper usage of another author's ideas, language, process, or results without giving appropriate credit. Plagiarism has nothing to do with examining the truth or accessing value of research data, process, or results. Plagiarism is determined based on whether a research corresponds to widely-used research ethics, containing proper citations. Within academia, plagiarism goes beyond the legal boundary, encompassing any kind of intentional wrongful appropriation of a research, which was created by another researchers. In summary, the definition of plagiarism is to steal other people's creative idea, research model, hypotheses, methods, definition, variables, images, tables and graphs, and use them without reasonable attribution to their true sources. There are various types of plagiarism. Some people assort plagiarism into idea plagiarism, text plagiarism, mosaic plagiarism, and idea distortion. Others view that plagiarism includes uncredited usage of another person's work without appropriate citations, self-plagiarism (using a part of a researcher's own previous research without proper citations), duplicate publication (publishing a researcher's own previous work with a different title), unethical citation (using quoted parts of another person's research without proper citations as if the parts are being cited by the current author). When an author wants to cite a part that was previously drawn from another source the author is supposed to reveal that the part is re-cited. If it is hard to state all the sources the author is allowed to mention the original source only. Today, various disciplines are developing their own measures to address these plagiarism issues, especially duplicate publications, by requiring researchers to clearly reveal true sources when they refer to any other research. Conclusions - Research misconducts including plagiarism have broad and unclear boundaries which allow ambiguous definitions and diverse interpretations. It seems difficult for researchers to have clear understandings of ways to avoid plagiarism and how to cite other's works properly. However, if guidelines are developed to detect and avoid plagiarism considering characteristics of each discipline (For example, social science and natural sciences might be able to have different standards on plagiarism.) and shared among researchers they will likely have a consensus and understanding regarding the issue. Particularly, since duplicate publications has frequently appeared more than plagiarism, academic institutions will need to provide pre-warning and screening in evaluation processes in order to reduce mistakes of researchers and to prevent duplicate publications. What is critical for researchers is to clearly reveal the true sources based on the common citation rules and to only borrow necessary amounts of others' research.

Toward a Sociological Understanding of Koreans in Small Business in the United States (미국에서 한인 자영업에 관한 연구)

  • 최병목
    • Korea journal of population studies
    • /
    • v.19 no.2
    • /
    • pp.139-173
    • /
    • 1996
  • This study is an attempt to identify factors affecting korean immigrants concentration in small business enterprises in the middleman minority sector including the priphery and core sectors, with the private wage and self-employed worker examined in each sector, employing the 5 percent public use sample from the 1980 United States census. One out of five koreans aged 25∼64 years is engaged in self-employed small businesses, while the majority of koreans (4 out of 5) are in the private wage sector. In contrast to expectations, English language difficulties and inferior education are not the prime factors affecting self-employment small businesses. The korean self-employed small business owners both in the periphery sector and in the core sector showed the 'middle' strata of their position in the social structure in terms of their industry, occupation, earnings, etc.

  • PDF

A Methodology for Automatic Multi-Categorization of Single-Categorized Documents (단일 카테고리 문서의 다중 카테고리 자동확장 방법론)

  • Hong, Jin-Sung;Kim, Namgyu;Lee, Sangwon
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.3
    • /
    • pp.77-92
    • /
    • 2014
  • Recently, numerous documents including unstructured data and text have been created due to the rapid increase in the usage of social media and the Internet. Each document is usually provided with a specific category for the convenience of the users. In the past, the categorization was performed manually. However, in the case of manual categorization, not only can the accuracy of the categorization be not guaranteed but the categorization also requires a large amount of time and huge costs. Many studies have been conducted towards the automatic creation of categories to solve the limitations of manual categorization. Unfortunately, most of these methods cannot be applied to categorizing complex documents with multiple topics because the methods work by assuming that one document can be categorized into one category only. In order to overcome this limitation, some studies have attempted to categorize each document into multiple categories. However, they are also limited in that their learning process involves training using a multi-categorized document set. These methods therefore cannot be applied to multi-categorization of most documents unless multi-categorized training sets are provided. To overcome the limitation of the requirement of a multi-categorized training set by traditional multi-categorization algorithms, we propose a new methodology that can extend a category of a single-categorized document to multiple categorizes by analyzing relationships among categories, topics, and documents. First, we attempt to find the relationship between documents and topics by using the result of topic analysis for single-categorized documents. Second, we construct a correspondence table between topics and categories by investigating the relationship between them. Finally, we calculate the matching scores for each document to multiple categories. The results imply that a document can be classified into a certain category if and only if the matching score is higher than the predefined threshold. For example, we can classify a certain document into three categories that have larger matching scores than the predefined threshold. The main contribution of our study is that our methodology can improve the applicability of traditional multi-category classifiers by generating multi-categorized documents from single-categorized documents. Additionally, we propose a module for verifying the accuracy of the proposed methodology. For performance evaluation, we performed intensive experiments with news articles. News articles are clearly categorized based on the theme, whereas the use of vulgar language and slang is smaller than other usual text document. We collected news articles from July 2012 to June 2013. The articles exhibit large variations in terms of the number of types of categories. This is because readers have different levels of interest in each category. Additionally, the result is also attributed to the differences in the frequency of the events in each category. In order to minimize the distortion of the result from the number of articles in different categories, we extracted 3,000 articles equally from each of the eight categories. Therefore, the total number of articles used in our experiments was 24,000. The eight categories were "IT Science," "Economy," "Society," "Life and Culture," "World," "Sports," "Entertainment," and "Politics." By using the news articles that we collected, we calculated the document/category correspondence scores by utilizing topic/category and document/topics correspondence scores. The document/category correspondence score can be said to indicate the degree of correspondence of each document to a certain category. As a result, we could present two additional categories for each of the 23,089 documents. Precision, recall, and F-score were revealed to be 0.605, 0.629, and 0.617 respectively when only the top 1 predicted category was evaluated, whereas they were revealed to be 0.838, 0.290, and 0.431 when the top 1 - 3 predicted categories were considered. It was very interesting to find a large variation between the scores of the eight categories on precision, recall, and F-score.