• Title/Summary/Keyword: Internet Use

Search Result 6,016, Processing Time 0.037 seconds

Utilization of Smart Farms in Open-field Agriculture Based on Digital Twin (디지털 트윈 기반 노지스마트팜 활용방안)

  • Kim, Sukgu
    • Proceedings of the Korean Society of Crop Science Conference
    • /
    • 2023.04a
    • /
    • pp.7-7
    • /
    • 2023
  • Currently, the main technologies of various fourth industries are big data, the Internet of Things, artificial intelligence, blockchain, mixed reality (MR), and drones. In particular, "digital twin," which has recently become a global technological trend, is a concept of a virtual model that is expressed equally in physical objects and computers. By creating and simulating a Digital twin of software-virtualized assets instead of real physical assets, accurate information about the characteristics of real farming (current state, agricultural productivity, agricultural work scenarios, etc.) can be obtained. This study aims to streamline agricultural work through automatic water management, remote growth forecasting, drone control, and pest forecasting through the operation of an integrated control system by constructing digital twin data on the main production area of the nojinot industry and designing and building a smart farm complex. In addition, it aims to distribute digital environmental control agriculture in Korea that can reduce labor and improve crop productivity by minimizing environmental load through the use of appropriate amounts of fertilizers and pesticides through big data analysis. These open-field agricultural technologies can reduce labor through digital farming and cultivation management, optimize water use and prevent soil pollution in preparation for climate change, and quantitative growth management of open-field crops by securing digital data for the national cultivation environment. It is also a way to directly implement carbon-neutral RED++ activities by improving agricultural productivity. The analysis and prediction of growth status through the acquisition of the acquired high-precision and high-definition image-based crop growth data are very effective in digital farming work management. The Southern Crop Department of the National Institute of Food Science conducted research and development on various types of open-field agricultural smart farms such as underground point and underground drainage. In particular, from this year, commercialization is underway in earnest through the establishment of smart farm facilities and technology distribution for agricultural technology complexes across the country. In this study, we would like to describe the case of establishing the agricultural field that combines digital twin technology and open-field agricultural smart farm technology and future utilization plans.

  • PDF

Mediating Roles of Attachment for Information Sharing in Social Media: Social Capital Theory Perspective (소셜 미디어에서 정보공유를 위한 애착의 매개역할: 사회적 자본이론 관점)

  • Chung, Namho;Han, Hee Jeong;Koo, Chulmo
    • Asia pacific journal of information systems
    • /
    • v.22 no.4
    • /
    • pp.101-123
    • /
    • 2012
  • Currently, Social Media, it has widely a renown keyword and its related social trends and businesses have been fastly applied into various contexts. Social media has become an important research area for scholars interested in online technologies and cyber space and their social impacts. Social media is not only including web-based services but also mobile-based application services that allow people to share various style information and knowledge through online connection. Social media users have tendency to common identity- and bond-attachment through interactions such as 'thumbs up', 'reply note', 'forwarding', which may have driven from various factors and may result in delivering information, sharing knowledge, and specific experiences et al. Even further, almost of all social media sites provide and connect unknown strangers depending on shared interests, political views, or enjoyable activities, and other stuffs incorporating the creation of contents, which provides benefits to users. As fast developing digital devices including smartphone, tablet PC, internet based blogging, and photo and video clips, scholars desperately have began to study regarding diverse issues connecting human beings' motivations and the behavioral results which may be articulated by the format of antecedents as well as consequences related to contents that people create via social media. Social media such as Facebook, Twitter, or Cyworld users are more and more getting close each other and build up their relationships by a different style. In this sense, people use social media as tools for maintain pre-existing network, creating new people socially, and at the same time, explicitly find some business opportunities using personal and unlimited public networks. In terms of theory in explaining this phenomenon, social capital is a concept that describes the benefits one receives from one's relationship with others. Thereby, social media use is closely related to the form and connected of people, which is a bridge that can be able to achieve informational benefits of a heterogeneous network of people and common identity- and bonding-attachment which emphasizes emotional benefits from community members or friend group. Social capital would be resources accumulated through the relationships among people, which can be considered as an investment in social relations with expected returns and may achieve benefits from the greater access to and use of resources embedded in social networks. Social media using for their social capital has vastly been adopted in a cyber world, however, there has been little explaining the phenomenon theoretically how people may take advantages or opportunities through interaction among people, why people may interactively give willingness to help or their answers. The individual consciously express themselves in an online space, so called, common identity- or bonding-attachments. Common-identity attachment is the focus of the weak ties, which are loose connections between individuals who may provide useful information or new perspectives for one another but typically not emotional support, whereas common-bonding attachment is explained that between individuals in tightly-knit, emotionally close relationship such as family and close friends. The common identify- and bonding-attachment are mainly studying on-offline setting, which individual convey an impression to others that are expressed to own interest to others. Thus, individuals expect to meet other people and are trying to behave self-presentation engaging in opposite partners accordingly. As developing social media, individuals are motivated to disclose self-disclosures of open and honest using diverse cues such as verbal and nonverbal and pictorial and video files to their friends as well as passing strangers. Social media context, common identity- and bond-attachment for self-presentation seems different compared with face-to-face context. In the realm of social media, social users look for self-impression by posting text messages, pictures, video files. Under the digital environments, people interact to work, shop, learn, entertain, and be played. Social media provides increasingly the kinds of intention and behavior in online. Typically, identity and bond social capital through self-presentation is the intentional and tangible component of identity. At social media, people try to engage in others via a desired impression, which can maintain through performing coherent and complementary communications including displaying signs, symbols, brands made of digital stuffs(information, interest, pictures, etc,). In marketing area, consumers traditionally show common-identity as they select clothes, hairstyles, automobiles, logos, and so on, to impress others in any given context in a shopping mall or opera. To examine these social capital and attachment, we combined a social capital theory with an attachment theory into our research model. Our research model focuses on the common identity- and bond-attachment how they are formulated through social capitals: cognitive capital, structural capital, relational capital, and individual characteristics. Thus, we examined that individual online kindness, self-rated expertise, and social relation influence to build common identity- and bond-attachment, and the attachment effects make an impact on both the willingness to help, however, common bond seems not to show directly impact on information sharing. As a result, we discover that the social capital and attachment theories are mainly applicable to the context of social media and usage in the individual networks. We collected sample data of 256 who are using social media such as Facebook, Twitter, and Cyworld and analyzed the suggested hypotheses through the Structural Equation Model by AMOS. This study analyzes the direct and indirect relationship between the social network service usage and outcomes. Antecedents of kindness, confidence of knowledge, social relations are significantly affected to the mediators common identity-and bond attachments, however, interestingly, network externality does not impact, which we assumed that a size of network was a negative because group members would not significantly contribute if the members do not intend to actively interact with each other. The mediating variables had a positive effect on toward willingness to help. Further, common identity attachment has stronger significant on shared information.

  • PDF

Literature Analysis of Radiotherapy in Uterine Cervix Cancer for the Processing of the Patterns of Care Study in Korea (한국에서 자궁경부알 방사선치료의 Patterns of Care Study 진행을 위한 문헌 비교 연구)

  • Choi Doo Ho;Kim Eun Seog;Kim Yong Ho;Kim Jin Hee;Yang Dae Sik;Kang Seung Hee;Wu Hong Gyun;Kim Il Han
    • Radiation Oncology Journal
    • /
    • v.23 no.2
    • /
    • pp.61-70
    • /
    • 2005
  • Purpose: Uterine cervix cancer is one of the most prevalent women cancer in Korea. We analysed published papers in Korea with comparing Patterns of Care Study (PCS) articles of United States and Japan for the purpose of developing and processing Korean PCS. Materials and Methods: We searched PCS related foreign-produced papers in the PCS homepage (212 articles and abstracts) and from the Pub Med to find Structure and Process of the PCS. To compare their study with Korean papers, we used the internet site 'Korean Pub Med' to search 99 articles regarding uterine cervix cancer and radiation therapy. We analysed Korean paper by comparing them with selected PCS papers regarding Structure, Process and Outcome and compared their items between the period of before 1980's and 1990's. Results: Evaluable papers were 28 from United States, 10 from the Japan and 73 from the Korea which treated cervix PCS items. PCS papers for United States and Japan commonly stratified into $3\~4$ categories on the bases of the scales characteristics of the facilities, numbers of the patients, doctors, Researchers restricted eligible patients strictly. For the process of the study, they analysed factors regarding pretreatment staging in chronological order, treatment related factors, factors in addition to FIGO staging and treatment machine. Papers in United States dealt with racial characteristics, socioeconomic characteristics of the patients, tumor size (6), and bilaterality of parametrial or pelvic side wail invasion (5), whereas papers from Japan treated of the tumor markers. The common trend in the process of staging work-up was decreased use of lymphangiogram, barium enema and increased use of CT and MRI over the times. The recent subject from the Korean papers dealt with concurrent chemoradiotherapy (9 papers), treatment duration (4), tumor markers (B) and unconventional fractionation. Conclusion: By comparing papers among 3 nations, we collected items for Korean uterine cervix cancer PCS. By consensus meeting and close communication, survey items for cervix cancer PCS were developed to measure structure, process and outcome of the radiation treatment of the cervix cancer. Subsequent future research will focus on the use of brachytherapy and its impact on outcome including complications. These finding and future PCS studies will direct the development of educational programs aimed at correcting identified deficits in care.

Methodology for Identifying Issues of User Reviews from the Perspective of Evaluation Criteria: Focus on a Hotel Information Site (사용자 리뷰의 평가기준 별 이슈 식별 방법론: 호텔 리뷰 사이트를 중심으로)

  • Byun, Sungho;Lee, Donghoon;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.3
    • /
    • pp.23-43
    • /
    • 2016
  • As a result of the growth of Internet data and the rapid development of Internet technology, "big data" analysis has gained prominence as a major approach for evaluating and mining enormous data for various purposes. Especially, in recent years, people tend to share their experiences related to their leisure activities while also reviewing others' inputs concerning their activities. Therefore, by referring to others' leisure activity-related experiences, they are able to gather information that might guarantee them better leisure activities in the future. This phenomenon has appeared throughout many aspects of leisure activities such as movies, traveling, accommodation, and dining. Apart from blogs and social networking sites, many other websites provide a wealth of information related to leisure activities. Most of these websites provide information of each product in various formats depending on different purposes and perspectives. Generally, most of the websites provide the average ratings and detailed reviews of users who actually used products/services, and these ratings and reviews can actually support the decision of potential customers in purchasing the same products/services. However, the existing websites offering information on leisure activities only provide the rating and review based on one stage of a set of evaluation criteria. Therefore, to identify the main issue for each evaluation criterion as well as the characteristics of specific elements comprising each criterion, users have to read a large number of reviews. In particular, as most of the users search for the characteristics of the detailed elements for one or more specific evaluation criteria based on their priorities, they must spend a great deal of time and effort to obtain the desired information by reading more reviews and understanding the contents of such reviews. Although some websites break down the evaluation criteria and direct the user to input their reviews according to different levels of criteria, there exist excessive amounts of input sections that make the whole process inconvenient for the users. Further, problems may arise if a user does not follow the instructions for the input sections or fill in the wrong input sections. Finally, treating the evaluation criteria breakdown as a realistic alternative is difficult, because identifying all the detailed criteria for each evaluation criterion is a challenging task. For example, if a review about a certain hotel has been written, people tend to only write one-stage reviews for various components such as accessibility, rooms, services, or food. These might be the reviews for most frequently asked questions, such as distance between the nearest subway station or condition of the bathroom, but they still lack detailed information for these questions. In addition, in case a breakdown of the evaluation criteria was provided along with various input sections, the user might only fill in the evaluation criterion for accessibility or fill in the wrong information such as information regarding rooms in the evaluation criteria for accessibility. Thus, the reliability of the segmented review will be greatly reduced. In this study, we propose an approach to overcome the limitations of the existing leisure activity information websites, namely, (1) the reliability of reviews for each evaluation criteria and (2) the difficulty of identifying the detailed contents that make up the evaluation criteria. In our proposed methodology, we first identify the review content and construct the lexicon for each evaluation criterion by using the terms that are frequently used for each criterion. Next, the sentences in the review documents containing the terms in the constructed lexicon are decomposed into review units, which are then reconstructed by using the evaluation criteria. Finally, the issues of the constructed review units by evaluation criteria are derived and the summary results are provided. Apart from the derived issues, the review units are also provided. Therefore, this approach aims to help users save on time and effort, because they will only be reading the relevant information they need for each evaluation criterion rather than go through the entire text of review. Our proposed methodology is based on the topic modeling, which is being actively used in text analysis. The review is decomposed into sentence units rather than considering the whole review as a document unit. After being decomposed into individual review units, the review units are reorganized according to each evaluation criterion and then used in the subsequent analysis. This work largely differs from the existing topic modeling-based studies. In this paper, we collected 423 reviews from hotel information websites and decomposed these reviews into 4,860 review units. We then reorganized the review units according to six different evaluation criteria. By applying these review units in our methodology, the analysis results can be introduced, and the utility of proposed methodology can be demonstrated.

Visualizing the Results of Opinion Mining from Social Media Contents: Case Study of a Noodle Company (소셜미디어 콘텐츠의 오피니언 마이닝결과 시각화: N라면 사례 분석 연구)

  • Kim, Yoosin;Kwon, Do Young;Jeong, Seung Ryul
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.89-105
    • /
    • 2014
  • After emergence of Internet, social media with highly interactive Web 2.0 applications has provided very user friendly means for consumers and companies to communicate with each other. Users have routinely published contents involving their opinions and interests in social media such as blogs, forums, chatting rooms, and discussion boards, and the contents are released real-time in the Internet. For that reason, many researchers and marketers regard social media contents as the source of information for business analytics to develop business insights, and many studies have reported results on mining business intelligence from Social media content. In particular, opinion mining and sentiment analysis, as a technique to extract, classify, understand, and assess the opinions implicit in text contents, are frequently applied into social media content analysis because it emphasizes determining sentiment polarity and extracting authors' opinions. A number of frameworks, methods, techniques and tools have been presented by these researchers. However, we have found some weaknesses from their methods which are often technically complicated and are not sufficiently user-friendly for helping business decisions and planning. In this study, we attempted to formulate a more comprehensive and practical approach to conduct opinion mining with visual deliverables. First, we described the entire cycle of practical opinion mining using Social media content from the initial data gathering stage to the final presentation session. Our proposed approach to opinion mining consists of four phases: collecting, qualifying, analyzing, and visualizing. In the first phase, analysts have to choose target social media. Each target media requires different ways for analysts to gain access. There are open-API, searching tools, DB2DB interface, purchasing contents, and so son. Second phase is pre-processing to generate useful materials for meaningful analysis. If we do not remove garbage data, results of social media analysis will not provide meaningful and useful business insights. To clean social media data, natural language processing techniques should be applied. The next step is the opinion mining phase where the cleansed social media content set is to be analyzed. The qualified data set includes not only user-generated contents but also content identification information such as creation date, author name, user id, content id, hit counts, review or reply, favorite, etc. Depending on the purpose of the analysis, researchers or data analysts can select a suitable mining tool. Topic extraction and buzz analysis are usually related to market trends analysis, while sentiment analysis is utilized to conduct reputation analysis. There are also various applications, such as stock prediction, product recommendation, sales forecasting, and so on. The last phase is visualization and presentation of analysis results. The major focus and purpose of this phase are to explain results of analysis and help users to comprehend its meaning. Therefore, to the extent possible, deliverables from this phase should be made simple, clear and easy to understand, rather than complex and flashy. To illustrate our approach, we conducted a case study on a leading Korean instant noodle company. We targeted the leading company, NS Food, with 66.5% of market share; the firm has kept No. 1 position in the Korean "Ramen" business for several decades. We collected a total of 11,869 pieces of contents including blogs, forum contents and news articles. After collecting social media content data, we generated instant noodle business specific language resources for data manipulation and analysis using natural language processing. In addition, we tried to classify contents in more detail categories such as marketing features, environment, reputation, etc. In those phase, we used free ware software programs such as TM, KoNLP, ggplot2 and plyr packages in R project. As the result, we presented several useful visualization outputs like domain specific lexicons, volume and sentiment graphs, topic word cloud, heat maps, valence tree map, and other visualized images to provide vivid, full-colored examples using open library software packages of the R project. Business actors can quickly detect areas by a swift glance that are weak, strong, positive, negative, quiet or loud. Heat map is able to explain movement of sentiment or volume in categories and time matrix which shows density of color on time periods. Valence tree map, one of the most comprehensive and holistic visualization models, should be very helpful for analysts and decision makers to quickly understand the "big picture" business situation with a hierarchical structure since tree-map can present buzz volume and sentiment with a visualized result in a certain period. This case study offers real-world business insights from market sensing which would demonstrate to practical-minded business users how they can use these types of results for timely decision making in response to on-going changes in the market. We believe our approach can provide practical and reliable guide to opinion mining with visualized results that are immediately useful, not just in food industry but in other industries as well.

Context Sharing Framework Based on Time Dependent Metadata for Social News Service (소셜 뉴스를 위한 시간 종속적인 메타데이터 기반의 컨텍스트 공유 프레임워크)

  • Ga, Myung-Hyun;Oh, Kyeong-Jin;Hong, Myung-Duk;Jo, Geun-Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.4
    • /
    • pp.39-53
    • /
    • 2013
  • The emergence of the internet technology and SNS has increased the information flow and has changed the way people to communicate from one-way to two-way communication. Users not only consume and share the information, they also can create and share it among their friends across the social network service. It also changes the Social Media behavior to become one of the most important communication tools which also includes Social TV. Social TV is a form which people can watch a TV program and at the same share any information or its content with friends through Social media. Social News is getting popular and also known as a Participatory Social Media. It creates influences on user interest through Internet to represent society issues and creates news credibility based on user's reputation. However, the conventional platforms in news services only focus on the news recommendation domain. Recent development in SNS has changed this landscape to allow user to share and disseminate the news. Conventional platform does not provide any special way for news to be share. Currently, Social News Service only allows user to access the entire news. Nonetheless, they cannot access partial of the contents which related to users interest. For example user only have interested to a partial of the news and share the content, it is still hard for them to do so. In worst cases users might understand the news in different context. To solve this, Social News Service must provide a method to provide additional information. For example, Yovisto known as an academic video searching service provided time dependent metadata from the video. User can search and watch partial of video content according to time dependent metadata. They also can share content with a friend in social media. Yovisto applies a method to divide or synchronize a video based whenever the slides presentation is changed to another page. However, we are not able to employs this method on news video since the news video is not incorporating with any power point slides presentation. Segmentation method is required to separate the news video and to creating time dependent metadata. In this work, In this paper, a time dependent metadata-based framework is proposed to segment news contents and to provide time dependent metadata so that user can use context information to communicate with their friends. The transcript of the news is divided by using the proposed story segmentation method. We provide a tag to represent the entire content of the news. And provide the sub tag to indicate the segmented news which includes the starting time of the news. The time dependent metadata helps user to track the news information. It also allows them to leave a comment on each segment of the news. User also may share the news based on time metadata as segmented news or as a whole. Therefore, it helps the user to understand the shared news. To demonstrate the performance, we evaluate the story segmentation accuracy and also the tag generation. For this purpose, we measured accuracy of the story segmentation through semantic similarity and compared to the benchmark algorithm. Experimental results show that the proposed method outperforms benchmark algorithms in terms of the accuracy of story segmentation. It is important to note that sub tag accuracy is the most important as a part of the proposed framework to share the specific news context with others. To extract a more accurate sub tags, we have created stop word list that is not related to the content of the news such as name of the anchor or reporter. And we applied to framework. We have analyzed the accuracy of tags and sub tags which represent the context of news. From the analysis, it seems that proposed framework is helpful to users for sharing their opinions with context information in Social media and Social news.

Emoticon by Emotions: The Development of an Emoticon Recommendation System Based on Consumer Emotions (Emoticon by Emotions: 소비자 감성 기반 이모티콘 추천 시스템 개발)

  • Kim, Keon-Woo;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.227-252
    • /
    • 2018
  • The evolution of instant communication has mirrored the development of the Internet and messenger applications are among the most representative manifestations of instant communication technologies. In messenger applications, senders use emoticons to supplement the emotions conveyed in the text of their messages. The fact that communication via messenger applications is not face-to-face makes it difficult for senders to communicate their emotions to message recipients. Emoticons have long been used as symbols that indicate the moods of speakers. However, at present, emoticon-use is evolving into a means of conveying the psychological states of consumers who want to express individual characteristics and personality quirks while communicating their emotions to others. The fact that companies like KakaoTalk, Line, Apple, etc. have begun conducting emoticon business and sales of related content are expected to gradually increase testifies to the significance of this phenomenon. Nevertheless, despite the development of emoticons themselves and the growth of the emoticon market, no suitable emoticon recommendation system has yet been developed. Even KakaoTalk, a messenger application that commands more than 90% of domestic market share in South Korea, just grouped in to popularity, most recent, or brief category. This means consumers face the inconvenience of constantly scrolling around to locate the emoticons they want. The creation of an emoticon recommendation system would improve consumer convenience and satisfaction and increase the sales revenue of companies the sell emoticons. To recommend appropriate emoticons, it is necessary to quantify the emotions that the consumer sees and emotions. Such quantification will enable us to analyze the characteristics and emotions felt by consumers who used similar emoticons, which, in turn, will facilitate our emoticon recommendations for consumers. One way to quantify emoticons use is metadata-ization. Metadata-ization is a means of structuring or organizing unstructured and semi-structured data to extract meaning. By structuring unstructured emoticon data through metadata-ization, we can easily classify emoticons based on the emotions consumers want to express. To determine emoticons' precise emotions, we had to consider sub-detail expressions-not only the seven common emotional adjectives but also the metaphorical expressions that appear only in South Korean proved by previous studies related to emotion focusing on the emoticon's characteristics. We therefore collected the sub-detail expressions of emotion based on the "Shape", "Color" and "Adumbration". Moreover, to design a highly accurate recommendation system, we considered both emotion-technical indexes and emoticon-emotional indexes. We then identified 14 features of emoticon-technical indexes and selected 36 emotional adjectives. The 36 emotional adjectives consisted of contrasting adjectives, which we reduced to 18, and we measured the 18 emotional adjectives using 40 emoticon sets randomly selected from the top-ranked emoticons in the KakaoTalk shop. We surveyed 277 consumers in their mid-twenties who had experience purchasing emoticons; we recruited them online and asked them to evaluate five different emoticon sets. After data acquisition, we conducted a factor analysis of emoticon-emotional factors. We extracted four factors that we named "Comic", Softness", "Modernity" and "Transparency". We analyzed both the relationship between indexes and consumer attitude and the relationship between emoticon-technical indexes and emoticon-emotional factors. Through this process, we confirmed that the emoticon-technical indexes did not directly affect consumer attitudes but had a mediating effect on consumer attitudes through emoticon-emotional factors. The results of the analysis revealed the mechanism consumers use to evaluate emoticons; the results also showed that consumers' emoticon-technical indexes affected emoticon-emotional factors and that the emoticon-emotional factors affected consumer satisfaction. We therefore designed the emoticon recommendation system using only four emoticon-emotional factors; we created a recommendation method to calculate the Euclidean distance from each factors' emotion. In an attempt to increase the accuracy of the emoticon recommendation system, we compared the emotional patterns of selected emoticons with the recommended emoticons. The emotional patterns corresponded in principle. We verified the emoticon recommendation system by testing prediction accuracy; the predictions were 81.02% accurate in the first result, 76.64% accurate in the second, and 81.63% accurate in the third. This study developed a methodology that can be used in various fields academically and practically. We expect that the novel emoticon recommendation system we designed will increase emoticon sales for companies who conduct business in this domain and make consumer experiences more convenient. In addition, this study served as an important first step in the development of an intelligent emoticon recommendation system. The emotional factors proposed in this study could be collected in an emotional library that could serve as an emotion index for evaluation when new emoticons are released. Moreover, by combining the accumulated emotional library with company sales data, sales information, and consumer data, companies could develop hybrid recommendation systems that would bolster convenience for consumers and serve as intellectual assets that companies could strategically deploy.

A New Approach to Automatic Keyword Generation Using Inverse Vector Space Model (키워드 자동 생성에 대한 새로운 접근법: 역 벡터공간모델을 이용한 키워드 할당 방법)

  • Cho, Won-Chin;Rho, Sang-Kyu;Yun, Ji-Young Agnes;Park, Jin-Soo
    • Asia pacific journal of information systems
    • /
    • v.21 no.1
    • /
    • pp.103-122
    • /
    • 2011
  • Recently, numerous documents have been made available electronically. Internet search engines and digital libraries commonly return query results containing hundreds or even thousands of documents. In this situation, it is virtually impossible for users to examine complete documents to determine whether they might be useful for them. For this reason, some on-line documents are accompanied by a list of keywords specified by the authors in an effort to guide the users by facilitating the filtering process. In this way, a set of keywords is often considered a condensed version of the whole document and therefore plays an important role for document retrieval, Web page retrieval, document clustering, summarization, text mining, and so on. Since many academic journals ask the authors to provide a list of five or six keywords on the first page of an article, keywords are most familiar in the context of journal articles. However, many other types of documents could not benefit from the use of keywords, including Web pages, email messages, news reports, magazine articles, and business papers. Although the potential benefit is large, the implementation itself is the obstacle; manually assigning keywords to all documents is a daunting task, or even impractical in that it is extremely tedious and time-consuming requiring a certain level of domain knowledge. Therefore, it is highly desirable to automate the keyword generation process. There are mainly two approaches to achieving this aim: keyword assignment approach and keyword extraction approach. Both approaches use machine learning methods and require, for training purposes, a set of documents with keywords already attached. In the former approach, there is a given set of vocabulary, and the aim is to match them to the texts. In other words, the keywords assignment approach seeks to select the words from a controlled vocabulary that best describes a document. Although this approach is domain dependent and is not easy to transfer and expand, it can generate implicit keywords that do not appear in a document. On the other hand, in the latter approach, the aim is to extract keywords with respect to their relevance in the text without prior vocabulary. In this approach, automatic keyword generation is treated as a classification task, and keywords are commonly extracted based on supervised learning techniques. Thus, keyword extraction algorithms classify candidate keywords in a document into positive or negative examples. Several systems such as Extractor and Kea were developed using keyword extraction approach. Most indicative words in a document are selected as keywords for that document and as a result, keywords extraction is limited to terms that appear in the document. Therefore, keywords extraction cannot generate implicit keywords that are not included in a document. According to the experiment results of Turney, about 64% to 90% of keywords assigned by the authors can be found in the full text of an article. Inversely, it also means that 10% to 36% of the keywords assigned by the authors do not appear in the article, which cannot be generated through keyword extraction algorithms. Our preliminary experiment result also shows that 37% of keywords assigned by the authors are not included in the full text. This is the reason why we have decided to adopt the keyword assignment approach. In this paper, we propose a new approach for automatic keyword assignment namely IVSM(Inverse Vector Space Model). The model is based on a vector space model. which is a conventional information retrieval model that represents documents and queries by vectors in a multidimensional space. IVSM generates an appropriate keyword set for a specific document by measuring the distance between the document and the keyword sets. The keyword assignment process of IVSM is as follows: (1) calculating the vector length of each keyword set based on each keyword weight; (2) preprocessing and parsing a target document that does not have keywords; (3) calculating the vector length of the target document based on the term frequency; (4) measuring the cosine similarity between each keyword set and the target document; and (5) generating keywords that have high similarity scores. Two keyword generation systems were implemented applying IVSM: IVSM system for Web-based community service and stand-alone IVSM system. Firstly, the IVSM system is implemented in a community service for sharing knowledge and opinions on current trends such as fashion, movies, social problems, and health information. The stand-alone IVSM system is dedicated to generating keywords for academic papers, and, indeed, it has been tested through a number of academic papers including those published by the Korean Association of Shipping and Logistics, the Korea Research Academy of Distribution Information, the Korea Logistics Society, the Korea Logistics Research Association, and the Korea Port Economic Association. We measured the performance of IVSM by the number of matches between the IVSM-generated keywords and the author-assigned keywords. According to our experiment, the precisions of IVSM applied to Web-based community service and academic journals were 0.75 and 0.71, respectively. The performance of both systems is much better than that of baseline systems that generate keywords based on simple probability. Also, IVSM shows comparable performance to Extractor that is a representative system of keyword extraction approach developed by Turney. As electronic documents increase, we expect that IVSM proposed in this paper can be applied to many electronic documents in Web-based community and digital library.

An Analytical Approach Using Topic Mining for Improving the Service Quality of Hotels (호텔 산업의 서비스 품질 향상을 위한 토픽 마이닝 기반 분석 방법)

  • Moon, Hyun Sil;Sung, David;Kim, Jae Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.21-41
    • /
    • 2019
  • Thanks to the rapid development of information technologies, the data available on Internet have grown rapidly. In this era of big data, many studies have attempted to offer insights and express the effects of data analysis. In the tourism and hospitality industry, many firms and studies in the era of big data have paid attention to online reviews on social media because of their large influence over customers. As tourism is an information-intensive industry, the effect of these information networks on social media platforms is more remarkable compared to any other types of media. However, there are some limitations to the improvements in service quality that can be made based on opinions on social media platforms. Users on social media platforms represent their opinions as text, images, and so on. Raw data sets from these reviews are unstructured. Moreover, these data sets are too big to extract new information and hidden knowledge by human competences. To use them for business intelligence and analytics applications, proper big data techniques like Natural Language Processing and data mining techniques are needed. This study suggests an analytical approach to directly yield insights from these reviews to improve the service quality of hotels. Our proposed approach consists of topic mining to extract topics contained in the reviews and the decision tree modeling to explain the relationship between topics and ratings. Topic mining refers to a method for finding a group of words from a collection of documents that represents a document. Among several topic mining methods, we adopted the Latent Dirichlet Allocation algorithm, which is considered as the most universal algorithm. However, LDA is not enough to find insights that can improve service quality because it cannot find the relationship between topics and ratings. To overcome this limitation, we also use the Classification and Regression Tree method, which is a kind of decision tree technique. Through the CART method, we can find what topics are related to positive or negative ratings of a hotel and visualize the results. Therefore, this study aims to investigate the representation of an analytical approach for the improvement of hotel service quality from unstructured review data sets. Through experiments for four hotels in Hong Kong, we can find the strengths and weaknesses of services for each hotel and suggest improvements to aid in customer satisfaction. Especially from positive reviews, we find what these hotels should maintain for service quality. For example, compared with the other hotels, a hotel has a good location and room condition which are extracted from positive reviews for it. In contrast, we also find what they should modify in their services from negative reviews. For example, a hotel should improve room condition related to soundproof. These results mean that our approach is useful in finding some insights for the service quality of hotels. That is, from the enormous size of review data, our approach can provide practical suggestions for hotel managers to improve their service quality. In the past, studies for improving service quality relied on surveys or interviews of customers. However, these methods are often costly and time consuming and the results may be biased by biased sampling or untrustworthy answers. The proposed approach directly obtains honest feedback from customers' online reviews and draws some insights through a type of big data analysis. So it will be a more useful tool to overcome the limitations of surveys or interviews. Moreover, our approach easily obtains the service quality information of other hotels or services in the tourism industry because it needs only open online reviews and ratings as input data. Furthermore, the performance of our approach will be better if other structured and unstructured data sources are added.

Differences of news aspect about Asia and West in Korean newspapers and its reason: Focusing on news topic, amount of news, news tone and media sources (한국신문의 아시아와 서구에 대한 보도양상의 차이와 이유 연구: 뉴스주제, 보도량, 보도태도, 미디어 정보원을 중심으로)

  • Oh, Day-Young
    • Korean journal of communication and information
    • /
    • v.61
    • /
    • pp.74-97
    • /
    • 2013
  • Asia is developing rapidly in 21st century. Human and material exchanges between Korea and Asian countries have greatly increased. Korea entered the multicultural society. It became important for Korean people to understand Asia more correctively. Korean media can play a key role for this. In this point, I analyzed 1786 news contents reported in 2011 by four Korean newspapers(Chosun Ilbo, Dong-A Ilbo, Hankyoreh newspaper, Kyungh Kyunghyang Daily News), to see differences of Asia and West news aspect and its reason, focusing on news topic, amount of news, news tone and foreign media sources. In amount of news, the percent of West(54.3%) was higher than that of Asia news(45.7%). In news tone, negative news were the most in Asia news, but the least in West news. Korean newspaper showed more positive attitude to West than Asia. 1786 news were classified into seven topics(morality and justice, politics, economics and science, society, diplomacy and national defense, human interest, people). In news amount of seven topics, Korean newspapers reported hard news like morality and justice more than soft news like human interest about Asia. However they reported many soft news about West besides hard news. In news topics and tone, hard news showed negative tone most and soft news showed neutral or positive tone most. As a result, Korean news showed the negative attitude to Asia and the positive to West. Among five main sources(media, government, private organization, individual and material), only media source affected the differences of news attitude to Asia and West. Asia media source took the more positive attitude to Asia than West. West media took the negative attitude to Asia most and the neutral attitude to West most. Korean newspapers used West media as main sources in the news of all areas except East Asia. As a result, Korean newspapers showed the West-centered-attitude and reported the negative news more than neutral and positive about Asia. It was suggested that Korean newspapers had better increase Asia news in diverse spheres by the direct reporting of the correspondent and the more use of Asia media through the internet.

  • PDF