• Title/Summary/Keyword: Research Information Systems

Search Result 12,210, Processing Time 0.041 seconds

A Comparison Study of Cost Components to Estimate the Economic Loss from Foodborne Disease in Foreign Countries (국외 식중독으로 인한 손실비용 추정을 위한 항목 비교 연구)

  • Hyun, Jeong-Eun;Jin, Hyun Joung;Kim, Yesol;Ju, Hyo Jung;Kang, Woo In;Lee, Sun-Young
    • Journal of Food Hygiene and Safety
    • /
    • v.36 no.1
    • /
    • pp.68-76
    • /
    • 2021
  • Foodborne outbreaks frequently occur worldwide and result in huge economic losses. It is the therefore important to estimate the costs associated with foodborne diseases to minimize the economic damage. At the same time, it is difficult to accurately estimate the economic loss from foodborne disease due to a wide variety of cost components. In Korea, there are a limited number of analytical studies attempting to estimate such costs. In this study we investigated the components of economic cost used in foreign countries to better estimate the cost of foodborne disease in Korea. Seven recent studies investigated the cost components used to estimate the cost of foodborne disease in humans. This study categorized the economic loss into four types of cost: direct costs, indirect costs, food business costs, and government administration costs. The healthcare costs most often included were medical (outpatient) and hospital costs (inpatient). However, these cost components should be selected according to the systems and budgets of medical services by country. For non-healthcare costs, several other studies considered transportation costs to the hospital as an exception to the cost of inpatient care. So, further discussion is needed on whether to consider inpatient care costs. Among the indirect costs, premature mortality, lost productivity, lost leisure time, and lost quality of life/pain, grief and suffering costs were considered, but the opportunity costs for hospital visits were not considered in any of the above studies. As with healthcare costs, government administration costs should also be considered appropriate cost components due to the difference in government budget systems, for example. Our findings will provide fundamental information for economic analysis associated with foodborne diseases to improve food safety policy in Korea.

Twitter Issue Tracking System by Topic Modeling Techniques (토픽 모델링을 이용한 트위터 이슈 트래킹 시스템)

  • Bae, Jung-Hwan;Han, Nam-Gi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.109-122
    • /
    • 2014
  • People are nowadays creating a tremendous amount of data on Social Network Service (SNS). In particular, the incorporation of SNS into mobile devices has resulted in massive amounts of data generation, thereby greatly influencing society. This is an unmatched phenomenon in history, and now we live in the Age of Big Data. SNS Data is defined as a condition of Big Data where the amount of data (volume), data input and output speeds (velocity), and the variety of data types (variety) are satisfied. If someone intends to discover the trend of an issue in SNS Big Data, this information can be used as a new important source for the creation of new values because this information covers the whole of society. In this study, a Twitter Issue Tracking System (TITS) is designed and established to meet the needs of analyzing SNS Big Data. TITS extracts issues from Twitter texts and visualizes them on the web. The proposed system provides the following four functions: (1) Provide the topic keyword set that corresponds to daily ranking; (2) Visualize the daily time series graph of a topic for the duration of a month; (3) Provide the importance of a topic through a treemap based on the score system and frequency; (4) Visualize the daily time-series graph of keywords by searching the keyword; The present study analyzes the Big Data generated by SNS in real time. SNS Big Data analysis requires various natural language processing techniques, including the removal of stop words, and noun extraction for processing various unrefined forms of unstructured data. In addition, such analysis requires the latest big data technology to process rapidly a large amount of real-time data, such as the Hadoop distributed system or NoSQL, which is an alternative to relational database. We built TITS based on Hadoop to optimize the processing of big data because Hadoop is designed to scale up from single node computing to thousands of machines. Furthermore, we use MongoDB, which is classified as a NoSQL database. In addition, MongoDB is an open source platform, document-oriented database that provides high performance, high availability, and automatic scaling. Unlike existing relational database, there are no schema or tables with MongoDB, and its most important goal is that of data accessibility and data processing performance. In the Age of Big Data, the visualization of Big Data is more attractive to the Big Data community because it helps analysts to examine such data easily and clearly. Therefore, TITS uses the d3.js library as a visualization tool. This library is designed for the purpose of creating Data Driven Documents that bind document object model (DOM) and any data; the interaction between data is easy and useful for managing real-time data stream with smooth animation. In addition, TITS uses a bootstrap made of pre-configured plug-in style sheets and JavaScript libraries to build a web system. The TITS Graphical User Interface (GUI) is designed using these libraries, and it is capable of detecting issues on Twitter in an easy and intuitive manner. The proposed work demonstrates the superiority of our issue detection techniques by matching detected issues with corresponding online news articles. The contributions of the present study are threefold. First, we suggest an alternative approach to real-time big data analysis, which has become an extremely important issue. Second, we apply a topic modeling technique that is used in various research areas, including Library and Information Science (LIS). Based on this, we can confirm the utility of storytelling and time series analysis. Third, we develop a web-based system, and make the system available for the real-time discovery of topics. The present study conducted experiments with nearly 150 million tweets in Korea during March 2013.

A Study on Developing a VKOSPI Forecasting Model via GARCH Class Models for Intelligent Volatility Trading Systems (지능형 변동성트레이딩시스템개발을 위한 GARCH 모형을 통한 VKOSPI 예측모형 개발에 관한 연구)

  • Kim, Sun-Woong
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.2
    • /
    • pp.19-32
    • /
    • 2010
  • Volatility plays a central role in both academic and practical applications, especially in pricing financial derivative products and trading volatility strategies. This study presents a novel mechanism based on generalized autoregressive conditional heteroskedasticity (GARCH) models that is able to enhance the performance of intelligent volatility trading systems by predicting Korean stock market volatility more accurately. In particular, we embedded the concept of the volatility asymmetry documented widely in the literature into our model. The newly developed Korean stock market volatility index of KOSPI 200, VKOSPI, is used as a volatility proxy. It is the price of a linear portfolio of the KOSPI 200 index options and measures the effect of the expectations of dealers and option traders on stock market volatility for 30 calendar days. The KOSPI 200 index options market started in 1997 and has become the most actively traded market in the world. Its trading volume is more than 10 million contracts a day and records the highest of all the stock index option markets. Therefore, analyzing the VKOSPI has great importance in understanding volatility inherent in option prices and can afford some trading ideas for futures and option dealers. Use of the VKOSPI as volatility proxy avoids statistical estimation problems associated with other measures of volatility since the VKOSPI is model-free expected volatility of market participants calculated directly from the transacted option prices. This study estimates the symmetric and asymmetric GARCH models for the KOSPI 200 index from January 2003 to December 2006 by the maximum likelihood procedure. Asymmetric GARCH models include GJR-GARCH model of Glosten, Jagannathan and Runke, exponential GARCH model of Nelson and power autoregressive conditional heteroskedasticity (ARCH) of Ding, Granger and Engle. Symmetric GARCH model indicates basic GARCH (1, 1). Tomorrow's forecasted value and change direction of stock market volatility are obtained by recursive GARCH specifications from January 2007 to December 2009 and are compared with the VKOSPI. Empirical results indicate that negative unanticipated returns increase volatility more than positive return shocks of equal magnitude decrease volatility, indicating the existence of volatility asymmetry in the Korean stock market. The point value and change direction of tomorrow VKOSPI are estimated and forecasted by GARCH models. Volatility trading system is developed using the forecasted change direction of the VKOSPI, that is, if tomorrow VKOSPI is expected to rise, a long straddle or strangle position is established. A short straddle or strangle position is taken if VKOSPI is expected to fall tomorrow. Total profit is calculated as the cumulative sum of the VKOSPI percentage change. If forecasted direction is correct, the absolute value of the VKOSPI percentage changes is added to trading profit. It is subtracted from the trading profit if forecasted direction is not correct. For the in-sample period, the power ARCH model best fits in a statistical metric, Mean Squared Prediction Error (MSPE), and the exponential GARCH model shows the highest Mean Correct Prediction (MCP). The power ARCH model best fits also for the out-of-sample period and provides the highest probability for the VKOSPI change direction tomorrow. Generally, the power ARCH model shows the best fit for the VKOSPI. All the GARCH models provide trading profits for volatility trading system and the exponential GARCH model shows the best performance, annual profit of 197.56%, during the in-sample period. The GARCH models present trading profits during the out-of-sample period except for the exponential GARCH model. During the out-of-sample period, the power ARCH model shows the largest annual trading profit of 38%. The volatility clustering and asymmetry found in this research are the reflection of volatility non-linearity. This further suggests that combining the asymmetric GARCH models and artificial neural networks can significantly enhance the performance of the suggested volatility trading system, since artificial neural networks have been shown to effectively model nonlinear relationships.

Marketing Standardization and Firm Performance in International E.Commerce (국제전자상무중적영소표준화화공사표현(国际电子商务中的营销标准化和公司表现))

  • Fritz, Wolfgang;Dees, Heiko
    • Journal of Global Scholars of Marketing Science
    • /
    • v.19 no.3
    • /
    • pp.37-48
    • /
    • 2009
  • The standardization of marketing has been one of the most focused research topics in international marketing. The term "global marketing" was often used to mean an internationally standardized marketing strategy based on similarities between foreign markets. Marketing standardization was discussed only within the context of traditional physical marketplaces. Since then, the digital "marketspace" of the Internet had emerged in the 90's, and it became one of the most important drivers of the globalization process opening new opportunities for the standardization of global marketing activities. On the other hand, the opinion that a greater adoption of the Internet by customers may lead to a higher degree of customization and differentiation of products rather than standardization is also quite popular. Considering this disagreement, it is notable that comprehensive studies which focus upon the marketing standardization especially in the context of global e-commerce are missing to a high degree. On this background, the two basic research questions being addressed in this study are: (1) To what extent do companies standardize their marketing in international e-commerce? (2) Is there an impact of marketing standardization on the performance (or success) of these companies? Following research hypotheses were generated based upon literature review: H 1: Internationally engaged e-commerce firms show a growing readiness for marketing standardization. H 2: Marketing standardization exerts positive effects on the success of companies in international e-commerce. H 3: In international e-commerce, marketing mix standardization exerts a stronger positive effect on the economic as well as the non-economic success of companies than marketing process standardization. H 4: The higher the non-economic success in international e-commerce firms, the higher the economic success. The data for this research were obtained from a questionnaire survey conducted from February to April 2005. The international e-commerce companies of various industries in Germany and all subsidiaries or headquarters of foreign e-commerce companies based in Germany were included in the survey. 118 out of 801 companies responded to the questionnaire. For structural equation modelling (SEM), the Partial-Least. Squares (PLS) approach in the version PLS-Graph 3.0 was applied (Chin 1998a; 2001). All of four research hypotheses were supported by result of data analysis. The results show that companies engaged in international e-commerce standardize in particular brand name, web page design, product positioning, and the product program to a high degree. The companies intend to intensify their efforts for marketing mix standardization in the future. In addition they want to standardize their marketing processes also to a higher degree, especially within the range of information systems, corporate language and online marketing control procedures. In this study, marketing standardization exerts a positive overall impact on company performance in international e-commerce. Standardization of marketing mix exerts a stronger positive impact on the non-economic success than standardization of marketing processes, which in turn contributes slightly stronger to the economic success. Furthermore, our findings give clear support to the assumption that the non-economic success is highly relevant to the economic success of the firm in international e-commerce. The empirical findings indicate that marketing standardization is relevant to the companies' success in international e-commerce. But marketing mix and marketing process standardization contribute to the firms' economic and non-economic success in different ways. The findings indicate that companies do standardize numerous elements of their marketing mix on the Internet. This practice is in part contrary to the popular concept of a "differentiated standardization" which argues that some elements of the marketing mix should be adapted locally and others should be standardized internationally. Furthermore, the findings suggest that the overall standardization of marketing -rather than the standardization of one particular marketing mix element - is what brings about a positive overall impact on success.

  • PDF

Analyzing Contextual Polarity of Unstructured Data for Measuring Subjective Well-Being (주관적 웰빙 상태 측정을 위한 비정형 데이터의 상황기반 긍부정성 분석 방법)

  • Choi, Sukjae;Song, Yeongeun;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.83-105
    • /
    • 2016
  • Measuring an individual's subjective wellbeing in an accurate, unobtrusive, and cost-effective manner is a core success factor of the wellbeing support system, which is a type of medical IT service. However, measurements with a self-report questionnaire and wearable sensors are cost-intensive and obtrusive when the wellbeing support system should be running in real-time, despite being very accurate. Recently, reasoning the state of subjective wellbeing with conventional sentiment analysis and unstructured data has been proposed as an alternative to resolve the drawbacks of the self-report questionnaire and wearable sensors. However, this approach does not consider contextual polarity, which results in lower measurement accuracy. Moreover, there is no sentimental word net or ontology for the subjective wellbeing area. Hence, this paper proposes a method to extract keywords and their contextual polarity representing the subjective wellbeing state from the unstructured text in online websites in order to improve the reasoning accuracy of the sentiment analysis. The proposed method is as follows. First, a set of general sentimental words is proposed. SentiWordNet was adopted; this is the most widely used dictionary and contains about 100,000 words such as nouns, verbs, adjectives, and adverbs with polarities from -1.0 (extremely negative) to 1.0 (extremely positive). Second, corpora on subjective wellbeing (SWB corpora) were obtained by crawling online text. A survey was conducted to prepare a learning dataset that includes an individual's opinion and the level of self-report wellness, such as stress and depression. The participants were asked to respond with their feelings about online news on two topics. Next, three data sources were extracted from the SWB corpora: demographic information, psychographic information, and the structural characteristics of the text (e.g., the number of words used in the text, simple statistics on the special characters used). These were considered to adjust the level of a specific SWB. Finally, a set of reasoning rules was generated for each wellbeing factor to estimate the SWB of an individual based on the text written by the individual. The experimental results suggested that using contextual polarity for each SWB factor (e.g., stress, depression) significantly improved the estimation accuracy compared to conventional sentiment analysis methods incorporating SentiWordNet. Even though literature is available on Korean sentiment analysis, such studies only used only a limited set of sentimental words. Due to the small number of words, many sentences are overlooked and ignored when estimating the level of sentiment. However, the proposed method can identify multiple sentiment-neutral words as sentiment words in the context of a specific SWB factor. The results also suggest that a specific type of senti-word dictionary containing contextual polarity needs to be constructed along with a dictionary based on common sense such as SenticNet. These efforts will enrich and enlarge the application area of sentic computing. The study is helpful to practitioners and managers of wellness services in that a couple of characteristics of unstructured text have been identified for improving SWB. Consistent with the literature, the results showed that the gender and age affect the SWB state when the individual is exposed to an identical queue from the online text. In addition, the length of the textual response and usage pattern of special characters were found to indicate the individual's SWB. These imply that better SWB measurement should involve collecting the textual structure and the individual's demographic conditions. In the future, the proposed method should be improved by automated identification of the contextual polarity in order to enlarge the vocabulary in a cost-effective manner.

A Study on Analyzing Sentiments on Movie Reviews by Multi-Level Sentiment Classifier (영화 리뷰 감성분석을 위한 텍스트 마이닝 기반 감성 분류기 구축)

  • Kim, Yuyoung;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.3
    • /
    • pp.71-89
    • /
    • 2016
  • Sentiment analysis is used for identifying emotions or sentiments embedded in the user generated data such as customer reviews from blogs, social network services, and so on. Various research fields such as computer science and business management can take advantage of this feature to analyze customer-generated opinions. In previous studies, the star rating of a review is regarded as the same as sentiment embedded in the text. However, it does not always correspond to the sentiment polarity. Due to this supposition, previous studies have some limitations in their accuracy. To solve this issue, the present study uses a supervised sentiment classification model to measure a more accurate sentiment polarity. This study aims to propose an advanced sentiment classifier and to discover the correlation between movie reviews and box-office success. The advanced sentiment classifier is based on two supervised machine learning techniques, the Support Vector Machines (SVM) and Feedforward Neural Network (FNN). The sentiment scores of the movie reviews are measured by the sentiment classifier and are analyzed by statistical correlations between movie reviews and box-office success. Movie reviews are collected along with a star-rate. The dataset used in this study consists of 1,258,538 reviews from 175 films gathered from Naver Movie website (movie.naver.com). The results show that the proposed sentiment classifier outperforms Naive Bayes (NB) classifier as its accuracy is about 6% higher than NB. Furthermore, the results indicate that there are positive correlations between the star-rate and the number of audiences, which can be regarded as the box-office success of a movie. The study also shows that there is the mild, positive correlation between the sentiment scores estimated by the classifier and the number of audiences. To verify the applicability of the sentiment scores, an independent sample t-test was conducted. For this, the movies were divided into two groups using the average of sentiment scores. The two groups are significantly different in terms of the star-rated scores.

Measuring the Economic Impact of Item Descriptions on Sales Performance (온라인 상품 판매 성과에 영향을 미치는 상품 소개글 효과 측정 기법)

  • Lee, Dongwon;Park, Sung-Hyuk;Moon, Songchun
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.4
    • /
    • pp.1-17
    • /
    • 2012
  • Personalized smart devices such as smartphones and smart pads are widely used. Unlike traditional feature phones, theses smart devices allow users to choose a variety of functions, which support not only daily experiences but also business operations. Actually, there exist a huge number of applications accessible by smart device users in online and mobile application markets. Users can choose apps that fit their own tastes and needs, which is impossible for conventional phone users. With the increase in app demand, the tastes and needs of app users are becoming more diverse. To meet these requirements, numerous apps with diverse functions are being released on the market, which leads to fierce competition. Unlike offline markets, online markets have a limitation in that purchasing decisions should be made without experiencing the items. Therefore, online customers rely more on item-related information that can be seen on the item page in which online markets commonly provide details about each item. Customers can feel confident about the quality of an item through the online information and decide whether to purchase it. The same is true of online app markets. To win the sales competition against other apps that perform similar functions, app developers need to focus on writing app descriptions to attract the attention of customers. If we can measure the effect of app descriptions on sales without regard to the app's price and quality, app descriptions that facilitate the sale of apps can be identified. This study intends to provide such a quantitative result for app developers who want to promote the sales of their apps. For this purpose, we collected app details including the descriptions written in Korean from one of the largest app markets in Korea, and then extracted keywords from the descriptions. Next, the impact of the keywords on sales performance was measured through our econometric model. Through this analysis, we were able to analyze the impact of each keyword itself, apart from that of the design or quality. The keywords, comprised of the attribute and evaluation of each app, are extracted by a morpheme analyzer. Our model with the keywords as its input variables was established to analyze their impact on sales performance. A regression analysis was conducted for each category in which apps are included. This analysis was required because we found the keywords, which are emphasized in app descriptions, different category-by-category. The analysis conducted not only for free apps but also for paid apps showed which keywords have more impact on sales performance for each type of app. In the analysis of paid apps in the education category, keywords such as 'search+easy' and 'words+abundant' showed higher effectiveness. In the same category, free apps whose keywords emphasize the quality of apps showed higher sales performance. One interesting fact is that keywords describing not only the app but also the need for the app have asignificant impact. Language learning apps, regardless of whether they are sold free or paid, showed higher sales performance by including the keywords 'foreign language study+important'. This result shows that motivation for the purchase affected sales. While item reviews are widely researched in online markets, item descriptions are not very actively studied. In the case of the mobile app markets, newly introduced apps may not have many item reviews because of the low quantity sold. In such cases, item descriptions can be regarded more important when customers make a decision about purchasing items. This study is the first trial to quantitatively analyze the relationship between an item description and its impact on sales performance. The results show that our research framework successfully provides a list of the most effective sales key terms with the estimates of their effectiveness. Although this study is performed for a specified type of item (i.e., mobile apps), our model can be applied to almost all of the items traded in online markets.

Design Evaluation Model Based on Consumer Values: Three-step Approach from Product Attributes, Perceived Attributes, to Consumer Values (소비자 가치기반 디자인 평가 모형: 제품 속성, 인지 속성, 소비자 가치의 3단계 접근)

  • Kim, Keon-Woo;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.4
    • /
    • pp.57-76
    • /
    • 2017
  • Recently, consumer needs are diversifying as information technologies are evolving rapidly. A lot of IT devices such as smart phones and tablet PCs are launching following the trend of information technology. While IT devices focused on the technical advance and improvement a few years ago, the situation is changed now. There is no difference in functional aspects, so companies are trying to differentiate IT devices in terms of appearance design. Consumers also consider design as being a more important factor in the decision-making of smart phones. Smart phones have become a fashion items, revealing consumers' own characteristics and personality. As the design and appearance of the smartphone become important things, it is necessary to examine consumer values from the design and appearance of IT devices. Furthermore, it is crucial to clarify the mechanisms of consumers' design evaluation and develop the design evaluation model based on the mechanism. Since the influence of design gets continuously strong, various and many studies related to design were carried out. These studies can classify three main streams. The first stream focuses on the role of design from the perspective of marketing and communication. The second one is the studies to find out an effective and appealing design from the perspective of industrial design. The last one is to examine the consumer values created by a product design, which means consumers' perception or feeling when they look and feel it. These numerous studies somewhat have dealt with consumer values, but they do not include product attributes, or do not cover the whole process and mechanism from product attributes to consumer values. In this study, we try to develop the holistic design evaluation model based on consumer values based on three-step approach from product attributes, perceived attributes, to consumer values. Product attributes means the real and physical characteristics each smart phone has. They consist of bezel, length, width, thickness, weight and curvature. Perceived attributes are derived from consumers' perception on product attributes. We consider perceived size of device, perceived size of display, perceived thickness, perceived weight, perceived bezel (top - bottom / left - right side), perceived curvature of edge, perceived curvature of back side, gap of each part, perceived gloss and perceived screen ratio. They are factorized into six clusters named as 'Size,' 'Slimness,' 'No-Frame,' 'Roundness,' 'Screen Ratio,' and 'Looseness.' We conducted qualitative research to find out consumer values, which are categorized into two: look and feel values. We identified the values named as 'Silhouette,' 'Neatness,' 'Attractiveness,' 'Polishing,' 'Innovativeness,' 'Professionalism,' 'Intellectualness,' 'Individuality,' and 'Distinctiveness' in terms of look values. Also, we identifies 'Stability,' 'Comfortableness,' 'Grip,' 'Solidity,' 'Non-fragility,' and 'Smoothness' in terms of feel values. They are factorized into five key values: 'Sleek Value,' 'Professional Value,' 'Unique Value,' 'Comfortable Value,' and 'Solid Value.' Finally, we developed the holistic design evaluation model by analyzing each relationship from product attributes, perceived attributes, to consumer values. This study has several theoretical and practical contributions. First, we found consumer values in terms of design evaluation and implicit chain relationship from the objective and physical characteristics to the subjective and mental evaluation. That is, the model explains the mechanism of design evaluation in consumer minds. Second, we suggest a general design evaluation process from product attributes, perceived attributes to consumer values. It is an adaptable methodology not only smart phone but also other IT products. Practically, this model can support the decision-making when companies initiative new product development. It can help product designers focus on their capacities with limited resources. Moreover, if its model combined with machine learning collecting consumers' purchasing data, most preferred values, sales data, etc., it will be able to evolve intelligent design decision support system.

The Research on scheme for revitalization of Conversion into World Geodetic Reference System (세계측지계 전환 활성화를 위한 방안 연구)

  • Sohn, Duk-Jae;Lee, Hyun-Jik;Yu, Young-Geol
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.27 no.3
    • /
    • pp.311-321
    • /
    • 2009
  • The Nation Geodetic Reference System which presents a consistent location standard used in creating a map or developing national land is defined and managed by the law in a nation. Each nation had used its own geodetic system created by astronomical surveying until recently, when Geodetic Reference System(World Geodetic Reference System) has been developed and used to progress in space and satellite geodetic technologies. Korea also amended its geodetic law in December 2001, converting its national geodetic system whose reference an oval figure is Bessel ellipsoid into the World Geodetic Reference System which uses GRS80 ellipsoid as reference ellipsoid. Accordingly, the National Geography Information Institute improved law and systems related to the change for the effective conversion from its national geodetic system into the World Geodetic Reference System. In addition National geographic information institute of the results of various studies is drawn to the World Geodetic Reference System for switching technology-met some of the institutional foundation Despite of accordance with formalities, National geographic information institute, Ministry of Land, Transport and Maritime Affairs and some local government of the World Geodetic Reference System, and local government has or has not spread in public institutions. Therefore, in order to promote the switch to the World Geodetic Reference System, it is required to analyze current technical and institutional problems and obstacles of the switch to the World Geodetic Reference System and to present the resolutions and to establish policy to achieve them. Accordingly, for the promotion of the switch to the World Geodetic Reference System, this study analyzed the results of previous studies, the current state of the switch to the World Geodetic Reference System and the problems of the switch, and then offered technological and institutional supplements. Furthermore, it standardized the subject and type of the conversion, defined the scope of the tasks of the National Geographic Information Institute and its related organizations, and presented the policy direction for the overall use of the World Geodetic Reference System by 2010.

Financial Fraud Detection using Text Mining Analysis against Municipal Cybercriminality (지자체 사이버 공간 안전을 위한 금융사기 탐지 텍스트 마이닝 방법)

  • Choi, Sukjae;Lee, Jungwon;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.119-138
    • /
    • 2017
  • Recently, SNS has become an important channel for marketing as well as personal communication. However, cybercrime has also evolved with the development of information and communication technology, and illegal advertising is distributed to SNS in large quantity. As a result, personal information is lost and even monetary damages occur more frequently. In this study, we propose a method to analyze which sentences and documents, which have been sent to the SNS, are related to financial fraud. First of all, as a conceptual framework, we developed a matrix of conceptual characteristics of cybercriminality on SNS and emergency management. We also suggested emergency management process which consists of Pre-Cybercriminality (e.g. risk identification) and Post-Cybercriminality steps. Among those we focused on risk identification in this paper. The main process consists of data collection, preprocessing and analysis. First, we selected two words 'daechul(loan)' and 'sachae(private loan)' as seed words and collected data with this word from SNS such as twitter. The collected data are given to the two researchers to decide whether they are related to the cybercriminality, particularly financial fraud, or not. Then we selected some of them as keywords if the vocabularies are related to the nominals and symbols. With the selected keywords, we searched and collected data from web materials such as twitter, news, blog, and more than 820,000 articles collected. The collected articles were refined through preprocessing and made into learning data. The preprocessing process is divided into performing morphological analysis step, removing stop words step, and selecting valid part-of-speech step. In the morphological analysis step, a complex sentence is transformed into some morpheme units to enable mechanical analysis. In the removing stop words step, non-lexical elements such as numbers, punctuation marks, and double spaces are removed from the text. In the step of selecting valid part-of-speech, only two kinds of nouns and symbols are considered. Since nouns could refer to things, the intent of message is expressed better than the other part-of-speech. Moreover, the more illegal the text is, the more frequently symbols are used. The selected data is given 'legal' or 'illegal'. To make the selected data as learning data through the preprocessing process, it is necessary to classify whether each data is legitimate or not. The processed data is then converted into Corpus type and Document-Term Matrix. Finally, the two types of 'legal' and 'illegal' files were mixed and randomly divided into learning data set and test data set. In this study, we set the learning data as 70% and the test data as 30%. SVM was used as the discrimination algorithm. Since SVM requires gamma and cost values as the main parameters, we set gamma as 0.5 and cost as 10, based on the optimal value function. The cost is set higher than general cases. To show the feasibility of the idea proposed in this paper, we compared the proposed method with MLE (Maximum Likelihood Estimation), Term Frequency, and Collective Intelligence method. Overall accuracy and was used as the metric. As a result, the overall accuracy of the proposed method was 92.41% of illegal loan advertisement and 77.75% of illegal visit sales, which is apparently superior to that of the Term Frequency, MLE, etc. Hence, the result suggests that the proposed method is valid and usable practically. In this paper, we propose a framework for crisis management caused by abnormalities of unstructured data sources such as SNS. We hope this study will contribute to the academia by identifying what to consider when applying the SVM-like discrimination algorithm to text analysis. Moreover, the study will also contribute to the practitioners in the field of brand management and opinion mining.