• Title/Summary/Keyword: empirical

Search Result 17,743, Processing Time 0.042 seconds

Optimization and Development of Prediction Model on the Removal Condition of Livestock Wastewater using a Response Surface Method in the Photo-Fenton Oxidation Process (Photo-Fenton 산화공정에서 반응표면분석법을 이용한 축산폐수의 COD 처리조건 최적화 및 예측식 수립)

  • Cho, Il-Hyoung;Chang, Soon-Woong;Lee, Si-Jin
    • Journal of Korean Society of Environmental Engineers
    • /
    • v.30 no.6
    • /
    • pp.642-652
    • /
    • 2008
  • The aim of our research was to apply experimental design methodology in the optimization condition of Photo-Fenton oxidation of the residual livestock wastewater after the coagulation process. The reactions of Photo-Fenton oxidation were mathematically described as a function of parameters amount of Fe(II)($x_1$), $H_2O_2(x_2)$ and pH($x_3$) being modeled by the use of the Box-Behnken method, which was used for fitting 2nd order response surface models and was alternative to central composite designs. The application of RSM using the Box-Behnken method yielded the following regression equation, which is an empirical relationship between the removal(%) of livestock wastewater and test variables in coded unit: Y = 79.3 + 15.61x$_1$ - 7.31x$_2$ - 4.26x$_3$ - 18x$_1{^2}$ - 10x$_2{^2}$ - 11.9x$_3{^2}$ + 2.49x$_1$x$_2$ - 4.4x$_2$x$_3$ - 1.65x$_1$x$_3$. The model predicted also agreed with the experimentally observed result(R$^2$ = 0.96) The results show that the response of treatment removal(%) in Photo-Fenton oxidation of livestock wastewater were significantly affected by the synergistic effect of linear terms(Fe(II)($x_1$), $H_2O_2(x_2)$, pH(x$_3$)), whereas Fe(II) $\times$ Fe(II)(x$_1{^2}$), $H_2O_2$ $\times$ $H_2O_2$(x$_2{^2}$) and pH $\times$ pH(x$_3{^2}$) on the quadratic terms were significantly affected by the antagonistic effect. $H_2O_2$ $\times$ pH(x$_2$x$_3$) had also a antagonistic effect in the cross-product term. The estimated ridge of the expected maximum response and optimal conditions for Y using canonical analysis were 84 $\pm$ 0.95% and (Fe(II)(X$_1$) = 0.0146 mM, $H_2O_2$(X$_2$) = 0.0867 mM and pH(X$_3$) = 4.704, respectively. The optimal ratio of Fe/H$_2O_2$ was also 0.17 at the pH 4.7.

A Conceptual Review of the Transaction Costs within a Distribution Channel (유통경로내의 거래비용에 대한 개념적 고찰)

  • Kwon, Young-Sik;Mun, Jang-Sil
    • Journal of Distribution Science
    • /
    • v.10 no.2
    • /
    • pp.29-41
    • /
    • 2012
  • This paper undertakes a conceptual review of transaction cost to broaden the understanding of the transaction cost analysis (TCA) approach. More than 40 years have passed since Coase's fundamental insight that transaction, coordination, and contracting costs must be considered explicitly in explaining the extent of vertical integration. Coase (1937) forced economists to identify previously neglected constraints on the trading process to foster efficient intrafirm, rather than interfirm, transactions. The transaction cost approach to economic organization study regards transactions as the basic units of analysis and holds that understanding transaction cost economy is central to organizational study. The approach applies to determining efficient boundaries, as between firms and markets, and to internal transaction organization, including employment relations design. TCA, developed principally by Oliver Williamson (1975,1979,1981a) blends institutional economics, organizational theory, and contract law. Further progress in transaction costs research awaits the identification of critical dimensions in which transaction costs differ and an examination of the economizing properties of alternative institutional modes for organizing transactions. The crucial investment distinction is: To what degree are transaction-specific (non-marketable) expenses incurred? Unspecialized items pose few hazards, since buyers can turn toalternative sources, and suppliers can sell output intended for one order to other buyers. Non-marketability problems arise when specific parties' identities have important cost-bearing consequences. Transactions of this kind are labeled idiosyncratic. The summarized results of the review are as follows. First, firms' distribution decisions often prompt examination of the make-or-buy question: Should a marketing activity be performed within the organization by company employees or contracted to an external agent? Second, manufacturers introducing an industrial product to a foreign market face a difficult decision. Should the product be marketed primarily by captive agents (the company sales force and distribution division) or independent intermediaries (outside sales agents and distribution)? Third, the authors develop a theoretical extension to the basic transaction cost model by combining insights from various theories with the TCA approach. Fourth, other such extensions are likely required for the general model to be applied to different channel situations. It is naive to assume the basic model appliesacross markedly different channel contexts without modifications and extensions. Although this study contributes to scholastic research, it is limited by several factors. First, the theoretical perspective of TCA has attracted considerable recent interest in the area of marketing channels. The analysis aims to match the properties of efficient governance structures with the attributes of the transaction. Second, empirical evidence about TCA's basic propositions is sketchy. Apart from Anderson's (1985) study of the vertical integration of the selling function and John's (1984) study of opportunism by franchised dealers, virtually no marketing studies involving the constructs implicated in the analysis have been reported. We hope, therefore, that further research will clarify distinctions between the different aspects of specific assets. Another important line of future research is the integration of efficiency-oriented TCA with organizational approaches that emphasize specific assets' conceptual definition and industry structure. Finally, research of transaction costs, uncertainty, opportunism, and switching costs is critical to future study.

  • PDF

An Analysis of the Comparative Importance of Systematic Attributes for Developing an Intelligent Online News Recommendation System: Focusing on the PWYW Payment Model (지능형 온라인 뉴스 추천시스템 개발을 위한 체계적 속성간 상대적 중요성 분석: PWYW 지불모델을 중심으로)

  • Lee, Hyoung-Joo;Chung, Nuree;Yang, Sung-Byung
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.75-100
    • /
    • 2018
  • Mobile devices have become an important channel for news content usage in our daily life. However, online news content readers' resistance to online news monetization is more serious than other digital content businesses, such as webtoons, music sources, videos, and games. Since major portal sites distribute online news content free of charge to increase their traffics, customers have been accustomed to free news content; hence this makes online news providers more difficult to switch their policies on business models (i.e., monetization policy). As a result, most online news providers are highly dependent on the advertising business model, which can lead to increasing number of false, exaggerated, or sensational advertisements inside the news website to maximize their advertising revenue. To reduce this advertising dependencies, many online news providers had attempted to switch their 'free' readers to 'paid' users, but most of them failed. However, recently, some online news media have been successfully applying the Pay-What-You-Want (PWYW) payment model, which allows readers to voluntarily pay fees for their favorite news content. These successful cases shed some lights to the managers of online news content provider regarding that the PWYW model can serve as an alternative business model. In this study, therefore, we collected 379 online news articles from Ohmynews.com that has been successfully employing the PWYW model, and analyzed the comparative importance of systematic attributes of online news content on readers' voluntary payment. More specifically, we derived the six systematic attributes (i.e., Type of Article Title, Image Stimulation, Article Readability, Article Type, Dominant Emotion, and Article-Image Similarity) and three or four levels within each attribute based on previous studies. Then, we conducted content analysis to measure five attributes except Article Readability attribute, measured by Flesch readability score. Before conducting main content analysis, the face reliabilities of chosen attributes were measured by three doctoral level researchers with 37 sample articles, and inter-coder reliabilities of the three coders were verified. Then, the main content analysis was conducted for two months from March 2017 with 379 online news articles. All 379 articles were reviewed by the same three coders, and 65 articles that showed inconsistency among coders were excluded before employing conjoint analysis. Finally, we examined the comparative importance of those six systematic attributes (Study 1), and levels within each of the six attributes (Study 2) through conjoint analysis with 314 online news articles. From the results of conjoint analysis, we found that Article Readability, Article-Image Similarity, and Type of Article Title are the most significant factors affecting online news readers' voluntary payment. First, it can be interpreted that if the level of readability of an online news article is in line with the readers' level of readership, the readers will voluntarily pay more. Second, the similarity between the content of the article and the image within it enables the readers to increase the information acceptance and to transmit the message of the article more effectively. Third, readers expect that the article title would reveal the content of the article, and the expectation influences the understanding and satisfaction of the article. Therefore, it is necessary to write an article with an appropriate readability level, and use images and title well matched with the content to make readers voluntarily pay more. We also examined the comparative importance of levels within each attribute in more details. Based on findings of two studies, two major and nine minor propositions are suggested for future empirical research. This study has academic implications in that it is one of the first studies applying both content analysis and conjoint analysis together to examine readers' voluntary payment behavior, rather than their intention to pay. In addition, online news content creators, providers, and managers could find some practical insights from this research in terms of how they should produce news content to make readers voluntarily pay more for their online news content.

Smart Store in Smart City: The Development of Smart Trade Area Analysis System Based on Consumer Sentiments (Smart Store in Smart City: 소비자 감성기반 상권분석 시스템 개발)

  • Yoo, In-Jin;Seo, Bong-Goon;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.25-52
    • /
    • 2018
  • This study performs social network analysis based on consumer sentiment related to a location in Seoul using data reflecting consumers' web search activities and emotional evaluations associated with commerce. The study focuses on large commercial districts in Seoul. In addition, to consider their various aspects, social network indexes were combined with the trading area's public data to verify factors affecting the area's sales. According to R square's change, We can see that the model has a little high R square value even though it includes only the district's public data represented by static data. However, the present study confirmed that the R square of the model combined with the network index derived from the social network analysis was even improved much more. A regression analysis of the trading area's public data showed that the five factors of 'number of market district,' 'residential area per person,' 'satisfaction of residential environment,' 'rate of change of trade,' and 'survival rate over 3 years' among twenty two variables. The study confirmed a significant influence on the sales of the trading area. According to the results, 'residential area per person' has the highest standardized beta value. Therefore, 'residential area per person' has the strongest influence on commercial sales. In addition, 'residential area per person,' 'number of market district,' and 'survival rate over 3 years' were found to have positive effects on the sales of all trading area. Thus, as the number of market districts in the trading area increases, residential area per person increases, and as the survival rate over 3 years of each store in the trading area increases, sales increase. On the other hand, 'satisfaction of residential environment' and 'rate of change of trade' were found to have a negative effect on sales. In the case of 'satisfaction of residential environment,' sales increase when the satisfaction level is low. Therefore, as consumer dissatisfaction with the residential environment increases, sales increase. The 'rate of change of trade' shows that sales increase with the decreasing acceleration of transaction frequency. According to the social network analysis, of the 25 regional trading areas in Seoul, Yangcheon-gu has the highest degree of connection. In other words, it has common sentiments with many other trading areas. On the other hand, Nowon-gu and Jungrang-gu have the lowest degree of connection. In other words, they have relatively distinct sentiments from other trading areas. The social network indexes used in the combination model are 'density of ego network,' 'degree centrality,' 'closeness centrality,' 'betweenness centrality,' and 'eigenvector centrality.' The combined model analysis confirmed that the degree centrality and eigenvector centrality of the social network index have a significant influence on sales and the highest influence in the model. 'Degree centrality' has a negative effect on the sales of the districts. This implies that sales decrease when holding various sentiments of other trading area, which conflicts with general social myths. However, this result can be interpreted to mean that if a trading area has low 'degree centrality,' it delivers unique and special sentiments to consumers. The findings of this study can also be interpreted to mean that sales can be increased if the trading area increases consumer recognition by forming a unique sentiment and city atmosphere that distinguish it from other trading areas. On the other hand, 'eigenvector centrality' has the greatest effect on sales in the combined model. In addition, the results confirmed a positive effect on sales. This finding shows that sales increase when a trading area is connected to others with stronger centrality than when it has common sentiments with others. This study can be used as an empirical basis for establishing and implementing a city and trading area strategy plan considering consumers' desired sentiments. In addition, we expect to provide entrepreneurs and potential entrepreneurs entering the trading area with sentiments possessed by those in the trading area and directions into the trading area considering the district-sentiment structure.

A Time Series Graph based Convolutional Neural Network Model for Effective Input Variable Pattern Learning : Application to the Prediction of Stock Market (효과적인 입력변수 패턴 학습을 위한 시계열 그래프 기반 합성곱 신경망 모형: 주식시장 예측에의 응용)

  • Lee, Mo-Se;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.167-181
    • /
    • 2018
  • Over the past decade, deep learning has been in spotlight among various machine learning algorithms. In particular, CNN(Convolutional Neural Network), which is known as the effective solution for recognizing and classifying images or voices, has been popularly applied to classification and prediction problems. In this study, we investigate the way to apply CNN in business problem solving. Specifically, this study propose to apply CNN to stock market prediction, one of the most challenging tasks in the machine learning research. As mentioned, CNN has strength in interpreting images. Thus, the model proposed in this study adopts CNN as the binary classifier that predicts stock market direction (upward or downward) by using time series graphs as its inputs. That is, our proposal is to build a machine learning algorithm that mimics an experts called 'technical analysts' who examine the graph of past price movement, and predict future financial price movements. Our proposed model named 'CNN-FG(Convolutional Neural Network using Fluctuation Graph)' consists of five steps. In the first step, it divides the dataset into the intervals of 5 days. And then, it creates time series graphs for the divided dataset in step 2. The size of the image in which the graph is drawn is $40(pixels){\times}40(pixels)$, and the graph of each independent variable was drawn using different colors. In step 3, the model converts the images into the matrices. Each image is converted into the combination of three matrices in order to express the value of the color using R(red), G(green), and B(blue) scale. In the next step, it splits the dataset of the graph images into training and validation datasets. We used 80% of the total dataset as the training dataset, and the remaining 20% as the validation dataset. And then, CNN classifiers are trained using the images of training dataset in the final step. Regarding the parameters of CNN-FG, we adopted two convolution filters ($5{\times}5{\times}6$ and $5{\times}5{\times}9$) in the convolution layer. In the pooling layer, $2{\times}2$ max pooling filter was used. The numbers of the nodes in two hidden layers were set to, respectively, 900 and 32, and the number of the nodes in the output layer was set to 2(one is for the prediction of upward trend, and the other one is for downward trend). Activation functions for the convolution layer and the hidden layer were set to ReLU(Rectified Linear Unit), and one for the output layer set to Softmax function. To validate our model - CNN-FG, we applied it to the prediction of KOSPI200 for 2,026 days in eight years (from 2009 to 2016). To match the proportions of the two groups in the independent variable (i.e. tomorrow's stock market movement), we selected 1,950 samples by applying random sampling. Finally, we built the training dataset using 80% of the total dataset (1,560 samples), and the validation dataset using 20% (390 samples). The dependent variables of the experimental dataset included twelve technical indicators popularly been used in the previous studies. They include Stochastic %K, Stochastic %D, Momentum, ROC(rate of change), LW %R(Larry William's %R), A/D oscillator(accumulation/distribution oscillator), OSCP(price oscillator), CCI(commodity channel index), and so on. To confirm the superiority of CNN-FG, we compared its prediction accuracy with the ones of other classification models. Experimental results showed that CNN-FG outperforms LOGIT(logistic regression), ANN(artificial neural network), and SVM(support vector machine) with the statistical significance. These empirical results imply that converting time series business data into graphs and building CNN-based classification models using these graphs can be effective from the perspective of prediction accuracy. Thus, this paper sheds a light on how to apply deep learning techniques to the domain of business problem solving.

A Study on Risk Parity Asset Allocation Model with XGBoos (XGBoost를 활용한 리스크패리티 자산배분 모형에 관한 연구)

  • Kim, Younghoon;Choi, HeungSik;Kim, SunWoong
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.135-149
    • /
    • 2020
  • Artificial intelligences are changing world. Financial market is also not an exception. Robo-Advisor is actively being developed, making up the weakness of traditional asset allocation methods and replacing the parts that are difficult for the traditional methods. It makes automated investment decisions with artificial intelligence algorithms and is used with various asset allocation models such as mean-variance model, Black-Litterman model and risk parity model. Risk parity model is a typical risk-based asset allocation model which is focused on the volatility of assets. It avoids investment risk structurally. So it has stability in the management of large size fund and it has been widely used in financial field. XGBoost model is a parallel tree-boosting method. It is an optimized gradient boosting model designed to be highly efficient and flexible. It not only makes billions of examples in limited memory environments but is also very fast to learn compared to traditional boosting methods. It is frequently used in various fields of data analysis and has a lot of advantages. So in this study, we propose a new asset allocation model that combines risk parity model and XGBoost machine learning model. This model uses XGBoost to predict the risk of assets and applies the predictive risk to the process of covariance estimation. There are estimated errors between the estimation period and the actual investment period because the optimized asset allocation model estimates the proportion of investments based on historical data. these estimated errors adversely affect the optimized portfolio performance. This study aims to improve the stability and portfolio performance of the model by predicting the volatility of the next investment period and reducing estimated errors of optimized asset allocation model. As a result, it narrows the gap between theory and practice and proposes a more advanced asset allocation model. In this study, we used the Korean stock market price data for a total of 17 years from 2003 to 2019 for the empirical test of the suggested model. The data sets are specifically composed of energy, finance, IT, industrial, material, telecommunication, utility, consumer, health care and staple sectors. We accumulated the value of prediction using moving-window method by 1,000 in-sample and 20 out-of-sample, so we produced a total of 154 rebalancing back-testing results. We analyzed portfolio performance in terms of cumulative rate of return and got a lot of sample data because of long period results. Comparing with traditional risk parity model, this experiment recorded improvements in both cumulative yield and reduction of estimated errors. The total cumulative return is 45.748%, about 5% higher than that of risk parity model and also the estimated errors are reduced in 9 out of 10 industry sectors. The reduction of estimated errors increases stability of the model and makes it easy to apply in practical investment. The results of the experiment showed improvement of portfolio performance by reducing the estimated errors of the optimized asset allocation model. Many financial models and asset allocation models are limited in practical investment because of the most fundamental question of whether the past characteristics of assets will continue into the future in the changing financial market. However, this study not only takes advantage of traditional asset allocation models, but also supplements the limitations of traditional methods and increases stability by predicting the risks of assets with the latest algorithm. There are various studies on parametric estimation methods to reduce the estimated errors in the portfolio optimization. We also suggested a new method to reduce estimated errors in optimized asset allocation model using machine learning. So this study is meaningful in that it proposes an advanced artificial intelligence asset allocation model for the fast-developing financial markets.

The Mediating Effect of Experiential Value on Customers' Perceived Value of Digital Content: China's Anti-virus Program Market (경험개치대소비자대전자내용적인지개치적중개영향(经验价值对消费者对电子内容的认知价值的中介影响): 중국살독연건시장(中国杀毒软件市场))

  • Jia, Weiwei;Kim, Sae-Bum
    • Journal of Global Scholars of Marketing Science
    • /
    • v.20 no.2
    • /
    • pp.219-230
    • /
    • 2010
  • Digital content makes big changes to our daily lives while bringing opportunities and challenges for companies. Creative firms integrate pictures, texts, videos, audios, and data by digitalization to develop new products or services and create digital experiences to promote their brands. Most articles on digital content contribute to the basic concept or development of marketing it in literature. Actually, compared with traditional value chains for common products or services, the digital content industry seems to have more potential value. Because quite a bit of digital content is free to the consumer, price is not necessarily perceived as an indicator of the quality or value of information (Rowley 2008). It becomes evident that a current theme in digital content is the issue of "value," and research on customers' perceived value of digital content is a necessity. This article argues that experiential value has an advantage in customers' evaluations of digital content. Two different but related contributions to the understanding of "value" of digital content are made here. First, based on the comparison of digital content with products and services, the article proposes two key characteristics that make experiential strategy available for digital content: intangibility and near-zero reproduction cost. On top of that, based on the discussion of the gap between company's idealized value and customer's perceived value, this article emphasizes that digital content prices and pricing of digital content is different from products and services. As a result of intangibility, prices may not reflect customer value. Moreover, the cost of digital content in the development stage may be very high while reproduction costs shrink dramatically. Moreover, because of the value gap mentioned before, the pricing polices vary for different digital contents. For example, flat price policy is generally used for movies and music (Magiera 2001; Netherby 2002), while for continuous demand, digital content such as online games and anti-virus programs involves a more complicated matter of utility and competitive price levels. Digital content companies have to explore various kinds of strategies to overcome this gap. Rethinking marketing solutions such as advertisements, images, and word-of-mouth and their effect on customers' perceived value becomes essential. China's digital content industry is becoming more and more globalized and drawing special attention from different countries and regions that have respective competitive advantages. The 2008-2009 Annual Report on the Development of China's Digital Content Industry (CCIDConsulting 2009) indicates that, with the driven power of domestic demand and governmental policy support, the country's digital content industry maintained a fast growth of some 30 percent in 2008, obviously indicating the initial stage of industry expansion. In China, anti-virus programs and other software programs which need to be updated use a quarter-based pricing policy. Customers can download a trial version for free and use it for six months or a year. If they want to use it longer, continuous payment is needed. They examine the excellence of the digital content during this trial period and decide whether to pay for continued usage. For China’s music and movie industries, as a result of initial development, experiential strategy has not been much applied, even though firms in other countries find the trial experience and explore important strategies(such as customers listening to music for several seconds for free before downloading it). For the above reasons, anti-virus program may be a representative for digital content industry in China and an exploratory study of the advantage of experiential value in customer's perceived value of digital content is done in the anti-virus market of China. In order to enhance the reliability of the survey data, this study focused on people who were experienced users of anti-virus programs. The empirical results revealed that experiential value has a positive effect on customers' perceived value of digital content. In other words, because digital content is intangible and the reproduction costs are nearly zero, customers' evaluations are based heavily on their experience. Moreover, image and word-of-mouth do not have a positive effect on perceived value, only on experiential value. That is to say, a digital content value chain is different from that of a general product or service. Experiential value has a notable advantage and mediates the effect of image and word-of-mouth on perceived value. The results of this study help provide an understanding of why free digital content downloads exist in developing countries. Customers can perceive the value of digital content only by using and experiencing it. This is also why such governments support the development of digital content. Other developing countries whose digital content business is also in the beginning stage can make use of the suggestions here. Moreover, based on the advantage of experiential strategy, companies should make more of an effort to invest in customers' experience. As a result of the characteristics and value gap of digital content, customers perceive more value in the intangible digital content only by experiencing what they really want. Moreover, because of the near-zero reproduction costs, companies can perhaps use experiential strategy to enhance customer understanding of digital content.

Customer Behavior Prediction of Binary Classification Model Using Unstructured Information and Convolution Neural Network: The Case of Online Storefront (비정형 정보와 CNN 기법을 활용한 이진 분류 모델의 고객 행태 예측: 전자상거래 사례를 중심으로)

  • Kim, Seungsoo;Kim, Jongwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.221-241
    • /
    • 2018
  • Deep learning is getting attention recently. The deep learning technique which had been applied in competitions of the International Conference on Image Recognition Technology(ILSVR) and AlphaGo is Convolution Neural Network(CNN). CNN is characterized in that the input image is divided into small sections to recognize the partial features and combine them to recognize as a whole. Deep learning technologies are expected to bring a lot of changes in our lives, but until now, its applications have been limited to image recognition and natural language processing. The use of deep learning techniques for business problems is still an early research stage. If their performance is proved, they can be applied to traditional business problems such as future marketing response prediction, fraud transaction detection, bankruptcy prediction, and so on. So, it is a very meaningful experiment to diagnose the possibility of solving business problems using deep learning technologies based on the case of online shopping companies which have big data, are relatively easy to identify customer behavior and has high utilization values. Especially, in online shopping companies, the competition environment is rapidly changing and becoming more intense. Therefore, analysis of customer behavior for maximizing profit is becoming more and more important for online shopping companies. In this study, we propose 'CNN model of Heterogeneous Information Integration' using CNN as a way to improve the predictive power of customer behavior in online shopping enterprises. In order to propose a model that optimizes the performance, which is a model that learns from the convolution neural network of the multi-layer perceptron structure by combining structured and unstructured information, this model uses 'heterogeneous information integration', 'unstructured information vector conversion', 'multi-layer perceptron design', and evaluate the performance of each architecture, and confirm the proposed model based on the results. In addition, the target variables for predicting customer behavior are defined as six binary classification problems: re-purchaser, churn, frequent shopper, frequent refund shopper, high amount shopper, high discount shopper. In order to verify the usefulness of the proposed model, we conducted experiments using actual data of domestic specific online shopping company. This experiment uses actual transactions, customers, and VOC data of specific online shopping company in Korea. Data extraction criteria are defined for 47,947 customers who registered at least one VOC in January 2011 (1 month). The customer profiles of these customers, as well as a total of 19 months of trading data from September 2010 to March 2012, and VOCs posted for a month are used. The experiment of this study is divided into two stages. In the first step, we evaluate three architectures that affect the performance of the proposed model and select optimal parameters. We evaluate the performance with the proposed model. Experimental results show that the proposed model, which combines both structured and unstructured information, is superior compared to NBC(Naïve Bayes classification), SVM(Support vector machine), and ANN(Artificial neural network). Therefore, it is significant that the use of unstructured information contributes to predict customer behavior, and that CNN can be applied to solve business problems as well as image recognition and natural language processing problems. It can be confirmed through experiments that CNN is more effective in understanding and interpreting the meaning of context in text VOC data. And it is significant that the empirical research based on the actual data of the e-commerce company can extract very meaningful information from the VOC data written in the text format directly by the customer in the prediction of the customer behavior. Finally, through various experiments, it is possible to say that the proposed model provides useful information for the future research related to the parameter selection and its performance.

Sentiment Analysis of Korean Reviews Using CNN: Focusing on Morpheme Embedding (CNN을 적용한 한국어 상품평 감성분석: 형태소 임베딩을 중심으로)

  • Park, Hyun-jung;Song, Min-chae;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.59-83
    • /
    • 2018
  • With the increasing importance of sentiment analysis to grasp the needs of customers and the public, various types of deep learning models have been actively applied to English texts. In the sentiment analysis of English texts by deep learning, natural language sentences included in training and test datasets are usually converted into sequences of word vectors before being entered into the deep learning models. In this case, word vectors generally refer to vector representations of words obtained through splitting a sentence by space characters. There are several ways to derive word vectors, one of which is Word2Vec used for producing the 300 dimensional Google word vectors from about 100 billion words of Google News data. They have been widely used in the studies of sentiment analysis of reviews from various fields such as restaurants, movies, laptops, cameras, etc. Unlike English, morpheme plays an essential role in sentiment analysis and sentence structure analysis in Korean, which is a typical agglutinative language with developed postpositions and endings. A morpheme can be defined as the smallest meaningful unit of a language, and a word consists of one or more morphemes. For example, for a word '예쁘고', the morphemes are '예쁘(= adjective)' and '고(=connective ending)'. Reflecting the significance of Korean morphemes, it seems reasonable to adopt the morphemes as a basic unit in Korean sentiment analysis. Therefore, in this study, we use 'morpheme vector' as an input to a deep learning model rather than 'word vector' which is mainly used in English text. The morpheme vector refers to a vector representation for the morpheme and can be derived by applying an existent word vector derivation mechanism to the sentences divided into constituent morphemes. By the way, here come some questions as follows. What is the desirable range of POS(Part-Of-Speech) tags when deriving morpheme vectors for improving the classification accuracy of a deep learning model? Is it proper to apply a typical word vector model which primarily relies on the form of words to Korean with a high homonym ratio? Will the text preprocessing such as correcting spelling or spacing errors affect the classification accuracy, especially when drawing morpheme vectors from Korean product reviews with a lot of grammatical mistakes and variations? We seek to find empirical answers to these fundamental issues, which may be encountered first when applying various deep learning models to Korean texts. As a starting point, we summarized these issues as three central research questions as follows. First, which is better effective, to use morpheme vectors from grammatically correct texts of other domain than the analysis target, or to use morpheme vectors from considerably ungrammatical texts of the same domain, as the initial input of a deep learning model? Second, what is an appropriate morpheme vector derivation method for Korean regarding the range of POS tags, homonym, text preprocessing, minimum frequency? Third, can we get a satisfactory level of classification accuracy when applying deep learning to Korean sentiment analysis? As an approach to these research questions, we generate various types of morpheme vectors reflecting the research questions and then compare the classification accuracy through a non-static CNN(Convolutional Neural Network) model taking in the morpheme vectors. As for training and test datasets, Naver Shopping's 17,260 cosmetics product reviews are used. To derive morpheme vectors, we use data from the same domain as the target one and data from other domain; Naver shopping's about 2 million cosmetics product reviews and 520,000 Naver News data arguably corresponding to Google's News data. The six primary sets of morpheme vectors constructed in this study differ in terms of the following three criteria. First, they come from two types of data source; Naver news of high grammatical correctness and Naver shopping's cosmetics product reviews of low grammatical correctness. Second, they are distinguished in the degree of data preprocessing, namely, only splitting sentences or up to additional spelling and spacing corrections after sentence separation. Third, they vary concerning the form of input fed into a word vector model; whether the morphemes themselves are entered into a word vector model or with their POS tags attached. The morpheme vectors further vary depending on the consideration range of POS tags, the minimum frequency of morphemes included, and the random initialization range. All morpheme vectors are derived through CBOW(Continuous Bag-Of-Words) model with the context window 5 and the vector dimension 300. It seems that utilizing the same domain text even with a lower degree of grammatical correctness, performing spelling and spacing corrections as well as sentence splitting, and incorporating morphemes of any POS tags including incomprehensible category lead to the better classification accuracy. The POS tag attachment, which is devised for the high proportion of homonyms in Korean, and the minimum frequency standard for the morpheme to be included seem not to have any definite influence on the classification accuracy.

An Empirical Study on the Determinants of Supply Chain Management Systems Success from Vendor's Perspective (참여자관점에서 공급사슬관리 시스템의 성공에 영향을 미치는 요인에 관한 실증연구)

  • Kang, Sung-Bae;Moon, Tae-Soo;Chung, Yoon
    • Asia pacific journal of information systems
    • /
    • v.20 no.3
    • /
    • pp.139-166
    • /
    • 2010
  • The supply chain management (SCM) systems have emerged as strong managerial tools for manufacturing firms in enhancing competitive strength. Despite of large investments in the SCM systems, many companies are not fully realizing the promised benefits from the systems. A review of literature on adoption, implementation and success factor of IOS (inter-organization systems), EDI (electronic data interchange) systems, shows that this issue has been examined from multiple theoretic perspectives. And many researchers have attempted to identify the factors which influence the success of system implementation. However, the existing studies have two drawbacks in revealing the determinants of systems implementation success. First, previous researches raise questions as to the appropriateness of research subjects selected. Most SCM systems are operating in the form of private industrial networks, where the participants of the systems consist of two distinct groups: focus companies and vendors. The focus companies are the primary actors in developing and operating the systems, while vendors are passive participants which are connected to the system in order to supply raw materials and parts to the focus companies. Under the circumstance, there are three ways in selecting the research subjects; focus companies only, vendors only, or two parties grouped together. It is hard to find researches that use the focus companies exclusively as the subjects probably due to the insufficient sample size for statistic analysis. Most researches have been conducted using the data collected from both groups. We argue that the SCM success factors cannot be correctly indentified in this case. The focus companies and the vendors are in different positions in many areas regarding the system implementation: firm size, managerial resources, bargaining power, organizational maturity, and etc. There are no obvious reasons to believe that the success factors of the two groups are identical. Grouping the two groups also raises questions on measuring the system success. The benefits from utilizing the systems may not be commonly distributed to the two groups. One group's benefits might be realized at the expenses of the other group considering the situation where vendors participating in SCM systems are under continuous pressures from the focus companies with respect to prices, quality, and delivery time. Therefore, by combining the system outcomes of both groups we cannot measure the system benefits obtained by each group correctly. Second, the measures of system success adopted in the previous researches have shortcoming in measuring the SCM success. User satisfaction, system utilization, and user attitudes toward the systems are most commonly used success measures in the existing studies. These measures have been developed as proxy variables in the studies of decision support systems (DSS) where the contribution of the systems to the organization performance is very difficult to measure. Unlike the DSS, the SCM systems have more specific goals, such as cost saving, inventory reduction, quality improvement, rapid time, and higher customer service. We maintain that more specific measures can be developed instead of proxy variables in order to measure the system benefits correctly. The purpose of this study is to find the determinants of SCM systems success in the perspective of vendor companies. In developing the research model, we have focused on selecting the success factors appropriate for the vendors through reviewing past researches and on developing more accurate success measures. The variables can be classified into following: technological, organizational, and environmental factors on the basis of TOE (Technology-Organization-Environment) framework. The model consists of three independent variables (competition intensity, top management support, and information system maturity), one mediating variable (collaboration), one moderating variable (government support), and a dependent variable (system success). The systems success measures have been developed to reflect the operational benefits of the SCM systems; improvement in planning and analysis capabilities, faster throughput, cost reduction, task integration, and improved product and customer service. The model has been validated using the survey data collected from 122 vendors participating in the SCM systems in Korea. To test for mediation, one should estimate the hierarchical regression analysis on the collaboration. And moderating effect analysis should estimate the moderated multiple regression, examines the effect of the government support. The result shows that information system maturity and top management support are the most important determinants of SCM system success. Supply chain technologies that standardize data formats and enhance information sharing may be adopted by supply chain leader organization because of the influence of focal company in the private industrial networks in order to streamline transactions and improve inter-organization communication. Specially, the need to develop and sustain an information system maturity will provide the focus and purpose to successfully overcome information system obstacles and resistance to innovation diffusion within the supply chain network organization. The support of top management will help focus efforts toward the realization of inter-organizational benefits and lend credibility to functional managers responsible for its implementation. The active involvement, vision, and direction of high level executives provide the impetus needed to sustain the implementation of SCM. The quality of collaboration relationships also is positively related to outcome variable. Collaboration variable is found to have a mediation effect between on influencing factors and implementation success. Higher levels of inter-organizational collaboration behaviors such as shared planning and flexibility in coordinating activities were found to be strongly linked to the vendors trust in the supply chain network. Government support moderates the effect of the IS maturity, competitive intensity, top management support on collaboration and implementation success of SCM. In general, the vendor companies face substantially greater risks in SCM implementation than the larger companies do because of severe constraints on financial and human resources and limited education on SCM systems. Besides resources, Vendors generally lack computer experience and do not have sufficient internal SCM expertise. For these reasons, government supports may establish requirements for firms doing business with the government or provide incentives to adopt, implementation SCM or practices. Government support provides significant improvements in implementation success of SCM when IS maturity, competitive intensity, top management support and collaboration are low. The environmental characteristic of competition intensity has no direct effect on vendor perspective of SCM system success. But, vendors facing above average competition intensity will have a greater need for changing technology. This suggests that companies trying to implement SCM systems should set up compatible supply chain networks and a high-quality collaboration relationship for implementation and performance.