• Title/Summary/Keyword: Decision Method

Search Result 5,492, Processing Time 0.037 seconds

Rough Set Analysis for Stock Market Timing (러프집합분석을 이용한 매매시점 결정)

  • Huh, Jin-Nyung;Kim, Kyoung-Jae;Han, In-Goo
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.3
    • /
    • pp.77-97
    • /
    • 2010
  • Market timing is an investment strategy which is used for obtaining excessive return from financial market. In general, detection of market timing means determining when to buy and sell to get excess return from trading. In many market timing systems, trading rules have been used as an engine to generate signals for trade. On the other hand, some researchers proposed the rough set analysis as a proper tool for market timing because it does not generate a signal for trade when the pattern of the market is uncertain by using the control function. The data for the rough set analysis should be discretized of numeric value because the rough set only accepts categorical data for analysis. Discretization searches for proper "cuts" for numeric data that determine intervals. All values that lie within each interval are transformed into same value. In general, there are four methods for data discretization in rough set analysis including equal frequency scaling, expert's knowledge-based discretization, minimum entropy scaling, and na$\ddot{i}$ve and Boolean reasoning-based discretization. Equal frequency scaling fixes a number of intervals and examines the histogram of each variable, then determines cuts so that approximately the same number of samples fall into each of the intervals. Expert's knowledge-based discretization determines cuts according to knowledge of domain experts through literature review or interview with experts. Minimum entropy scaling implements the algorithm based on recursively partitioning the value set of each variable so that a local measure of entropy is optimized. Na$\ddot{i}$ve and Booleanreasoning-based discretization searches categorical values by using Na$\ddot{i}$ve scaling the data, then finds the optimized dicretization thresholds through Boolean reasoning. Although the rough set analysis is promising for market timing, there is little research on the impact of the various data discretization methods on performance from trading using the rough set analysis. In this study, we compare stock market timing models using rough set analysis with various data discretization methods. The research data used in this study are the KOSPI 200 from May 1996 to October 1998. KOSPI 200 is the underlying index of the KOSPI 200 futures which is the first derivative instrument in the Korean stock market. The KOSPI 200 is a market value weighted index which consists of 200 stocks selected by criteria on liquidity and their status in corresponding industry including manufacturing, construction, communication, electricity and gas, distribution and services, and financing. The total number of samples is 660 trading days. In addition, this study uses popular technical indicators as independent variables. The experimental results show that the most profitable method for the training sample is the na$\ddot{i}$ve and Boolean reasoning but the expert's knowledge-based discretization is the most profitable method for the validation sample. In addition, the expert's knowledge-based discretization produced robust performance for both of training and validation sample. We also compared rough set analysis and decision tree. This study experimented C4.5 for the comparison purpose. The results show that rough set analysis with expert's knowledge-based discretization produced more profitable rules than C4.5.

Structural features and Diffusion Patterns of Gartner Hype Cycle for Artificial Intelligence using Social Network analysis (인공지능 기술에 관한 가트너 하이프사이클의 네트워크 집단구조 특성 및 확산패턴에 관한 연구)

  • Shin, Sunah;Kang, Juyoung
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.107-129
    • /
    • 2022
  • It is important to preempt new technology because the technology competition is getting much tougher. Stakeholders conduct exploration activities continuously for new technology preoccupancy at the right time. Gartner's Hype Cycle has significant implications for stakeholders. The Hype Cycle is a expectation graph for new technologies which is combining the technology life cycle (S-curve) with the Hype Level. Stakeholders such as R&D investor, CTO(Chef of Technology Officer) and technical personnel are very interested in Gartner's Hype Cycle for new technologies. Because high expectation for new technologies can bring opportunities to maintain investment by securing the legitimacy of R&D investment. However, contrary to the high interest of the industry, the preceding researches faced with limitations aspect of empirical method and source data(news, academic papers, search traffic, patent etc.). In this study, we focused on two research questions. The first research question was 'Is there a difference in the characteristics of the network structure at each stage of the hype cycle?'. To confirm the first research question, the structural characteristics of each stage were confirmed through the component cohesion size. The second research question is 'Is there a pattern of diffusion at each stage of the hype cycle?'. This research question was to be solved through centralization index and network density. The centralization index is a concept of variance, and a higher centralization index means that a small number of nodes are centered in the network. Concentration of a small number of nodes means a star network structure. In the network structure, the star network structure is a centralized structure and shows better diffusion performance than a decentralized network (circle structure). Because the nodes which are the center of information transfer can judge useful information and deliver it to other nodes the fastest. So we confirmed the out-degree centralization index and in-degree centralization index for each stage. For this purpose, we confirmed the structural features of the community and the expectation diffusion patterns using Social Network Serice(SNS) data in 'Gartner Hype Cycle for Artificial Intelligence, 2021'. Twitter data for 30 technologies (excluding four technologies) listed in 'Gartner Hype Cycle for Artificial Intelligence, 2021' were analyzed. Analysis was performed using R program (4.1.1 ver) and Cyram Netminer. From October 31, 2021 to November 9, 2021, 6,766 tweets were searched through the Twitter API, and converting the relationship user's tweet(Source) and user's retweets (Target). As a result, 4,124 edgelists were analyzed. As a reult of the study, we confirmed the structural features and diffusion patterns through analyze the component cohesion size and degree centralization and density. Through this study, we confirmed that the groups of each stage increased number of components as time passed and the density decreased. Also 'Innovation Trigger' which is a group interested in new technologies as a early adopter in the innovation diffusion theory had high out-degree centralization index and the others had higher in-degree centralization index than out-degree. It can be inferred that 'Innovation Trigger' group has the biggest influence, and the diffusion will gradually slow down from the subsequent groups. In this study, network analysis was conducted using social network service data unlike methods of the precedent researches. This is significant in that it provided an idea to expand the method of analysis when analyzing Gartner's hype cycle in the future. In addition, the fact that the innovation diffusion theory was applied to the Gartner's hype cycle's stage in artificial intelligence can be evaluated positively because the Gartner hype cycle has been repeatedly discussed as a theoretical weakness. Also it is expected that this study will provide a new perspective on decision-making on technology investment to stakeholdes.

An Exploratory Study on the Components of Visual Merchandising of Internet Shopping Mall (인터넷쇼핑몰의 VMD 구성요인에 대한 탐색적 연구)

  • Kim, Kwang-Seok;Shin, Jong-Kuk;Koo, Dong-Mo
    • Journal of Global Scholars of Marketing Science
    • /
    • v.18 no.2
    • /
    • pp.19-45
    • /
    • 2008
  • This study is to empirically examine the primary dimensions of visual merchandising (VMD) of internet shopping mall, namely store design, merchandise, and merchandising cues, to be a attractive virtual store to the shoppers. The authors reviewed the literature related to the major components of VMD from the perspective of the AIDA model, which has been mainly applied to the offline store settings. The major purposes of the study are as follows; first, tries to derive the variables related with the components of visual merchandising through reviewing the existing literatures, establish the hypotheses, and test it empirically. Second, examines the relationships between the components of VMD and the attitude toward the VMD, however, putting more emphasis on finding out the component structure of the VMD. VMD needs to be examined with the perspective that an online shopping mall is a virtual self-service or clerkless store, which could reduce the number of employees, help the shoppers search, evaluate and purchase for themselves, and to be explored in terms of the in-store persuasion processes of customers. This study reviewed the literatures related to store design, merchandise, and merchandising cues which might be relevant to the store, product, and promotion respectively. VMD is a total communication tool, and AIDA model could explain the in-store consumer behavior of online shopping. Store design has to do with triggering a consumer attention to the online mall, merchandise with a product related interest, and merchandising cues with promotions such as recommendation and links that induce the desire to pruchase. These three steps might be seen as the processes for purchase actions. The theoretical rationale for the relationship between VMD and AIDA could be found in Tyagi(2005) that the three steps of consumer-oriented merchandising are a store, a product assortment, and placement, in Omar(1999) that three types of interior display are a architectural design display, commodity display, and point-of-sales(POS) display, and in Davies and Ward(2005) that the retail store interior image is related to an atmosphere, merchandise, and in-store promotion. Lee et al(2000) suggested as the web merchandising components a merchandising cues, a shopping metaphor which is an assistant tool for search, a store design, a layout(web design), and a product assortment. The store design which includes differentiation, simplicity and navigation is supposed to be related to the attention to the virtual store. Second, the merchandise dimensions comprising product assortments, visual information and product reputation have to do with the interest in the product offerings. Finally, the merchandising cues that refer to merchandiser(MD)'s recommendation of products and providing the hyperlinks to relevant goods for the shopper is concerned with attempt to induce the desire to purchase. The questionnaire survey was carried out to collect the data about the consumers who would shop at internet shopping malls frequently. To select the subject malls, the mall ranking data announced by a mall rating agency was used to differentiate the most popular and least popular five mall each. The subjects was instructed to answer the questions after navigating the designated mall for five minutes. The 300 questionnaire was distributed to the consumers, 166 samples were used in the final analysis. The empirical testing focused on identifying and confirming the dimensionality of VMD and its subdimensions using a structural equation modeling method. The confirmatory factor analysis for the endogeneous and exogeneous variables was carried out in four parts. The second-order factor analysis was done for a store design, a merchandise, and a merchandising cues, and first-order confirmatory factor analysis for the attitude toward the VMD. The model test results shows that the chi-square value of structural equation is 144.39(d.f 49), significant at 0.01 level which means the proposed model was rejected. But, judging from the ratio of chi-square value vs. degree of freedom, the ratio was 2.94 which smaller than an acceptable level of 3.0, RMR is 0.087 which is higher than a generally acceptable level of 0.08. GFI and AGFI is turned out to be 0.90 and 0.84 respectively. Both NFI and NNFI is 0.94, and CFI 0.95. The major test results are as follows; first, the second-order factor analysis and structural equational modeling reveals that the differentiation, simplicity and ease of identifying current status of the transaction are confirmed to be subdimensions of store design and to be a significant predictors of the dependent variable. This result implies that when designing an online shopping mall, it is necessary to differentiate visually from other malls to improve the effectiveness of the communications of store design. That is, the differentiated store design raise the contrast stimulus to sensory organs to promote the memory of the store and to have a favorable attitude toward the VMD of a store. The results that navigation which means the easiness of identifying current status of shopping affects the attitude to VMD could be interpreted that the navigating processes via the hyperlinks which is characteristics of an internet shopping is a complex and cognitive process and shoppers are likely to lack the sense of overall structure of the store. Consequently, shoppers are likely to be alost amid shopping not knowing where to go. The orientation tool enhance the accessibility of information to raise the perceptive power about the store environment.(Titus & Everett 1995) Second, the primary dimension of merchandise and its subdimensions was confirmed to be unidimensional respectively, have a construct validity, and nomological validity which the VMD dimensions supposed to have a positive correlation with the dependent variable. The subdimensions of product assortment, brand fame and information provision proved to have a positive effect on the attitude toward the VMD. It could be interpreted that the more plentiful the product and brand assortment of the mall is, the more likely the shoppers to favor it. Brand fame and information provision as well affect the VMD attitude, which means that the more famous the brand, the more likely the shoppers would trust and feel familiar with the mall, and the plentifully and visually presented information could have the shopper have a favorable attitude toward the store VMD. Third, it turned out to be that merchandising cue of product recommendation and hyperlinks affect the VMD attitude. This could be interpreted that recommended products could reduce the uncertainty related with the purchase decision, and the hyperlinks to relevant products would help the shopper save the cognitive effort exerted into the information search and gathering, which could lead to a favorable attitude to the VMD. This study tried to sheds some new light on the VMD of online store by reviewing the variables mentioned to be relevant with offline VMD in the existing literatures, and tried to link the VMD components from the perspective of AIDA model. The effect size of the VMD dimensions on the attitude was in the order of the merchandise, the store design and the merchandising cues.It is said that an internet has an unlimited place for display, however, the virtual store is not unlimited since the consumer has a limited amount of cognitive ability to process the external information and internal memory. Particularly, the shoppers are likely to face some difficulties in decision making on account of too many alternative and information overloads. Therefore, the internet shopping mall manager should take into consideration the cost of information search on the part of the consumer, to establish the optimal product placements and search routes. An efficient store composition would be possible by reducing the psychological burdens and cognitive efforts exerted to information search and alternatives evaluation. The store image is in most part determined by the product category and its brand it deals in. The results of this study support this proposition that the merchandise is most important to the VMD attitude than other components, the manager is required to take a strategic approach to VMD. The internet users are getting more accustomed and more knowledgeable about the internet media and more likely to accept the internet as a shopping channel as the period of time during which they use the internet to shop become longer. The web merchandiser should be aware that the product introduction using a moving pictures and a bulletin board become more important in order to present the interactive product information visually and communicate with customers more actively, therefore leading to making the quantity and quality of product information more rich.

  • PDF

A Study on Interactions of Competitive Promotions Between the New and Used Cars (신차와 중고차간 프로모션의 상호작용에 대한 연구)

  • Chang, Kwangpil
    • Asia Marketing Journal
    • /
    • v.14 no.1
    • /
    • pp.83-98
    • /
    • 2012
  • In a market where new and used cars are competing with each other, we would run the risk of obtaining biased estimates of cross elasticity between them if we focus on only new cars or on only used cars. Unfortunately, most of previous studies on the automobile industry have focused on only new car models without taking into account the effect of used cars' pricing policy on new cars' market shares and vice versa, resulting in inadequate prediction of reactive pricing in response to competitors' rebate or price discount. However, there are some exceptions. Purohit (1992) and Sullivan (1990) looked into both new and used car markets at the same time to examine the effect of new car model launching on the used car prices. But their studies have some limitations in that they employed the average used car prices reported in NADA Used Car Guide instead of actual transaction prices. Some of the conflicting results may be due to this problem in the data. Park (1998) recognized this problem and used the actual prices in his study. His work is notable in that he investigated the qualitative effect of new car model launching on the pricing policy of the used car in terms of reinforcement of brand equity. The current work also used the actual price like Park (1998) but the quantitative aspect of competitive price promotion between new and used cars of the same model was explored. In this study, I develop a model that assumes that the cross elasticity between new and used cars of the same model is higher than those amongst new cars and used cars of the different model. Specifically, I apply the nested logit model that assumes the car model choice at the first stage and the choice between new and used cars at the second stage. This proposed model is compared to the IIA (Independence of Irrelevant Alternatives) model that assumes that there is no decision hierarchy but that new and used cars of the different model are all substitutable at the first stage. The data for this study are drawn from Power Information Network (PIN), an affiliate of J.D. Power and Associates. PIN collects sales transaction data from a sample of dealerships in the major metropolitan areas in the U.S. These are retail transactions, i.e., sales or leases to final consumers, excluding fleet sales and including both new car and used car sales. Each observation in the PIN database contains the transaction date, the manufacturer, model year, make, model, trim and other car information, the transaction price, consumer rebates, the interest rate, term, amount financed (when the vehicle is financed or leased), etc. I used data for the compact cars sold during the period January 2009- June 2009. The new and used cars of the top nine selling models are included in the study: Mazda 3, Honda Civic, Chevrolet Cobalt, Toyota Corolla, Hyundai Elantra, Ford Focus, Volkswagen Jetta, Nissan Sentra, and Kia Spectra. These models in the study accounted for 87% of category unit sales. Empirical application of the nested logit model showed that the proposed model outperformed the IIA (Independence of Irrelevant Alternatives) model in both calibration and holdout samples. The other comparison model that assumes choice between new and used cars at the first stage and car model choice at the second stage turned out to be mis-specfied since the dissimilarity parameter (i.e., inclusive or categroy value parameter) was estimated to be greater than 1. Post hoc analysis based on estimated parameters was conducted employing the modified Lanczo's iterative method. This method is intuitively appealing. For example, suppose a new car offers a certain amount of rebate and gains market share at first. In response to this rebate, a used car of the same model keeps decreasing price until it regains the lost market share to maintain the status quo. The new car settle down to a lowered market share due to the used car's reaction. The method enables us to find the amount of price discount to main the status quo and equilibrium market shares of the new and used cars. In the first simulation, I used Jetta as a focal brand to see how its new and used cars set prices, rebates or APR interactively assuming that reactive cars respond to price promotion to maintain the status quo. The simulation results showed that the IIA model underestimates cross elasticities, resulting in suggesting less aggressive used car price discount in response to new cars' rebate than the proposed nested logit model. In the second simulation, I used Elantra to reconfirm the result for Jetta and came to the same conclusion. In the third simulation, I had Corolla offer $1,000 rebate to see what could be the best response for Elantra's new and used cars. Interestingly, Elantra's used car could maintain the status quo by offering lower price discount ($160) than the new car ($205). In the future research, we might want to explore the plausibility of the alternative nested logit model. For example, the NUB model that assumes choice between new and used cars at the first stage and brand choice at the second stage could be a possibility even though it was rejected in the current study because of mis-specification (A dissimilarity parameter turned out to be higher than 1). The NUB model may have been rejected due to true mis-specification or data structure transmitted from a typical car dealership. In a typical car dealership, both new and used cars of the same model are displayed. Because of this fact, the BNU model that assumes brand choice at the first stage and choice between new and used cars at the second stage may have been favored in the current study since customers first choose a dealership (brand) then choose between new and used cars given this market environment. However, suppose there are dealerships that carry both new and used cars of various models, then the NUB model might fit the data as well as the BNU model. Which model is a better description of the data is an empirical question. In addition, it would be interesting to test a probabilistic mixture model of the BNU and NUB on a new data set.

  • PDF

Feasibility of Deep Learning Algorithms for Binary Classification Problems (이진 분류문제에서의 딥러닝 알고리즘의 활용 가능성 평가)

  • Kim, Kitae;Lee, Bomi;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.1
    • /
    • pp.95-108
    • /
    • 2017
  • Recently, AlphaGo which is Bakuk (Go) artificial intelligence program by Google DeepMind, had a huge victory against Lee Sedol. Many people thought that machines would not be able to win a man in Go games because the number of paths to make a one move is more than the number of atoms in the universe unlike chess, but the result was the opposite to what people predicted. After the match, artificial intelligence technology was focused as a core technology of the fourth industrial revolution and attracted attentions from various application domains. Especially, deep learning technique have been attracted as a core artificial intelligence technology used in the AlphaGo algorithm. The deep learning technique is already being applied to many problems. Especially, it shows good performance in image recognition field. In addition, it shows good performance in high dimensional data area such as voice, image and natural language, which was difficult to get good performance using existing machine learning techniques. However, in contrast, it is difficult to find deep leaning researches on traditional business data and structured data analysis. In this study, we tried to find out whether the deep learning techniques have been studied so far can be used not only for the recognition of high dimensional data but also for the binary classification problem of traditional business data analysis such as customer churn analysis, marketing response prediction, and default prediction. And we compare the performance of the deep learning techniques with that of traditional artificial neural network models. The experimental data in the paper is the telemarketing response data of a bank in Portugal. It has input variables such as age, occupation, loan status, and the number of previous telemarketing and has a binary target variable that records whether the customer intends to open an account or not. In this study, to evaluate the possibility of utilization of deep learning algorithms and techniques in binary classification problem, we compared the performance of various models using CNN, LSTM algorithm and dropout, which are widely used algorithms and techniques in deep learning, with that of MLP models which is a traditional artificial neural network model. However, since all the network design alternatives can not be tested due to the nature of the artificial neural network, the experiment was conducted based on restricted settings on the number of hidden layers, the number of neurons in the hidden layer, the number of output data (filters), and the application conditions of the dropout technique. The F1 Score was used to evaluate the performance of models to show how well the models work to classify the interesting class instead of the overall accuracy. The detail methods for applying each deep learning technique in the experiment is as follows. The CNN algorithm is a method that reads adjacent values from a specific value and recognizes the features, but it does not matter how close the distance of each business data field is because each field is usually independent. In this experiment, we set the filter size of the CNN algorithm as the number of fields to learn the whole characteristics of the data at once, and added a hidden layer to make decision based on the additional features. For the model having two LSTM layers, the input direction of the second layer is put in reversed position with first layer in order to reduce the influence from the position of each field. In the case of the dropout technique, we set the neurons to disappear with a probability of 0.5 for each hidden layer. The experimental results show that the predicted model with the highest F1 score was the CNN model using the dropout technique, and the next best model was the MLP model with two hidden layers using the dropout technique. In this study, we were able to get some findings as the experiment had proceeded. First, models using dropout techniques have a slightly more conservative prediction than those without dropout techniques, and it generally shows better performance in classification. Second, CNN models show better classification performance than MLP models. This is interesting because it has shown good performance in binary classification problems which it rarely have been applied to, as well as in the fields where it's effectiveness has been proven. Third, the LSTM algorithm seems to be unsuitable for binary classification problems because the training time is too long compared to the performance improvement. From these results, we can confirm that some of the deep learning algorithms can be applied to solve business binary classification problems.

The detection of collapsible airways contributing to airflow limitation (기류 제한에 영향을 미치는 허탈성 기도의 분석)

  • Kim, Yun Seong;Park, Byung Gyu;Lee, Kyong In;Son, Seok Man;Lee, Hyo Jin;Lee, Min Ki;Son, Choon Hee;Park, Soon Kew
    • Tuberculosis and Respiratory Diseases
    • /
    • v.43 no.4
    • /
    • pp.558-570
    • /
    • 1996
  • Background : The detection of Collapsible airways has important therapeutic implications in chronic airway disease and bronchial asthma. The distinction of a purely collapsible airways disease from that of asthma is important because the treatment of the dormer may include the use of pursed lip breathing or nasal positive pressure ventilation whereas in the latter, pharmacologic approaches are used. One form of irreversible airflow limitation is collapsible airways, which has been shown to be a Component of asthma or to emphysema, it can be assessed by the volume difference between what exits the lung as determined by a spirometer and the volume compressed as measured by the plethysmography. Method : To investigate whether volume difference between slow and forced vital Capacity(SVC-FVC) by spirometry may be used as a surrogate index of airway collapse, we examined pulmonary function parameters before and after bronchodilator agent inhalation by spirometry and body plethysmography in 20 cases of patients with evidence of airflow limitation(chronic obstructive pulmonary disease 12 cases, stable bronchial asthma 7 cases, combined chronic obstructive pulmonary disease with asthma 1 case) and 20 cases of normal subjects without evidence of airflow limitation referred to the Pusan National University Hospital pulmonary function laboratory from January 1995 to July 1995 prospectively. Results : 1) Average and standard deviation of age, height, weight of patients with airflow limitation was $58.3{\pm}7.24$(yr), $166{\pm}8.0$(cm), $59.0{\pm}9.9$(kg) and those of normal subjects was $56.3{\pm}12.47$(yr), $165.9{\pm}6.9$(cm), $64.4{\pm}10.4$(kg), respectively. The differences of physical characteristics of both group were not significant statistically and male to female ratio was 14:6 in both groups. 2) The difference between slow vital capacity and forced vital capacity was $395{\pm}317ml$ in patients group and $154{\pm}176ml$ in normal group and there was statistically significance between two groups(p<0.05). Sensitivity and specificity were most higher when the cut-off value was 208ml. 3) After bronchodilator inhalation, reversible airway obstructions were shown in 16 cases of patients group, 7 cases of control group(p<0.05) by spirometry or body plethysmography d the differences of slow vital capacity and forced vital capacity in bronchodilator response group and nonresponse group were $300.4{\pm}306ml$, $144.7{\pm}180ml$ and this difference was statistically significant. 4) The difference between slow vital capacity and forced vital capacity before bronchodilator inhalation was correlated with airway resistance before bronchodilator(r=0.307 p=0.05), and the difference between slow vital capacity and forced vital capacity after bronchodilator was correlated with difference between slow vital capacity and forced vital capacity(r=0.559 p=0.0002), thoracic gas volume(r=0.488 p=0.002) before bronchodilator and airway resistance(r=0.583 p=0.0001), thoracic gas volume(r=0.375 p=0.0170) after bronchodilator, respectively. 5) The difference between slow vital capacity and forced vital capacity in smokers and nonsmokers was $257.5{\pm}303ml$, $277.5{\pm}276ml$, respectively and this difference did not reach statistical significance(p>0.05). Conclusion : The difference between slow vital capacity and forced vital capacity by spirometry may be useful for the detection of collapsible airway and may help decision making of therapeutic plans.

  • PDF

A Study on Market Size Estimation Method by Product Group Using Word2Vec Algorithm (Word2Vec을 활용한 제품군별 시장규모 추정 방법에 관한 연구)

  • Jung, Ye Lim;Kim, Ji Hui;Yoo, Hyoung Sun
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.1-21
    • /
    • 2020
  • With the rapid development of artificial intelligence technology, various techniques have been developed to extract meaningful information from unstructured text data which constitutes a large portion of big data. Over the past decades, text mining technologies have been utilized in various industries for practical applications. In the field of business intelligence, it has been employed to discover new market and/or technology opportunities and support rational decision making of business participants. The market information such as market size, market growth rate, and market share is essential for setting companies' business strategies. There has been a continuous demand in various fields for specific product level-market information. However, the information has been generally provided at industry level or broad categories based on classification standards, making it difficult to obtain specific and proper information. In this regard, we propose a new methodology that can estimate the market sizes of product groups at more detailed levels than that of previously offered. We applied Word2Vec algorithm, a neural network based semantic word embedding model, to enable automatic market size estimation from individual companies' product information in a bottom-up manner. The overall process is as follows: First, the data related to product information is collected, refined, and restructured into suitable form for applying Word2Vec model. Next, the preprocessed data is embedded into vector space by Word2Vec and then the product groups are derived by extracting similar products names based on cosine similarity calculation. Finally, the sales data on the extracted products is summated to estimate the market size of the product groups. As an experimental data, text data of product names from Statistics Korea's microdata (345,103 cases) were mapped in multidimensional vector space by Word2Vec training. We performed parameters optimization for training and then applied vector dimension of 300 and window size of 15 as optimized parameters for further experiments. We employed index words of Korean Standard Industry Classification (KSIC) as a product name dataset to more efficiently cluster product groups. The product names which are similar to KSIC indexes were extracted based on cosine similarity. The market size of extracted products as one product category was calculated from individual companies' sales data. The market sizes of 11,654 specific product lines were automatically estimated by the proposed model. For the performance verification, the results were compared with actual market size of some items. The Pearson's correlation coefficient was 0.513. Our approach has several advantages differing from the previous studies. First, text mining and machine learning techniques were applied for the first time on market size estimation, overcoming the limitations of traditional sampling based- or multiple assumption required-methods. In addition, the level of market category can be easily and efficiently adjusted according to the purpose of information use by changing cosine similarity threshold. Furthermore, it has a high potential of practical applications since it can resolve unmet needs for detailed market size information in public and private sectors. Specifically, it can be utilized in technology evaluation and technology commercialization support program conducted by governmental institutions, as well as business strategies consulting and market analysis report publishing by private firms. The limitation of our study is that the presented model needs to be improved in terms of accuracy and reliability. The semantic-based word embedding module can be advanced by giving a proper order in the preprocessed dataset or by combining another algorithm such as Jaccard similarity with Word2Vec. Also, the methods of product group clustering can be changed to other types of unsupervised machine learning algorithm. Our group is currently working on subsequent studies and we expect that it can further improve the performance of the conceptually proposed basic model in this study.

Herbicidal Phytotoxicity under Adverse Environments and Countermeasures (불량환경하(不良環境下)에서의 제초제(除草劑) 약해(藥害)와 경감기술(輕減技術))

  • Kwon, Y.W.;Hwang, H.S.;Kang, B.H.
    • Korean Journal of Weed Science
    • /
    • v.13 no.4
    • /
    • pp.210-233
    • /
    • 1993
  • The herbicide has become indispensable as much as nitrogen fertilizer in Korean agriculture from 1970 onwards. It is estimated that in 1991 more than 40 herbicides were registered for rice crop and treated to an area 1.41 times the rice acreage ; more than 30 herbicides were registered for field crops and treated to 89% of the crop area ; the treatment acreage of 3 non-selective foliar-applied herbicides reached 2,555 thousand hectares. During the last 25 years herbicides have benefited the Korean farmers substantially in labor, cost and time of farming. Any herbicide which causes crop injury in ordinary uses is not allowed to register in most country. Herbicides, however, can cause crop injury more or less when they are misused, abused or used under adverse environments. The herbicide use more than 100% of crop acreage means an increased probability of which herbicides are used wrong or under adverse situation. This is true as evidenced by that about 25% of farmers have experienced the herbicide caused crop injury more than once during last 10 years on authors' nationwide surveys in 1992 and 1993 ; one-half of the injury incidences were with crop yield loss greater than 10%. Crop injury caused by herbicide had not occurred to a serious extent in the 1960s when the herbicides fewer than 5 were used by farmers to the field less than 12% of total acreage. Farmers ascribed about 53% of the herbicidal injury incidences at their fields to their misuses such as overdose, careless or improper application, off-time application or wrong choice of the herbicide, etc. While 47% of the incidences were mainly due to adverse natural conditions. Such misuses can be reduced to a minimum through enhanced education/extension services for right uses and, although undesirable, increased farmers' experiences of phytotoxicity. The most difficult primary problem arises from lack of countermeasures for farmers to cope with various adverse environmental conditions. At present almost all the herbicides have"Do not use!" instructions on label to avoid crop injury under adverse environments. These "Do not use!" situations Include sandy, highly percolating, or infertile soils, cool water gushing paddy, poorly draining paddy, terraced paddy, too wet or dry soils, days of abnormally cool or high air temperature, etc. Meanwhile, the cultivated lands are under poor conditions : the average organic matter content ranges 2.5 to 2.8% in paddy soil and 2.0 to 2.6% in upland soil ; the canon exchange capacity ranges 8 to 12 m.e. ; approximately 43% of paddy and 56% of upland are of sandy to sandy gravel soil ; only 42% of paddy and 16% of upland fields are on flat land. The present situation would mean that about 40 to 50% of soil applied herbicides are used on the field where the label instructs "Do not use!". Yet no positive effort has been made for 25 years long by government or companies to develop countermeasures. It is a really sophisticated social problem. In the 1960s and 1970s a subside program to incoporate hillside red clayish soil into sandy paddy as well as campaign for increased application of compost to the field had been operating. Yet majority of the sandy soils remains sandy and the program and campaign had been stopped. With regard to this sandy soil problem the authors have developed a method of "split application of a herbicide onto sandy soil field". A model case study has been carried out with success and is introduced with key procedure in this paper. Climate is variable in its nature. Among the climatic components sudden fall or rise in temperature is hardly avoidable for a crop plant. Our spring air temperature fluctuates so much ; for example, the daily mean air temperature of Inchon city varied from 6.31 to $16.81^{\circ}C$ on April 20, early seeding time of crops, within${\times}$2Sd range of 30 year records. Seeding early in season means an increased liability to phytotoxicity, and this will be more evident in direct water-seeding of rice. About 20% of farmers depend on the cold underground-water pumped for rice irrigation. If the well is deep over 70m, the fresh water may be about $10^{\circ}C$ cold. The water should be warmed to about $20^{\circ}C$ before irrigation. This is not so practiced well by farmers. In addition to the forementioned adverse conditions there exist many other aspects to be amended. Among them the worst for liquid spray type herbicides is almost total lacking in proper knowledge of nozzle types and concern with even spray by the administrative, rural extension officers, company and farmers. Even not available in the market are the nozzles and sprayers appropriate for herbicides spray. Most people perceive all the pesticide sprayers same and concern much with the speed and easiness of spray, not with correct spray. There exist many points to be improved to minimize herbicidal phytotoxicity in Korea and many ways to achieve the goal. First of all it is suggested that 1) the present evaluation of a new herbicide at standard and double doses in registration trials is to be an evaluation for standard, double and triple doses to exploit the response slope in making decision for approval and recommendation of different dose for different situation on label, 2) the government is to recognize the facts and nature of the present problem to correct the present misperceptions and to develop an appropriate national program for improvement of soil conditions, spray equipment, extention manpower and services, 3) the researchers are to enhance researches on the countermeasures and 4) the herbicide makers/dealers are to correct their misperceptions and policy for sales, to develop database on the detailed use conditions of consumer one by one and to serve the consumers with direct counsel based on the database.

  • PDF

A New Exploratory Research on Franchisor's Provision of Exclusive Territories (가맹본부의 배타적 영업지역보호에 대한 탐색적 연구)

  • Lim, Young-Kyun;Lee, Su-Dong;Kim, Ju-Young
    • Journal of Distribution Research
    • /
    • v.17 no.1
    • /
    • pp.37-63
    • /
    • 2012
  • In franchise business, exclusive sales territory (sometimes EST in table) protection is a very important issue from an economic, social and political point of view. It affects the growth and survival of both franchisor and franchisee and often raises issues of social and political conflicts. When franchisee is not familiar with related laws and regulations, franchisor has high chance to utilize it. Exclusive sales territory protection by the manufacturer and distributors (wholesalers or retailers) means sales area restriction by which only certain distributors have right to sell products or services. The distributor, who has been granted exclusive sales territories, can protect its own territory, whereas he may be prohibited from entering in other regions. Even though exclusive sales territory is a quite critical problem in franchise business, there is not much rigorous research about the reason, results, evaluation, and future direction based on empirical data. This paper tries to address this problem not only from logical and nomological validity, but from empirical validation. While we purse an empirical analysis, we take into account the difficulties of real data collection and statistical analysis techniques. We use a set of disclosure document data collected by Korea Fair Trade Commission, instead of conventional survey method which is usually criticized for its measurement error. Existing theories about exclusive sales territory can be summarized into two groups as shown in the table below. The first one is about the effectiveness of exclusive sales territory from both franchisor and franchisee point of view. In fact, output of exclusive sales territory can be positive for franchisors but negative for franchisees. Also, it can be positive in terms of sales but negative in terms of profit. Therefore, variables and viewpoints should be set properly. The other one is about the motive or reason why exclusive sales territory is protected. The reasons can be classified into four groups - industry characteristics, franchise systems characteristics, capability to maintain exclusive sales territory, and strategic decision. Within four groups of reasons, there are more specific variables and theories as below. Based on these theories, we develop nine hypotheses which are briefly shown in the last table below with the results. In order to validate the hypothesis, data is collected from government (FTC) homepage which is open source. The sample consists of 1,896 franchisors and it contains about three year operation data, from 2006 to 2008. Within the samples, 627 have exclusive sales territory protection policy and the one with exclusive sales territory policy is not evenly distributed over 19 representative industries. Additional data are also collected from another government agency homepage, like Statistics Korea. Also, we combine data from various secondary sources to create meaningful variables as shown in the table below. All variables are dichotomized by mean or median split if they are not inherently dichotomized by its definition, since each hypothesis is composed by multiple variables and there is no solid statistical technique to incorporate all these conditions to test the hypotheses. This paper uses a simple chi-square test because hypotheses and theories are built upon quite specific conditions such as industry type, economic condition, company history and various strategic purposes. It is almost impossible to find all those samples to satisfy them and it can't be manipulated in experimental settings. However, more advanced statistical techniques are very good on clean data without exogenous variables, but not good with real complex data. The chi-square test is applied in a way that samples are grouped into four with two criteria, whether they use exclusive sales territory protection or not, and whether they satisfy conditions of each hypothesis. So the proportion of sample franchisors which satisfy conditions and protect exclusive sales territory, does significantly exceed the proportion of samples that satisfy condition and do not protect. In fact, chi-square test is equivalent with the Poisson regression which allows more flexible application. As results, only three hypotheses are accepted. When attitude toward the risk is high so loyalty fee is determined according to sales performance, EST protection makes poor results as expected. And when franchisor protects EST in order to recruit franchisee easily, EST protection makes better results. Also, when EST protection is to improve the efficiency of franchise system as a whole, it shows better performances. High efficiency is achieved as EST prohibits the free riding of franchisee who exploits other's marketing efforts, and it encourages proper investments and distributes franchisee into multiple regions evenly. Other hypotheses are not supported in the results of significance testing. Exclusive sales territory should be protected from proper motives and administered for mutual benefits. Legal restrictions driven by the government agency like FTC could be misused and cause mis-understandings. So there need more careful monitoring on real practices and more rigorous studies by both academicians and practitioners.

  • PDF

Recognition and attitude to functional division between physicians and pharmacists of practising physicians and pharmacists in Taegu city (대구시 개원의사와 개국약사의 의약분업에 대한 인식과 태도)

  • Lee, Moo-Sik;Yoon, Nung-Ki;Suh, Suk-Kwon;Park, Jae-Yong
    • Journal of Preventive Medicine and Public Health
    • /
    • v.26 no.1 s.41
    • /
    • pp.1-19
    • /
    • 1993
  • Mail questionnaire was administrated to 370 practising physicians and 388 pharmacists in Taegu city selected by systematic sampling to examine utilization states and opinion of pharmacy under medical care insurance programme and the attitude to the functional division between physicians and pharmacists from April to May 1992. Regarding the opinion on the outcome of drug-store under medical insurance, 71.2 percent of practicing physician answered faliure but 13.4 percent of practicing pharmacists answered failure in contrast. Fifty percent of practicing physician asserted introducing functional division between physician and pharmacist while 66.9 percent of practicing pharmacist answered drug-store under medical insurance itself is sucessful programme. Average daily numbers of preparation of medicine was 32.2 case. Percentage of utilization of drug-store under medical issurance to average daily cases of preparing of medicine was 20 percent, percentage of utilization with physician's prescription was 0.7 percent. And 58.7 percent of practicing physician experienced outside the institute prescription. Regarding the opinion on the pros and cons of enforcing functional division between physician and pharmacist, 59.2 percent of practicing physician prefered pros and 17.7 percent cons, but 38 percent of practicing pharmacist prefered pros and 45.5 percent cons. And pharmacist knew better the content of functional division between physician and pharmacist than physician. As a reason for pros of enforcing functional division between physician and pharmacist, practicing physician emphasized to prevent misuse or abuse of medicine but practicing pharmacist emphasized to display physician and pharmacist's professional ability. And as an opinion on implementation style of functional division between physician and pharmacist in pros respondents, practicing physician favored mandatory enforcement (52.3%), while practicing pharmacist favored partial incomplete functional division (81.7%). As the method of prescription if functional division between physician and pharmacist will be enforced, both practicing physician and pharmacist prefered generic name (44.0%, 89%) mostly, but physician prefered brand name (35.3%) secondly. Regarding the reason for not implementing functional division between physician and pharmacist up to date, both physician and pharmacist answered problem of business right between physician and pharmacist, followed by lack of recognition, and interest of people and lack of the govermental willness. Regarding the opinion on prior decision of condition for enforcing functional division between physician and pharmacist, practicing physician and pharmacist named uneven distribution of medical facilities and drug-store between rural and urban, inequality of physician and pharmacist manpower and the problem of manpower demand and supply mostly, and practicing physician pointed out establishing attitude of acceptance on the part of pharmacist and practicing pharmacist favored establishing attitude of acceptance on the part of physician, which was different attitudes between physician and pharmacist. Following conclusion was reached ; 1. Current drug-store under medical insurance program yield insufficient outcome, so we should consider program conversion from drug-store under medical insurance program to functional division between physician and pharmacist. 2. There were problem of business right and conflicts between physician and pharmacist at enforcing functional division between physician and pharmacist, so the goverment should search for formulating plan to resolve the problem and have neutral willness for the protection of the national health.

  • PDF