• Title/Summary/Keyword: business information

Search Result 14,464, Processing Time 0.04 seconds

A Study on the Prediction Model of Stock Price Index Trend based on GA-MSVM that Simultaneously Optimizes Feature and Instance Selection (입력변수 및 학습사례 선정을 동시에 최적화하는 GA-MSVM 기반 주가지수 추세 예측 모형에 관한 연구)

  • Lee, Jong-sik;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.4
    • /
    • pp.147-168
    • /
    • 2017
  • There have been many studies on accurate stock market forecasting in academia for a long time, and now there are also various forecasting models using various techniques. Recently, many attempts have been made to predict the stock index using various machine learning methods including Deep Learning. Although the fundamental analysis and the technical analysis method are used for the analysis of the traditional stock investment transaction, the technical analysis method is more useful for the application of the short-term transaction prediction or statistical and mathematical techniques. Most of the studies that have been conducted using these technical indicators have studied the model of predicting stock prices by binary classification - rising or falling - of stock market fluctuations in the future market (usually next trading day). However, it is also true that this binary classification has many unfavorable aspects in predicting trends, identifying trading signals, or signaling portfolio rebalancing. In this study, we try to predict the stock index by expanding the stock index trend (upward trend, boxed, downward trend) to the multiple classification system in the existing binary index method. In order to solve this multi-classification problem, a technique such as Multinomial Logistic Regression Analysis (MLOGIT), Multiple Discriminant Analysis (MDA) or Artificial Neural Networks (ANN) we propose an optimization model using Genetic Algorithm as a wrapper for improving the performance of this model using Multi-classification Support Vector Machines (MSVM), which has proved to be superior in prediction performance. In particular, the proposed model named GA-MSVM is designed to maximize model performance by optimizing not only the kernel function parameters of MSVM, but also the optimal selection of input variables (feature selection) as well as instance selection. In order to verify the performance of the proposed model, we applied the proposed method to the real data. The results show that the proposed method is more effective than the conventional multivariate SVM, which has been known to show the best prediction performance up to now, as well as existing artificial intelligence / data mining techniques such as MDA, MLOGIT, CBR, and it is confirmed that the prediction performance is better than this. Especially, it has been confirmed that the 'instance selection' plays a very important role in predicting the stock index trend, and it is confirmed that the improvement effect of the model is more important than other factors. To verify the usefulness of GA-MSVM, we applied it to Korea's real KOSPI200 stock index trend forecast. Our research is primarily aimed at predicting trend segments to capture signal acquisition or short-term trend transition points. The experimental data set includes technical indicators such as the price and volatility index (2004 ~ 2017) and macroeconomic data (interest rate, exchange rate, S&P 500, etc.) of KOSPI200 stock index in Korea. Using a variety of statistical methods including one-way ANOVA and stepwise MDA, 15 indicators were selected as candidate independent variables. The dependent variable, trend classification, was classified into three states: 1 (upward trend), 0 (boxed), and -1 (downward trend). 70% of the total data for each class was used for training and the remaining 30% was used for verifying. To verify the performance of the proposed model, several comparative model experiments such as MDA, MLOGIT, CBR, ANN and MSVM were conducted. MSVM has adopted the One-Against-One (OAO) approach, which is known as the most accurate approach among the various MSVM approaches. Although there are some limitations, the final experimental results demonstrate that the proposed model, GA-MSVM, performs at a significantly higher level than all comparative models.

The Changing Aspects of North Korea's Terror Crimes and Countermeasures : Focused on Power Conflict of High Ranking Officials after Kim Jong-IL Era (북한 테러범죄의 변화양상에 따른 대응방안 -김정일 정권 이후 고위층 권력 갈등을 중심으로)

  • Byoun, Chan-Ho;Kim, Eun-Jung
    • Korean Security Journal
    • /
    • no.39
    • /
    • pp.185-215
    • /
    • 2014
  • Since North Korea has used terror crime as a means of unification under communism against South Korea, South Korea has been much damaged until now. And the occurrence possibility of terror crime by North Korean authority is now higher than any other time. The North Korean terror crimes of Kim Il Sung era had been committed by the dictator's instruction with the object of securing governing fund. However, looking at the terror crimes committed for decades during Kim Jung Il authority, it is revealed that these terror crimes are expressed as a criminal behavior because of the conflict to accomplish the power and economic advantage non powerful groups target. This study focused on the power conflict in various causes of terror crimes by applying George B. Vold(1958)'s theory which explained power conflict between groups became a factor of crime, and found the aspect by ages of terror crime behavior by North Korean authority and responding plan to future North Korean terror crime. North Korean authority high-ranking officials were the Labor Party focusing on Juche Idea for decades in Kim Il Sung time. Afterwards, high-ranking officials were formed focusing on military authorities following Military First Policy at the beginning of Kim Jung Il authority, rapid power change has been done for recent 10 years. To arrange the aspect by times of terror crime following this power change, alienated party executives following the support of positive military first authority by Kim Jung Il after 1995 could not object to forcible terror crime behavior of military authority, and 1st, 2nd Yeongpyeong maritime war which happened this time was propelled by military first authority to show the power of military authority. After 2006, conservative party union enforced censorship and inspection on the trade business and foreign currency-earning of military authority while executing drastic purge. The shooting on Keumkangsan tourists that happened this time was a forcible terror crime by military authority following the pressure of conservative party. After October, 2008, first military reign union executed the launch of Gwanmyungsung No.2 long-range missile, second nuclear test, Daechung marine war, and Cheonanham attacking terror in order to highlight the importance and role of military authority. After September 2010, new reign union went through severe competition between new military authority and new mainstream and new military authority at this time executed highly professionalized terror crime such as cyber/electronic terror unlike past military authority. After July 2012, ICBM test launch, third nuclear test, cyber terror on Cheongwadae homepage of new mainstream association was the intention of Km Jung Eun to display his ability and check and adjust the power of party/military/cabinet/ public security organ, and he can attempt the unexpected terror crime in the future. North Korean terror crime has continued since 1980s when Kim Jung Il's power succession was carried out, and the power aspect by times has rapidly changed since 1994 when Kim Il Sung died and the terror crime became intense following the power combat between high-ranking officials and power conflict for right robbery. Now South Korea should install the specialized department which synthesizes and analyzes the information on North Korean high-ranking officials and reinforce the comprehensive information-collecting system through the protection and management of North Korean defectors and secret agents in order to determine the cause of North Korean terror crime and respond to it. And South Korea should participate positively in the international collaboration related to North Korean terror and make direct efforts to attract the international agreement to build the international cooperation for the response to North Korean terror crime. Also, we should try more to arrange the realistic countermeasure against North Korean cyber/electronic terror which was more diversified with the expertise terror escaping from existing forcible terror through enactment/revision of law related to cyber terror crime, organizing relevant institute and budget, training professional manpower, and technical development.

  • PDF

Differential Effects of Recovery Efforts on Products Attitudes (제품태도에 대한 회복노력의 차별적 효과)

  • Kim, Cheon-GIl;Choi, Jung-Mi
    • Journal of Global Scholars of Marketing Science
    • /
    • v.18 no.1
    • /
    • pp.33-58
    • /
    • 2008
  • Previous research has presupposed that the evaluation of consumer who received any recovery after experiencing product failure should be better than the evaluation of consumer who did not receive any recovery. The major purposes of this article are to examine impacts of product defect failures rather than service failures, and to explore effects of recovery on postrecovery product attitudes. First, this article deals with the occurrence of severe and unsevere failure and corresponding service recovery toward tangible products rather than intangible services. Contrary to intangible services, purchase and usage are separable for tangible products. This difference makes it clear that executing an recovery strategy toward tangible products is not plausible right after consumers find out product failures. The consumers may think about backgrounds and causes for the unpleasant events during the time gap between product failure and recovery. The deliberation may dilutes positive effects of recovery efforts. The recovery strategies which are provided to consumers experiencing product failures can be classified into three types. A recovery strategy can be implemented to provide consumers with a new product replacing the old defective product, a complimentary product for free, a discount at the time of the failure incident, or a coupon that can be used on the next visit. This strategy is defined as "a rewarding effort." Meanwhile a product failure may arise in exchange for its benefit. Then the product provider can suggest a detail explanation that the defect is hard to escape since it relates highly to the specific advantage to the product. The strategy may be called as "a strengthening effort." Another possible strategy is to recover negative attitude toward own brand by giving prominence to the disadvantages of a competing brand rather than the advantages of its own brand. The strategy is reflected as "a weakening effort." This paper emphasizes that, in order to confirm its effectiveness, a recovery strategy should be compared to being nothing done in response to the product failure. So the three types of recovery efforts is discussed in comparison to the situation involving no recovery effort. The strengthening strategy is to claim high relatedness of the product failure with another advantage, and expects the two-sidedness to ease consumers' complaints. The weakening strategy is to emphasize non-aversiveness of product failure, even if consumers choose another competitive brand. The two strategies can be effective in restoring to the original state, by providing plausible motives to accept the condition of product failure or by informing consumers of non-responsibility in the failure case. However the two may be less effective strategies than the rewarding strategy, since it tries to take care of the rehabilitation needs of consumers. Especially, the relative effect between the strengthening effort and the weakening effort may differ in terms of the severity of the product failure. A consumer who realizes a highly severe failure is likely to attach importance to the property which caused the failure. This implies that the strengthening effort would be less effective under the condition of high product severity. Meanwhile, the failing property is not diagnostic information in the condition of low failure severity. Consumers would not pay attention to non-diagnostic information, and with which they are not likely to change their attitudes. This implies that the strengthening effort would be more effective under the condition of low product severity. A 2 (product failure severity: high or low) X 4 (recovery strategies: rewarding, strengthening, weakening, or doing nothing) between-subjects design was employed. The particular levels of product failure severity and the types of recovery strategies were determined after a series of expert interviews. The dependent variable was product attitude after the recovery effort was provided. Subjects were 284 consumers who had an experience of cosmetics. Subjects were first given a product failure scenario and were asked to rate the comprehensibility of the failure scenario, the probability of raising complaints against the failure, and the subjective severity of the failure. After a recovery scenario was presented, its comprehensibility and overall evaluation were measured. The subjects assigned to the condition of no recovery effort were exposed to a short news article on the cosmetic industry. Next, subjects answered filler questions: 42 items of the need for cognitive closure and 16 items of need-to-evaluate. In the succeeding page a subject's product attitude was measured on an five-item, six-point scale, and a subject's repurchase intention on an three-item, six-point scale. After demographic variables of age and sex were asked, ten items of the subject's objective knowledge was checked. The results showed that the subjects formed more favorable evaluations after receiving rewarding efforts than after receiving either strengthening or weakening efforts. This is consistent with Hoffman, Kelley, and Rotalsky (1995) in that a tangible service recovery could be more effective that intangible efforts. Strengthening and weakening efforts also were effective compared to no recovery effort. So we found that generally any recovery increased products attitudes. The results hint us that a recovery strategy such as strengthening or weakening efforts, although it does not contain a specific reward, may have an effect on consumers experiencing severe unsatisfaction and strong complaint. Meanwhile, strengthening and weakening efforts were not expected to increase product attitudes under the condition of low severity of product failure. We can conclude that only a physical recovery effort may be recognized favorably as a firm's willingness to recover its fault by consumers experiencing low involvements. Results of the present experiment are explained in terms of the attribution theory. This article has a limitation that it utilized fictitious scenarios. Future research deserves to test a realistic effect of recovery for actual consumers. Recovery involves a direct, firsthand experience of ex-users. Recovery does not apply to non-users. The experience of receiving recovery efforts can be relatively more salient and accessible for the ex-users than for non-users. A recovery effort might be more likely to improve product attitude for the ex-users than for non-users. Also the present experiment did not include consumers who did not have an experience of the products and who did not perceive the occurrence of product failure. For the non-users and the ignorant consumers, the recovery efforts might lead to decreased product attitude and purchase intention. This is because the recovery trials may give an opportunity for them to notice the product failure.

  • PDF

Evaluating Reverse Logistics Networks with Centralized Centers : Hybrid Genetic Algorithm Approach (집중형센터를 가진 역물류네트워크 평가 : 혼합형 유전알고리즘 접근법)

  • Yun, YoungSu
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.4
    • /
    • pp.55-79
    • /
    • 2013
  • In this paper, we propose a hybrid genetic algorithm (HGA) approach to effectively solve the reverse logistics network with centralized centers (RLNCC). For the proposed HGA approach, genetic algorithm (GA) is used as a main algorithm. For implementing GA, a new bit-string representation scheme using 0 and 1 values is suggested, which can easily make initial population of GA. As genetic operators, the elitist strategy in enlarged sampling space developed by Gen and Chang (1997), a new two-point crossover operator, and a new random mutation operator are used for selection, crossover and mutation, respectively. For hybrid concept of GA, an iterative hill climbing method (IHCM) developed by Michalewicz (1994) is inserted into HGA search loop. The IHCM is one of local search techniques and precisely explores the space converged by GA search. The RLNCC is composed of collection centers, remanufacturing centers, redistribution centers, and secondary markets in reverse logistics networks. Of the centers and secondary markets, only one collection center, remanufacturing center, redistribution center, and secondary market should be opened in reverse logistics networks. Some assumptions are considered for effectively implementing the RLNCC The RLNCC is represented by a mixed integer programming (MIP) model using indexes, parameters and decision variables. The objective function of the MIP model is to minimize the total cost which is consisted of transportation cost, fixed cost, and handling cost. The transportation cost is obtained by transporting the returned products between each centers and secondary markets. The fixed cost is calculated by opening or closing decision at each center and secondary markets. That is, if there are three collection centers (the opening costs of collection center 1 2, and 3 are 10.5, 12.1, 8.9, respectively), and the collection center 1 is opened and the remainders are all closed, then the fixed cost is 10.5. The handling cost means the cost of treating the products returned from customers at each center and secondary markets which are opened at each RLNCC stage. The RLNCC is solved by the proposed HGA approach. In numerical experiment, the proposed HGA and a conventional competing approach is compared with each other using various measures of performance. For the conventional competing approach, the GA approach by Yun (2013) is used. The GA approach has not any local search technique such as the IHCM proposed the HGA approach. As measures of performance, CPU time, optimal solution, and optimal setting are used. Two types of the RLNCC with different numbers of customers, collection centers, remanufacturing centers, redistribution centers and secondary markets are presented for comparing the performances of the HGA and GA approaches. The MIP models using the two types of the RLNCC are programmed by Visual Basic Version 6.0, and the computer implementing environment is the IBM compatible PC with 3.06Ghz CPU speed and 1GB RAM on Windows XP. The parameters used in the HGA and GA approaches are that the total number of generations is 10,000, population size 20, crossover rate 0.5, mutation rate 0.1, and the search range for the IHCM is 2.0. Total 20 iterations are made for eliminating the randomness of the searches of the HGA and GA approaches. With performance comparisons, network representations by opening/closing decision, and convergence processes using two types of the RLNCCs, the experimental result shows that the HGA has significantly better performance in terms of the optimal solution than the GA, though the GA is slightly quicker than the HGA in terms of the CPU time. Finally, it has been proved that the proposed HGA approach is more efficient than conventional GA approach in two types of the RLNCC since the former has a GA search process as well as a local search process for additional search scheme, while the latter has a GA search process alone. For a future study, much more large-sized RLNCCs will be tested for robustness of our approach.

A Study on Developing a VKOSPI Forecasting Model via GARCH Class Models for Intelligent Volatility Trading Systems (지능형 변동성트레이딩시스템개발을 위한 GARCH 모형을 통한 VKOSPI 예측모형 개발에 관한 연구)

  • Kim, Sun-Woong
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.2
    • /
    • pp.19-32
    • /
    • 2010
  • Volatility plays a central role in both academic and practical applications, especially in pricing financial derivative products and trading volatility strategies. This study presents a novel mechanism based on generalized autoregressive conditional heteroskedasticity (GARCH) models that is able to enhance the performance of intelligent volatility trading systems by predicting Korean stock market volatility more accurately. In particular, we embedded the concept of the volatility asymmetry documented widely in the literature into our model. The newly developed Korean stock market volatility index of KOSPI 200, VKOSPI, is used as a volatility proxy. It is the price of a linear portfolio of the KOSPI 200 index options and measures the effect of the expectations of dealers and option traders on stock market volatility for 30 calendar days. The KOSPI 200 index options market started in 1997 and has become the most actively traded market in the world. Its trading volume is more than 10 million contracts a day and records the highest of all the stock index option markets. Therefore, analyzing the VKOSPI has great importance in understanding volatility inherent in option prices and can afford some trading ideas for futures and option dealers. Use of the VKOSPI as volatility proxy avoids statistical estimation problems associated with other measures of volatility since the VKOSPI is model-free expected volatility of market participants calculated directly from the transacted option prices. This study estimates the symmetric and asymmetric GARCH models for the KOSPI 200 index from January 2003 to December 2006 by the maximum likelihood procedure. Asymmetric GARCH models include GJR-GARCH model of Glosten, Jagannathan and Runke, exponential GARCH model of Nelson and power autoregressive conditional heteroskedasticity (ARCH) of Ding, Granger and Engle. Symmetric GARCH model indicates basic GARCH (1, 1). Tomorrow's forecasted value and change direction of stock market volatility are obtained by recursive GARCH specifications from January 2007 to December 2009 and are compared with the VKOSPI. Empirical results indicate that negative unanticipated returns increase volatility more than positive return shocks of equal magnitude decrease volatility, indicating the existence of volatility asymmetry in the Korean stock market. The point value and change direction of tomorrow VKOSPI are estimated and forecasted by GARCH models. Volatility trading system is developed using the forecasted change direction of the VKOSPI, that is, if tomorrow VKOSPI is expected to rise, a long straddle or strangle position is established. A short straddle or strangle position is taken if VKOSPI is expected to fall tomorrow. Total profit is calculated as the cumulative sum of the VKOSPI percentage change. If forecasted direction is correct, the absolute value of the VKOSPI percentage changes is added to trading profit. It is subtracted from the trading profit if forecasted direction is not correct. For the in-sample period, the power ARCH model best fits in a statistical metric, Mean Squared Prediction Error (MSPE), and the exponential GARCH model shows the highest Mean Correct Prediction (MCP). The power ARCH model best fits also for the out-of-sample period and provides the highest probability for the VKOSPI change direction tomorrow. Generally, the power ARCH model shows the best fit for the VKOSPI. All the GARCH models provide trading profits for volatility trading system and the exponential GARCH model shows the best performance, annual profit of 197.56%, during the in-sample period. The GARCH models present trading profits during the out-of-sample period except for the exponential GARCH model. During the out-of-sample period, the power ARCH model shows the largest annual trading profit of 38%. The volatility clustering and asymmetry found in this research are the reflection of volatility non-linearity. This further suggests that combining the asymmetric GARCH models and artificial neural networks can significantly enhance the performance of the suggested volatility trading system, since artificial neural networks have been shown to effectively model nonlinear relationships.

Efficient Topic Modeling by Mapping Global and Local Topics (전역 토픽의 지역 매핑을 통한 효율적 토픽 모델링 방안)

  • Choi, Hochang;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.69-94
    • /
    • 2017
  • Recently, increase of demand for big data analysis has been driving the vigorous development of related technologies and tools. In addition, development of IT and increased penetration rate of smart devices are producing a large amount of data. According to this phenomenon, data analysis technology is rapidly becoming popular. Also, attempts to acquire insights through data analysis have been continuously increasing. It means that the big data analysis will be more important in various industries for the foreseeable future. Big data analysis is generally performed by a small number of experts and delivered to each demander of analysis. However, increase of interest about big data analysis arouses activation of computer programming education and development of many programs for data analysis. Accordingly, the entry barriers of big data analysis are gradually lowering and data analysis technology being spread out. As the result, big data analysis is expected to be performed by demanders of analysis themselves. Along with this, interest about various unstructured data is continually increasing. Especially, a lot of attention is focused on using text data. Emergence of new platforms and techniques using the web bring about mass production of text data and active attempt to analyze text data. Furthermore, result of text analysis has been utilized in various fields. Text mining is a concept that embraces various theories and techniques for text analysis. Many text mining techniques are utilized in this field for various research purposes, topic modeling is one of the most widely used and studied. Topic modeling is a technique that extracts the major issues from a lot of documents, identifies the documents that correspond to each issue and provides identified documents as a cluster. It is evaluated as a very useful technique in that reflect the semantic elements of the document. Traditional topic modeling is based on the distribution of key terms across the entire document. Thus, it is essential to analyze the entire document at once to identify topic of each document. This condition causes a long time in analysis process when topic modeling is applied to a lot of documents. In addition, it has a scalability problem that is an exponential increase in the processing time with the increase of analysis objects. This problem is particularly noticeable when the documents are distributed across multiple systems or regions. To overcome these problems, divide and conquer approach can be applied to topic modeling. It means dividing a large number of documents into sub-units and deriving topics through repetition of topic modeling to each unit. This method can be used for topic modeling on a large number of documents with limited system resources, and can improve processing speed of topic modeling. It also can significantly reduce analysis time and cost through ability to analyze documents in each location or place without combining analysis object documents. However, despite many advantages, this method has two major problems. First, the relationship between local topics derived from each unit and global topics derived from entire document is unclear. It means that in each document, local topics can be identified, but global topics cannot be identified. Second, a method for measuring the accuracy of the proposed methodology should be established. That is to say, assuming that global topic is ideal answer, the difference in a local topic on a global topic needs to be measured. By those difficulties, the study in this method is not performed sufficiently, compare with other studies dealing with topic modeling. In this paper, we propose a topic modeling approach to solve the above two problems. First of all, we divide the entire document cluster(Global set) into sub-clusters(Local set), and generate the reduced entire document cluster(RGS, Reduced global set) that consist of delegated documents extracted from each local set. We try to solve the first problem by mapping RGS topics and local topics. Along with this, we verify the accuracy of the proposed methodology by detecting documents, whether to be discerned as the same topic at result of global and local set. Using 24,000 news articles, we conduct experiments to evaluate practical applicability of the proposed methodology. In addition, through additional experiment, we confirmed that the proposed methodology can provide similar results to the entire topic modeling. We also proposed a reasonable method for comparing the result of both methods.

Analysis of the Time-dependent Relation between TV Ratings and the Content of Microblogs (TV 시청률과 마이크로블로그 내용어와의 시간대별 관계 분석)

  • Choeh, Joon Yeon;Baek, Haedeuk;Choi, Jinho
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.1
    • /
    • pp.163-176
    • /
    • 2014
  • Social media is becoming the platform for users to communicate their activities, status, emotions, and experiences to other people. In recent years, microblogs, such as Twitter, have gained in popularity because of its ease of use, speed, and reach. Compared to a conventional web blog, a microblog lowers users' efforts and investment for content generation by recommending shorter posts. There has been a lot research into capturing the social phenomena and analyzing the chatter of microblogs. However, measuring television ratings has been given little attention so far. Currently, the most common method to measure TV ratings uses an electronic metering device installed in a small number of sampled households. Microblogs allow users to post short messages, share daily updates, and conveniently keep in touch. In a similar way, microblog users are interacting with each other while watching television or movies, or visiting a new place. In order to measure TV ratings, some features are significant during certain hours of the day, or days of the week, whereas these same features are meaningless during other time periods. Thus, the importance of features can change during the day, and a model capturing the time sensitive relevance is required to estimate TV ratings. Therefore, modeling time-related characteristics of features should be a key when measuring the TV ratings through microblogs. We show that capturing time-dependency of features in measuring TV ratings is vitally necessary for improving their accuracy. To explore the relationship between the content of microblogs and TV ratings, we collected Twitter data using the Get Search component of the Twitter REST API from January 2013 to October 2013. There are about 300 thousand posts in our data set for the experiment. After excluding data such as adverting or promoted tweets, we selected 149 thousand tweets for analysis. The number of tweets reaches its maximum level on the broadcasting day and increases rapidly around the broadcasting time. This result is stems from the characteristics of the public channel, which broadcasts the program at the predetermined time. From our analysis, we find that count-based features such as the number of tweets or retweets have a low correlation with TV ratings. This result implies that a simple tweet rate does not reflect the satisfaction or response to the TV programs. Content-based features extracted from the content of tweets have a relatively high correlation with TV ratings. Further, some emoticons or newly coined words that are not tagged in the morpheme extraction process have a strong relationship with TV ratings. We find that there is a time-dependency in the correlation of features between the before and after broadcasting time. Since the TV program is broadcast at the predetermined time regularly, users post tweets expressing their expectation for the program or disappointment over not being able to watch the program. The highly correlated features before the broadcast are different from the features after broadcasting. This result explains that the relevance of words with TV programs can change according to the time of the tweets. Among the 336 words that fulfill the minimum requirements for candidate features, 145 words have the highest correlation before the broadcasting time, whereas 68 words reach the highest correlation after broadcasting. Interestingly, some words that express the impossibility of watching the program show a high relevance, despite containing a negative meaning. Understanding the time-dependency of features can be helpful in improving the accuracy of TV ratings measurement. This research contributes a basis to estimate the response to or satisfaction with the broadcasted programs using the time dependency of words in Twitter chatter. More research is needed to refine the methodology for predicting or measuring TV ratings.

Ensemble Learning with Support Vector Machines for Bond Rating (회사채 신용등급 예측을 위한 SVM 앙상블학습)

  • Kim, Myoung-Jong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.29-45
    • /
    • 2012
  • Bond rating is regarded as an important event for measuring financial risk of companies and for determining the investment returns of investors. As a result, it has been a popular research topic for researchers to predict companies' credit ratings by applying statistical and machine learning techniques. The statistical techniques, including multiple regression, multiple discriminant analysis (MDA), logistic models (LOGIT), and probit analysis, have been traditionally used in bond rating. However, one major drawback is that it should be based on strict assumptions. Such strict assumptions include linearity, normality, independence among predictor variables and pre-existing functional forms relating the criterion variablesand the predictor variables. Those strict assumptions of traditional statistics have limited their application to the real world. Machine learning techniques also used in bond rating prediction models include decision trees (DT), neural networks (NN), and Support Vector Machine (SVM). Especially, SVM is recognized as a new and promising classification and regression analysis method. SVM learns a separating hyperplane that can maximize the margin between two categories. SVM is simple enough to be analyzed mathematical, and leads to high performance in practical applications. SVM implements the structuralrisk minimization principle and searches to minimize an upper bound of the generalization error. In addition, the solution of SVM may be a global optimum and thus, overfitting is unlikely to occur with SVM. In addition, SVM does not require too many data sample for training since it builds prediction models by only using some representative sample near the boundaries called support vectors. A number of experimental researches have indicated that SVM has been successfully applied in a variety of pattern recognition fields. However, there are three major drawbacks that can be potential causes for degrading SVM's performance. First, SVM is originally proposed for solving binary-class classification problems. Methods for combining SVMs for multi-class classification such as One-Against-One, One-Against-All have been proposed, but they do not improve the performance in multi-class classification problem as much as SVM for binary-class classification. Second, approximation algorithms (e.g. decomposition methods, sequential minimal optimization algorithm) could be used for effective multi-class computation to reduce computation time, but it could deteriorate classification performance. Third, the difficulty in multi-class prediction problems is in data imbalance problem that can occur when the number of instances in one class greatly outnumbers the number of instances in the other class. Such data sets often cause a default classifier to be built due to skewed boundary and thus the reduction in the classification accuracy of such a classifier. SVM ensemble learning is one of machine learning methods to cope with the above drawbacks. Ensemble learning is a method for improving the performance of classification and prediction algorithms. AdaBoost is one of the widely used ensemble learning techniques. It constructs a composite classifier by sequentially training classifiers while increasing weight on the misclassified observations through iterations. The observations that are incorrectly predicted by previous classifiers are chosen more often than examples that are correctly predicted. Thus Boosting attempts to produce new classifiers that are better able to predict examples for which the current ensemble's performance is poor. In this way, it can reinforce the training of the misclassified observations of the minority class. This paper proposes a multiclass Geometric Mean-based Boosting (MGM-Boost) to resolve multiclass prediction problem. Since MGM-Boost introduces the notion of geometric mean into AdaBoost, it can perform learning process considering the geometric mean-based accuracy and errors of multiclass. This study applies MGM-Boost to the real-world bond rating case for Korean companies to examine the feasibility of MGM-Boost. 10-fold cross validations for threetimes with different random seeds are performed in order to ensure that the comparison among three different classifiers does not happen by chance. For each of 10-fold cross validation, the entire data set is first partitioned into tenequal-sized sets, and then each set is in turn used as the test set while the classifier trains on the other nine sets. That is, cross-validated folds have been tested independently of each algorithm. Through these steps, we have obtained the results for classifiers on each of the 30 experiments. In the comparison of arithmetic mean-based prediction accuracy between individual classifiers, MGM-Boost (52.95%) shows higher prediction accuracy than both AdaBoost (51.69%) and SVM (49.47%). MGM-Boost (28.12%) also shows the higher prediction accuracy than AdaBoost (24.65%) and SVM (15.42%)in terms of geometric mean-based prediction accuracy. T-test is used to examine whether the performance of each classifiers for 30 folds is significantly different. The results indicate that performance of MGM-Boost is significantly different from AdaBoost and SVM classifiers at 1% level. These results mean that MGM-Boost can provide robust and stable solutions to multi-classproblems such as bond rating.

A Study of the Reactive Movement Synchronization for Analysis of Group Flow (그룹 몰입도 판단을 위한 움직임 동기화 연구)

  • Ryu, Joon Mo;Park, Seung-Bo;Kim, Jae Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.1
    • /
    • pp.79-94
    • /
    • 2013
  • Recently, the high value added business is steadily growing in the culture and art area. To generated high value from a performance, the satisfaction of audience is necessary. The flow in a critical factor for satisfaction, and it should be induced from audience and measures. To evaluate interest and emotion of audience on contents, producers or investors need a kind of index for the measurement of the flow. But it is neither easy to define the flow quantitatively, nor to collect audience's reaction immediately. The previous studies of the group flow were evaluated by the sum of the average value of each person's reaction. The flow or "good feeling" from each audience was extracted from his face, especially, the change of his (or her) expression and body movement. But it was not easy to handle the large amount of real-time data from each sensor signals. And also it was difficult to set experimental devices, in terms of economic and environmental problems. Because, all participants should have their own personal sensor to check their physical signal. Also each camera should be located in front of their head to catch their looks. Therefore we need more simple system to analyze group flow. This study provides the method for measurement of audiences flow with group synchronization at same time and place. To measure the synchronization, we made real-time processing system using the Differential Image and Group Emotion Analysis (GEA) system. Differential Image was obtained from camera and by the previous frame was subtracted from present frame. So the movement variation on audience's reaction was obtained. And then we developed a program, GEX(Group Emotion Analysis), for flow judgment model. After the measurement of the audience's reaction, the synchronization is divided as Dynamic State Synchronization and Static State Synchronization. The Dynamic State Synchronization accompanies audience's active reaction, while the Static State Synchronization means to movement of audience. The Dynamic State Synchronization can be caused by the audience's surprise action such as scary, creepy or reversal scene. And the Static State Synchronization was triggered by impressed or sad scene. Therefore we showed them several short movies containing various scenes mentioned previously. And these kind of scenes made them sad, clap, and creepy, etc. To check the movement of audience, we defined the critical point, ${\alpha}$and ${\beta}$. Dynamic State Synchronization was meaningful when the movement value was over critical point ${\beta}$, while Static State Synchronization was effective under critical point ${\alpha}$. ${\beta}$ is made by audience' clapping movement of 10 teams in stead of using average number of movement. After checking the reactive movement of audience, the percentage(%) ratio was calculated from the division of "people having reaction" by "total people". Total 37 teams were made in "2012 Seoul DMC Culture Open" and they involved the experiments. First, they followed induction to clap by staff. Second, basic scene for neutralize emotion of audience. Third, flow scene was displayed to audience. Forth, the reversal scene was introduced. And then 24 teams of them were provided with amuse and creepy scenes. And the other 10 teams were exposed with the sad scene. There were clapping and laughing action of audience on the amuse scene with shaking their head or hid with closing eyes. And also the sad or touching scene made them silent. If the results were over about 80%, the group could be judged as the synchronization and the flow were achieved. As a result, the audience showed similar reactions about similar stimulation at same time and place. Once we get an additional normalization and experiment, we can obtain find the flow factor through the synchronization on a much bigger group and this should be useful for planning contents.

Robo-Advisor Algorithm with Intelligent View Model (지능형 전망모형을 결합한 로보어드바이저 알고리즘)

  • Kim, Sunwoong
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.39-55
    • /
    • 2019
  • Recently banks and large financial institutions have introduced lots of Robo-Advisor products. Robo-Advisor is a Robot to produce the optimal asset allocation portfolio for investors by using the financial engineering algorithms without any human intervention. Since the first introduction in Wall Street in 2008, the market size has grown to 60 billion dollars and is expected to expand to 2,000 billion dollars by 2020. Since Robo-Advisor algorithms suggest asset allocation output to investors, mathematical or statistical asset allocation strategies are applied. Mean variance optimization model developed by Markowitz is the typical asset allocation model. The model is a simple but quite intuitive portfolio strategy. For example, assets are allocated in order to minimize the risk on the portfolio while maximizing the expected return on the portfolio using optimization techniques. Despite its theoretical background, both academics and practitioners find that the standard mean variance optimization portfolio is very sensitive to the expected returns calculated by past price data. Corner solutions are often found to be allocated only to a few assets. The Black-Litterman Optimization model overcomes these problems by choosing a neutral Capital Asset Pricing Model equilibrium point. Implied equilibrium returns of each asset are derived from equilibrium market portfolio through reverse optimization. The Black-Litterman model uses a Bayesian approach to combine the subjective views on the price forecast of one or more assets with implied equilibrium returns, resulting a new estimates of risk and expected returns. These new estimates can produce optimal portfolio by the well-known Markowitz mean-variance optimization algorithm. If the investor does not have any views on his asset classes, the Black-Litterman optimization model produce the same portfolio as the market portfolio. What if the subjective views are incorrect? A survey on reports of stocks performance recommended by securities analysts show very poor results. Therefore the incorrect views combined with implied equilibrium returns may produce very poor portfolio output to the Black-Litterman model users. This paper suggests an objective investor views model based on Support Vector Machines(SVM), which have showed good performance results in stock price forecasting. SVM is a discriminative classifier defined by a separating hyper plane. The linear, radial basis and polynomial kernel functions are used to learn the hyper planes. Input variables for the SVM are returns, standard deviations, Stochastics %K and price parity degree for each asset class. SVM output returns expected stock price movements and their probabilities, which are used as input variables in the intelligent views model. The stock price movements are categorized by three phases; down, neutral and up. The expected stock returns make P matrix and their probability results are used in Q matrix. Implied equilibrium returns vector is combined with the intelligent views matrix, resulting the Black-Litterman optimal portfolio. For comparisons, Markowitz mean-variance optimization model and risk parity model are used. The value weighted market portfolio and equal weighted market portfolio are used as benchmark indexes. We collect the 8 KOSPI 200 sector indexes from January 2008 to December 2018 including 132 monthly index values. Training period is from 2008 to 2015 and testing period is from 2016 to 2018. Our suggested intelligent view model combined with implied equilibrium returns produced the optimal Black-Litterman portfolio. The out of sample period portfolio showed better performance compared with the well-known Markowitz mean-variance optimization portfolio, risk parity portfolio and market portfolio. The total return from 3 year-period Black-Litterman portfolio records 6.4%, which is the highest value. The maximum draw down is -20.8%, which is also the lowest value. Sharpe Ratio shows the highest value, 0.17. It measures the return to risk ratio. Overall, our suggested view model shows the possibility of replacing subjective analysts's views with objective view model for practitioners to apply the Robo-Advisor asset allocation algorithms in the real trading fields.