• Title/Summary/Keyword: Traditional techniques

Search Result 1,650, Processing Time 0.026 seconds

A Study on the Effect of Using Sentiment Lexicon in Opinion Classification (오피니언 분류의 감성사전 활용효과에 대한 연구)

  • Kim, Seungwoo;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.1
    • /
    • pp.133-148
    • /
    • 2014
  • Recently, with the advent of various information channels, the number of has continued to grow. The main cause of this phenomenon can be found in the significant increase of unstructured data, as the use of smart devices enables users to create data in the form of text, audio, images, and video. In various types of unstructured data, the user's opinion and a variety of information is clearly expressed in text data such as news, reports, papers, and various articles. Thus, active attempts have been made to create new value by analyzing these texts. The representative techniques used in text analysis are text mining and opinion mining. These share certain important characteristics; for example, they not only use text documents as input data, but also use many natural language processing techniques such as filtering and parsing. Therefore, opinion mining is usually recognized as a sub-concept of text mining, or, in many cases, the two terms are used interchangeably in the literature. Suppose that the purpose of a certain classification analysis is to predict a positive or negative opinion contained in some documents. If we focus on the classification process, the analysis can be regarded as a traditional text mining case. However, if we observe that the target of the analysis is a positive or negative opinion, the analysis can be regarded as a typical example of opinion mining. In other words, two methods (i.e., text mining and opinion mining) are available for opinion classification. Thus, in order to distinguish between the two, a precise definition of each method is needed. In this paper, we found that it is very difficult to distinguish between the two methods clearly with respect to the purpose of analysis and the type of results. We conclude that the most definitive criterion to distinguish text mining from opinion mining is whether an analysis utilizes any kind of sentiment lexicon. We first established two prediction models, one based on opinion mining and the other on text mining. Next, we compared the main processes used by the two prediction models. Finally, we compared their prediction accuracy. We then analyzed 2,000 movie reviews. The results revealed that the prediction model based on opinion mining showed higher average prediction accuracy compared to the text mining model. Moreover, in the lift chart generated by the opinion mining based model, the prediction accuracy for the documents with strong certainty was higher than that for the documents with weak certainty. Most of all, opinion mining has a meaningful advantage in that it can reduce learning time dramatically, because a sentiment lexicon generated once can be reused in a similar application domain. Additionally, the classification results can be clearly explained by using a sentiment lexicon. This study has two limitations. First, the results of the experiments cannot be generalized, mainly because the experiment is limited to a small number of movie reviews. Additionally, various parameters in the parsing and filtering steps of the text mining may have affected the accuracy of the prediction models. However, this research contributes a performance and comparison of text mining analysis and opinion mining analysis for opinion classification. In future research, a more precise evaluation of the two methods should be made through intensive experiments.

Analysis of Gwonbeop(拳法) in traditional martial arts literature (전통무예서의 권법 분석)

  • Kwak, Nak-hyun;Lim, Tae-hee
    • (The)Study of the Eastern Classic
    • /
    • no.54
    • /
    • pp.289-318
    • /
    • 2014
  • The purpose of this study was to compare Gwonbeop motions among "Gihyosinseo", "Mubiji", "Muyejaebobunyuksokjib" and to analysis catalogue of books on Gwonbeop for comprehensive interpretation in "Muyedobotongji". the conclusion through literature review was as follows. First, There were total 16 references which were composed of 14 references of China and 2 references of Korea. In particular there was no reference of Japan for Gwonbeop. In detail, unrepresentative references of China were "Hanseo", "Gihyosinseo", "Mubiji" and unrepresentative reference of Korea was Muyejaebobunyuksokjib". Second, We compared motions of Gwonbeop between China and Korea. There were located 5 motions such as Gyungakguheese(False Prey Posture), Gigose(Flag Beating Posture), Ahnshichukshinse(Goose Wing Posture), Jumjoose(Picking Elbow Posture), Pogase(Throwing Shelf Postere). In other three references unmentioned "Mubiji" there were located 8 motions such as Jungranse(Spring Railing Posture), Gichukgakse(Ghost Kicking and Leg Striking Postere), Jidangse(Open Finger Attacking Posture), Soodoose(Beast Head Shield Posture) Shingwon(Heavenly Fist Posture), Il-jopyunse(Whipping Lunging Posture), Jakjiryongse(Dragon Prey Snatching Posture), Joyangsoose(Slanting Hero Hand Posture), In two references of Korea there were located 2 unique motions such as Nachaluichoolmun Gakabyunhase(Low Encountering Posture), Gumgyedongnipse(Single Leg Throwing Postere). Third, Most of all we found two kinds of unique motions such as Chukcheonse and Eungswaeik on "Muyejaebobunyuksokjib" and such as Nachaluichoolmun Gakabyunhase(Low Encountering Posture), Gumgyedongnipse (Single Leg Throwing Postere) on "Muyedobotongji". Based on chronological table although "Gihyosinseo" is the longest literature, there was begun changing techniques in details on literatures of Korea. Transformed into techniques of Gwonbeop on Korea could be supposed that those skills were reflected in society and culture of the Joseon Dynasty. To sum it up, Gwonbeop of "Muyedobotongji" was written by "Gihyosinseo", "Mubiji", "Muyejaebobunyuksokjib" but most motions of Gwonbeop were begun to change gradually except 5 motions of "Gihyosinseo". Especially, there were 8 unique motions which could not be found in references of China. Those unique motions of Korea literatures were living proof of attempting transfiguration from motions of China. The significance of this study was to be able to put stepping-stone to interpretate history of Taekwondo which takes center stage on bare hands martial arts and analyzed the meaning of historical martial arts on Gwonbeop in Joseon Dynasty.

Semantic Visualization of Dynamic Topic Modeling (다이내믹 토픽 모델링의 의미적 시각화 방법론)

  • Yeon, Jinwook;Boo, Hyunkyung;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.131-154
    • /
    • 2022
  • Recently, researches on unstructured data analysis have been actively conducted with the development of information and communication technology. In particular, topic modeling is a representative technique for discovering core topics from massive text data. In the early stages of topic modeling, most studies focused only on topic discovery. As the topic modeling field matured, studies on the change of the topic according to the change of time began to be carried out. Accordingly, interest in dynamic topic modeling that handle changes in keywords constituting the topic is also increasing. Dynamic topic modeling identifies major topics from the data of the initial period and manages the change and flow of topics in a way that utilizes topic information of the previous period to derive further topics in subsequent periods. However, it is very difficult to understand and interpret the results of dynamic topic modeling. The results of traditional dynamic topic modeling simply reveal changes in keywords and their rankings. However, this information is insufficient to represent how the meaning of the topic has changed. Therefore, in this study, we propose a method to visualize topics by period by reflecting the meaning of keywords in each topic. In addition, we propose a method that can intuitively interpret changes in topics and relationships between or among topics. The detailed method of visualizing topics by period is as follows. In the first step, dynamic topic modeling is implemented to derive the top keywords of each period and their weight from text data. In the second step, we derive vectors of top keywords of each topic from the pre-trained word embedding model. Then, we perform dimension reduction for the extracted vectors. Then, we formulate a semantic vector of each topic by calculating weight sum of keywords in each vector using topic weight of each keyword. In the third step, we visualize the semantic vector of each topic using matplotlib, and analyze the relationship between or among the topics based on the visualized result. The change of topic can be interpreted in the following manners. From the result of dynamic topic modeling, we identify rising top 5 keywords and descending top 5 keywords for each period to show the change of the topic. Existing many topic visualization studies usually visualize keywords of each topic, but our approach proposed in this study differs from previous studies in that it attempts to visualize each topic itself. To evaluate the practical applicability of the proposed methodology, we performed an experiment on 1,847 abstracts of artificial intelligence-related papers. The experiment was performed by dividing abstracts of artificial intelligence-related papers into three periods (2016-2017, 2018-2019, 2020-2021). We selected seven topics based on the consistency score, and utilized the pre-trained word embedding model of Word2vec trained with 'Wikipedia', an Internet encyclopedia. Based on the proposed methodology, we generated a semantic vector for each topic. Through this, by reflecting the meaning of keywords, we visualized and interpreted the themes by period. Through these experiments, we confirmed that the rising and descending of the topic weight of a keyword can be usefully used to interpret the semantic change of the corresponding topic and to grasp the relationship among topics. In this study, to overcome the limitations of dynamic topic modeling results, we used word embedding and dimension reduction techniques to visualize topics by era. The results of this study are meaningful in that they broadened the scope of topic understanding through the visualization of dynamic topic modeling results. In addition, the academic contribution can be acknowledged in that it laid the foundation for follow-up studies using various word embeddings and dimensionality reduction techniques to improve the performance of the proposed methodology.

Application of Remote Sensing Techniques to Survey and Estimate the Standing-Stock of Floating Debris in the Upper Daecheong Lake (원격탐사 기법 적용을 통한 대청호 상류 유입 부유쓰레기 조사 및 현존량 추정 연구)

  • Youngmin Kim;Seon Woong Jang ;Heung-Min Kim;Tak-Young Kim;Suho Bak
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.5_1
    • /
    • pp.589-597
    • /
    • 2023
  • Floating debris in large quantities from land during heavy rainfall has adverse social, economic, and environmental impacts, but the monitoring system for the concentration area and amount is insufficient. In this study, we proposed an efficient monitoring method for floating debris entering the river during heavy rainfall in Daecheong Lake, the largest water supply source in the central region, and applied remote sensing techniques to estimate the standing-stock of floating debris. To investigate the status of floating debris in the upper of Daecheong Lake, we used a tracking buoy equipped with a low-orbit satellite communication terminal to identify the movement route and behavior characteristics, and used a drone to estimate the potential concentration area and standing-stock of floating debris. The location tracking buoys moved rapidly during the period when the cumulative rainfall for 3 days increased by more than 200 to 300 mm. In the case of Hotan Bridge, which showed the longest distance, it moved about 72.8 km for one day, and the maximum moving speed at this time was 5.71 km/h. As a result of calculating the standing-stock of floating debris using a drone after heavy rainfall, it was found to be 658.8 to 9,165.4 tons, with the largest amount occurring in the Seokhori area. In this study, we were able to identify the main concentrations of floating debris by using location-tracking buoys and drones. It is believed that remote sensing-based monitoring methods, which are more mobile and quicker than traditional monitoring methods, can contribute to reducing the cost of collecting and processing large amounts of floating debris that flows in during heavy rain periods in the future.

Performance Analysis of Frequent Pattern Mining with Multiple Minimum Supports (다중 최소 임계치 기반 빈발 패턴 마이닝의 성능분석)

  • Ryang, Heungmo;Yun, Unil
    • Journal of Internet Computing and Services
    • /
    • v.14 no.6
    • /
    • pp.1-8
    • /
    • 2013
  • Data mining techniques are used to find important and meaningful information from huge databases, and pattern mining is one of the significant data mining techniques. Pattern mining is a method of discovering useful patterns from the huge databases. Frequent pattern mining which is one of the pattern mining extracts patterns having higher frequencies than a minimum support threshold from databases, and the patterns are called frequent patterns. Traditional frequent pattern mining is based on a single minimum support threshold for the whole database to perform mining frequent patterns. This single support model implicitly supposes that all of the items in the database have the same nature. In real world applications, however, each item in databases can have relative characteristics, and thus an appropriate pattern mining technique which reflects the characteristics is required. In the framework of frequent pattern mining, where the natures of items are not considered, it needs to set the single minimum support threshold to a too low value for mining patterns containing rare items. It leads to too many patterns including meaningless items though. In contrast, we cannot mine any pattern if a too high threshold is used. This dilemma is called the rare item problem. To solve this problem, the initial researches proposed approximate approaches which split data into several groups according to item frequencies or group related rare items. However, these methods cannot find all of the frequent patterns including rare frequent patterns due to being based on approximate techniques. Hence, pattern mining model with multiple minimum supports is proposed in order to solve the rare item problem. In the model, each item has a corresponding minimum support threshold, called MIS (Minimum Item Support), and it is calculated based on item frequencies in databases. The multiple minimum supports model finds all of the rare frequent patterns without generating meaningless patterns and losing significant patterns by applying the MIS. Meanwhile, candidate patterns are extracted during a process of mining frequent patterns, and the only single minimum support is compared with frequencies of the candidate patterns in the single minimum support model. Therefore, the characteristics of items consist of the candidate patterns are not reflected. In addition, the rare item problem occurs in the model. In order to address this issue in the multiple minimum supports model, the minimum MIS value among all of the values of items in a candidate pattern is used as a minimum support threshold with respect to the candidate pattern for considering its characteristics. For efficiently mining frequent patterns including rare frequent patterns by adopting the above concept, tree based algorithms of the multiple minimum supports model sort items in a tree according to MIS descending order in contrast to those of the single minimum support model, where the items are ordered in frequency descending order. In this paper, we study the characteristics of the frequent pattern mining based on multiple minimum supports and conduct performance evaluation with a general frequent pattern mining algorithm in terms of runtime, memory usage, and scalability. Experimental results show that the multiple minimum supports based algorithm outperforms the single minimum support based one and demands more memory usage for MIS information. Moreover, the compared algorithms have a good scalability in the results.

A Study on the Prediction Model of Stock Price Index Trend based on GA-MSVM that Simultaneously Optimizes Feature and Instance Selection (입력변수 및 학습사례 선정을 동시에 최적화하는 GA-MSVM 기반 주가지수 추세 예측 모형에 관한 연구)

  • Lee, Jong-sik;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.4
    • /
    • pp.147-168
    • /
    • 2017
  • There have been many studies on accurate stock market forecasting in academia for a long time, and now there are also various forecasting models using various techniques. Recently, many attempts have been made to predict the stock index using various machine learning methods including Deep Learning. Although the fundamental analysis and the technical analysis method are used for the analysis of the traditional stock investment transaction, the technical analysis method is more useful for the application of the short-term transaction prediction or statistical and mathematical techniques. Most of the studies that have been conducted using these technical indicators have studied the model of predicting stock prices by binary classification - rising or falling - of stock market fluctuations in the future market (usually next trading day). However, it is also true that this binary classification has many unfavorable aspects in predicting trends, identifying trading signals, or signaling portfolio rebalancing. In this study, we try to predict the stock index by expanding the stock index trend (upward trend, boxed, downward trend) to the multiple classification system in the existing binary index method. In order to solve this multi-classification problem, a technique such as Multinomial Logistic Regression Analysis (MLOGIT), Multiple Discriminant Analysis (MDA) or Artificial Neural Networks (ANN) we propose an optimization model using Genetic Algorithm as a wrapper for improving the performance of this model using Multi-classification Support Vector Machines (MSVM), which has proved to be superior in prediction performance. In particular, the proposed model named GA-MSVM is designed to maximize model performance by optimizing not only the kernel function parameters of MSVM, but also the optimal selection of input variables (feature selection) as well as instance selection. In order to verify the performance of the proposed model, we applied the proposed method to the real data. The results show that the proposed method is more effective than the conventional multivariate SVM, which has been known to show the best prediction performance up to now, as well as existing artificial intelligence / data mining techniques such as MDA, MLOGIT, CBR, and it is confirmed that the prediction performance is better than this. Especially, it has been confirmed that the 'instance selection' plays a very important role in predicting the stock index trend, and it is confirmed that the improvement effect of the model is more important than other factors. To verify the usefulness of GA-MSVM, we applied it to Korea's real KOSPI200 stock index trend forecast. Our research is primarily aimed at predicting trend segments to capture signal acquisition or short-term trend transition points. The experimental data set includes technical indicators such as the price and volatility index (2004 ~ 2017) and macroeconomic data (interest rate, exchange rate, S&P 500, etc.) of KOSPI200 stock index in Korea. Using a variety of statistical methods including one-way ANOVA and stepwise MDA, 15 indicators were selected as candidate independent variables. The dependent variable, trend classification, was classified into three states: 1 (upward trend), 0 (boxed), and -1 (downward trend). 70% of the total data for each class was used for training and the remaining 30% was used for verifying. To verify the performance of the proposed model, several comparative model experiments such as MDA, MLOGIT, CBR, ANN and MSVM were conducted. MSVM has adopted the One-Against-One (OAO) approach, which is known as the most accurate approach among the various MSVM approaches. Although there are some limitations, the final experimental results demonstrate that the proposed model, GA-MSVM, performs at a significantly higher level than all comparative models.

Efficient Topic Modeling by Mapping Global and Local Topics (전역 토픽의 지역 매핑을 통한 효율적 토픽 모델링 방안)

  • Choi, Hochang;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.69-94
    • /
    • 2017
  • Recently, increase of demand for big data analysis has been driving the vigorous development of related technologies and tools. In addition, development of IT and increased penetration rate of smart devices are producing a large amount of data. According to this phenomenon, data analysis technology is rapidly becoming popular. Also, attempts to acquire insights through data analysis have been continuously increasing. It means that the big data analysis will be more important in various industries for the foreseeable future. Big data analysis is generally performed by a small number of experts and delivered to each demander of analysis. However, increase of interest about big data analysis arouses activation of computer programming education and development of many programs for data analysis. Accordingly, the entry barriers of big data analysis are gradually lowering and data analysis technology being spread out. As the result, big data analysis is expected to be performed by demanders of analysis themselves. Along with this, interest about various unstructured data is continually increasing. Especially, a lot of attention is focused on using text data. Emergence of new platforms and techniques using the web bring about mass production of text data and active attempt to analyze text data. Furthermore, result of text analysis has been utilized in various fields. Text mining is a concept that embraces various theories and techniques for text analysis. Many text mining techniques are utilized in this field for various research purposes, topic modeling is one of the most widely used and studied. Topic modeling is a technique that extracts the major issues from a lot of documents, identifies the documents that correspond to each issue and provides identified documents as a cluster. It is evaluated as a very useful technique in that reflect the semantic elements of the document. Traditional topic modeling is based on the distribution of key terms across the entire document. Thus, it is essential to analyze the entire document at once to identify topic of each document. This condition causes a long time in analysis process when topic modeling is applied to a lot of documents. In addition, it has a scalability problem that is an exponential increase in the processing time with the increase of analysis objects. This problem is particularly noticeable when the documents are distributed across multiple systems or regions. To overcome these problems, divide and conquer approach can be applied to topic modeling. It means dividing a large number of documents into sub-units and deriving topics through repetition of topic modeling to each unit. This method can be used for topic modeling on a large number of documents with limited system resources, and can improve processing speed of topic modeling. It also can significantly reduce analysis time and cost through ability to analyze documents in each location or place without combining analysis object documents. However, despite many advantages, this method has two major problems. First, the relationship between local topics derived from each unit and global topics derived from entire document is unclear. It means that in each document, local topics can be identified, but global topics cannot be identified. Second, a method for measuring the accuracy of the proposed methodology should be established. That is to say, assuming that global topic is ideal answer, the difference in a local topic on a global topic needs to be measured. By those difficulties, the study in this method is not performed sufficiently, compare with other studies dealing with topic modeling. In this paper, we propose a topic modeling approach to solve the above two problems. First of all, we divide the entire document cluster(Global set) into sub-clusters(Local set), and generate the reduced entire document cluster(RGS, Reduced global set) that consist of delegated documents extracted from each local set. We try to solve the first problem by mapping RGS topics and local topics. Along with this, we verify the accuracy of the proposed methodology by detecting documents, whether to be discerned as the same topic at result of global and local set. Using 24,000 news articles, we conduct experiments to evaluate practical applicability of the proposed methodology. In addition, through additional experiment, we confirmed that the proposed methodology can provide similar results to the entire topic modeling. We also proposed a reasonable method for comparing the result of both methods.

The Conservation Treatment for the Mattress from National Folklore Cultural Heritage, the Red-lacquered Furniture with Inlaid Mother-of-pearl Design Used by Empress Sunjeonghyo and Comparative Study of Manufacturing Techniques (국가민속문화재 전 순정효황후 주칠 나전가구(傳 純貞孝皇后 朱漆 螺鈿家具) 매트리스의 보존처리 및 제작 기법 비교)

  • Park, Hyungho;Kim, Jongsu;Kim, Suchul;Keum, Jongsuk;Jang, Jongmin;Kim, Suha;Park, Changyuel
    • Korean Journal of Heritage: History & Science
    • /
    • v.54 no.1
    • /
    • pp.220-237
    • /
    • 2021
  • This study carried out the conservation treatment for the mattress put on the bed, which is one of 4 items in National Folklore Cultural Heritage, the Red-lacquered Furniture with the inlaid mother-of-pearl design used by Empress Sunjeonghyo (presumed), after identifying the characteristics of the manufacturing techniques and the used materials. And the study intends to compare it with the mattress placed in the Daejojeong in the Changdeokgung Palace in order to identify the characteristics of mattresses domestically used during the 1920s and 1930s. From the analysis of the mattress presumably used by Empress Sunjeonghyo, it was identified that the mattress frame was made of pinaceous hemlock spruce while the webbing and twine in the structural parts were made of jute. The findings are as follows: the burlap had a filling material that was made of jute; the straw mat was made from Oryza; and, the rest of the filling material was cotton. Rayon was used for the top cover while cotton was used for the bottom. As a result of research on the materials and the inner structure, it was found that mattress was manufactured in the form of the upholstery style mainly found in chairs and day-beds in Western furniture. Based on analysis results, materials identical to the original were adopted during the conservation treatment. Next, the process of dismantling, cleaning, repair, reinforcement and assembling was conducted. During the dismantling process, the top cover was newly discovered and some letters (Yokohama, Kobe, and Joseon) were found in the burlap filling, but there was no trace which can clarify its maker or production place. dry cleaning was carried out on the structural parts, filling materials, and the cover, and then the repair and reinforcement were done, preserving the existing materials in the upholstery structure and using the same materials for conservation. The webbing in the structural parts was reinforced using materials identical to the original, and the twine was used for arranging and fixing the springs into wooden frames. For the damaged cotton cloth and burlap, reinforcement materials identical to the original were put over it and sown. For the damaged area of the top cover, reinforcement cloth was cut and then added inside and the damaged area was sown. Assembling was carried out in the reverse order of the dismantling. After the burlap identical to the original material was inserted into the areas in contact with the springs and then fastened, a filling pad, reinforcement cloth, a straw mat, cotton cloth, cotton felt, wide cotton cloth for protecting the cover, and the cover were layered and fastened with tacks. The two mattresses used by Empress Sunjeonghyo differed only by the period of production and followed the same Western upholstery style consisting of the frames, filling materials, and covers. During the conservation treatment process, a velvet cover was newly discovered and the traces of repair in the past were found. Furthermore, identifying straw mats, straw bags, and straws for filling material, this study confirmed changes in the materials used according to the production environment. In the future, it is expected to see changes in the conservation materials during the conservation treatment and manufacturing techniques used for chairs and sofas in the upholstery style belonging to the modern cultural artifacts.

A Study on Commodity Asset Investment Model Based on Machine Learning Technique (기계학습을 활용한 상품자산 투자모델에 관한 연구)

  • Song, Jin Ho;Choi, Heung Sik;Kim, Sun Woong
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.4
    • /
    • pp.127-146
    • /
    • 2017
  • Services using artificial intelligence have begun to emerge in daily life. Artificial intelligence is applied to products in consumer electronics and communications such as artificial intelligence refrigerators and speakers. In the financial sector, using Kensho's artificial intelligence technology, the process of the stock trading system in Goldman Sachs was improved. For example, two stock traders could handle the work of 600 stock traders and the analytical work for 15 people for 4weeks could be processed in 5 minutes. Especially, big data analysis through machine learning among artificial intelligence fields is actively applied throughout the financial industry. The stock market analysis and investment modeling through machine learning theory are also actively studied. The limits of linearity problem existing in financial time series studies are overcome by using machine learning theory such as artificial intelligence prediction model. The study of quantitative financial data based on the past stock market-related numerical data is widely performed using artificial intelligence to forecast future movements of stock price or indices. Various other studies have been conducted to predict the future direction of the market or the stock price of companies by learning based on a large amount of text data such as various news and comments related to the stock market. Investing on commodity asset, one of alternative assets, is usually used for enhancing the stability and safety of traditional stock and bond asset portfolio. There are relatively few researches on the investment model about commodity asset than mainstream assets like equity and bond. Recently machine learning techniques are widely applied on financial world, especially on stock and bond investment model and it makes better trading model on this field and makes the change on the whole financial area. In this study we made investment model using Support Vector Machine among the machine learning models. There are some researches on commodity asset focusing on the price prediction of the specific commodity but it is hard to find the researches about investment model of commodity as asset allocation using machine learning model. We propose a method of forecasting four major commodity indices, portfolio made of commodity futures, and individual commodity futures, using SVM model. The four major commodity indices are Goldman Sachs Commodity Index(GSCI), Dow Jones UBS Commodity Index(DJUI), Thomson Reuters/Core Commodity CRB Index(TRCI), and Rogers International Commodity Index(RI). We selected each two individual futures among three sectors as energy, agriculture, and metals that are actively traded on CME market and have enough liquidity. They are Crude Oil, Natural Gas, Corn, Wheat, Gold and Silver Futures. We made the equally weighted portfolio with six commodity futures for comparing with other commodity indices. We set the 19 macroeconomic indicators including stock market indices, exports & imports trade data, labor market data, and composite leading indicators as the input data of the model because commodity asset is very closely related with the macroeconomic activities. They are 14 US economic indicators, two Chinese economic indicators and two Korean economic indicators. Data period is from January 1990 to May 2017. We set the former 195 monthly data as training data and the latter 125 monthly data as test data. In this study, we verified that the performance of the equally weighted commodity futures portfolio rebalanced by the SVM model is better than that of other commodity indices. The prediction accuracy of the model for the commodity indices does not exceed 50% regardless of the SVM kernel function. On the other hand, the prediction accuracy of equally weighted commodity futures portfolio is 53%. The prediction accuracy of the individual commodity futures model is better than that of commodity indices model especially in agriculture and metal sectors. The individual commodity futures portfolio excluding the energy sector has outperformed the three sectors covered by individual commodity futures portfolio. In order to verify the validity of the model, it is judged that the analysis results should be similar despite variations in data period. So we also examined the odd numbered year data as training data and the even numbered year data as test data and we confirmed that the analysis results are similar. As a result, when we allocate commodity assets to traditional portfolio composed of stock, bond, and cash, we can get more effective investment performance not by investing commodity indices but by investing commodity futures. Especially we can get better performance by rebalanced commodity futures portfolio designed by SVM model.

A Study on Recent Research Trend in Management of Technology Using Keywords Network Analysis (키워드 네트워크 분석을 통해 살펴본 기술경영의 최근 연구동향)

  • Kho, Jaechang;Cho, Kuentae;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.2
    • /
    • pp.101-123
    • /
    • 2013
  • Recently due to the advancements of science and information technology, the socio-economic business areas are changing from the industrial economy to a knowledge economy. Furthermore, companies need to do creation of new value through continuous innovation, development of core competencies and technologies, and technological convergence. Therefore, the identification of major trends in technology research and the interdisciplinary knowledge-based prediction of integrated technologies and promising techniques are required for firms to gain and sustain competitive advantage and future growth engines. The aim of this paper is to understand the recent research trend in management of technology (MOT) and to foresee promising technologies with deep knowledge for both technology and business. Furthermore, this study intends to give a clear way to find new technical value for constant innovation and to capture core technology and technology convergence. Bibliometrics is a metrical analysis to understand literature's characteristics. Traditional bibliometrics has its limitation not to understand relationship between trend in technology management and technology itself, since it focuses on quantitative indices such as quotation frequency. To overcome this issue, the network focused bibliometrics has been used instead of traditional one. The network focused bibliometrics mainly uses "Co-citation" and "Co-word" analysis. In this study, a keywords network analysis, one of social network analysis, is performed to analyze recent research trend in MOT. For the analysis, we collected keywords from research papers published in international journals related MOT between 2002 and 2011, constructed a keyword network, and then conducted the keywords network analysis. Over the past 40 years, the studies in social network have attempted to understand the social interactions through the network structure represented by connection patterns. In other words, social network analysis has been used to explain the structures and behaviors of various social formations such as teams, organizations, and industries. In general, the social network analysis uses data as a form of matrix. In our context, the matrix depicts the relations between rows as papers and columns as keywords, where the relations are represented as binary. Even though there are no direct relations between papers who have been published, the relations between papers can be derived artificially as in the paper-keyword matrix, in which each cell has 1 for including or 0 for not including. For example, a keywords network can be configured in a way to connect the papers which have included one or more same keywords. After constructing a keywords network, we analyzed frequency of keywords, structural characteristics of keywords network, preferential attachment and growth of new keywords, component, and centrality. The results of this study are as follows. First, a paper has 4.574 keywords on the average. 90% of keywords were used three or less times for past 10 years and about 75% of keywords appeared only one time. Second, the keyword network in MOT is a small world network and a scale free network in which a small number of keywords have a tendency to become a monopoly. Third, the gap between the rich (with more edges) and the poor (with fewer edges) in the network is getting bigger as time goes on. Fourth, most of newly entering keywords become poor nodes within about 2~3 years. Finally, keywords with high degree centrality, betweenness centrality, and closeness centrality are "Innovation," "R&D," "Patent," "Forecast," "Technology transfer," "Technology," and "SME". The results of analysis will help researchers identify major trends in MOT research and then seek a new research topic. We hope that the result of the analysis will help researchers of MOT identify major trends in technology research, and utilize as useful reference information when they seek consilience with other fields of study and select a new research topic.