• Title/Summary/Keyword: vector-matrix method

Search Result 417, Processing Time 0.027 seconds

Multivariate conditional tail expectations (다변량 조건부 꼬리 기대값)

  • Hong, C.S.;Kim, T.W.
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.7
    • /
    • pp.1201-1212
    • /
    • 2016
  • Value at Risk (VaR) for market risk management is a favorite method used by financial companies; however, there are some problems that cannot be explained for the amount of loss when a specific investment fails. Conditional Tail Expectation (CTE) is an alternative risk measure defined as the conditional expectation exceeded VaR. Multivariate loss rates are transformed into a univariate distribution in real financial markets in order to obtain CTE for some portfolio as well as to estimate CTE. We propose multivariate CTEs using multivariate quantile vectors. A relationship among multivariate CTEs is also derived by extending univariate CTEs. Multivariate CTEs are obtained from bivariate and trivariate normal distributions; in addition, relationships among multivariate CTEs are also explored. We then discuss the extensibility to high dimension as well as illustrate some examples. Multivariate CTEs (using variance-covariance matrix and multivariate quantile vector) are found to have smaller values than CTEs transformed to univariate. Therefore, it can be concluded that the proposed multivariate CTEs provides smaller estimates that represent less risk than others and that a drastic investment using this CTE is also possible when a diversified investment strategy includes many companies in a portfolio.

Incremental Regression based on a Sliding Window for Stream Data Prediction (스트림 데이타 예측을 위한 슬라이딩 윈도우 기반 점진적 회귀분석)

  • Kim, Sung-Hyun;Jin, Long;Ryu, Keun-Ho
    • Journal of KIISE:Databases
    • /
    • v.34 no.6
    • /
    • pp.483-492
    • /
    • 2007
  • Time series of conventional prediction techniques uses the model which is generated from the training step. This model is applied to new input data without any change. If this model is applied directly to stream data, the rate of prediction accuracy will be decreased. This paper proposes an stream data prediction technique using sliding window and regression. This technique considers the characteristic of time series which may be changed over time. It is composed of two steps. The first step executes a fractional process for applying input data to the regression model. The second step updates the model by using its information as new data. Additionally, the model is maintained by only recent data in a queue. This approach has the following two advantages. It maintains the minimum information of the model by using a matrix, so space complexity is reduced. Moreover, it prevents the increment of error rate by updating the model over time. Accuracy rate of the proposed method is measured by RME(Relative Mean Error) and RMSE(Root Mean Square Error). The results of stream data prediction experiment are performed by the proposed technique IMQR(Incremental Multiple Quadratic Regression) is more efficient than those of MLR(Multiple Linear Regression) and SVR(Support Vector Regression).

Characterization of Korean Archaeological Artifacts by Neutron Activation Analysis (I). Multivariate Classification of Korean Ancient Coins. (중성자 방사화분석에 의한 한국산 고고학적 유물의 특성화 연구 (I). 다변량 해석법에 의한 고전 (古錢) 의 분류 연구)

  • Chul Lee;Oh Cheun Kwun;Hyung Tae Kang;Ihn Chong Lee;Nak Bae Kim
    • Journal of the Korean Chemical Society
    • /
    • v.31 no.6
    • /
    • pp.555-566
    • /
    • 1987
  • Fifty ancient Korean coins originated in Yi Dynasty have been determined for 9 elements such as Sn, Fe, As, Ag, Co, Sb, Ir, Ru and Ni by instrumental neutron activation analysis and for 3 elements such as Cu, Pb, and Zn by atomic absorption spectrometry. Bronze coins originated in early days of the dynasty contain as major constituents Cu, Pb and Sn approximately in the ratio 90 : 4 : 3, whereas, those in latter days contain in ratio 7 : 2 : 0. Brass coins which had begun in 17 century contain as major constituents Cu, Zn and Pb approximately in the ratio 7 : 1 : 1. The multivariate data have been analyzed for the relation among elemental contents through the variance-covariance matrix. The data have been further analyzed by a principal component mapping method. As the results training set of 8 class have been chosen, based on the spread of sample points in an eigen vector plot and archaeological data such as age and the office of minting. The training set and test set of samples have finally been analyzed for the assignment to certain classes or outliers through the statistical isolinear multiple component analysis (SIMCA).

  • PDF

WebPR : A Dynamic Web Page Recommendation Algorithm Based on Mining Frequent Traversal Patterns (WebPR :빈발 순회패턴 탐사에 기반한 동적 웹페이지 추천 알고리즘)

  • Yoon, Sun-Hee;Kim, Sam-Keun;Lee, Chang-Hoon
    • The KIPS Transactions:PartB
    • /
    • v.11B no.2
    • /
    • pp.187-198
    • /
    • 2004
  • The World-Wide Web is the largest distributed Information space and has grown to encompass diverse information resources. However, although Web is growing exponentially, the individual's capacity to read and digest contents is essentially fixed. From the view point of Web users, they can be confused by explosion of Web information, by constantly changing Web environments, and by lack of understanding needs of Web users. In these Web environments, mining traversal patterns is an important problem in Web mining with a host of application domains including system design and Information services. Conventional traversal pattern mining systems use the inter-pages association in sessions with only a very restricted mechanism (based on vector or matrix) for generating frequent k-Pagesets. We develop a family of novel algorithms (termed WebPR - Web Page Recommend) for mining frequent traversal patterns and then pageset to recommend. Our algorithms provide Web users with new page views, which Include pagesets to recommend, so that users can effectively traverse its Web site. The main distinguishing factors are both a point consistently spanning schemes applying inter-pages association for mining frequent traversal patterns and a point proposing the most efficient tree model. Our experimentation with two real data sets, including Lady Asiana and KBS media server site, clearly validates that our method outperforms conventional methods.

Label Embedding for Improving Classification Accuracy UsingAutoEncoderwithSkip-Connections (다중 레이블 분류의 정확도 향상을 위한 스킵 연결 오토인코더 기반 레이블 임베딩 방법론)

  • Kim, Museong;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.175-197
    • /
    • 2021
  • Recently, with the development of deep learning technology, research on unstructured data analysis is being actively conducted, and it is showing remarkable results in various fields such as classification, summary, and generation. Among various text analysis fields, text classification is the most widely used technology in academia and industry. Text classification includes binary class classification with one label among two classes, multi-class classification with one label among several classes, and multi-label classification with multiple labels among several classes. In particular, multi-label classification requires a different training method from binary class classification and multi-class classification because of the characteristic of having multiple labels. In addition, since the number of labels to be predicted increases as the number of labels and classes increases, there is a limitation in that performance improvement is difficult due to an increase in prediction difficulty. To overcome these limitations, (i) compressing the initially given high-dimensional label space into a low-dimensional latent label space, (ii) after performing training to predict the compressed label, (iii) restoring the predicted label to the high-dimensional original label space, research on label embedding is being actively conducted. Typical label embedding techniques include Principal Label Space Transformation (PLST), Multi-Label Classification via Boolean Matrix Decomposition (MLC-BMaD), and Bayesian Multi-Label Compressed Sensing (BML-CS). However, since these techniques consider only the linear relationship between labels or compress the labels by random transformation, it is difficult to understand the non-linear relationship between labels, so there is a limitation in that it is not possible to create a latent label space sufficiently containing the information of the original label. Recently, there have been increasing attempts to improve performance by applying deep learning technology to label embedding. Label embedding using an autoencoder, a deep learning model that is effective for data compression and restoration, is representative. However, the traditional autoencoder-based label embedding has a limitation in that a large amount of information loss occurs when compressing a high-dimensional label space having a myriad of classes into a low-dimensional latent label space. This can be found in the gradient loss problem that occurs in the backpropagation process of learning. To solve this problem, skip connection was devised, and by adding the input of the layer to the output to prevent gradient loss during backpropagation, efficient learning is possible even when the layer is deep. Skip connection is mainly used for image feature extraction in convolutional neural networks, but studies using skip connection in autoencoder or label embedding process are still lacking. Therefore, in this study, we propose an autoencoder-based label embedding methodology in which skip connections are added to each of the encoder and decoder to form a low-dimensional latent label space that reflects the information of the high-dimensional label space well. In addition, the proposed methodology was applied to actual paper keywords to derive the high-dimensional keyword label space and the low-dimensional latent label space. Using this, we conducted an experiment to predict the compressed keyword vector existing in the latent label space from the paper abstract and to evaluate the multi-label classification by restoring the predicted keyword vector back to the original label space. As a result, the accuracy, precision, recall, and F1 score used as performance indicators showed far superior performance in multi-label classification based on the proposed methodology compared to traditional multi-label classification methods. This can be seen that the low-dimensional latent label space derived through the proposed methodology well reflected the information of the high-dimensional label space, which ultimately led to the improvement of the performance of the multi-label classification itself. In addition, the utility of the proposed methodology was identified by comparing the performance of the proposed methodology according to the domain characteristics and the number of dimensions of the latent label space.

A Study on Music Summarization (음악요약 생성에 관한 연구)

  • Kim Sung-Tak;Kim Sang-Ho;Kim Hoi-Rin;Choi Ji-Hoon;Lee Han-Kyu;Hong Jin-Woo
    • Journal of Broadcast Engineering
    • /
    • v.11 no.1 s.30
    • /
    • pp.3-14
    • /
    • 2006
  • Music summarization means a technique which automatically generates the most importantand representative a part or parts ill music content. The techniques of music summarization have been studied with two categories according to summary characteristics. The first one is that the repeated part is provided as music summary and the second provides the combined segments which consist of segments with different characteristics as music summary in music content In this paper, we propose and evaluate two kinds of music summarization techniques. The algorithm using multi-level vector quantization which provides a repeated part as music summary gives fixed-length music summary is evaluated by overlapping ration between hand-made repeated parts and automatically generated summary. As results, the overlapping ratios of conventional methods are 42.2% and 47.4%, but that of proposed method with fixed-length summary is 67.1%. Optimal length music summary is evaluated by the portion of overlapping between summary and repeated part which is different length according to music content and the result shows that automatically-generated summary expresses more effective part than fixed-length summary with optimal length. The cluster-based algorithm using 2-D similarity matrix and k-means algorithm provides the combined segments as music summary. In order to evaluate this algorithm, we use MOS test consisting of two questions(How many similar segments are in summarized music? How many segments are included in same structure?) and the results show good performance.

Incorporating Social Relationship discovered from User's Behavior into Collaborative Filtering (사용자 행동 기반의 사회적 관계를 결합한 사용자 협업적 여과 방법)

  • Thay, Setha;Ha, Inay;Jo, Geun-Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.2
    • /
    • pp.1-20
    • /
    • 2013
  • Nowadays, social network is a huge communication platform for providing people to connect with one another and to bring users together to share common interests, experiences, and their daily activities. Users spend hours per day in maintaining personal information and interacting with other people via posting, commenting, messaging, games, social events, and applications. Due to the growth of user's distributed information in social network, there is a great potential to utilize the social data to enhance the quality of recommender system. There are some researches focusing on social network analysis that investigate how social network can be used in recommendation domain. Among these researches, we are interested in taking advantages of the interaction between a user and others in social network that can be determined and known as social relationship. Furthermore, mostly user's decisions before purchasing some products depend on suggestion of people who have either the same preferences or closer relationship. For this reason, we believe that user's relationship in social network can provide an effective way to increase the quality in prediction user's interests of recommender system. Therefore, social relationship between users encountered from social network is a common factor to improve the way of predicting user's preferences in the conventional approach. Recommender system is dramatically increasing in popularity and currently being used by many e-commerce sites such as Amazon.com, Last.fm, eBay.com, etc. Collaborative filtering (CF) method is one of the essential and powerful techniques in recommender system for suggesting the appropriate items to user by learning user's preferences. CF method focuses on user data and generates automatic prediction about user's interests by gathering information from users who share similar background and preferences. Specifically, the intension of CF method is to find users who have similar preferences and to suggest target user items that were mostly preferred by those nearest neighbor users. There are two basic units that need to be considered by CF method, the user and the item. Each user needs to provide his rating value on items i.e. movies, products, books, etc to indicate their interests on those items. In addition, CF uses the user-rating matrix to find a group of users who have similar rating with target user. Then, it predicts unknown rating value for items that target user has not rated. Currently, CF has been successfully implemented in both information filtering and e-commerce applications. However, it remains some important challenges such as cold start, data sparsity, and scalability reflected on quality and accuracy of prediction. In order to overcome these challenges, many researchers have proposed various kinds of CF method such as hybrid CF, trust-based CF, social network-based CF, etc. In the purpose of improving the recommendation performance and prediction accuracy of standard CF, in this paper we propose a method which integrates traditional CF technique with social relationship between users discovered from user's behavior in social network i.e. Facebook. We identify user's relationship from behavior of user such as posts and comments interacted with friends in Facebook. We believe that social relationship implicitly inferred from user's behavior can be likely applied to compensate the limitation of conventional approach. Therefore, we extract posts and comments of each user by using Facebook Graph API and calculate feature score among each term to obtain feature vector for computing similarity of user. Then, we combine the result with similarity value computed using traditional CF technique. Finally, our system provides a list of recommended items according to neighbor users who have the biggest total similarity value to the target user. In order to verify and evaluate our proposed method we have performed an experiment on data collected from our Movies Rating System. Prediction accuracy evaluation is conducted to demonstrate how much our algorithm gives the correctness of recommendation to user in terms of MAE. Then, the evaluation of performance is made to show the effectiveness of our method in terms of precision, recall, and F1-measure. Evaluation on coverage is also included in our experiment to see the ability of generating recommendation. The experimental results show that our proposed method outperform and more accurate in suggesting items to users with better performance. The effectiveness of user's behavior in social network particularly shows the significant improvement by up to 6% on recommendation accuracy. Moreover, experiment of recommendation performance shows that incorporating social relationship observed from user's behavior into CF is beneficial and useful to generate recommendation with 7% improvement of performance compared with benchmark methods. Finally, we confirm that interaction between users in social network is able to enhance the accuracy and give better recommendation in conventional approach.