• Title/Summary/Keyword: Easy of Use

Search Result 3,406, Processing Time 0.028 seconds

A Study of Anomaly Detection for ICT Infrastructure using Conditional Multimodal Autoencoder (ICT 인프라 이상탐지를 위한 조건부 멀티모달 오토인코더에 관한 연구)

  • Shin, Byungjin;Lee, Jonghoon;Han, Sangjin;Park, Choong-Shik
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.57-73
    • /
    • 2021
  • Maintenance and prevention of failure through anomaly detection of ICT infrastructure is becoming important. System monitoring data is multidimensional time series data. When we deal with multidimensional time series data, we have difficulty in considering both characteristics of multidimensional data and characteristics of time series data. When dealing with multidimensional data, correlation between variables should be considered. Existing methods such as probability and linear base, distance base, etc. are degraded due to limitations called the curse of dimensions. In addition, time series data is preprocessed by applying sliding window technique and time series decomposition for self-correlation analysis. These techniques are the cause of increasing the dimension of data, so it is necessary to supplement them. The anomaly detection field is an old research field, and statistical methods and regression analysis were used in the early days. Currently, there are active studies to apply machine learning and artificial neural network technology to this field. Statistically based methods are difficult to apply when data is non-homogeneous, and do not detect local outliers well. The regression analysis method compares the predictive value and the actual value after learning the regression formula based on the parametric statistics and it detects abnormality. Anomaly detection using regression analysis has the disadvantage that the performance is lowered when the model is not solid and the noise or outliers of the data are included. There is a restriction that learning data with noise or outliers should be used. The autoencoder using artificial neural networks is learned to output as similar as possible to input data. It has many advantages compared to existing probability and linear model, cluster analysis, and map learning. It can be applied to data that does not satisfy probability distribution or linear assumption. In addition, it is possible to learn non-mapping without label data for teaching. However, there is a limitation of local outlier identification of multidimensional data in anomaly detection, and there is a problem that the dimension of data is greatly increased due to the characteristics of time series data. In this study, we propose a CMAE (Conditional Multimodal Autoencoder) that enhances the performance of anomaly detection by considering local outliers and time series characteristics. First, we applied Multimodal Autoencoder (MAE) to improve the limitations of local outlier identification of multidimensional data. Multimodals are commonly used to learn different types of inputs, such as voice and image. The different modal shares the bottleneck effect of Autoencoder and it learns correlation. In addition, CAE (Conditional Autoencoder) was used to learn the characteristics of time series data effectively without increasing the dimension of data. In general, conditional input mainly uses category variables, but in this study, time was used as a condition to learn periodicity. The CMAE model proposed in this paper was verified by comparing with the Unimodal Autoencoder (UAE) and Multi-modal Autoencoder (MAE). The restoration performance of Autoencoder for 41 variables was confirmed in the proposed model and the comparison model. The restoration performance is different by variables, and the restoration is normally well operated because the loss value is small for Memory, Disk, and Network modals in all three Autoencoder models. The process modal did not show a significant difference in all three models, and the CPU modal showed excellent performance in CMAE. ROC curve was prepared for the evaluation of anomaly detection performance in the proposed model and the comparison model, and AUC, accuracy, precision, recall, and F1-score were compared. In all indicators, the performance was shown in the order of CMAE, MAE, and AE. Especially, the reproduction rate was 0.9828 for CMAE, which can be confirmed to detect almost most of the abnormalities. The accuracy of the model was also improved and 87.12%, and the F1-score was 0.8883, which is considered to be suitable for anomaly detection. In practical aspect, the proposed model has an additional advantage in addition to performance improvement. The use of techniques such as time series decomposition and sliding windows has the disadvantage of managing unnecessary procedures; and their dimensional increase can cause a decrease in the computational speed in inference.The proposed model has characteristics that are easy to apply to practical tasks such as inference speed and model management.

Visualizing the Results of Opinion Mining from Social Media Contents: Case Study of a Noodle Company (소셜미디어 콘텐츠의 오피니언 마이닝결과 시각화: N라면 사례 분석 연구)

  • Kim, Yoosin;Kwon, Do Young;Jeong, Seung Ryul
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.89-105
    • /
    • 2014
  • After emergence of Internet, social media with highly interactive Web 2.0 applications has provided very user friendly means for consumers and companies to communicate with each other. Users have routinely published contents involving their opinions and interests in social media such as blogs, forums, chatting rooms, and discussion boards, and the contents are released real-time in the Internet. For that reason, many researchers and marketers regard social media contents as the source of information for business analytics to develop business insights, and many studies have reported results on mining business intelligence from Social media content. In particular, opinion mining and sentiment analysis, as a technique to extract, classify, understand, and assess the opinions implicit in text contents, are frequently applied into social media content analysis because it emphasizes determining sentiment polarity and extracting authors' opinions. A number of frameworks, methods, techniques and tools have been presented by these researchers. However, we have found some weaknesses from their methods which are often technically complicated and are not sufficiently user-friendly for helping business decisions and planning. In this study, we attempted to formulate a more comprehensive and practical approach to conduct opinion mining with visual deliverables. First, we described the entire cycle of practical opinion mining using Social media content from the initial data gathering stage to the final presentation session. Our proposed approach to opinion mining consists of four phases: collecting, qualifying, analyzing, and visualizing. In the first phase, analysts have to choose target social media. Each target media requires different ways for analysts to gain access. There are open-API, searching tools, DB2DB interface, purchasing contents, and so son. Second phase is pre-processing to generate useful materials for meaningful analysis. If we do not remove garbage data, results of social media analysis will not provide meaningful and useful business insights. To clean social media data, natural language processing techniques should be applied. The next step is the opinion mining phase where the cleansed social media content set is to be analyzed. The qualified data set includes not only user-generated contents but also content identification information such as creation date, author name, user id, content id, hit counts, review or reply, favorite, etc. Depending on the purpose of the analysis, researchers or data analysts can select a suitable mining tool. Topic extraction and buzz analysis are usually related to market trends analysis, while sentiment analysis is utilized to conduct reputation analysis. There are also various applications, such as stock prediction, product recommendation, sales forecasting, and so on. The last phase is visualization and presentation of analysis results. The major focus and purpose of this phase are to explain results of analysis and help users to comprehend its meaning. Therefore, to the extent possible, deliverables from this phase should be made simple, clear and easy to understand, rather than complex and flashy. To illustrate our approach, we conducted a case study on a leading Korean instant noodle company. We targeted the leading company, NS Food, with 66.5% of market share; the firm has kept No. 1 position in the Korean "Ramen" business for several decades. We collected a total of 11,869 pieces of contents including blogs, forum contents and news articles. After collecting social media content data, we generated instant noodle business specific language resources for data manipulation and analysis using natural language processing. In addition, we tried to classify contents in more detail categories such as marketing features, environment, reputation, etc. In those phase, we used free ware software programs such as TM, KoNLP, ggplot2 and plyr packages in R project. As the result, we presented several useful visualization outputs like domain specific lexicons, volume and sentiment graphs, topic word cloud, heat maps, valence tree map, and other visualized images to provide vivid, full-colored examples using open library software packages of the R project. Business actors can quickly detect areas by a swift glance that are weak, strong, positive, negative, quiet or loud. Heat map is able to explain movement of sentiment or volume in categories and time matrix which shows density of color on time periods. Valence tree map, one of the most comprehensive and holistic visualization models, should be very helpful for analysts and decision makers to quickly understand the "big picture" business situation with a hierarchical structure since tree-map can present buzz volume and sentiment with a visualized result in a certain period. This case study offers real-world business insights from market sensing which would demonstrate to practical-minded business users how they can use these types of results for timely decision making in response to on-going changes in the market. We believe our approach can provide practical and reliable guide to opinion mining with visualized results that are immediately useful, not just in food industry but in other industries as well.

An Exploratory Study on the Effects of Relational Benefits and Brand Identity : mediating effect of brand identity (관계혜택과 브랜드 동일시의 역할에 관한 탐색적 연구: 브랜드 동일시의 매개역할을 중심으로)

  • Bang, Jounghae;Jung, Jiyeon;Lee, Eunhyung;Kang, Hyunmo
    • Asia Marketing Journal
    • /
    • v.12 no.2
    • /
    • pp.155-175
    • /
    • 2010
  • Most of the service industries including finance and telecommunications have become matured and saturated. The competitions have become severe while the differences among brands become smaller. Therefore maintaining good relationships with customers has been critical for the service providers. In case of credit card and debit card, the similar patterns are shown. It is important for them to maintain good relationships with customers, and therefore, they have used marketing program which provides customized services to customers and utilizes the membership programs. Not only do they build and maintain good relationships, but also highlight their brands from the emotional aspects. For example, KB Card or Hyundai Card uses well-known designers' works for their credit card design. As well, they differentiate the designs of credit cards to stress on their brand personalities. BC Card introduced the credit card with perfume that a customer would like. Even though the credit card is small and not shown to public easily, it becomes more important for those companies to touch the customers' feelings with the brand personalities and their images. This is partly because of changes in consumers' lifestyles. Y-generations becomes highly likely to express themselves in many different ways and more emotional than X-generations. For the Y-generations, therefore, even credit cards in the wallet should be personalized and well-designed. In line with it, credit cards with good design can be seen as an example of brand identity, where different design for each customer can be used to recognize the membership groups that customers want to belong. On the other hand, these credit card companies offer the special treatment benefits for those customers who are heavy users for the cards. For example, those customers who love sports will receive some special discounts when they use their credit cards for sports related products. Therefore this study attempted to explore the relationships between relational benefits, brand identification and loyalty. It has been well known that relational benefits and brand identification lead to loyalty independently from many other studies, but there has been few study to review all the three variables all together in a research model. Furthermore, as reviewed above, in the card industry, many companies attempt to associate the brand image with their products to fit their customers' lifestyles while relational benefits are still playing an important role for their business. Therefore in our research model, relational benefits, brand identification, and loyalty are all included. We focus on the mediating effect of brand identification. From the relational benefits perspective, only special treatment benefit and confidence benefit are included. Social benefit is not applicable for this credit card industry because not many cases of face-to-face interaction can be found. From the brand identification perspective, personal brand identity and social brand identity are reviewed and included in the model. Overall, the research model emphasizes that the relationships between relational benefits and loyalty will be mediated by the effect of brand identification. The effects of relational benefits which are confidence benefit and special treatment benefits on loyalty will be realized when they fit to the personal brand identity and social brand identity. In the research model, therefore, the relationships between confidence benefit and social brand identity, and between confidence benefit and personal identity are hypothesized while the effects of special treatment benefit on social brand identity and personal brand identity are hypothesized. Loyalty, then, is hypothesized to have positive relationships with personal brand identity and social brand identity. In addition, confidence benefit among the relational benefits is expected to have a direct, positive relationship with loyalty because confidence benefit has been recognized as a critical factor for good relationships and satisfaction. Data were collected from college students who have been using either credit cards or debit cards. College students were regarded good subjects because they are in Y-generation cohorts and have tendency to express themselves more. Total sample size was two hundred three at the beginning, but after deleting those data with many missing values, one hundred ninety-seven data points were remained and used for the model testing. Measurement items were brought from the previous literatures and modified for this research. To test the reliability, using SPSS 14, chronbach's α was examined and all the values were from .874 to .928 exceeding over .7. Using AMOS 7.0, confirmatory factor analysis was conducted to investigate the measurement model. The measurement model was found good fit with χ2(67)=188.388 (p= .000), GFI=.886, AGFI=.821, CFI=.941, RMSEA=.096. Using AMOS 7.0, structural equation modeling has been used to analyze the research model. Overall, the research model fit were χ2(68)=188.670 (p= .000), GFI=.886, AGFI=,824 CFI=.942, RMSEA=.095 indicating good fit. In details, all the paths hypothesized in the research model were found significant except for the path from social brand identity to loyalty. Personal brand identity leads to loyalty while both confidence benefit and special treatment benefit have a positive relationships with personal and social identities. As well, confidence benefit has a direct positive effect on loyalty. The results indicates the followings. First, personal brand identity plays an important role for credit/debit card usage. Therefore even for the products which are not shown to public easy, design and emotional aspect can be important to fit the customers' lifestyles. Second, confidence benefit and special treatment benefit have a positive effects on personal brand identity. Therefore it will be needed for marketers to associate the special treatment and trust and confidence benefits with personal image, personality and personal identity. Third, this study found again the importance of confidence and trust. However interestingly enough, social brand identity was not found to be significantly related to loyalty. It can be explained that the main sample of this study consists of college students. Those strategies to facilitate social brand identity are focused on high social status groups while college students have not been established their status yet.

  • PDF

Design and Implementation of MongoDB-based Unstructured Log Processing System over Cloud Computing Environment (클라우드 환경에서 MongoDB 기반의 비정형 로그 처리 시스템 설계 및 구현)

  • Kim, Myoungjin;Han, Seungho;Cui, Yun;Lee, Hanku
    • Journal of Internet Computing and Services
    • /
    • v.14 no.6
    • /
    • pp.71-84
    • /
    • 2013
  • Log data, which record the multitude of information created when operating computer systems, are utilized in many processes, from carrying out computer system inspection and process optimization to providing customized user optimization. In this paper, we propose a MongoDB-based unstructured log processing system in a cloud environment for processing the massive amount of log data of banks. Most of the log data generated during banking operations come from handling a client's business. Therefore, in order to gather, store, categorize, and analyze the log data generated while processing the client's business, a separate log data processing system needs to be established. However, the realization of flexible storage expansion functions for processing a massive amount of unstructured log data and executing a considerable number of functions to categorize and analyze the stored unstructured log data is difficult in existing computer environments. Thus, in this study, we use cloud computing technology to realize a cloud-based log data processing system for processing unstructured log data that are difficult to process using the existing computing infrastructure's analysis tools and management system. The proposed system uses the IaaS (Infrastructure as a Service) cloud environment to provide a flexible expansion of computing resources and includes the ability to flexibly expand resources such as storage space and memory under conditions such as extended storage or rapid increase in log data. Moreover, to overcome the processing limits of the existing analysis tool when a real-time analysis of the aggregated unstructured log data is required, the proposed system includes a Hadoop-based analysis module for quick and reliable parallel-distributed processing of the massive amount of log data. Furthermore, because the HDFS (Hadoop Distributed File System) stores data by generating copies of the block units of the aggregated log data, the proposed system offers automatic restore functions for the system to continually operate after it recovers from a malfunction. Finally, by establishing a distributed database using the NoSQL-based Mongo DB, the proposed system provides methods of effectively processing unstructured log data. Relational databases such as the MySQL databases have complex schemas that are inappropriate for processing unstructured log data. Further, strict schemas like those of relational databases cannot expand nodes in the case wherein the stored data are distributed to various nodes when the amount of data rapidly increases. NoSQL does not provide the complex computations that relational databases may provide but can easily expand the database through node dispersion when the amount of data increases rapidly; it is a non-relational database with an appropriate structure for processing unstructured data. The data models of the NoSQL are usually classified as Key-Value, column-oriented, and document-oriented types. Of these, the representative document-oriented data model, MongoDB, which has a free schema structure, is used in the proposed system. MongoDB is introduced to the proposed system because it makes it easy to process unstructured log data through a flexible schema structure, facilitates flexible node expansion when the amount of data is rapidly increasing, and provides an Auto-Sharding function that automatically expands storage. The proposed system is composed of a log collector module, a log graph generator module, a MongoDB module, a Hadoop-based analysis module, and a MySQL module. When the log data generated over the entire client business process of each bank are sent to the cloud server, the log collector module collects and classifies data according to the type of log data and distributes it to the MongoDB module and the MySQL module. The log graph generator module generates the results of the log analysis of the MongoDB module, Hadoop-based analysis module, and the MySQL module per analysis time and type of the aggregated log data, and provides them to the user through a web interface. Log data that require a real-time log data analysis are stored in the MySQL module and provided real-time by the log graph generator module. The aggregated log data per unit time are stored in the MongoDB module and plotted in a graph according to the user's various analysis conditions. The aggregated log data in the MongoDB module are parallel-distributed and processed by the Hadoop-based analysis module. A comparative evaluation is carried out against a log data processing system that uses only MySQL for inserting log data and estimating query performance; this evaluation proves the proposed system's superiority. Moreover, an optimal chunk size is confirmed through the log data insert performance evaluation of MongoDB for various chunk sizes.

Interpretation of Praying Letter and Estimation of Production Period on Samsaebulhoedo at Yongjusa Temple (용주사(龍珠寺) <삼세불회도(三世佛會圖)>의 축원문(祝願文) 해석(解釋)과 제작시기(製作時期) 추정(推定))

  • Kang, Kwan-shik
    • MISULJARYO - National Museum of Korea Art Journal
    • /
    • v.96
    • /
    • pp.155-180
    • /
    • 2019
  • Samsaebulhoedo(三世佛會圖) at Yongjusa Temple(龍珠寺), regarded as a monumental masterpiece consisting of different elements such as Confucian and Buddhist ideas, palace academy garden and Buddhist artist styles, unique traditional and western painting styles, is one of the representative works that symbolically illustrate the development and innovation of painting in the late Joseon dynasty. However, the absence of painting inscriptions raised persistent controversy over the past half century among researchers as to the matters of estimating its production period, identifying the original author and analyzing style characteristics. In the end, the work failed to gain recognitions commensurate with its historical significance and value. It is the particularly vital issue in that estimating the production period of the existing masterpiece is the beginning of all other discussions. However, this issue has caused the ensuing debates since all details are difficult to be interpreted to a concise form due to a number of different records on painters and mixture of traditional buddhist painting styles used by buddhist painters and innovative western styles used by ordinary painters. Contrary to other ordinary Buddhist paintings, this painting, Samsaebulhoedo, has a praying letter for the royal establishment at the center of the main altar. It should be noted that regarding this painting, its original version-His Royal Highness King, Her Majesty, His Royal Crown Prince主上殿下, 王妃殿下, 世子邸下-was erased and instead added Her Love Majesty慈宮邸下 in front of Her Majesty. This praying letter can be assumed as one of the significant and objective evidence for estimating its production period. The new argument of the late 19th century production focused on this praying letter, and proposed that King Sunjo was then the first-born son when Yongjusa Temple was built in 1790 and it was not until January 1, 1800 that he was ascended to the Crown Prince. In this light, the existing praying letter with the eulogistic title-Crown Prince世子-should be considered revised after his ascension to the throne. Styles and icons bore some resemblance to Samsaebulhoedo at Cheongryongsa Temple or Bongeunsa Temple portrayed by Buddhist painters in the late 19th century. Therefore, the remaining Samsaebulhoedo should be depicted by them in the same period as western styles were introduced in Buddhist painting in later days. Following extensive investigations, praying letters in Buddhist paintings in the late 19th century show that it was usual to record specification such as class, birth date and family name of people during the dynasty at the point of producing Buddhist paintings. It is easy to find that those who passed away decades ago cannot be revised to use eulogistic titles as seen by the praying letters in Samsaebulhoedo at Yongju Temple. As "His Royal Highness King, Her Majesty, His Royal Crown Prince" was generally used around 1790 regardless of the presence of first-born son or Crown Prince, it was rather natural to write the eulogistic title "His Royal Crown Prince" in the praying letter of Samsaebulhoedo. Contrary to ordinary royal hierarchy, Her Love Majesty was placed in front of Her Majesty. Based on this, the praying letter was assumed to be revised since King Jeongjo placed royal status of Hyegyeonggung before the Queen, which was an exceptional case during King Jeongjo's reign, due to unusual relationships among King Jeongjo, Hyegyeonggung and the Queen arising from the death of Crown Prince(思悼世子). At that time, there was a special case of originally writing a formal tripod praying letter, as can be seen from ordinary praying letter in Buddhist paintings, erasing it and adding a special eulogistic title: Her Love Majesty. This indicates that King Jeongjo identified that Hyegyeonggung was erased, and commanded to add it; nevertheless, ceremony leaders of Yongju Temple, built as a palace for holding ceremonies of Hyeonryungwon(顯隆園) are Jeongjo, the son of his father and his wife Hyegyeonggung (Her Love Majesty)(惠慶宮(慈宮)). This revision is believed to be ordered by King Jeongjo on January 17, 1791 when the King paid his first visit to the Hyeonryungwon since the establishment of Hyeonryungwon and Yongju Temple, stopped by Yongju Temple on his way to palace and saw Samsaebulhoedo for the first and last time. As shown above, this letter consisting of special contents and forms can be seen an obvious, objective testament to the original of Samsebulhoedo painted in 1790 when Yongju Temple was built.

A Proposal of a Keyword Extraction System for Detecting Social Issues (사회문제 해결형 기술수요 발굴을 위한 키워드 추출 시스템 제안)

  • Jeong, Dami;Kim, Jaeseok;Kim, Gi-Nam;Heo, Jong-Uk;On, Byung-Won;Kang, Mijung
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.3
    • /
    • pp.1-23
    • /
    • 2013
  • To discover significant social issues such as unemployment, economy crisis, social welfare etc. that are urgent issues to be solved in a modern society, in the existing approach, researchers usually collect opinions from professional experts and scholars through either online or offline surveys. However, such a method does not seem to be effective from time to time. As usual, due to the problem of expense, a large number of survey replies are seldom gathered. In some cases, it is also hard to find out professional persons dealing with specific social issues. Thus, the sample set is often small and may have some bias. Furthermore, regarding a social issue, several experts may make totally different conclusions because each expert has his subjective point of view and different background. In this case, it is considerably hard to figure out what current social issues are and which social issues are really important. To surmount the shortcomings of the current approach, in this paper, we develop a prototype system that semi-automatically detects social issue keywords representing social issues and problems from about 1.3 million news articles issued by about 10 major domestic presses in Korea from June 2009 until July 2012. Our proposed system consists of (1) collecting and extracting texts from the collected news articles, (2) identifying only news articles related to social issues, (3) analyzing the lexical items of Korean sentences, (4) finding a set of topics regarding social keywords over time based on probabilistic topic modeling, (5) matching relevant paragraphs to a given topic, and (6) visualizing social keywords for easy understanding. In particular, we propose a novel matching algorithm relying on generative models. The goal of our proposed matching algorithm is to best match paragraphs to each topic. Technically, using a topic model such as Latent Dirichlet Allocation (LDA), we can obtain a set of topics, each of which has relevant terms and their probability values. In our problem, given a set of text documents (e.g., news articles), LDA shows a set of topic clusters, and then each topic cluster is labeled by human annotators, where each topic label stands for a social keyword. For example, suppose there is a topic (e.g., Topic1 = {(unemployment, 0.4), (layoff, 0.3), (business, 0.3)}) and then a human annotator labels "Unemployment Problem" on Topic1. In this example, it is non-trivial to understand what happened to the unemployment problem in our society. In other words, taking a look at only social keywords, we have no idea of the detailed events occurring in our society. To tackle this matter, we develop the matching algorithm that computes the probability value of a paragraph given a topic, relying on (i) topic terms and (ii) their probability values. For instance, given a set of text documents, we segment each text document to paragraphs. In the meantime, using LDA, we can extract a set of topics from the text documents. Based on our matching process, each paragraph is assigned to a topic, indicating that the paragraph best matches the topic. Finally, each topic has several best matched paragraphs. Furthermore, assuming there are a topic (e.g., Unemployment Problem) and the best matched paragraph (e.g., Up to 300 workers lost their jobs in XXX company at Seoul). In this case, we can grasp the detailed information of the social keyword such as "300 workers", "unemployment", "XXX company", and "Seoul". In addition, our system visualizes social keywords over time. Therefore, through our matching process and keyword visualization, most researchers will be able to detect social issues easily and quickly. Through this prototype system, we have detected various social issues appearing in our society and also showed effectiveness of our proposed methods according to our experimental results. Note that you can also use our proof-of-concept system in http://dslab.snu.ac.kr/demo.html.