A Study of Anomaly Detection for ICT Infrastructure using Conditional Multimodal Autoencoder (ICT 인프라 이상탐지를 위한 조건부 멀티모달 오토인코더에 관한 연구)
-
- Journal of Intelligence and Information Systems
- /
- v.27 no.3
- /
- pp.57-73
- /
- 2021
Maintenance and prevention of failure through anomaly detection of ICT infrastructure is becoming important. System monitoring data is multidimensional time series data. When we deal with multidimensional time series data, we have difficulty in considering both characteristics of multidimensional data and characteristics of time series data. When dealing with multidimensional data, correlation between variables should be considered. Existing methods such as probability and linear base, distance base, etc. are degraded due to limitations called the curse of dimensions. In addition, time series data is preprocessed by applying sliding window technique and time series decomposition for self-correlation analysis. These techniques are the cause of increasing the dimension of data, so it is necessary to supplement them. The anomaly detection field is an old research field, and statistical methods and regression analysis were used in the early days. Currently, there are active studies to apply machine learning and artificial neural network technology to this field. Statistically based methods are difficult to apply when data is non-homogeneous, and do not detect local outliers well. The regression analysis method compares the predictive value and the actual value after learning the regression formula based on the parametric statistics and it detects abnormality. Anomaly detection using regression analysis has the disadvantage that the performance is lowered when the model is not solid and the noise or outliers of the data are included. There is a restriction that learning data with noise or outliers should be used. The autoencoder using artificial neural networks is learned to output as similar as possible to input data. It has many advantages compared to existing probability and linear model, cluster analysis, and map learning. It can be applied to data that does not satisfy probability distribution or linear assumption. In addition, it is possible to learn non-mapping without label data for teaching. However, there is a limitation of local outlier identification of multidimensional data in anomaly detection, and there is a problem that the dimension of data is greatly increased due to the characteristics of time series data. In this study, we propose a CMAE (Conditional Multimodal Autoencoder) that enhances the performance of anomaly detection by considering local outliers and time series characteristics. First, we applied Multimodal Autoencoder (MAE) to improve the limitations of local outlier identification of multidimensional data. Multimodals are commonly used to learn different types of inputs, such as voice and image. The different modal shares the bottleneck effect of Autoencoder and it learns correlation. In addition, CAE (Conditional Autoencoder) was used to learn the characteristics of time series data effectively without increasing the dimension of data. In general, conditional input mainly uses category variables, but in this study, time was used as a condition to learn periodicity. The CMAE model proposed in this paper was verified by comparing with the Unimodal Autoencoder (UAE) and Multi-modal Autoencoder (MAE). The restoration performance of Autoencoder for 41 variables was confirmed in the proposed model and the comparison model. The restoration performance is different by variables, and the restoration is normally well operated because the loss value is small for Memory, Disk, and Network modals in all three Autoencoder models. The process modal did not show a significant difference in all three models, and the CPU modal showed excellent performance in CMAE. ROC curve was prepared for the evaluation of anomaly detection performance in the proposed model and the comparison model, and AUC, accuracy, precision, recall, and F1-score were compared. In all indicators, the performance was shown in the order of CMAE, MAE, and AE. Especially, the reproduction rate was 0.9828 for CMAE, which can be confirmed to detect almost most of the abnormalities. The accuracy of the model was also improved and 87.12%, and the F1-score was 0.8883, which is considered to be suitable for anomaly detection. In practical aspect, the proposed model has an additional advantage in addition to performance improvement. The use of techniques such as time series decomposition and sliding windows has the disadvantage of managing unnecessary procedures; and their dimensional increase can cause a decrease in the computational speed in inference.The proposed model has characteristics that are easy to apply to practical tasks such as inference speed and model management.
The evolution of instant communication has mirrored the development of the Internet and messenger applications are among the most representative manifestations of instant communication technologies. In messenger applications, senders use emoticons to supplement the emotions conveyed in the text of their messages. The fact that communication via messenger applications is not face-to-face makes it difficult for senders to communicate their emotions to message recipients. Emoticons have long been used as symbols that indicate the moods of speakers. However, at present, emoticon-use is evolving into a means of conveying the psychological states of consumers who want to express individual characteristics and personality quirks while communicating their emotions to others. The fact that companies like KakaoTalk, Line, Apple, etc. have begun conducting emoticon business and sales of related content are expected to gradually increase testifies to the significance of this phenomenon. Nevertheless, despite the development of emoticons themselves and the growth of the emoticon market, no suitable emoticon recommendation system has yet been developed. Even KakaoTalk, a messenger application that commands more than 90% of domestic market share in South Korea, just grouped in to popularity, most recent, or brief category. This means consumers face the inconvenience of constantly scrolling around to locate the emoticons they want. The creation of an emoticon recommendation system would improve consumer convenience and satisfaction and increase the sales revenue of companies the sell emoticons. To recommend appropriate emoticons, it is necessary to quantify the emotions that the consumer sees and emotions. Such quantification will enable us to analyze the characteristics and emotions felt by consumers who used similar emoticons, which, in turn, will facilitate our emoticon recommendations for consumers. One way to quantify emoticons use is metadata-ization. Metadata-ization is a means of structuring or organizing unstructured and semi-structured data to extract meaning. By structuring unstructured emoticon data through metadata-ization, we can easily classify emoticons based on the emotions consumers want to express. To determine emoticons' precise emotions, we had to consider sub-detail expressions-not only the seven common emotional adjectives but also the metaphorical expressions that appear only in South Korean proved by previous studies related to emotion focusing on the emoticon's characteristics. We therefore collected the sub-detail expressions of emotion based on the "Shape", "Color" and "Adumbration". Moreover, to design a highly accurate recommendation system, we considered both emotion-technical indexes and emoticon-emotional indexes. We then identified 14 features of emoticon-technical indexes and selected 36 emotional adjectives. The 36 emotional adjectives consisted of contrasting adjectives, which we reduced to 18, and we measured the 18 emotional adjectives using 40 emoticon sets randomly selected from the top-ranked emoticons in the KakaoTalk shop. We surveyed 277 consumers in their mid-twenties who had experience purchasing emoticons; we recruited them online and asked them to evaluate five different emoticon sets. After data acquisition, we conducted a factor analysis of emoticon-emotional factors. We extracted four factors that we named "Comic", Softness", "Modernity" and "Transparency". We analyzed both the relationship between indexes and consumer attitude and the relationship between emoticon-technical indexes and emoticon-emotional factors. Through this process, we confirmed that the emoticon-technical indexes did not directly affect consumer attitudes but had a mediating effect on consumer attitudes through emoticon-emotional factors. The results of the analysis revealed the mechanism consumers use to evaluate emoticons; the results also showed that consumers' emoticon-technical indexes affected emoticon-emotional factors and that the emoticon-emotional factors affected consumer satisfaction. We therefore designed the emoticon recommendation system using only four emoticon-emotional factors; we created a recommendation method to calculate the Euclidean distance from each factors' emotion. In an attempt to increase the accuracy of the emoticon recommendation system, we compared the emotional patterns of selected emoticons with the recommended emoticons. The emotional patterns corresponded in principle. We verified the emoticon recommendation system by testing prediction accuracy; the predictions were 81.02% accurate in the first result, 76.64% accurate in the second, and 81.63% accurate in the third. This study developed a methodology that can be used in various fields academically and practically. We expect that the novel emoticon recommendation system we designed will increase emoticon sales for companies who conduct business in this domain and make consumer experiences more convenient. In addition, this study served as an important first step in the development of an intelligent emoticon recommendation system. The emotional factors proposed in this study could be collected in an emotional library that could serve as an emotion index for evaluation when new emoticons are released. Moreover, by combining the accumulated emotional library with company sales data, sales information, and consumer data, companies could develop hybrid recommendation systems that would bolster convenience for consumers and serve as intellectual assets that companies could strategically deploy.
Objectives: This study attempts to analyze the length of hospital stay and expenses of frequent disease admitted in a Vaterans Hospital. Methods: Data was collected from January 1, 2001 to December 31, 2003 from the Claim records of 9,640 patients in a Vaterans Hospital. Results: The results were as follows: 1. In age & sex distribution, there was male 70.9%, female 29.1%, and 35.8% of them is 70 age group. Frequency by insurance program was Health insurance 78.1%, Medical aid 14.2%, no insurance 4.1%, others 3.6%. Distribution of each department was internal medicine 28.3%, orthopedic surgery 21.3%, surgery 16.6%, neurosurgey 7.1%, pediatrics 5.9%. Also, in the veterans group, male to female patient ratio was 99.3% male to 0.7% female, them over 70 years old was 51.6%, and them which live in daejeon was 43.5%. 2. In frequency of disease, there was gastroenteritis 4.8%, pneumonia 3.8%, cartaract 3.7%, cerebral infarct 3.2%, hyperplasia of prostate 3.0%. In frequency of korean standard classification of diseases, there was injury and poisoning and certain other consequences of external causes 17.1%, diseases of digestive system 16.1%, diseases of musculoskeletal system and connective tissue 13.9%, diseases of respiratory system 9.4%, diseases of genitourinary system 8.6%. Also, in veterans group, frequency of them was diseases of musculoskeletal system and connective tissue 19.4%, diseases of digestive system 16.8%, injury and poisoning and certain other consequences of external causes 15.7%, diseases of genitourinary system 9.7%, diseases of circuatory system 8.2%. 3. Average length of hospital stay was 29.0 days for total patients, 51.8 days for the veterans group, 15.7 days for the non-veterans one. Average total expenses was 3,669,579 won, the veterans group 7,263,877 won, the non-veterans one 1,560,333 won. The ratio of insurer to insuree was 55.2 : 44.8, the ratio of amount paid by patient in the veterans group 61.7%, in the non-veterans one 33.0%. 4. In items of medical expenses, fee for hospital accommodation was 34.7%, fee for medication 13.2%(injection 7.8%, drug 5.4%), fee for service 48.6%(physical therapy 26.3%, operation 9.7%, laboratory examination 5.2%, radiological examination 3.1%, etc), others 3.4%. In them for the veterans group, fee for physical therapy was 35.3%, fee for hospital accommodation 35.2%, fee for injection 6.2%, fee for operation 5.9%, for the non-veterans one, fee for hospital accommodation 35.7%, fee for operation 16.4%, fee for injection 11.4%, fee for laboratory examination 8.3%. 5. In the comparison of the frequency by Korean standard classification of diseases and distance between the hospital and home, the region under 21.5Km was more frequent in symptoms, signs an abnormal clinical and laboratory findings 56.0%, injury and poisoning and certain other consequences of external causes 55.6%, diseases of the eye and adnexa 52.9%, the one over 21.5Km was more frequent in neoplasms 57.4%, diseases of musculoskeletal system and connective tissue 55.9%, diseases of genitourinary system 53.5%.
As social data become into the spotlight, mainstream web search engines provide data indicate how many people searched specific keyword: Web Search Traffic data. Web search traffic information is collection of each crowd that search for specific keyword. In a various area, web search traffic can be used as one of useful variables that represent the attention of common users on specific interests. A lot of studies uses web search traffic data to nowcast or forecast social phenomenon such as epidemic prediction, consumer pattern analysis, product life cycle, financial invest modeling and so on. Also web search traffic data have begun to be applied to predict tourist inbound. Proper demand prediction is needed because tourism is high value-added industry as increasing employment and foreign exchange. Among those tourists, especially Chinese tourists: Youke is continuously growing nowadays, Youke has been largest tourist inbound of Korea tourism for many years and tourism profits per one Youke as well. It is important that research into proper demand prediction approaches of Youke in both public and private sector. Accurate tourism demands prediction is important to efficient decision making in a limited resource. This study suggests improved model that reflects latest issue of society by presented the attention from group of individual. Trip abroad is generally high-involvement activity so that potential tourists likely deep into searching for information about their own trip. Web search traffic data presents tourists' attention in the process of preparation their journey instantaneous and dynamic way. So that this study attempted select key words that potential Chinese tourists likely searched out internet. Baidu-Chinese biggest web search engine that share over 80%- provides users with accessing to web search traffic data. Qualitative interview with potential tourists helps us to understand the information search behavior before a trip and identify the keywords for this study. Selected key words of web search traffic are categorized by how much directly related to "Korean Tourism" in a three levels. Classifying categories helps to find out which keyword can explain Youke inbound demands from close one to far one as distance of category. Web search traffic data of each key words gathered by web crawler developed to crawling web search data onto Baidu Index. Using automatically gathered variable data, linear model is designed by multiple regression analysis for suitable for operational application of decision and policy making because of easiness to explanation about variables' effective relationship. After regression linear models have composed, comparing with model composed traditional variables and model additional input web search traffic data variables to traditional model has conducted by significance and R squared. after comparing performance of models, final model is composed. Final regression model has improved explanation and advantage of real-time immediacy and convenience than traditional model. Furthermore, this study demonstrates system intuitively visualized to general use -Youke Mining solution has several functions of tourist decision making including embed final regression model. Youke Mining solution has algorithm based on data science and well-designed simple interface. In the end this research suggests three significant meanings on theoretical, practical and political aspects. Theoretically, Youke Mining system and the model in this research are the first step on the Youke inbound prediction using interactive and instant variable: web search traffic information represents tourists' attention while prepare their trip. Baidu web search traffic data has more than 80% of web search engine market. Practically, Baidu data could represent attention of the potential tourists who prepare their own tour as real-time. Finally, in political way, designed Chinese tourist demands prediction model based on web search traffic can be used to tourism decision making for efficient managing of resource and optimizing opportunity for successful policy.
To obtain basic information on the Korean local corn lines a total of 57 lines were selected from 1,000 Korean local collection at Chungnam National University, classified by principal component analysis, and genetic nature was investigated. The results are summarized as follows. 1. There were a great variation in mean values of plant characters of the lines. The mean values of plant characters except for density of kernels varied with types of crossing. All characters except. for tasselling dates were reduced in magnitude when selfed, while those characters were increased when topcrossed. 2. The correlation coefficients among characters studied ranged front 0.99 to -0.59. The correlation coefficients among characters were not greatly changed depending upon types of crosses. 3. In order to classify the lines more effectively, selected 12 plant characters were used to classify 57 local lines by principal component analysis. The first four component could explain 86.4%, 83.4% and 81.1% of the total variations in sibbed lines, selfed lines and topcrossed lines, respectively. 4. Contribution of characters to principal component was high at upper principal components and low at lower principal components. 5. Biological meaning of the principal component and plant types corresponding to the each principal component were explained clearly by the correlation coefficient between principal components and characters. The first principal component appeared to correspond to the size of plant and ear. The second principal component appeared to correspond to the degree of differentiation in organs and the duration of vegetative growing period. But biological meaning of the third and fourth principal components was not clear. 6. The lines were classified into 4 lineal groups by the taxonomic distance. Group I included 52 lines which was 91.2% of total lines, group II 3 lines, group III 1 lines and group IV I lines, respectively. Four groups could be characterized as follows : Group I : early maturity, short-culmed, medium height plant, small ears, medium kernels and medium yielding. Group II : late maturity, medium height plant, small ears, small kernels, prolific ears and higher yielding. Group III : medium maturity, tall-culmed, small ears, small kernels and low yielding. Group IV : medium maturity, tall-calmed, large ears, one ear plant and me yielding. 7. The inbreeding depression varied with plant characters and lines. The characters such as yield, kernel weight per ear, ear weight and plant height showed great degree of inbreeding depression. Group I showed high inbreeding depression in such characters as 100 kernel weight, leaf number, plant height and days to tasselling, while group II showed high inbreeding depression in other plant characters. 8. Heterosis of plant characters varied also with lines. The ear weight, kernel weight per ear, yield, 100 kernel weight, and plant height were some of the plant characters showing high heterosis. Group II showed high values of heterosis in such characters as ear length, ear diameter, ear weight, kernel weight per ear, 100 kernel weight, and leaf length, while group I was high in heterosis in other plant characters. 9. The degree of homozgosity was highest in ear weight (79.1%) and lowest in ear number per plant (-21%). Group II showed higher degree of homozygosity than group I. 10. Correlation coefficients between characters of ribbed and topcrossed lines were positive for all characters. Highly significant. correlation coefficients between ribbed and topcrossed lines were obtained especially for characters such as ear number per plant, plant height, leaf length and yield per plot.
1. If one unity is given to the prongs whose ends touch each other for estimating the internal stresses occuring in it, the internal stresses which are developed in the open prongs can be evaluated by the ratio to the unity. In accordance with the above statement, an equation was derived as follows. For employing this equation, the prongs should be made as shown in Fig. I, and be measured A and B' as indicated in Fig. l. A more precise value will result as the angle (J becomes smaller.
AirAsia QZ8501 Jet departed from Juanda International Airport in, Surabaya, Indonesia at 05:35 on Dec. 28, 2014 and was scheduled to arrive at Changi International Airport in Singapore at 08:30 the same day. The aircraft, an Airbus A320-200 crashed into the Java Sea on Dec. 28, 2014 carrying 162 passengers and crew off the coast of Indonesia's second largest city Surabaya on its way to Singapore. Indonesia's AirAsia jet carrying 162 people lost contact with ground control on Dec. 28, 2014. The aircraft's debris was found about 66 miles from the plane's last detected position. The 155 passengers and seven crew members aboard Flight QZ 8501, which vanished from radar 42 minutes after having departed Indonesia's second largest city of Surabaya bound for Singapore early Dec. 28, 2014. AirAsia QZ8501 had on board 137 adult passengers, 17 children and one infant, along with two pilots and five crew members in the aircraft, a majority of them Indonesian nationals. On board Flight QZ8501 were 155 Indonesian, three South Koreans, and one person each from Singapore, Malaysia and the UK. The Malaysia Airlines Flight 370 departed from Kuala Lumpur International Airport on March 8, 2014 at 00:41 local time and was scheduled to land at Beijing's Capital International Airport at 06:30 local time. Malaysia Airlines also marketed as China Southern Airlines Flight 748 (CZ748) through a code-share agreement, was a scheduled international passenger flight that disappeared on 8 March 2014 en route from Kuala Lumpur International Airport to Beijing's Capital International Airport (a distance of 2,743 miles: 4,414 km). The aircraft, a Boeing 777-200ER, last made contact with air traffic control less than an hour after takeoff. Operated by Malaysia Airlines (MAS), the aircraft carried 12 crew members and 227 passengers from 15 nations. There were 227 passengers, including 153 Chinese and 38 Malaysians, according to records. Nearly two-thirds of the passengers on Flight 370 were from China. On April 5, 2014 what could be the wreckage of the ill-fated Malaysia Airlines was found. What appeared to be the remnants of flight MH370 have been spotted drifting in a remote section of the Indian Ocean. Compensation for loss of life is vastly different between US. passengers and non-U.S. passengers. "If the claim is brought in the US. court, it's of significantly more value than if it's brought into any other court." Some victims and survivors of the Indonesian and Malaysia airline's air crash case would like to sue the lawsuit to the United States court in order to receive a larger compensation package for damage caused by an accident that occurred in the sea of Java sea and the Indian ocean and rather than taking it to the Indonesian or Malaysian court. Though each victim and survivor of the Indonesian and Malaysia airline's air crash case will receive an unconditional 113,100 Unit of Account (SDR) as an amount of compensation for damage from Indonesia's AirAsia and Malaysia Airlines in accordance with Article 21, 1 (absolute, strict, no-fault liability system) of the 1999 Montreal Convention. But if Indonesia AirAsia airlines and Malaysia Airlines cannot prove as to the following two points without fault based on Article 21, 2 (presumed faulty system) of the 1999 Montreal Convention, AirAsia of Indonesiaand Malaysia Airlines will be burdened the unlimited liability to the each victim and survivor of the Indonesian and Malaysia airline's air crash case such as (1) such damage was not due to the negligence or other wrongful act or omission of the air carrier or its servants or agents, or (2) such damage was solely due to the negligence or other wrongful act or omission of a third party. In this researcher's view for the aforementioned reasons, and under the laws of China, Indonesia, Malaysia and Korea the Chinese, Indonesian, Malaysia and Korean, some victims and survivors of the crash of the two flights are entitled to receive possibly from more than 113,100 SDR to 5 million US$ from the two airlines or from the Aviation Insurance Company based on decision of the American court. It could also be argued that it is reasonable and necessary to revise the clause referring to bodily injury to a clause mentioning personal injury based on Article 17 of the 1999 Montreal Convention so as to be included the mental injury and condolence in the near future.
This study is to find out the correlation with buddhist music after analyzing the rhythm of six pieces of PanYeombul sung by Kim Janggil out of Ogu exorcism of East coast the findings summarized are as follows. First, PanYeombul by Kim Janggil, performed on Oct, 16, 2016, was composed of
Thecodiplosis japonesis is sweeping the Pinus densiflora forests from south-west to north-east direction, destroying almost all the aged large trees as well as even the young ones. The front line of infestation is moving slowly but ceaselessly norhwards as a long bottle front. Estimation is that more than 40 percent of the area of P. densiflora forest has been damaged already, however some individuals could escapes from the damage and contribute to restore the site to the previous vegetation composition. When the stands were attacked by this insect, the drastic openings of the upper story of tree canopy formed by exclusively P. densiflora are usually resulted and some environmental factors such as light, temperature, litter accumulation, soil moisture and offers were naturally modified. With these changes after insect invasion, as the time passes, phytosociologic changes of the vegetation are gradually proceeding. If we select the forest according to four categories concerning the history of the insect outbreak, namely, non-attacked (healthy forest), recently damaged (the outbreak occured about 1-2 years ago), severely damaged (occured 5-6 years ago), damage prolonged (occured 10 years ago) and restored (occured about 20 years ago), any directional changes of vegetation composition could be traced these in line with four progressive stages. To elucidate these changes, three survey districts; (1) "Gongju" where the damage was severe and it was outbroken in 1977, (2) "Buyeo" where damage prolonged and (3) "Gochang" as restored, were set, (See Tab. 1). All these were located in the south temperate forest zone which was delimited mainly due to the temporature factor and generally accepted without any opposition at present. In view of temperature, the amount and distribution of precipitation and various soil factor, the overall homogeneity of environmental conditions between survey districts might be accepted. However this did not mean that small changes of edaphic and topographic conditions and microclimates can induce any alteration of vegetation patterns. Again four survey plots were set in each district and inter plot distance was 3 to 4 km. And again four subplots were set within a survey plot. The size of a subplot was
The wall shear stress in the vicinity of end-to end anastomoses under steady flow conditions was measured using a flush-mounted hot-film anemometer(FMHFA) probe. The experimental measurements were in good agreement with numerical results except in flow with low Reynolds numbers. The wall shear stress increased proximal to the anastomosis in flow from the Penrose tubing (simulating an artery) to the PTFE: graft. In flow from the PTFE graft to the Penrose tubing, low wall shear stress was observed distal to the anastomosis. Abnormal distributions of wall shear stress in the vicinity of the anastomosis, resulting from the compliance mismatch between the graft and the host artery, might be an important factor of ANFH formation and the graft failure. The present study suggests a correlation between regions of the low wall shear stress and the development of anastomotic neointimal fibrous hyperplasia(ANPH) in end-to-end anastomoses. 30523 T00401030523 ^x Air pressure decay(APD) rate and ultrafiltration rate(UFR) tests were performed on new and saline rinsed dialyzers as well as those roused in patients several times. C-DAK 4000 (Cordis Dow) and CF IS-11 (Baxter Travenol) reused dialyzers obtained from the dialysis clinic were used in the present study. The new dialyzers exhibited a relatively flat APD, whereas saline rinsed and reused dialyzers showed considerable amount of decay. C-DAH dialyzers had a larger APD(11.70