• Title/Summary/Keyword: log data analysis

Search Result 978, Processing Time 0.03 seconds

Integrated Monitoring System using Log Data (로그 데이터를 이용한 통합모니터링 시스템)

  • Jeon, Byung-Jin;Yoon, Deok-Byeong;Shin, Seung-Soo
    • Journal of Convergence for Information Technology
    • /
    • v.7 no.1
    • /
    • pp.35-42
    • /
    • 2017
  • In this paper, we propose to implement an integrated monitoring system using log data to reduce the load of analysis task of information security officer and to detect information leak in advance. To do this, we developed a transmission module between different model DBMS that transmits large amount of log data generated by the individual security system (MSSQL) to the integrated monitoring system (ORACLE), and the transmitted log data is digitized by individual and individual and researches about the continuous inspection and measures against malicious users when the information leakage symptom is detected by using the numerical data.

Analysis of Web Log Using Clementine Data Mining Solution (클레멘타인 데이터마이닝 솔루션을 이용한 웹 로그 분석)

  • Kim, Jae-Kyeong;Lee, Kun-Chang;Chung, Nam-Ho;Kwon, Soon-Jae;Cho, Yoon-Ho
    • Information Systems Review
    • /
    • v.4 no.1
    • /
    • pp.47-67
    • /
    • 2002
  • Since mid 90's, most of firms utilizing web as a communication vehicle with customers are keenly interested in web log file which contains a lot of trails customers left on the web, such as IP address, reference address, cookie file, duration time, etc. Therefore, an appropriate analysis of the web log file leads to understanding customer's behaviors on the web. Its analysis results can be used as an effective marketing information for locating potential target customers. In this study, we introduced a web mining technique using Clementine of SPSS, and analyzed a set of real web log data file on a certain Internet hub site. We also suggested a process of various strategies build-up based on the web mining results.

A Development Study of Tool for Web Log Analysis

  • Choi, Seungbae;Kang, Changwan;Kim, Kyukon;Son, Jongkwan
    • Communications for Statistical Applications and Methods
    • /
    • v.11 no.1
    • /
    • pp.93-106
    • /
    • 2004
  • Recently, many data of various types is gained with development of computer in many fields. Especially, web log data generating in web site furnish beneficial information on an organization. The enterprise's destiny is swayed by according as how these information gaining from the web site utilize. In this paper, for the purpose of obtaining useful information, we present a tool is called WebBizi for web log analysis. This will be helpful to enterprise working the web site.

Comparison of Methods for Reducing the Dimension of Compositional Data with Zero Values

  • Song, Taeg-Youn;Choi, Byung-Jin
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.4
    • /
    • pp.559-569
    • /
    • 2012
  • Compositional data consist of compositions that are non-negative vectors of proportions with the unit-sum constraint. In disciplines such as petrology and archaeometry, it is fundamental to statistically analyze this type of data. Aitchison (1983) introduced a log-contrast principal component analysis that involves logratio transformed data, as a dimension-reduction technique to understand and interpret the structure of compositional data. However, the analysis is not usable when zero values are present in the data. In this paper, we introduce 4 possible methods to reduce the dimension of compositional data with zero values. Two real data sets are analyzed using the methods and the obtained results are compared.

THE STUDY OF FLOOD FREQUENCY ESTIMATES USING CAUCHY VARIABLE KERNEL

  • Moon, Young-Il;Cha, Young-Il;Ashish Sharma
    • Water Engineering Research
    • /
    • v.2 no.1
    • /
    • pp.1-10
    • /
    • 2001
  • The frequency analyses for the precipitation data in Korea were performed. We used daily maximum series, monthly maximum series, and annual series. For nonparametric frequency analyses, variable kernel estimators were used. Nonparametric methods do not require assumptions about the underlying populations from which the data are obtained. Therefore, they are better suited for multimodal distributions with the advantage of not requiring a distributional assumption. In order to compare their performance with parametric distributions, we considered several probability density functions. They are Gamma, Gumbel, Log-normal, Log-Pearson type III, Exponential, Generalized logistic, Generalized Pareto, and Wakeby distributions. The variable kernel estimates are comparable and are in the middle of the range of the parametric estimates. The variable kernel estimates show a very small probability in extrapolation beyond the largest observed data in the sample. However, the log-variable kernel estimates remedied these defects with the log-transformed data.

  • PDF

A Study on Data Pre-filtering Methods for Fault Diagnosis (시스템 결함원인분석을 위한 데이터 로그 전처리 기법 연구)

  • Lee, Yang-Ji;Kim, Duck-Young;Hwang, Min-Soon;Cheong, Young-Soo
    • Korean Journal of Computational Design and Engineering
    • /
    • v.17 no.2
    • /
    • pp.97-110
    • /
    • 2012
  • High performance sensors and modern data logging technology with real-time telemetry facilitate system fault diagnosis in a very precise manner. Fault detection, isolation and identification in fault diagnosis systems are typical steps to analyze the root cause of failures. This systematic failure analysis provides not only useful clues to rectify the abnormal behaviors of a system, but also key information to redesign the current system for retrofit. The main barriers to effective failure analysis are: (i) the gathered data (event) logs are too large in general, and further (ii) they usually contain noise and redundant data that make precise analysis difficult. This paper therefore applies suitable pre-processing techniques to data reduction and feature extraction, and then converts the reduced data log into a new format of event sequence information. Finally the event sequence information is decoded to investigate the correlation between specific event patterns and various system faults. The efficiency of the developed pre-filtering procedure is examined with a terminal box data log of a marine diesel engine.

A Study of Web Usage Mining for eCRM

  • Hyuncheol Kang;Jung, Byoung-Cheol
    • Communications for Statistical Applications and Methods
    • /
    • v.8 no.3
    • /
    • pp.831-840
    • /
    • 2001
  • In this study, We introduce the process of web usage mining, which has lately attracted considerable attention with the fast diffusion of world wide web, and explain the web log data, which Is the main subject of web usage mining. Also, we illustrate some real examples of analysis for web log data and look into practical application of web usage mining for eCRM.

  • PDF

Mobile Gamer Categorization with Archetypal Analysis and Cognitive-Psychological Features from Log Data (로그 데이터의 유형분석 및 인지심리적 속성 추출을 이용한 모바일 게이머 유형화 연구)

  • Jeon, Jihoon;Yoon, Dumim;Yang, Seongil;Kim, Kyungjoong
    • Journal of KIISE
    • /
    • v.45 no.3
    • /
    • pp.234-241
    • /
    • 2018
  • The study of classifying gamer types or analyzing the characteristics of gamers is a field of interest for data analysis researchers. From the past to the present, much research has been done on gamer categorization and gamer analysis. However, most studies use surveys or bio-signals, which is not practical because it is difficult to obtain large amounts of data. Even if the game log is used, it is difficult to analyze the psychology of the gamer because the gamer is categorized and analyzed by extracting only statistical values. However, if we can extract the cognitive psychology information of the gamer from the basic game log, we can analyze the gamer more intuitively and easily. In this paper, we extracted eight cognitive psychological features representing the behavior and psychological information of the gamer using Crazy Dragon's game log, which is a mobile Role-Playing-Game (RPG). In addition, we classified gamers based upon cognitive psychological features and analyzed them using eight cognitive psychological features. As a result, most gamers were highly correlated with one or two types.

On Sample Size Calculation in Bioequivalence Trials

  • Kang, Seung-Ho
    • Proceedings of the PSK Conference
    • /
    • 2003.04a
    • /
    • pp.117.2-118
    • /
    • 2003
  • Sample size calculations plays an important role in a bioequivalence trials and is determined by considering power under the alternative hypothesis. The regulatory guideline recommends that $2{\times}2$ crossover design is conducted and raw data is log-transformed for statistical analysis. In this paper, we discuss the sample size calculation in $2{\times}2$ crossover design with the log-transformed data.

  • PDF

Diagnosis Analysis of Patient Process Log Data (환자의 프로세스 로그 정보를 이용한 진단 분석)

  • Bae, Joonsoo
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.42 no.4
    • /
    • pp.126-134
    • /
    • 2019
  • Nowadays, since there are so many big data available everywhere, those big data can be used to find useful information to improve design and operation by using various analysis methods such as data mining. Especially if we have event log data that has execution history data of an organization such as case_id, event_time, event (activity), performer, etc., then we can apply process mining to discover the main process model in the organization. Once we can find the main process from process mining, we can utilize it to improve current working environment. In this paper we developed a new method to find a final diagnosis of a patient, who needs several procedures (medical test and examination) to diagnose disease of the patient by using process mining approach. Some patients can be diagnosed by only one procedure, but there are certainly some patients who are very difficult to diagnose and need to take several procedures to find exact disease name. We used 2 million procedure log data and there are 397 thousands patients who took 2 and more procedures to find a final disease. These multi-procedure patients are not frequent case, but it is very critical to prevent wrong diagnosis. From those multi-procedure taken patients, 4 procedures were discovered to be a main process model in the hospital. Using this main process model, we can understand the sequence of procedures in the hospital and furthermore the relationship between diagnosis and corresponding procedures.