• Title/Summary/Keyword: Outlier analysis

Search Result 235, Processing Time 0.04 seconds

The Assessing Comparative Study for Statistical Process Control of Software Reliability Model Based on Musa-Okumo and Power-law Type (Musa-Okumoto와 Power-law형 NHPP 소프트웨어 신뢰모형에 관한 통계적 공정관리 접근방법 비교연구)

  • Kim, Hee-Cheul
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.8 no.6
    • /
    • pp.483-490
    • /
    • 2015
  • There are many software reliability models that are based on the times of occurrences of errors in the debugging of software. It is shown that it is possible to do likelihood inference for software reliability models based on finite failure model and non-homogeneous Poisson Processes (NHPP). For someone making a decision about when to market software, the conditional failure rate is an important variables. The infinite failure model are used in a wide variety of practical situations. Their use in characterization problems, detection of outlier, linear estimation, study of system reliability, life-testing, survival analysis, data compression and many other fields can be seen from the many study. Statistical process control (SPC) can monitor the forecasting of software failure and thereby contribute significantly to the improvement of software reliability. Control charts are widely used for software process control in the software industry. In this paper, proposed a control mechanism based on NHPP using mean value function of Musa-Okumo and Power law type property.

Design of Anomaly Detection System Based on Big Data in Internet of Things (빅데이터 기반의 IoT 이상 장애 탐지 시스템 설계)

  • Na, Sung Il;Kim, Hyoung Joong
    • Journal of Digital Contents Society
    • /
    • v.19 no.2
    • /
    • pp.377-383
    • /
    • 2018
  • Internet of Things (IoT) is producing various data as the smart environment comes. The IoT data collection is used as important data to judge systems's status. Therefore, it is important to monitor the anomaly state of the sensor in real-time and to detect anomaly data. However, it is necessary to convert the IoT data into a normalized data structure for anomaly detection because of the variety of data structures and protocols. Thus, we can expect a good quality effect such as accurate analysis data quality and service quality. In this paper, we propose an anomaly detection system based on big data from collected sensor data. The proposed system is applied to ensure anomaly detection and keep data quality. In addition, we applied the machine learning model of support vector machine using anomaly detection based on time-series data. As a result, machine learning using preprocessed data was able to accurately detect and predict anomaly.

A Study on the Spatial Distribution Patterns of Urban Green Spaces Using Local Spatial Autocorrelation Statistics (국지적 공간자기상관통계를 이용한 도시녹지의 공간적 분포패턴에 관한 연구)

  • Kim, Yun-Ki
    • Journal of Cadastre & Land InformatiX
    • /
    • v.50 no.1
    • /
    • pp.25-45
    • /
    • 2020
  • The primary purpose of this study is to compare and analyze the performance of local spatial autocorrelation techniques in identifying spatial distribution patterns of green spaces. To achieve the objective, this researcher uses satellite image analysis and spatial autocorrelation techniques. The result of the study shows that the LISA cluster map with the spatial outlier cluster is superior to other analytical methods in identifying the spatial distribution pattern of urban green space. This study can contribute to the related fields in that it uses several different research methods than the existing ones. Despite this differentiation and usefulness, this study has limitations in using low-resolution satellite imagery and NDVI among vegetation indices in identifying spatial distribution patterns of green areas. These limitations may be overcome in future studies by using UAV images or by simultaneously using several vegetation indices.

Development on Crop Yield Forecasting Model for Major Vegetable Crops using Meteorological Information of Main Production Area (주산지 기상정보를 활용한 주요 채소작물의 단수 예측 모형 개발)

  • Lim, Chul-Hee;Kim, Gang Sun;Lee, Eun Jung;Heo, Seongbong;Kim, Teayeon;Kim, Young Seok;Lee, Woo-Kyun
    • Journal of Climate Change Research
    • /
    • v.7 no.2
    • /
    • pp.193-203
    • /
    • 2016
  • The importance of forecasting agricultural production is receiving attention while climate change is accelerating. This study suggested three types of crop yield forecasting model for major vegetable crops by using downscaled meteorological information of main production area on farmland level, which identified as limitation from previous studies. First, this study conducted correlation analysis with seven types of farm level downscaled meteorological informations and reported crop yield of main production area. After, we selected three types of meteorological factors which showed the highest relation with each crop species and regions. Parameters were deducted from meterological factor with high correlation but crop species number was neglected. After, crop yield of each crops was estimated by using the three suggested types of models. Chinese cabbage showed high accuracy in overall, while the accuracy of daikon and onion was quiet revised by neglecting the outlier. Chili and garlic showed differences by region, but Kyungbuk chili and Chungnam, Kyungsang garlic appeared significant accuracy. We also selected key meteorological factor of each crops which has the highest relation with crop yield. If the factor had significant relation with the quantity, it explains better about the variations of key meteorological factor. This study will contribute to establishing the methodology of future studies by estimating the crop yield of different species by using farmland meterological information and relatively simplify multiple linear regression models.

Background Concentration and Contamination Assessment of Heavy Metals in Korean Coastal Sediments (한반도 연안 퇴적물의 중금속 배경농도 및 오염도 평가)

  • WOO, JUNSIK;LEE, HYOJIN;PARK, JONGKYU;PARK, KYOUNGKYU;CHO, DONGJIN;JANG, DONGJUN;PARK, SOJUNG;CHOI, MANSIK;YOO, JEONGKYU
    • The Sea:JOURNAL OF THE KOREAN SOCIETY OF OCEANOGRAPHY
    • /
    • v.24 no.1
    • /
    • pp.64-78
    • /
    • 2019
  • The background concentrations of heavy metals in Korean coastal sediments were estimated using heavy metal data for 495 sediments obtained from 'National Marine Ecosystem Survey (Coastal ecosystem) in 2016-2017' and the extent of contamination was assessed. Al, Cs, and Li are chosen as appropriate indicators for sediment grain size. In the relationships between heavy metal and indicators concentrations, the lowest slope data were selected through the outlier removal and residual analysis, and the background concentrations were presented as a linear regression line between metal and indicator. Comparing the previous studies for the background concentrations of heavy metals in Korean coastal sediments, concentration levels were generally consistent but those for As and Cd were presented for the first time, and the background concentration using Li as the indicator was presented for the first time.

Application of Discrete Wavelet Transforms to Identify Unknown Attacks in Anomaly Detection Analysis (이상 탐지 분석에서 알려지지 않는 공격을 식별하기 위한 이산 웨이블릿 변환 적용 연구)

  • Kim, Dong-Wook;Shin, Gun-Yoon;Yun, Ji-Young;Kim, Sang-Soo;Han, Myung-Mook
    • Journal of Internet Computing and Services
    • /
    • v.22 no.3
    • /
    • pp.45-52
    • /
    • 2021
  • Although many studies have been conducted to identify unknown attacks in cyber security intrusion detection systems, studies based on outliers are attracting attention. Accordingly, we identify outliers by defining categories for unknown attacks. The unknown attacks were investigated in two categories: first, there are factors that generate variant attacks, and second, studies that classify them into new types. We have conducted outlier studies that can identify similar data, such as variants, in the category of studies that generate variant attacks. The big problem of identifying anomalies in the intrusion detection system is that normal and aggressive behavior share the same space. For this, we applied a technique that can be divided into clear types for normal and attack by discrete wavelet transformation and detected anomalies. As a result, we confirmed that the outliers can be identified through One-Class SVM in the data reconstructed by discrete wavelet transform.

Analysis of behavior by duration of extreme rainfall based on radar precipitation data (레이더 강수 데이터 기반 극한 강우의 지속시간별 거동 분석)

  • Soohyun Kim;Dongkyun Kim
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2023.05a
    • /
    • pp.116-116
    • /
    • 2023
  • 대규모 댐과 같은 수공구조물의 파괴시 상당한 피해가 발생하므로 구조물설계시 가능최대강수량(PMP) 기준이 적용된다. 포락선 방법은 가장 극심했던 강우량의 포락선을 작성하여 PMP를 산정하는 방법으로 기상 및 강수량자료가 부족시 PMP 추정이 어려운 경우에 사용한다. 포락선의 근사식은 지속시간의 거듭제곱인 멱함수 형태로 나타내며, 우리나라의 경우 1일을 전후로 계수와 차수가 다른 식을 사용한다. 이러한 근사식은 우리나라의 이상홍수 발생빈도 및 규모가 커짐에 따라 검토될 필요성이 있다. 또한, PMP 산정시 활용하는 제한된 수의 지상관측자료는 시공간적 변동성을 완전히 포착할 수 없어 한계가 있다. 본 연구는 이러한 한계를 극복하기 위하여 기상레이더 자료를 기반으로 우리나라 전역의 최대 강우깊이-지속시간 관계를 분석 및 새로운 PMP 포락선을 제시한다. 활용한 레이더는 CMAX(Column Maximum)로 2009~2018년간 10분 단위자료를 수집하였다. 레이더 자료와 비교하기 위하여 지상관측자료 AWS를 함께 수집하였다. AWS는 1997~2022년간 1분 단위자료로 우리나라 전역의 547개 지점관측자료를 활용하였다. 레이더자료는 Z-R 관계식으로 변환하여 가외치(outlier)를 제거 및 보정하였다. 그 후, 정규 크리깅기법으로 생성한 지상관측 강우장과 병합하는 CM(Conditional Merging)기법을 적용하였다. 우리나라 최대 강우깊이-지속시간 관계를 산정한 결과, 기존 포락선의 값이 낮게 산정되었음을 확인하였다. 이는 기후변화 등에 따라 최근 극한 호우가 발생한 것으로 판단된다. 또한, 실제 근사식은 멱함수 거동에서 벗어난 형태로 나타났고, 지점관측자료가 기상레이더 값보다 과소추정되는 경향을 확인하였다. 특히 같은 기간에서 확인하였을 때, 강우지속시간이 짧을수록 AWS값과 레이더자료의 강수량이 2배 정도 차이를 보여 지점관측소가 없는 지역의 국지성 호우 존재를 확인할 수 있었다. 추후, 미래에 더 긴 레이더 시계열을 사용한다면, 더욱 신뢰성 있는 자료로 활용할 수 있을 것으로 판단한다.

  • PDF

Probabilistic Distribution and Variability of Geotechnical Properties with Randomness Characteristic (무작위성을 보이는 지반정수의 확률분포 및 변동성)

  • Kim, Dong-Hee;Lee, Ju-Hyoung;Lee, Woo-Jin
    • Journal of the Korean Geotechnical Society
    • /
    • v.25 no.11
    • /
    • pp.87-103
    • /
    • 2009
  • To determine the reliable probabilistic distribution model of geotechnical properties, outlier and randomness test for analysis data, parameter estimation of probabilistic distribution model, and goodness-of-fit test for model parameter and probabilistic distribution model have to be performed in sequence. In this paper, the probabilistic distribution model's geotechnical properties of Songdo area in Incheon are estimated by the above proposed procedure. Also, the coefficient of variation (COV) representing the variability of geotechnical properties is determined for several geotechnical properties. Reliable probabilistic distribution model and COV of geotechnical properties can be used for probability-based design procedure and reasonable choice of design value in deterministic design method.

A Study on a Measure of Asset Management Information Systems for Highway Transportation Facilities using AHP (계층적 분석기법을 이용한 도로시설 자산관리정보시스템 평가에 관한 연구)

  • Jeong, Seong Yun;Choi, Won Sik;Kim, Woo Je
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.30 no.6D
    • /
    • pp.663-673
    • /
    • 2010
  • We are developing asset management information systems to introduce the preventive/proactive approach at operation and management (O&M) of infrastructures. The objective of this study is to explore the future direction for development and operation of asset management information systems. So, we developed the success model and selected the evaluation criteria for analyzing user satisfaction to assess the expected performance of asset management information systems and the degree of impact on asset management functions to operation and management of highway transportation facilities. We estimated the relative importance weight according to the selected evaluation criteria through AHP analysis. We verified the logical consistency of the importance weight and exclude biased outlier from importance weight group using the concept of the Compatibility.

Regional Frequency Analysis for Future Precipitation from RCP Scenarios (대표농도경로 시나리오에 의한 미래 강수량의 지역빈도해석)

  • Kim, Duck Hwan;Hong, Seung Jin;Choi, Chang Hyun;Han, Dae Gun;Lee, So Jong;Kim, Hung Soo
    • Journal of Wetlands Research
    • /
    • v.17 no.1
    • /
    • pp.80-90
    • /
    • 2015
  • Variability of precipitation pattern and intensity are increasing due to the urbanization and industrialization which induce increasing impervious area and the climate change. Therefore, more severe urban inundation and flood damage will be occurred by localized heavy precipitation event in the future. In this study, we analyze the future frequency based precipitation under climate change based on the regional frequency analysis. The observed precipitation data from 58 stations provided by Korea Meteorological Administration(KMA) are collected and the data period is more than 30 years. Then the frequency based precipitation for the observed data by regional frequency analysis are estimated. In order to remove the bias from the simulated precipitation by RCP scenarios, the quantile mapping method and outlier test are used. The regional frequency analysis using L-moment method(Hosking and Wallis, 1997) is performed and the future frequency based precipitation for 80, 100, and 200 years of return period are estimated. As a result, future frequency based precipitation in South Korea will be increased by 25 to 27 percent. Especially the result for Jeju Island shows that the increasing rate will be higher than other areas. Severe heavy precipitation could be more and more frequently occurred in the future due to the climate change and the runoff characteristics will be also changed by urbanization, industrialization, and climate change. Therefore, we need prepare flood prevention measures for our flood safety in the future.