• Title/Summary/Keyword: Outlier Analysis

Search Result 236, Processing Time 0.029 seconds

Error analysis of 3-D surface parameters from space encoding range imaging (공간 부호화 레인지 센서를 이용한 3차원 표면 파라미터의 에러분석에 관한 연구)

  • 정흥상;권인소;조태훈
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1997.10a
    • /
    • pp.375-378
    • /
    • 1997
  • This research deals with a problem of reconstructing 3D surface structures from their 2D projections, which is an important research topic in computer vision. In order to provide robust reconstruction algorithm, that is reliable even in the presence of uncertainty in the range images, we first present a detailed model and analysis of several error sources and their effects on measuring three-dimensional surface properties using the space encoded range imaging technique. Our approach has two key elements. The first is the error modeling for the space encoding range sensor and its propagation to the 3D surface reconstruction problem. The second key element in our approach is the algorithm for removing outliers in the range image. Such analyses, to our knowledge, have never attempted before. Experimental results show that our approach is significantly reliable.

  • PDF

Space Time Data Analysis for Greenhouse Whitefly (온실가루이의 공간시계열 분석)

  • 박진모;신기일
    • The Korean Journal of Applied Statistics
    • /
    • v.17 no.3
    • /
    • pp.403-418
    • /
    • 2004
  • Recently space-time model in spatial data analysis is widly used. In this paper we applied this model to analysis of greenhouse whitefly. For handling time component, we used ARMA model and autoregressive error model and for outliers, we adapted Mugglestone's method. We compared space-time models and geostatistic model with MSE and MAPE.

An improvement of MT transfer function estimates using by pre-screening scheme based on the statistical distribution of electromagnetic fields (통계적 사전 처리방법을 통한 MT 전달함수 추정의 향상 기법 연구)

  • Yang Junmo;Kwon Byung-Doo;Lee Duk-Kee;Song Youn-Ho;Youn Yong-Hoon
    • 한국지구물리탐사학회:학술대회논문집
    • /
    • 2005.05a
    • /
    • pp.273-280
    • /
    • 2005
  • Robust magneto-telluric (MT) response function estimators are now in standard use in electromagnetic induction research. Properly devised and applied, these methods can reduce the influence of unusual data (outlier) in the response (electric field) variable, but often not sensitive to exceptional predictor (magnetic field) data, which are termed leverage points. A bounded influence estimator is described which simultaneously limits the influence of both outlier and leverage point, and has proven to consistently yield more reliable MT response function estimates than conventional robust approach. The bounded influence estimator combines a standard robust M-estimator with leverage weighting based on the statistics of the hat matrix diagonal, which is a standard statistical measure of unusual predictors. Further extensions to MT data analysis are proposed, including a establishment of data rejection criterion which minimize the influence of both electric and magnetic outlier in frequency domain based on statistical distribution of electromagnetic field. The rejection scheme made in this study seems to have an effective performance on eliminating extreme data, which is even not removed by BI estimator, in frequency domain. The effectiveness and advantage of these developments are illustrated using real MT data.

  • PDF

On Feasibility of Ambulatory KDRGs for the Classification of Health Insurance Claims (KDRG를 이용한 건강보험 외래 진료비 분류 타당성)

  • 박하영;박기동;신영수
    • Health Policy and Management
    • /
    • v.13 no.1
    • /
    • pp.98-115
    • /
    • 2003
  • Concerns about growing health insurance expenditures became a national Issue in 2001 when the National Health Insurance went into a deficit. Increases in spending for ambulatory care shared the largest portion of the problem. Methods and systems to control the spending should be developed and a system to measure case mix of providers is one of core components of the control system. The objectives of this article is to examine the feasibility of applying Korean Diagnosis Related Groups (KDRGs) to classify health insurance claims for ambulatory care and to identify problem areas of the classification. A database of 11,586,270 claims for ambulatory care delivered during January 2002 was obtained for the study, and the final number of claims analyzed was 8,319,494 after KDRG numbers were assigned to the data and records with an error KDRG were excluded from the study. The unit of analysis was a claim and resource use was measured by the sum of charges incurred during a month at a department of a hospital of at a clinic. Within group variance was assessed by th coefficient of variation (CV), and the classification accuracy was evaluated by the variance reduction achieved by the KDRG classification. The analyses were performed on both all and non-outlier data, and on a subset of the database to examine the validity of study results. Data were assigned to 787 KDRGs among 1,244 KDRGs defined in the classification system. For non-outlier data, 77.4% of KDRGs had a CV of charges from tertiary care hospitals less than 100% and 95.43% of KDRGs for data from clinics. The variance reduction achieved by the KDRG classification was 40.80% for non-outlier claims from tertiary care hospitals, 51.98% for general hospitals, 40.89% for hospitals, and 54.99% for clinics. Similar results were obtained from the analyses performed on a subset of the study database. The study results indicated that KDRGs developed for a classification of inpatient care could be used for ambulatory care, although there were areas where the classification should be refined. Its power to predict tile resource utilization showed a potential for its application to measure case mix of providers for monitoring and managing delivery of ambulatory care. The issue concerning the quality of diagnostic information contained in insurance claims remains to be improved, and significance of future studies for other classification systems based on visits or episodes is guaranteed.

Deduction of Data Quality Control Strategy for High Density Rain Gauge Network in Seoul Area (서울시 고밀도 지상강우자료 품질관리방안 도출)

  • Yoon, Seongsim;Lee, Byongju;Choi, Youngjean
    • Journal of Korea Water Resources Association
    • /
    • v.48 no.4
    • /
    • pp.245-255
    • /
    • 2015
  • This study used high density network of integrated meteorological sensor, which are operated by SK Planet, with KMA weather stations to estimate the quantitative precipitation field in Seoul area. We introduced SK Planet network and analyzed quality of the observed data for 3 months data from 1 July to 30 September 2013. As the quality analysis result, we checked most SK Planet stations observed similar with previous KMA stations. We developed the real-time quality check and adjustment method to reduce the error effect for hydrological application by missing and outlier value and we confirmed the developed method can be corrected the missing and outlier value. Through this method, we used the 190 stations(KMA 34 stations, SK Planet 156 stations) that missing ratio is less than 20% and the effect of the outlier was the smallest for quantitative precipitation estimation. Moreover, we evaluated reproducibility of rainfall field high density rain gauge network has $3km^2$/gauge. As the result, the spatial relative frequency of rainfall field using SK Planet and KMA stations is similar with radar rainfall field. And, it supplement the blank of KMA observation network. Especially, through this research we will take advantage of the density of the network to estimate rainfall field which can be considered as a very good approximation of the true value.

A Prediction Method of Learning Outcomes based on Regression Model for Effective Peer Review Learning (효율적인 피어리뷰 학습을 위한 회귀 모델 기반 학습성과 예측 방법)

  • Shin, Hyo-Joung;Jung, Hye-Wuk;Cho, Kwang-Su;Lee, Jee-Hyoung
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.22 no.5
    • /
    • pp.624-630
    • /
    • 2012
  • The peer review learning is a method which improves learning outcome of students through feedback between students and the observation and analysis of other students. One of the important problems in a peer review system is to find proper evaluators to each learner considering characteristics of students for improving learning outcomes. Some of peer review systems randomly assign peer review evaluators to learners, or chose evaluators based on limited strategies. However, these systems have a problem that they do not consider various characteristics of learners and evaluators who participate in peer reviews. In this paper, we propose a novel prediction approach of learning outcomes to apply peer review systems considering various characteristics of learners and evaluators. The proposed approach extracts representative attributes from the profiles of students and predicts learning outcomes using various regression models. In order to verify how much outliers affect on the prediction of learning outcomes, we also apply several outlier removal methods to the regression models and compare the predictive performance of learning outcomes. The experiment result says that the SVR model which does not removes outliers shows an error rate of 0.47% on average and has the best predictive performance.

Privacy-Preserving Outlier Detection in Healthcare Services (IoT환경에서 프라이버시를 보장하는 의료데이터 이상치 탐색 기법)

  • Lee, Bo Young;Choi, Wonsuk;Lee, Dong Hoon
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.25 no.5
    • /
    • pp.1187-1199
    • /
    • 2015
  • Recently, as high-quality sensors are being developed, it is available to conveniently measure any kind of data. Healthcare services are being combined with Internet of things (IoTs). And applications that use user's data which are remotely measured, such as heart rate, blood oxygen level, temperature are emerging. The typical example is applications that find ideal spouse by using a user's genetic information, or indicate the presence or absence of a disease. Such information is closely related to the user's privacy, so biometric information must be protected. That is, service provider must provide the service while preserving user's privacy. In this paper, we propose a scheme which enables privacy-preserving outlier detection in Healthcare Service.

Principal Components Logistic Regression based on Robust Estimation (로버스트추정에 바탕을 둔 주성분로지스틱회귀)

  • Kim, Bu-Yong;Kahng, Myung-Wook;Jang, Hea-Won
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.3
    • /
    • pp.531-539
    • /
    • 2009
  • Logistic regression is widely used as a datamining technique for the customer relationship management. The maximum likelihood estimator has highly inflated variance when multicollinearity exists among the regressors, and it is not robust against outliers. Thus we propose the robust principal components logistic regression to deal with both multicollinearity and outlier problem. A procedure is suggested for the selection of principal components, which is based on the condition index. When a condition index is larger than the cutoff value obtained from the model constructed on the basis of the conjoint analysis, the corresponding principal component is removed from the logistic model. In addition, we employ an algorithm for the robust estimation, which strives to dampen the effect of outliers by applying the appropriate weights and factors to the leverage points and vertical outliers identified by the V-mask type criterion. The Monte Carlo simulation results indicate that the proposed procedure yields higher rate of correct classification than the existing method.

A Novel Network Anomaly Detection Method based on Data Balancing and Recursive Feature Addition

  • Liu, Xinqian;Ren, Jiadong;He, Haitao;Wang, Qian;Sun, Shengting
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.7
    • /
    • pp.3093-3115
    • /
    • 2020
  • Network anomaly detection system plays an essential role in detecting network anomaly and ensuring network security. Anomaly detection system based machine learning has become an increasingly popular solution. However, due to the unbalance and high-dimension characteristics of network traffic, the existing methods unable to achieve the excellent performance of high accuracy and low false alarm rate. To address this problem, a new network anomaly detection method based on data balancing and recursive feature addition is proposed. Firstly, data balancing algorithm based on improved KNN outlier detection is designed to select part respective data on each category. Combination optimization about parameters of improved KNN outlier detection is implemented by genetic algorithm. Next, recursive feature addition algorithm based on correlation analysis is proposed to select effective features, in which a cross contingency test is utilized to analyze correlation and obtain a features subset with a strong correlation. Then, random forests model is as the classification model to detection anomaly. Finally, the proposed algorithm is evaluated on benchmark datasets KDD Cup 1999 and UNSW_NB15. The result illustrates the proposed strategies enhance accuracy and recall, and decrease the false alarm rate. Compared with other algorithms, this algorithm still achieves significant effects, especially recall in the small category.

Underwater Navigation of AUVs Using Uncorrelated Measurement Error Model of USBL

  • Lee, Pan-Mook;Park, Jin-Yeong;Baek, Hyuk;Kim, Sea-Moon;Jun, Bong-Huan;Kim, Ho-Sung;Lee, Phil-Yeob
    • Journal of Ocean Engineering and Technology
    • /
    • v.36 no.5
    • /
    • pp.340-352
    • /
    • 2022
  • This article presents a modeling method for the uncorrelated measurement error of the ultra-short baseline (USBL) acoustic positioning system for aiding navigation of underwater vehicles. The Mahalanobis distance (MD) and principal component analysis are applied to decorrelate the errors of USBL measurements, which are correlated in the x- and y-directions and vary according to the relative direction and distance between a reference station and the underwater vehicles. The proposed method can decouple the radial-direction error and angular direction error from each USBL measurement, where the former and latter are independent and dependent, respectively, of the distance between the reference station and the vehicle. With the decorrelation of the USBL errors along the trajectory of the vehicles in every time step, the proposed method can reduce the threshold of the outlier decision level. To demonstrate the effectiveness of the proposed method, simulation studies were performed with motion data obtained from a field experiment involving an autonomous underwater vehicle and USBL signals generated numerically by matching the specifications of a specific USBL with the data of a global positioning system. The simulations indicated that the navigation system is more robust in rejecting outliers of the USBL measurements than conventional ones. In addition, it was shown that the erroneous estimation of the navigation system after a long USBL blackout can converge to the true states using the MD of the USBL measurements. The navigation systems using the uncorrelated error model of the USBL, therefore, can effectively eliminate USBL outliers without loss of uncontaminated signals.