• Title/Summary/Keyword: removal of outliers

Search Result 21, Processing Time 0.023 seconds

Impact of Outliers on the Statistical Measures of the Environmental Monitoring Data in Busan Coastal Sea (이상자료가 연안 환경자료의 통계 척도에 미치는 영향)

  • Cho, Hong-Yeon;Lee, Ki-Seop;Ahn, Soon-Mo
    • Ocean and Polar Research
    • /
    • v.38 no.2
    • /
    • pp.149-159
    • /
    • 2016
  • The statistical measures of the coastal environmental data are used in a variety of statistical inferences, hypothesis tests, and data-driven modeling. If the measures are biased, then the statistical estimations and models may also be biased and this potential for bias is great when data contain some outliers defined as extraordinary large or small data values. This study aims to suggest more robust statistical measures as alternatives to more commonly used measures and to assess the performance these robust measures through a quantitative evaluation of more typical measures, such as in terms of locations, spreads, and shapes, with regard to environmental monitoring data in the Busan coastal sea. The detection of outliers within the data was carried out on the basis of Rosner's test. About 5-10% of the nutrient data were found to contain outliers based on Rosner's test. After removal (zero-weighting) of the outliers in the data sets, the relative change ratios of the mean and standard deviation between before and after outlier-removal conditions revealed the figures 13 and 33%, respectively. The variation magnitudes of skewness and kurtosis are 1.36 and 8.11 in a decreasing trend, respectively. On the other hand, the change ratios for more robust measures regarding the mean and standard deviation are 3.7-10.5%, and the variation magnitudes of robust skewness and kurtosis are about only 2-4% of the magnitude of the non-robust measures. The robust measures can be regarded as outlier-resistant statistical measures based on the relatively small changes in the scenarios before and after outlier removal conditions.

Noise Removal of Terrestrial LiDAR Data Using Tensor Voting Method (텐서보팅(Tensor Voting)기법을 이용한 지상라이다 자료의 노이즈 처리)

  • Seo, Il-Hong;Sohn, Hong-Gyoo;Kim, Chang-Jae;Lim, Jin-Hee
    • Proceedings of the Korean Society of Surveying, Geodesy, Photogrammetry, and Cartography Conference
    • /
    • 2010.04a
    • /
    • pp.157-160
    • /
    • 2010
  • Terrestrial LiDAR data contains outliers which do not need in processing purpose. That is inefficient in the aspect of productivity. These noise requires manual process to be removed, which causes inefficiency in aspect of productivity. The purpose of this research is to demonstrate a possibility of automatic outlier removal of LiDAR data using 3D Tensor Voting method. For this, we presented in this article about the procedure to perform the application of Tensor Voting algorithm to the real data from terrestrial LiDAR.

  • PDF

Single Outlier Removal Technology for TWR based High Precision Localization (TWR 기반 고정밀 측위를 위한 단일 이상측정치 제거 기술)

  • Lee, Chang-Eun;Sung, Tae-Kyung
    • The Journal of Korea Robotics Society
    • /
    • v.12 no.3
    • /
    • pp.350-355
    • /
    • 2017
  • UWB (Ultra Wide Band) refers to a system with a bandwidth of over 500 MHz or a bandwidth of 20% of the center frequency. It is robust against channel fading and has a wide signal bandwidth. Using the IR-UWB based ranging system, it is possible to obtain decimeter-level ranging accuracy. Furthermore, IR-UWB system enables acquisition over glass or cement with high resolution. In recent years, IR-UWB-based ranging chipsets have become cheap and popular, and it has become possible to implement positioning systems of several tens of centimeters. The system can be configured as one-way ranging (OWR) positioning system for fast ranging and TWR (two-way ranging) positioning system for cheap and robust ranging. On the other hand, the ranging based positioning system has a limitation on the number of terminals for localization because it takes time to perform a communication procedure to perform ranging. To overcome this problem, code multiplexing and channel multiplexing are performed. However, errors occur in measurement due to interference between channels and code, multipath, and so on. The measurement filtering is used to reduce the measurement error, but more fundamentally, techniques for removing these measurements should be studied. First, the TWR based positioning was analyzed from a stochastic point of view and the effects of outlier measurements were summarized. The positioning algorithm for analytically identifying and removing single outlier is summarized and extended to three dimensions. Through the simulation, we have verified the algorithm to detect and remove single outliers.

A Development of Preprocessing Models of Toll Collection System Data for Travel Time Estimation (통행시간 추정을 위한 TCS 데이터의 전처리 모형 개발)

  • Lee, Hyun-Seok;NamKoong, Seong J.
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.8 no.5
    • /
    • pp.1-11
    • /
    • 2009
  • TCS Data imply characteristics of traffic conditions. However, there are outliers in TCS data, which can not represent the travel time of the pertinent section, if these outliers are not eliminated, travel time may be distorted owing to these outliers. Various travel time can be distributed under the same section and time because the variation of the travel time is increase as the section distance is increase, which make difficult to calculate the representative of travel time. Accordingly, it is important to grasp travel time characteristics in order to compute the representative of travel time using TCS Data. In this study, after analyzing the variation ratio of the travel time according to the link distance and the level of congestion, the outlier elimination model and the smoothing model for TCS data were proposed. The results show that the proposed model can be utilized for estimating a reliable travel time for a long-distance path in which there are a variation of travel times from the same departure time, the intervals are large and the change in the representative travel time is irregular for a short period.

  • PDF

Improved Lexicon-driven based Chord Symbol Recognition in Musical Images

  • Dinh, Cong Minh;Do, Luu Ngoc;Yang, Hyung-Jeong;Kim, Soo-Hyung;Lee, Guee-Sang
    • International Journal of Contents
    • /
    • v.12 no.4
    • /
    • pp.53-61
    • /
    • 2016
  • Although extensively developed, optical music recognition systems have mostly focused on musical symbols (notes, rests, etc.), while disregarding the chord symbols. The process becomes difficult when the images are distorted or slurred, although this can be resolved using optical character recognition systems. Moreover, the appearance of outliers (lyrics, dynamics, etc.) increases the complexity of the chord recognition. Therefore, we propose a new approach addressing these issues. After binarization, un-distortion, and stave and lyric removal of a musical image, a rule-based method is applied to detect the potential regions of chord symbols. Next, a lexicon-driven approach is used to optimally and simultaneously separate and recognize characters. The score that is returned from the recognition process is used to detect the outliers. The effectiveness of our system is demonstrated through impressive accuracy of experimental results on two datasets having a variety of resolutions.

Application of Vocal Properties and Vocal Independent Features to Classifying Sasang Constitution (음성 특성 및 음성 독립 변수의 사상체질 분류로의 적용 방법)

  • Kim, Keun-Ho;Kang, Nam-Sik;Ku, Bon-Cho;Kim, Jong-Yeol
    • Journal of Sasang Constitutional Medicine
    • /
    • v.23 no.4
    • /
    • pp.458-470
    • /
    • 2011
  • 1. Objectives Vocal characteristics are commonly considered as an important factor in determining the Sasang constitution and the health condition. We have tried to find out the classification procedure to distinguish the constitution objectively and quantitatively by analyzing the characteristics of subject's voice without noise and error. 2. Methods In this study, we extract the vocal features from voice selected with prior information, remove outliers, minimize the correlated features, correct the features with normalization according to gender and age, and make the discriminant functions that are adaptive to gender and age from the features for improving diagnostic accuracy. 3. Results and Conclusions Finally, the discriminant functions produced about 45% accuracy to classify the constitution for every age interval and every gender, and the diagnostic accuracy was meaningful as the result from only the voice.

Preparation and evaluation of limestone reference material for a proficiency test (국내산 석회석의 비교숙련도 시험용 시료 제조 및 평가)

  • Jung, Choong-Ho;Park, Deok-Won;Kim, Sung-Min;Yu, Eung-Chul
    • Analytical Science and Technology
    • /
    • v.22 no.1
    • /
    • pp.82-91
    • /
    • 2009
  • Preparation and evaluation of the limestone samples for a proficiency test using domestic limestone have been performed. We have used statistical method for evaluation of the XRF and instrumental analysis results. We have found that there were some outliers from XRF and ICP-OES instrumental analysis results for each sample. After removal of 5 outliers among the 50 samples we could obtain the homogeneous samples which have within a reliability of 95% from a statistical analysis result.

Improvement of PM Forecasting Performance by Outlier Data Removing (Outlier 데이터 제거를 통한 미세먼지 예보성능의 향상)

  • Jeon, Young Tae;Yu, Suk Hyun;Kwon, Hee Yong
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.6
    • /
    • pp.747-755
    • /
    • 2020
  • In this paper, we deal with outlier data problems that occur when constructing a PM2.5 fine dust forecasting system using a neural network. In general, when learning a neural network, some of the data are not helpful for learning, but rather disturbing. Those are called outlier data. When they are included in the training data, various problems such as overfitting occur. In building a PM2.5 fine dust concentration forecasting system using neural network, we have found several outlier data in the training data. We, therefore, remove them, and then make learning 3 ways. Over_outlier model removes outlier data that target concentration is low, but the model forecast is high. Under_outlier model removes outliers data that target concentration is high, but the model forecast is low. All_outlier model removes both Over_outlier and Under_outlier data. We compare 3 models with a conventional outlier removal model and non-removal model. Our outlier removal model shows better performance than the others.

A NEW LANDSAT IMAGE CO-REGISTRATION AND OUTLIER REMOVAL TECHNIQUES

  • Kim, Jong-Hong;Heo, Joon;Sohn, Hong-Gyoo
    • Proceedings of the KSRS Conference
    • /
    • v.2
    • /
    • pp.594-597
    • /
    • 2006
  • Image co-registration is the process of overlaying two images of the same scene. One of which is a reference image, while the other (sensed image) is geometrically transformed to the one. Numerous methods were developed for the automated image co-registration and it is known as a time-consuming and/or computation-intensive procedure. In order to improve efficiency and effectiveness of the co-registration of satellite imagery, this paper proposes a pre-qualified area matching, which is composed of feature extraction with Laplacian filter and area matching algorithm using correlation coefficient. Moreover, to improve the accuracy of co-registration, the outliers in the initial matching point should be removed. For this, two outlier detection techniques of studentized residual and modified RANSAC algorithm are used in this study. Three pairs of Landsat images were used for performance test, and the results were compared and evaluated in terms of robustness and efficiency.

  • PDF

A New Landsat Image Co-Registration and Outlier Removal Techniques

  • Kim, Jong-Hong;Heo, Joon;Sohn, Hong-Gyoo
    • Korean Journal of Remote Sensing
    • /
    • v.22 no.5
    • /
    • pp.439-443
    • /
    • 2006
  • Image co-registration is the process of overlaying two images of the same scene. One of which is a reference image, while the other (sensed image) is geometrically transformed to the one. Numerous methods were developed for the automated image co-registration and it is known as a timeconsuming and/or computation-intensive procedure. In order to improve efficiency and effectiveness of the co-registration of satellite imagery, this paper proposes a pre-qualified area matching, which is composed of feature extraction with Laplacian filter and area matching algorithm using correlation coefficient. Moreover, to improve the accuracy of co-registration, the outliers in the initial matching point should be removed. For this, two outlier detection techniques of studentized residual and modified RANSAC algorithm are used in this study. Three pairs of Landsat images were used for performance test, and the results were compared and evaluated in terms of robustness and efficiency.