• 제목/요약/키워드: outlier removal

검색결과 57건 처리시간 0.027초

Outlier 데이터 제거를 통한 미세먼지 예보성능의 향상 (Improvement of PM Forecasting Performance by Outlier Data Removing)

  • 전영태;유숙현;권희용
    • 한국멀티미디어학회논문지
    • /
    • 제23권6호
    • /
    • pp.747-755
    • /
    • 2020
  • In this paper, we deal with outlier data problems that occur when constructing a PM2.5 fine dust forecasting system using a neural network. In general, when learning a neural network, some of the data are not helpful for learning, but rather disturbing. Those are called outlier data. When they are included in the training data, various problems such as overfitting occur. In building a PM2.5 fine dust concentration forecasting system using neural network, we have found several outlier data in the training data. We, therefore, remove them, and then make learning 3 ways. Over_outlier model removes outlier data that target concentration is low, but the model forecast is high. Under_outlier model removes outliers data that target concentration is high, but the model forecast is low. All_outlier model removes both Over_outlier and Under_outlier data. We compare 3 models with a conventional outlier removal model and non-removal model. Our outlier removal model shows better performance than the others.

Fast Outlier Removal for Image Registration based on Modified K-means Clustering

  • Soh, Young-Sung;Qadir, Mudasar;Kim, In-Taek
    • 융합신호처리학회논문지
    • /
    • 제16권1호
    • /
    • pp.9-14
    • /
    • 2015
  • Outlier detection and removal is a crucial step needed for various image processing applications such as image registration. Random Sample Consensus (RANSAC) is known to be the best algorithm so far for the outlier detection and removal. However RANSAC requires a cosiderable computation time. To drastically reduce the computation time while preserving the comparable quality, a outlier detection and removal method based on modified K-means is proposed. The original K-means was conducted first for matching point pairs and then cluster merging and member exclusion step are performed in the modification step. We applied the methods to various images with highly repetitive patterns under several geometric distortions and obtained successful results. We compared the proposed method with RANSAC and showed that the proposed method runs 3~10 times faster than RANSAC.

가중치 보정을 이용한 다중대체법 (Multiple Imputation Reducing Outlier Effect using Weight Adjustment Methods)

  • 김진영;신기일
    • 응용통계연구
    • /
    • 제26권4호
    • /
    • pp.635-647
    • /
    • 2013
  • 다중 대체법은 표본조사에서 결측값이 발생하였을 때 가장 흔히 사용하는 방법이다. 이 방법은 여러 요인에 의해 그 성능이 좌우되며 특히 이상점의 영향을 많이 받는다. 본 연구에서는 가중치 보정법을 이용하여 이상점의 영향력을 줄여 다중 대체법의 성능을 향상시키는 방법을 연구하였다. 가중치 보정법을 이용하여 얻어진 최종 가중치를 다중대체에 사용하였으며 SAS의 PROC MI가 다중 대체를 위해 사용되었다. 모의실험과 매월노동통계 자료를 이용한 실제 자료 분석을 통하여 제안된 방법의 우수성을 확인하였다.

다양한 오정합 제거 알고리즘을 이용한 영상정합의 정확도 향상 (Improving the Accuracy of Image Matching using Various Outlier Removal Algorithms)

  • 이용일;김준철;이영란;신성웅
    • 한국측량학회지
    • /
    • 제27권1호
    • /
    • pp.667-675
    • /
    • 2009
  • 영상정합은 원격 탐사, GIS 등과 같은 영상 활용 분야에서 매우 광범위하게 적용된다. 일반적으로 초기의 정합점 데이터들은 영상정합의 정확도를 떨어뜨리는 오정합을 포함하고 있다. 본 논문의 목적은 영상정합에서 정확도를 유지하기 위해 오정합의 탐색 및 제거를 위한 강건한 접근법을 개발하는 것이다. 본 논문은 자동으로 오정합을 탐색하기 위해 역방향 정합 유사변환, RANSAC 알고리즘을 사용하였으며 빠르고 효율적인 영상정합을 위해 중복영역의 계산, 블록기반 처리 등과 같은 전처리 단계를 사용하였다. 논문에서 제안한 방법을 실제 항공사진 영상쌍에 적용하여 robustness와 효율성 측면에서 그 결과를 분석하였다.

실시간 파노라마 합성에서의 효과적인 outlier 제거 방법 (Efficient outlier removal algorithm for real-time panoramic stitching)

  • 김범수;조남익
    • 한국방송∙미디어공학회:학술대회논문집
    • /
    • 한국방송공학회 2011년도 하계학술대회
    • /
    • pp.513-516
    • /
    • 2011
  • 기존의 실시간 파노라마 합성 알고리즘에서는 매칭점과 입력 영상에서의 outlier를 구분하고 제거하기가 어렵기 때문에 노이즈가 많은 영상 또는 반복적인 패턴이 많은 영상에서 왜곡이 쉽게 발생하는 문제가 있다. 따라서 본 논문에서는 기존의 실시간 파노라마 합성 프레임웍에서 실시간 합성 조건을 만족시키면서 효과적으로 매칭점과 입력 영상에서의 outlier를 제거하는 방법을 제안한다. 이를 위해서 선형 모델에서 outlier을 제거하는 데 주로 사용되는 RANSAC 알고리즘을 실시간 파노라마 합성에서 사용되는 비선형 모델에 적용 가능하도록 수정하고 속도 향상을 위해서 사용되는 모델의 파라미터를 줄이는 방법을 제안한다. 이를 통하여 매칭점 중에 존재하는 outiler를 제거하고 전체 매칭점 중에서 inlier 비율을 이용하여 입력되는 영상시퀀스에서 outlier 영상을 제거하는 방법을 제안한다. 실험 결과 기존의 방법에 비해서 합성 결과의 왜곡이 줄어드는 것을 확인하였다.

  • PDF

Improved LTE Fingerprint Positioning Through Clustering-based Repeater Detection and Outlier Removal

  • Kwon, Jae Uk;Chae, Myeong Seok;Cho, Seong Yun
    • Journal of Positioning, Navigation, and Timing
    • /
    • 제11권4호
    • /
    • pp.369-379
    • /
    • 2022
  • In weighted k-nearest neighbor (WkNN)-based Fingerprinting positioning step, a process of comparing the requested positioning signal with signal information for each reference point stored in the fingerprint DB is performed. At this time, the higher the number of matched base station identifiers, the higher the possibility that the terminal exists in the corresponding location, and in fact, an additional weight is added to the location in proportion to the number of matching base stations. On the other hand, if the matching number of base stations is small, the selected candidate reference point has high dependence on the similarity value of the signal. But one problem arises here. The positioning signal can be compared with the repeater signal in the signal information stored on the DB, and the corresponding reference point can be selected as a candidate location. The selected reference point is likely to be an outlier, and if a certain weight is applied to the corresponding location, the error of the estimated location information increases. In order to solve this problem, this paper proposes a WkNN technique including an outlier removal function. To this end, it is first determined whether the repeater signal is included in the DB information of the matched base station. If the reference point for the repeater signal is selected as the candidate position, the reference position corresponding to the outlier is removed based on the clustering technique. The performance of the proposed technique is verified through data acquired in Seocho 1 and 2 dongs in Seoul.

Big Data Smoothing and Outlier Removal for Patent Big Data Analysis

  • Choi, JunHyeog;Jun, Sunghae
    • 한국컴퓨터정보학회논문지
    • /
    • 제21권8호
    • /
    • pp.77-84
    • /
    • 2016
  • In general statistical analysis, we need to make a normal assumption. If this assumption is not satisfied, we cannot expect a good result of statistical data analysis. Most of statistical methods processing the outlier and noise also need to the assumption. But the assumption is not satisfied in big data because of its large volume and heterogeneity. So we propose a methodology based on box-plot and data smoothing for controling outlier and noise in big data analysis. The proposed methodology is not dependent upon the normal assumption. In addition, we select patent documents as target domain of big data because patent big data analysis is a important issue in management of technology. We analyze patent documents using big data learning methods for technology analysis. The collected patent data from patent databases on the world are preprocessed and analyzed by text mining and statistics. But the most researches about patent big data analysis did not consider the outlier and noise problem. This problem decreases the accuracy of prediction and increases the variance of parameter estimation. In this paper, we check the existence of the outlier and noise in patent big data. To know whether the outlier is or not in the patent big data, we use box-plot and smoothing visualization. We use the patent documents related to three dimensional printing technology to illustrate how the proposed methodology can be used for finding the existence of noise in the searched patent big data.

화학적산소요구량의 총유기탄소 변환을 위한 이상자료의 탐지와 처리 (Outlier Detection and Treatment for the Conversion of Chemical Oxygen Demand to Total Organic Carbon)

  • 조범준;조홍연;김성
    • 한국해안·해양공학회논문집
    • /
    • 제26권4호
    • /
    • pp.207-216
    • /
    • 2014
  • 총유기탄소(TOC)는 해양의 탄소순환 연구분야에서 직접적인 생물학적 지표로 이용되는 중요한 인자다. 가용한 TOC 자료가 상대적으로 화학적산소요구량(COD) 자료 보다 부족하기 때문에 COD 자료를 활용하여 TOC 자료를 추정할 수 있다. COD를 TOC 로의 변환 시 TOC 추정에 직접적으로 영향을 미치는 COD 관측자료에 포함된 이상자료의 탐지와 적절한 처리는 합리적이고 객관적으로 수행되어야 한다. 본 연구에서는 국내 연안해역에서 관측된 염분, COD 및 TOC 자료에 대한 최적회귀모형을 제시하였다. 최적회귀모형은 이상자료와 영향자료를 여러 가지 탐색방법으로 진단하여 제거 전 후의 자료 개수 변화, 변동계수 및 RMS 오차를 비교 및 분석하여 선택하였다. 연구수행 결과, Cook의 진단방법과 SIQR의 boxplot 방법을 조합한 방법이 가장 적절한 것으로 파악되었다. 최적 회귀 함수는 TOC(mg/L) = $0.44{\cdot}COD(mg/L)+1.53$ 이고, 결정계수는 0.47 정도로 나타났으며, RMS 오차는 0.85 mg/L이다. RMS 오차와 지레계수(leverage values)의 변동계수는 이상자료 제거 전에 비하여 각각 31%, 80%로 크게 감소되었다. 본 연구에서 제시된 방법을 통해 COD와 TOC 관측자료에 포함된 이상자료와 영향자료의 과도한 영향을 진단 및 제거하였기 때문에 보다 적절한 회귀곡선식을 제시할 수 있었다.

이상자료가 연안 환경자료의 통계 척도에 미치는 영향 (Impact of Outliers on the Statistical Measures of the Environmental Monitoring Data in Busan Coastal Sea)

  • 조홍연;이기섭;안순모
    • Ocean and Polar Research
    • /
    • 제38권2호
    • /
    • pp.149-159
    • /
    • 2016
  • The statistical measures of the coastal environmental data are used in a variety of statistical inferences, hypothesis tests, and data-driven modeling. If the measures are biased, then the statistical estimations and models may also be biased and this potential for bias is great when data contain some outliers defined as extraordinary large or small data values. This study aims to suggest more robust statistical measures as alternatives to more commonly used measures and to assess the performance these robust measures through a quantitative evaluation of more typical measures, such as in terms of locations, spreads, and shapes, with regard to environmental monitoring data in the Busan coastal sea. The detection of outliers within the data was carried out on the basis of Rosner's test. About 5-10% of the nutrient data were found to contain outliers based on Rosner's test. After removal (zero-weighting) of the outliers in the data sets, the relative change ratios of the mean and standard deviation between before and after outlier-removal conditions revealed the figures 13 and 33%, respectively. The variation magnitudes of skewness and kurtosis are 1.36 and 8.11 in a decreasing trend, respectively. On the other hand, the change ratios for more robust measures regarding the mean and standard deviation are 3.7-10.5%, and the variation magnitudes of robust skewness and kurtosis are about only 2-4% of the magnitude of the non-robust measures. The robust measures can be regarded as outlier-resistant statistical measures based on the relatively small changes in the scenarios before and after outlier removal conditions.

A Fast Image Matching Method for Oblique Video Captured with UAV Platform

  • Byun, Young Gi;Kim, Dae Sung
    • 한국측량학회지
    • /
    • 제38권2호
    • /
    • pp.165-172
    • /
    • 2020
  • There is growing interest in Vision-based video image matching owing to the constantly developing technology of unmanned-based systems. The purpose of this paper is the development of a fast and effective matching technique for the UAV oblique video image. We first extracted initial matching points using NCC (Normalized Cross-Correlation) algorithm and improved the computational efficiency of NCC algorithm using integral image. Furthermore, we developed a triangulation-based outlier removal algorithm to extract more robust matching points among the initial matching points. In order to evaluate the performance of the propose method, our method was quantitatively compared with existing image matching approaches. Experimental results demonstrated that the proposed method can process 2.57 frames per second for video image matching and is up to 4 times faster than existing methods. The proposed method therefore has a good potential for the various video-based applications that requires image matching as a pre-processing.