• Title/Summary/Keyword: DBSCAN

Search Result 66, Processing Time 0.02 seconds

Discriminant analysis for unbalanced data using HDBSCAN (불균형자료를 위한 판별분석에서 HDBSCAN의 활용)

  • Lee, Bo-Hui;Kim, Tae-Heon;Choi, Yong-Seok
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.4
    • /
    • pp.599-609
    • /
    • 2021
  • Data with a large difference in the number of objects between clusters are called unbalanced data. In discriminant analysis of unbalanced data, it is more important to classify objects in minority categories than to classify objects in majority categories well. However, objects in minority categories are often misclassified into majority categories. In this study, we propose a method that combined hierarchical DBSCAN (HDBSCAN) and SMOTE to solve this problem. Using HDBSCAN, it removes noise in minority categories and majority categories. Then it applies SMOTE to create new data. Area under the roc curve (AUC) and F1 scores were used to compare performance with existing methods. As a result, in most cases, the method combining HDBSCAN and synthetic minority oversampling technique (SMOTE) showed a high performance index, and it was found to be an excellent method for classifying unbalanced data.

Dentifying and Clustering the Flood Impacted Areas for Strategic Information Provision (전략적 정보제공을 위한 침수영향구역 클러스터링)

  • Park, Eun Mi;Bilal, Muhammad
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.20 no.6
    • /
    • pp.100-109
    • /
    • 2021
  • Flooding usually brings in disruptions and aggravated congestions to the roadway network. Hence, right information should be provided to road users to avoid the flood-impacted areas and for city officials to recover the network. However, the information about individual link congestion may not be conveyed to roadway users and city officials because too many links are congested at the same time. Therefore, more significant information may be desired, especially in a disastrous situation. This information may include 1) which places to avoid during flooding 2) which places are feasible to drive avoiding flooding. Hence, this paper aims to develop a framework to identify the flood-impacted areas in a roadway network and their criticality. Various impacted clusters and their spatiotemporal properties were identified with field data. From this data, roadway users can reroute their trips, and city officials can take the right actions to recover the affected areas. The information resulting from the developed framework would be significant enough for roadway users and city officials to cope with flooding.

Analysis of Characteristics of NPS Runoff and Pollution Contribution Rate in Songya-stream Watershed (송야천 유역의 비점오염물질 유출 특성 및 오염기여율 분석)

  • Kang Taeseong;Yu Nayeong;Shin Minhwan;Lim Kyoungjae;Park Minji;Park Baekyung;Kim Jonggun
    • Journal of Korean Society on Water Environment
    • /
    • v.39 no.4
    • /
    • pp.316-328
    • /
    • 2023
  • In this study, the characteristics of nonpoint pollutant outflow and contribution rate of pollution in Songya-stream mainstream and tributaries were analyzed. Further, water pollution management and improvement measures for pollution-oriented rivers were proposed. An on-site investigation was conducted to determine the inflow of major pollutants into the basin, and it was found that pollutants generated from agricultural land and livestock facilities flowed into the river, resulting in a high concentration of turbid water. Based on the analysis results of the pollution load data calculated through actual measurement monitoring (flow and water quality) and the occurrence and emission load data calculated using the national pollution source survey data, the S3 and S6 were selected as the concerned pollution tributaries in the Songya-stream basin. Results of cluster analysis using Pearson correlation coefficient evaluation and Density based spatial clustering of applications with noise (DBSCAN) technique showed that the S3 and S6 were most consistent with the C2 cluster (a cluster of Songya-stream mainstream owned area) corresponding to the mainstream of Songya-stream. The analysis results of the major pollutants in the concerned pollution tributaries showed that livestock and land pollutants were the major pollutants. Consequently, optimal management techniques such as fertilizer management, water gate management in paddy, vegetated filter strip and livestock manure public treatment were proposed to reduce livestock and land pollutants.

Moving Object Detection and Tracking Techniques for Error Reduction (오인식률 감소를 위한 이동 물체 검출 및 추적 기법)

  • Hwang, Seung-Jun;Ko, Ha-Yoon;Baek, Joong-Hwan
    • Journal of Advanced Navigation Technology
    • /
    • v.22 no.1
    • /
    • pp.20-26
    • /
    • 2018
  • In this paper, we propose a moving object detection and tracking algorithm based on multi-frame feature point tracking information to reduce false positives. However, there are problems of detection error and tracking speed in existing studies. In order to compensate for this, we first calculate the corner feature points and the optical flow of multiple frames for camera movement compensation and object tracking. Next, the tracking error of the optical flow is reduced by the multi-frame forward-backward tracking, and the traced feature points are divided into the background and the moving object candidate based on homography and RANSAC algorithm for camera movement compensation. Among the transformed corner feature points, the outlier points removed by the RANSAC are clustered and the outlier cluster of a certain size is classified as the moving object candidate. Objects classified as moving object candidates are tracked according to label tracking based data association analysis. In this paper, we prove that the proposed algorithm improves both precision and recall compared with existing algorithms by using quadrotor image - based detection and tracking performance experiments.

Optimal Parameter Analysis and Evaluation of Change Detection for SLIC-based Superpixel Techniques Using KOMPSAT Data (KOMPSAT 영상을 활용한 SLIC 계열 Superpixel 기법의 최적 파라미터 분석 및 변화 탐지 성능 비교)

  • Chung, Minkyung;Han, Youkyung;Choi, Jaewan;Kim, Yongil
    • Korean Journal of Remote Sensing
    • /
    • v.34 no.6_3
    • /
    • pp.1427-1443
    • /
    • 2018
  • Object-based image analysis (OBIA) allows higher computation efficiency and usability of information inherent in the image, as it reduces the complexity of the image while maintaining the image properties. Superpixel methods oversegment the image with a smaller image unit than an ordinary object segment and well preserve the edges of the image. SLIC (Simple linear iterative clustering) is known for outperforming the previous superpixel methods with high image segmentation quality. Although the input parameter for SLIC, number of superpixels has considerable influence on image segmentation results, impact analysis for SLIC parameter has not been investigated enough. In this study, we performed optimal parameter analysis and evaluation of change detection for SLIC-based superpixel techniques using KOMPSAT data. Forsuperpixel generation, three superpixel methods (SLIC; SLIC0, zero parameter version of SLIC; SNIC, simple non-iterative clustering) were used with superpixel sizes in ranges of $5{\times}5$ (pixels) to $50{\times}50$ (pixels). Then, the image segmentation results were analyzed for how well they preserve the edges of the change detection reference data. Based on the optimal parameter analysis, image segmentation boundaries were obtained from difference image of the bi-temporal images. Then, DBSCAN (Density-based spatial clustering of applications with noise) was applied to cluster the superpixels to a certain size of objects for change detection. The changes of features were detected for each superpixel and compared with reference data for evaluation. From the change detection results, it proved that better change detection can be achieved even with bigger superpixel size if the superpixels were generated with high regularity of size and shape.

Color-related Query Processing for Intelligent E-Commerce Search (지능형 검색엔진을 위한 색상 질의 처리 방안)

  • Hong, Jung A;Koo, Kyo Jung;Cha, Ji Won;Seo, Ah Jeong;Yeo, Un Yeong;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.109-125
    • /
    • 2019
  • As interest on intelligent search engines increases, various studies have been conducted to extract and utilize the features related to products intelligencely. In particular, when users search for goods in e-commerce search engines, the 'color' of a product is an important feature that describes the product. Therefore, it is necessary to deal with the synonyms of color terms in order to produce accurate results to user's color-related queries. Previous studies have suggested dictionary-based approach to process synonyms for color features. However, the dictionary-based approach has a limitation that it cannot handle unregistered color-related terms in user queries. In order to overcome the limitation of the conventional methods, this research proposes a model which extracts RGB values from an internet search engine in real time, and outputs similar color names based on designated color information. At first, a color term dictionary was constructed which includes color names and R, G, B values of each color from Korean color standard digital palette program and the Wikipedia color list for the basic color search. The dictionary has been made more robust by adding 138 color names converted from English color names to foreign words in Korean, and with corresponding RGB values. Therefore, the fininal color dictionary includes a total of 671 color names and corresponding RGB values. The method proposed in this research starts by searching for a specific color which a user searched for. Then, the presence of the searched color in the built-in color dictionary is checked. If there exists the color in the dictionary, the RGB values of the color in the dictioanry are used as reference values of the retrieved color. If the searched color does not exist in the dictionary, the top-5 Google image search results of the searched color are crawled and average RGB values are extracted in certain middle area of each image. To extract the RGB values in images, a variety of different ways was attempted since there are limits to simply obtain the average of the RGB values of the center area of images. As a result, clustering RGB values in image's certain area and making average value of the cluster with the highest density as the reference values showed the best performance. Based on the reference RGB values of the searched color, the RGB values of all the colors in the color dictionary constructed aforetime are compared. Then a color list is created with colors within the range of ${\pm}50$ for each R value, G value, and B value. Finally, using the Euclidean distance between the above results and the reference RGB values of the searched color, the color with the highest similarity from up to five colors becomes the final outcome. In order to evaluate the usefulness of the proposed method, we performed an experiment. In the experiment, 300 color names and corresponding color RGB values by the questionnaires were obtained. They are used to compare the RGB values obtained from four different methods including the proposed method. The average euclidean distance of CIE-Lab using our method was about 13.85, which showed a relatively low distance compared to 3088 for the case using synonym dictionary only and 30.38 for the case using the dictionary with Korean synonym website WordNet. The case which didn't use clustering method of the proposed method showed 13.88 of average euclidean distance, which implies the DBSCAN clustering of the proposed method can reduce the Euclidean distance. This research suggests a new color synonym processing method based on RGB values that combines the dictionary method with the real time synonym processing method for new color names. This method enables to get rid of the limit of the dictionary-based approach which is a conventional synonym processing method. This research can contribute to improve the intelligence of e-commerce search systems especially on the color searching feature.