• 제목/요약/키워드: K-means Clustering Analysis

검색결과 462건 처리시간 0.023초

슈퍼픽셀 DBSCAN 군집 알고리즘을 이용한 용융아연도금 강판의 부식이미지 분석 (Corrosion image analysis on galvanized steel by using superpixel DBSCAN clustering algorithm)

  • 김범수;김연원;이경황;양정현
    • 한국표면공학회지
    • /
    • 제55권3호
    • /
    • pp.164-172
    • /
    • 2022
  • Hot-dip galvanized steel(GI) is widely used throughout the industry as a corrosion resistance material. Corrosion of steel is a common phenomenon that results in the gradual degradation under various environmental conditions. Corrosion monitoring is to track the degradation progress for a long time. Corrosion on steel plate appears as discoloration and any irregularities on the surface. This study developed a quantitative evaluation method of the rust formed on GI steel plate using a superpixel-based DBSCAN clustering method and k-means clustering from the corroded area in a given image. The superpixel-based DBSCAN clustering method decrease computational costs, reaching automatic segmentation. The image color of the rusty surface was analyzed quantitatively based on HSV(Hue, Saturation, Value) color space. In addition, two segmentation methods are compared for the particular spatial region using their histograms.

클러스터링을 이용한 스마트폰 사용자 추천 시스템 만들기 (Creating a Smartphone User Recommendation System Using Clustering)

  • Jin Hyoung AN
    • Journal of Korea Artificial Intelligence Association
    • /
    • 제2권1호
    • /
    • pp.1-6
    • /
    • 2024
  • In this paper, we develop an AI-based recommendation system that matches the specifications of smartphones from company 'S'. The system aims to simplify the complex decision-making process of consumers and guide them to choose the smartphone that best suits their daily needs. The recommendation system analyzes five specifications of smartphones (price, battery capacity, weight, camera quality, capacity) to help users make informed decisions without searching for extensive information. This approach not only saves time but also improves user satisfaction by ensuring that the selected smartphone closely matches the user's lifestyle and needs. The system utilizes unsupervised learning, i.e. clustering (K-MEANS, DBSCAN, Hierarchical Clustering), and provides personalized recommendations by evaluating them with silhouette scores, ensuring accurate and reliable grouping of similar smartphone models. By leveraging advanced data analysis techniques, the system can identify subtle patterns and preferences that might not be immediately apparent to consumers, enhancing the overall user experience. The ultimate goal of this AI recommendation system is to simplify the smartphone selection process, making it more accessible and user-friendly for all consumers. This paper discusses the data collection, preprocessing, development, implementation, and potential impact of the system using Pandas, crawling, scikit-learn, etc., and highlights the benefits of helping consumers explore the various options available and confidently choose the smartphone that best suits their daily lives.

A New Approach for Hierarchical Dividing to Passenger Nodes in Passenger Dedicated Line

  • Zhao, Chanchan;Liu, Feng;Hai, Xiaowei
    • Journal of Information Processing Systems
    • /
    • 제14권3호
    • /
    • pp.694-708
    • /
    • 2018
  • China possesses a passenger dedicated line system of large scale, passenger flow intensity with uneven distribution, and passenger nodes with complicated relations. Consequently, the significance of passenger nodes shall be considered and the dissimilarity of passenger nodes shall be analyzed in compiling passenger train operation and conducting transportation allocation. For this purpose, the passenger nodes need to be hierarchically divided. Targeting at problems such as hierarchical dividing process vulnerable to subjective factors and local optimum in the current research, we propose a clustering approach based on self-organizing map (SOM) and k-means, and then, harnessing the new approach, hierarchical dividing of passenger dedicated line passenger nodes is effectuated. Specifically, objective passenger nodes parameters are selected and SOM is used to give a preliminary passenger nodes clustering firstly; secondly, Davies-Bouldin index is used to determine the number of clusters of the passenger nodes; and thirdly, k-means is used to conduct accurate clustering, thus getting the hierarchical dividing of passenger nodes. Through example analysis, the feasibility and rationality of the algorithm was proved.

집단화를 위한 병렬 알고리즘의 구현 (Parallel Algorithm For Level Clustering)

  • 배용근
    • 한국정보처리학회논문지
    • /
    • 제2권2호
    • /
    • pp.148-155
    • /
    • 1995
  • 많은 양의 패턴들을 분석할 때, 이 패턴들을 어떤 평가함수에 의해서 여러 군으로 집단화할 필요가 있다. 이 과정은 입력 패턴의 수가 많을 경우 상당한 량의 계산을 필 요로 하며, 이를 위한 병렬화 알고리즘이 요구된다. 이 문제를 해결하기 위하여 본 논 문은 K-means 알고리즘을 병렬화한 병렬 집단화 알고리즘을 제안하고, 메세지 전송을 근간으로 하는 MIMD 병렬 컴퓨터하에서 이를 수행하였다. 실험 및 성능 분석을 통하여 입력 패턴이 많을 경우, 본 병렬 알고리즘이 적절함을 알 수 있었다.

  • PDF

Bootstrap Analysis and Major DNA Markers of BM4311 Microsatellite Locus in Hanwoo Chromosome 6

  • Yeo, Jung-Sou;Kim, Jae-Woo;Shin, Hyo-Sub;Lee, Jea-Young
    • Asian-Australasian Journal of Animal Sciences
    • /
    • 제17권8호
    • /
    • pp.1033-1038
    • /
    • 2004
  • LOD scores related to marbling scores and permutation test have been applied for the purpose detecting quantitative trait loci (QTL) and we selected a considerable major locus BM4311. K-means clustering, for the major DNA marker mining of BM4311 microsatellite loci in Hanwoo chromosome 6, has been tried and five traits are divided by three cluster groups. Then, the three cluster groups are classified according to six DNA markers. Finally, bootstrap test method to calculate confidence intervals, using resampling method, has been adapted in order to find major DNA markers. It could be concluded that the major markers of BM4311 locus in Hanwoo chromosome 6 were DNA marker 100 and 95 bp.

Comparison of the Performance of Clustering Analysis using Data Reduction Techniques to Identify Energy Use Patterns

  • Song, Kwonsik;Park, Moonseo;Lee, Hyun-Soo;Ahn, Joseph
    • 국제학술발표논문집
    • /
    • The 6th International Conference on Construction Engineering and Project Management
    • /
    • pp.559-563
    • /
    • 2015
  • Identification of energy use patterns in buildings has a great opportunity for energy saving. To find what energy use patterns exist, clustering analysis has been commonly used such as K-means and hierarchical clustering method. In case of high dimensional data such as energy use time-series, data reduction should be considered to avoid the curse of dimensionality. Principle Component Analysis, Autocorrelation Function, Discrete Fourier Transform and Discrete Wavelet Transform have been widely used to map the original data into the lower dimensional spaces. However, there still remains an ongoing issue since the performance of clustering analysis is dependent on data type, purpose and application. Therefore, we need to understand which data reduction techniques are suitable for energy use management. This research aims find the best clustering method using energy use data obtained from Seoul National University campus. The results of this research show that most experiments with data reduction techniques have a better performance. Also, the results obtained helps facility managers optimally control energy systems such as HVAC to reduce energy use in buildings.

  • PDF

K-평균 군집분석을 이용한 동아시아 지역 날씨유형 분류 (Classification of Weather Patterns in the East Asia Region using the K-means Clustering Analysis)

  • 조영준;이현철;임병환;김승범
    • 대기
    • /
    • 제29권4호
    • /
    • pp.451-461
    • /
    • 2019
  • Medium-range forecast is highly dependent on ensemble forecast data. However, operational weather forecasters have not enough time to digest all of detailed features revealed in ensemble forecast data. To utilize the ensemble data effectively in medium-range forecasting, representative weather patterns in East Asia in this study are defined. The k-means clustering analysis is applied for the objectivity of weather patterns. Input data used daily Mean Sea Level Pressure (MSLP) anomaly of the ECMWF ReAnalysis-Interim (ERA-Interim) during 1981~2010 (30 years) provided by the European Centre for Medium-Range Weather Forecasts (ECMWF). Using the Explained Variance (EV), the optimal study area is defined by 20~60°N, 100~150°E. The number of clusters defined by Explained Cluster Variance (ECV) is thirty (k = 30). 30 representative weather patterns with their frequencies are summarized. Weather pattern #1 occurred all seasons, but it was about 56% in summer (June~September). The relatively rare occurrence of weather pattern (#30) occurred mainly in winter. Additionally, we investigate the relationship between weather patterns and extreme weather events such as heat wave, cold wave, and heavy rainfall as well as snowfall. The weather patterns associated with heavy rainfall exceeding 110 mm day-1 were #1, #4, and #9 with days (%) of more than 10%. Heavy snowfall events exceeding 24 cm day-1 mainly occurred in weather pattern #28 (4%) and #29 (6%). High and low temperature events (> 34℃ and < -14℃) were associated with weather pattern #1~4 (14~18%) and #28~29 (27~29%), respectively. These results suggest that the classification of various weather patterns will be used as a reference for grouping all ensemble forecast data, which will be useful for the scenario-based medium-range ensemble forecast in the future.

A Study on the Integration Between Smart Mobility Technology and Information Communication Technology (ICT) Using Patent Analysis

  • Alkaabi, Khaled Sulaiman Khalfan Sulaiman;Yu, Jiwon
    • 한국컴퓨터정보학회논문지
    • /
    • 제24권6호
    • /
    • pp.89-97
    • /
    • 2019
  • This study proposes a method for investigating current patents related to information communication technology and smart mobility to provide insights into future technology trends. The method is based on text mining clustering analysis. The method consists of two stages, which are data preparation and clustering analysis, respectively. In the first stage, tokenizing, filtering, stemming, and feature selection are implemented to transform the data into a usable format (structured data) and to extract useful information for the next stage. In the second stage, the structured data is partitioned into groups. The K-medoids algorithm is selected over the K-means algorithm for this analysis owing to its advantages in dealing with noise and outliers. The results of the analysis indicate that most current patents focus mainly on smart connectivity and smart guide systems, which play a major role in the development of smart mobility.

An eigenspace projection clustering method for structural damage detection

  • Zhu, Jun-Hua;Yu, Ling;Yu, Li-Li
    • Structural Engineering and Mechanics
    • /
    • 제44권2호
    • /
    • pp.179-196
    • /
    • 2012
  • An eigenspace projection clustering method is proposed for structural damage detection by combining projection algorithm and fuzzy clustering technique. The integrated procedure includes data selection, data normalization, projection, damage feature extraction, and clustering algorithm to structural damage assessment. The frequency response functions (FRFs) of the healthy and the damaged structure are used as initial data, median values of the projections are considered as damage features, and the fuzzy c-means (FCM) algorithm are used to categorize these features. The performance of the proposed method has been validated using a three-story frame structure built and tested by Los Alamos National Laboratory, USA. Two projection algorithms, namely principal component analysis (PCA) and kernel principal component analysis (KPCA), are compared for better extraction of damage features, further six kinds of distances adopted in FCM process are studied and discussed. The illustrated results reveal that the distance selection depends on the distribution of features. For the optimal choice of projections, it is recommended that the Cosine distance is used for the PCA while the Seuclidean distance and the Cityblock distance suitably used for the KPCA. The PCA method is recommended when a large amount of data need to be processed due to its higher correct decisions and less computational costs.

A Simulation Study on The Behavior Analysis of The Degree of Membership in Fuzzy c-means Method

  • Okazaki, Takeo;Aibara, Ukyo;Setiyani, Lina
    • IEIE Transactions on Smart Processing and Computing
    • /
    • 제4권4호
    • /
    • pp.209-215
    • /
    • 2015
  • Fuzzy c-means method is typical soft clustering, and requires a degree of membership that indicates the degree of belonging to each cluster at the time of clustering. Parameter values greater than 1 and less than 2 have been used by convention. According to the proposed data-generation scheme and the simulation results, some behaviors in the degree of "fuzziness" was derived.