• Title/Summary/Keyword: K-means cluster

Search Result 616, Processing Time 0.028 seconds

Classification Tree-Based Feature-Selective Clustering Analysis: Case of Credit Card Customer Segmentation (분류나무를 활용한 군집분석의 입력특성 선택: 신용카드 고객세분화 사례)

  • Yoon Hanseong
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.19 no.4
    • /
    • pp.1-11
    • /
    • 2023
  • Clustering analysis is used in various fields including customer segmentation and clustering methods such as k-means are actively applied in the credit card customer segmentation. In this paper, we summarized the input features selection method of k-means clustering for the case of the credit card customer segmentation problem, and evaluated its feasibility through the analysis results. By using the label values of k-means clustering results as target features of a decision tree classification, we composed a method for prioritizing input features using the information gain of the branch. It is not easy to determine effectiveness with the clustering effectiveness index, but in the case of the CH index, cluster effectiveness is improved evidently in the method presented in this paper compared to the case of randomly determining priorities. The suggested method can be used for effectiveness of actively used clustering analysis including k-means method.

Adaptive k-means clustering for Flying Ad-hoc Networks

  • Raza, Ali;Khan, Muhammad Fahad;Maqsood, Muazzam;Haider, Bilal;Aadil, Farhan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.6
    • /
    • pp.2670-2685
    • /
    • 2020
  • Flying ad-hoc networks (FANETs) is a vibrant research area nowadays. This type of network ranges from various military and civilian applications. FANET is formed by micro and macro UAVs. Among many other problems, there are two main issues in FANET. Limited energy and high mobility of FANET nodes effect the flight time and routing directly. Clustering is a remedy to handle these types of problems. In this paper, an efficient clustering technique is proposed to handle routing and energy problems. Transmission range of FANET nodes is dynamically tuned accordingly as per their operational requirement. By optimizing the transmission range packet loss ratio (PLR) is minimized and link quality is improved which leads towards reduced energy consumption. To elect optimal cluster heads (CHs) based on their fitness we use k-means. Selection of optimal CHs reduce the routing overhead and improves energy consumption. Our proposed scheme outclasses the existing state-of-the-art techniques, ACO based CACONET and PSO based CLPSO, in terms of energy consumption and cluster building time.

Partial Discharge Diagnosis of Interface Defect by the Distribution Statistical Analysis (분포 통계 해석에 의한 계면 결함 부분방전 진단)

  • Cho, Kyung-Soon;Lee, Kang-Won;Kim, Won-Jong;Hong, Jin-Woong;Shin, Jong-Yeol
    • Journal of the Korean Institute of Electrical and Electronic Material Engineers
    • /
    • v.21 no.4
    • /
    • pp.348-353
    • /
    • 2008
  • Most of the high voltage insulation systems, such as the power cable joint having hetero interface, are composed of more than two different insulators to improve insulating performance. The partial discharge(PD) in these hetero interface is expected to affect the total insulation performance. Thus, it is important to study electrical properties on these interfaces. This study described the influence of copper and semiconductive substance defects on $\Phi$-q-n distribution between the interface of the model cable joints to classify PD source. PD was sequentially detected for 600 cycles of the applied voltage. The K-means cluster analysis has been analyzed to investigate the $\Phi$-q-n distribution. The skewness-kurtosis(Sk-Ku) plot from K-means clustering results was defined to quantify cluster distribution and classify distribution patterns. The Sk-Ku plot is composed of skewness and kurtosis along abscissa and ordinate which indicate the asymmetry and the sharpness of distribution. As a result of the Sk-Ku plot, it was confirmed that the data was distributed in 1st 2nd and 3rd quadrant at copper foreign substance defect, but in case of semiconductive foreign substance, the data was distributed in 2nd quadrant only.

Multiscale Clustering and Profile Visualization of Malocclusion in Korean Orthodontic Patients : Cluster Analysis of Malocclusion

  • Jeong, Seo-Rin;Kim, Sehyun;Kim, Soo Yong;Lim, Sung-Hoon
    • International Journal of Oral Biology
    • /
    • v.43 no.2
    • /
    • pp.101-111
    • /
    • 2018
  • Understanding the classification of malocclusion is a crucial issue in Orthodontics. It can also help us to diagnose, treat, and understand malocclusion to establish a standard for definite class of patients. Principal component analysis (PCA) and k-means algorithms have been emerging as data analytic methods for cephalometric measurements, due to their intuitive concepts and application potentials. This study analyzed the macro- and meso-scale classification structure and feature basis vectors of 1020 (415 male, 605 female; mean age, 25 years) orthodontic patients using statistical preprocessing, PCA, random matrix theory (RMT) and k-means algorithms. RMT results show that 7 principal components (PCs) are significant standard in the extraction of features. Using k-means algorithms, 3 and 6 clusters were identified and the axes of PC1~3 were determined to be significant for patient classification. Macro-scale classification denotes skeletal Class I, II, III and PC1 means anteroposterior discrepancy of the maxilla and mandible and mandibular position. PC2 and PC3 means vertical pattern and maxillary position respectively; they played significant roles in the meso-scale classification. In conclusion, the typical patient profile (TPP) of each class showed that the data-based classification corresponds with the clinical classification of orthodontic patients. This data-based study can provide insight into the development of new diagnostic classifications.

Evaluation of Shopping Items: Focused on Purchase of Foreign Tourists in South Korea

  • Jeong, Dong-Bin
    • East Asian Journal of Business Economics (EAJBE)
    • /
    • v.7 no.2
    • /
    • pp.21-30
    • /
    • 2019
  • Purpose - In this work, we categorize the 21 shopping items which foreign tourists purchase in South Korea and monitor the level of dissimilarity (or similarity) between each item by utilizing distance matrix, and both hierarchical and k-means cluster analyses, respectively, based on several purpose of visit attributes in 2017. In addition, multidimensional scaling (MDS) method is applied for mining visual appearance of proximities among shopping items based on purpose of visit attributes. Research design and methodology - This study is carried out in 2017 by Ministry of Culture, Sports and Tourism and conduct a face-to-face survey of foreign tourists from 20 countries who purchase shopping items in South Korea. CLUSTER, PROXIMITIES and ALSCAL modules in IBM SPSS 23.0 are used to perform this work. Results - We ascertain that 21 shopping items can be classified into five similar groups which have homogeneous traits by going through two-step cluster analysis. We can position homogeneous places of cluster and shopping items joining each cluster. Conclusions - We can relatively assess patterns and characteristics of each shopping item, come by useful information in activating shopping tour based on the actual state of recognition of foreign tourists and practically apply to each tourism industry on underlying results.

Fast Search Algorithm for Determining the Optimal Number of Clusters using Cluster Validity Index (클러스터 타당성 평가기준을 이용한 최적의 클러스터 수 결정을 위한 고속 탐색 알고리즘)

  • Lee, Sang-Wook
    • The Journal of the Korea Contents Association
    • /
    • v.9 no.9
    • /
    • pp.80-89
    • /
    • 2009
  • A fast and efficient search algorithm to determine an optimal number of clusters in clustering algorithms is presented. The method is based on cluster validity index which is a measure for clustering optimality. As the clustering procedure progresses and reaches an optimal cluster configuration, the cluster validity index is expected to be minimized or maximized. In this Paper, a fast non-exhaustive search method for finding the optimal number of clusters is designed and shown to work well in clustering. The proposed algorithm is implemented with the k-mean++ algorithm as underlying clustering techniques using CB and PBM as a cluster validity index. Experimental results show that the proposed method provides the computation time efficiency without loss of accuracy on several artificial and real-life data sets.

The Study on Typology of Internet Shopping Style in Internet Shopping Mall Users (인터넷쇼핑몰 이용 소비자의 쇼핑스타일 유형에 관한 연구)

  • Moon Sook-jae;Lee Youn Hee;Cheon Hyejung
    • Journal of the Korean Home Economics Association
    • /
    • v.43 no.9 s.211
    • /
    • pp.1-13
    • /
    • 2005
  • The purposes of this study were to classify internet shopping mall user by their shopping styles and to define the characteristics of the classified individual clusters. Questionnaires were completed by 338 men and women who have used internet shopping malls at lead once during the previous 6 months. The internet shopping styles were classified into 4 clusters after factor analysis and k-means cluster analysis. Cluster I, named 'high brand proneness', can be described as having low score on devotee tendency. Cluster II, named 'high value proneness', is characterized by a high score on seeking substance. Cluster III, called 'steadiness orientation', can be described as having a tow score on seeking trend and substance. Cluster IV, named 'individuality inclination', can be described as having low score on seeking trend. These four clusters differ in terms of socio-demographic and environmental characteristics such as gender, age, educational level, occupation, and internet using time. Theoretical and practical implications are discussed.

Analysis of Area Type Classification of Seoul Using Geodemographics Methods (Geodemographics의 연구기법을 활용한 서울시 지역유형 분석 연구)

  • Woo, Hyun-Jee;Kim, Young-Hoon
    • Journal of the Korean association of regional geographers
    • /
    • v.15 no.4
    • /
    • pp.510-523
    • /
    • 2009
  • Geodemographics(GD) can be defined as an analytical approach of socio-economic and behavioral data about people to investigate geographical patterns. GD is based on the assumptions that demographical and behavioral characteristics of people who live in the same neighborhood are similar and then the neighborhoods can be categorized with spatial classifications with the geographical classifications. Thus, this paper, in order to identify the applicability of the geographical classification of the GD, explores the concepts of the geodemographics into Seoul city areas with Korea census data sets that contain key characteristics of demographic profiles in the area. Then, this paper attempt to explain each area classification profile by using clustering techniques with Ward's and k-means statistical methods. For this as as as, this paper employs 2005 Census dataset released by Korea National Statistics Office and the neighborhood unit is based on Dong level, the smallest administrative boundary unit in Korea. After selecting and standardizing variables, several areas are categorized by the cluster techniques into 13, this paps as distinctive cluster profiles. These cluster profiles are used to cthite a short description and expand on the cluster names. Finally, the results of the classification propose a reasonable judgement for target area types which benefits for the people who make a spatial decision for their spatial problem-solving.

  • PDF

CLUSTER ANALYSIS FOR REGION ELECTRIC LOAD FORECASTING SYSTEM

  • Park, Hong-Kyu;Kim, Young-Il;Park, Jin-Hyoung;Ryu, Keun-Ho
    • Proceedings of the KSRS Conference
    • /
    • 2007.10a
    • /
    • pp.591-593
    • /
    • 2007
  • This paper is to cluster the AMR (Automatic Meter Reading) data. The load survey system has been applied to record the power consumption of sampling the contract assortment in KEPRI AMR. The effect of the contract assortment change to the customer power consumption is determined by executing the clustering on the load survey results. We can supply the power to customer according to usage to the analysis cluster. The Korea a class of the electricity supply type is less than other country. Because of the Korea electricity markets exists one electricity provider. Need to further divide of electricity supply type for more efficient supply. We are found pattern that is different from supplied type to customer. Out experiment use the Clementine which data mining tools.

  • PDF

A Image Contrast Enhancement by Clustering of Image Histogram (영상의 히스토그램 군집화에 의한 영상 대비 향상)

  • Hong, Seok-Keun;Lee, Ki-Hwan;Cho, Seok-Je
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.10 no.4
    • /
    • pp.239-244
    • /
    • 2009
  • Image contrast enhancement has an important role in image processing applications. Conventional contrast enhancement techniques, histogram stretching and histogram equalization, and many methods based on histogram equalization often fail to produce satisfactory results for broad variety of low-contrast images. So, this paper proposes a new image contrast enhancement method based on the clustering method. The number of cluster of histogram is found by analysing the histogram of original image. The histogram components is classified using K-means algorithm. And then these histogram components are performed histogram stretching and histogram equalization selectively by comparing cluster range with pixel rate of cluster. From the expremental results, the proposed method was more effective than conventional contrast enhancement techniques.

  • PDF