• Title/Summary/Keyword: K means clustering

Search Result 1,111, Processing Time 0.035 seconds

A Study on Market Segmentation Based on E-Commerce User Reviews Using Clustering Algorithm (클러스터링 기법을 활용한 이커머스 사용자 리뷰에 따른 시장세분화 연구)

  • Kim, Mingyeong;Huh, Jaeseok;Sa, Aejin;Jun, Ahreum;Lee, Hanbyeol
    • The Journal of Society for e-Business Studies
    • /
    • v.27 no.2
    • /
    • pp.21-36
    • /
    • 2022
  • Recently, as COVID-19 has made the e-commerce market expand widely, customers who have different consumption patterns appear in the market. Because companies can obtain opinions and information of customers from reviews, they increasingly face the requirements of managing customer reviews on online platform. In this study, we analyze customers and carry out market segmentation for classifying and defining type of customers in e-commerce. Specifically, K-means clustering was conducted on customer review data collected from Wemakeprice online shopping platform, which leads to the result that six clusters were derived. Finally, we define the characteristics of each cluster and propose a customer management plan. This paper is possible to be used as materials which identify types of customers and it can reduce the cost of customer management and make a profit for online platforms.

The preprocessing effect using K-means clustering and merging algorithms in cardiac left ventricle segmentation

  • Cho, Ik-Hwan;Do, Ki-Bum;Oh, Jung-Su;Song, In-Chan;Chang, Kee-Hyun;Jeong, Dong-Seok
    • Proceedings of the KSMRM Conference
    • /
    • 2002.11a
    • /
    • pp.126-126
    • /
    • 2002
  • Purpose: For quantitative analysis of the cardiac diseases, it is necessary to segment the left-ventricle(LV) in MR cardiac images. Snake or active contour model has been used to segment LV boundary. In using these models, however, the contour of the LV may not converge to the desirable one because the contour may fall into local minimum value due to image artifact in inner region of the LV Therefore, in this paper, we propose the new preprocessing method using K-means clustering and merging algorithms that can improve the performance of the active contour model.

  • PDF

A Classification Algorithm Based on Data Clustering and Data Reduction for Intrusion Detection System over Big Data

  • Wang, Qiuhua;Ouyang, Xiaoqin;Zhan, Jiacheng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.7
    • /
    • pp.3714-3732
    • /
    • 2019
  • With the rapid development of network, Intrusion Detection System(IDS) plays a more and more important role in network applications. Many data mining algorithms are used to build IDS. However, due to the advent of big data era, massive data are generated. When dealing with large-scale data sets, most data mining algorithms suffer from a high computational burden which makes IDS much less efficient. To build an efficient IDS over big data, we propose a classification algorithm based on data clustering and data reduction. In the training stage, the training data are divided into clusters with similar size by Mini Batch K-Means algorithm, meanwhile, the center of each cluster is used as its index. Then, we select representative instances for each cluster to perform the task of data reduction and use the clusters that consist of representative instances to build a K-Nearest Neighbor(KNN) detection model. In the detection stage, we sort clusters according to the distances between the test sample and cluster indexes, and obtain k nearest clusters where we find k nearest neighbors. Experimental results show that searching neighbors by cluster indexes reduces the computational complexity significantly, and classification with reduced data of representative instances not only improves the efficiency, but also maintains high accuracy.

An Optimized Partner Searching System for B2B Marketplace Applying Clustering Techniques (군집화 기법을 이용한 B2B Marketplace상의 최적 파트너 검색 시스템)

  • Kim Shin-Young;Kim Soo-Young
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2003.05a
    • /
    • pp.572-579
    • /
    • 2003
  • With the expansion of e-commerce, E-marketplace has become one of the most discussed topics in recent years. Limited theoretical works, however, have been done to optimize the practical use of e-marketplace systems. Other potential issues aside, this research has focused on this problem: 'the participants waste too much time, effort and cost to find out their best partner in B2B marketplace.' To solve this problem, this paper proposes a system which provides the user-company with the automated and customized brokering service. The system proposed in this paper assesses the weight on the priorities of a user-company, runs the two-stage clustering algorithm with self-organizing map and K-means clustering technique. Subsequently, the system shows the clustering result and user guide-line. This system enables B2B marketplace to have more efficiency on transaction with smaller pool of partners to be searched.

  • PDF

A Study on Korean isolated word recognition using LPC cepstrum and clustering (LPC Cepstrum과 집단화를 이용한 한국어 고립단어 인식에 관한 연구)

  • Kim, Jin-Yeong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.6 no.4
    • /
    • pp.44-54
    • /
    • 1987
  • In this paper, the problem of LP-model and it's solution by liftering in cepstrum domain are investigated in speaker independent isolated-word recognition. And, clustering technique is discussed for obtaining the reference template. KMA (K-means iteration with average) method, which is transformed from UWA method and K-iteration method, has been suggested and compared with each other for clustering, the result of recognition experiments shows max. $95\%$ recognition rate when rasied-sign lifter and KMA clustering method is applied.

  • PDF

Korean Phoneme Recognition by Combining Self-Organizing Feature Map with K-means clustering algorithm

  • Jeon, Yong-Ku;Lee, Seong-Kwon;Yang, Jin-Woo;Lee, Hyung-Jun;Kim, Soon-Hyob
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06a
    • /
    • pp.1046-1051
    • /
    • 1994
  • It is known that SOFM has the property of effectively creating topographically the organized map of various features on input signals, SOFM can effectively be applied to the recognition of Korean phonemes. However, is isn't guaranteed that the network is sufficiently learned in SOFM algorithm. In order to solve this problem, we propose the learning algorithm combined with the conventional K-means clustering algorithm in fine-tuning stage. To evaluate the proposed algorithm, we performed speaker dependent recognition experiment using six phoneme classes. Comparing the performances of the Kohonen's algorithm with a proposed algorithm, we prove that the proposed algorithm is better than the conventional SOFM algorithm.

  • PDF

Nonlinear Characteristics of Non-Fuzzy Inference Systems Based on HCM Clustering Algorithm (HCM 클러스터링 알고리즘 기반 비퍼지 추론 시스템의 비선형 특성)

  • Park, Keon-Jun;Lee, Dong-Yoon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.13 no.11
    • /
    • pp.5379-5388
    • /
    • 2012
  • In fuzzy modeling for nonlinear process, the fuzzy rules are typically formed by selection of the input variables, the number of space division and membership functions. The Generation of fuzzy rules for nonlinear processes have the problem that the number of fuzzy rules exponentially increases. To solve this problem, complex nonlinear process can be modeled by generating the fuzzy rules by means of fuzzy division of input space. Therefore, in this paper, rules of non-fuzzy inference systems are generated by partitioning the input space in the scatter form using HCM clustering algorithm. The premise parameters of the rules are determined by membership matrix by means of HCM clustering algorithm. The consequence part of the rules is represented in the form of polynomial functions and the consequence parameters of each rule are identified by the standard least-squares method. And lastly, we evaluate the performance and the nonlinear characteristics using the data widely used in nonlinear process. Through this experiment, we showed that high-dimensional nonlinear systems can be modeled by a very small number of rules.

A Study on Research Paper Classification Using Keyword Clustering (키워드 군집화를 이용한 연구 논문 분류에 관한 연구)

  • Lee, Yun-Soo;Pheaktra, They;Lee, JongHyuk;Gil, Joon-Min
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.12
    • /
    • pp.477-484
    • /
    • 2018
  • Due to the advancement of computer and information technologies, numerous papers have been published. As new research fields continue to be created, users have a lot of trouble finding and categorizing their interesting papers. In order to alleviate users' this difficulty, this paper presents a method of grouping similar papers and clustering them. The presented method extracts primary keywords from the abstracts of each paper by using TF-IDF. Based on TF-IDF values extracted using K-means clustering algorithm, our method clusters papers to the ones that have similar contents. To demonstrate the practicality of the proposed method, we use paper data in FGCS journal as actual data. Based on these data, we derive the number of clusters using Elbow scheme and show clustering performance using Silhouette scheme.

Bootstrap Analysis and Major DNA Markers of BM4311 Microsatellite Locus in Hanwoo Chromosome 6

  • Yeo, Jung-Sou;Kim, Jae-Woo;Shin, Hyo-Sub;Lee, Jea-Young
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.17 no.8
    • /
    • pp.1033-1038
    • /
    • 2004
  • LOD scores related to marbling scores and permutation test have been applied for the purpose detecting quantitative trait loci (QTL) and we selected a considerable major locus BM4311. K-means clustering, for the major DNA marker mining of BM4311 microsatellite loci in Hanwoo chromosome 6, has been tried and five traits are divided by three cluster groups. Then, the three cluster groups are classified according to six DNA markers. Finally, bootstrap test method to calculate confidence intervals, using resampling method, has been adapted in order to find major DNA markers. It could be concluded that the major markers of BM4311 locus in Hanwoo chromosome 6 were DNA marker 100 and 95 bp.

Fully Automatic Segmentation Method of Pathological Periventricular White Matter Changes Using Morphological Features

  • Cho Ik-Hwan;Song In-Chan;Oh Jung-Su;Jeong Dong-Seok
    • Journal of Biomedical Engineering Research
    • /
    • v.26 no.6
    • /
    • pp.383-391
    • /
    • 2005
  • Age-related White Matter Changes (WMC) on Magnetic Resonance Imaging (MRI) are known to appear frequently in Multiple sclerosis (MS) and Alzheimer's disease and to be related to cognitive impairment. The characterization of these WMC is very important to the study of psychology and aging. These changes consist of periventricular and subcortical types, however it is difficult to detect and segment WMC using only intensity-based methods, because their intensity, level IS similar to th~t of the gray matter (GM). In this paper, we propose a new method of segmenting periventricular WMC using K-means clustering and morphological features.