• Title/Summary/Keyword: K means clustering

Search Result 1,118, Processing Time 0.035 seconds

A Study on Data Clustering Method Using Local Probability (국부 확률을 이용한 데이터 분류에 관한 연구)

  • Son, Chang-Ho;Choi, Won-Ho;Lee, Jae-Kook
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.13 no.1
    • /
    • pp.46-51
    • /
    • 2007
  • In this paper, we propose a new data clustering method using local probability and hypothesis theory. To cluster the test data set we analyze the local area of the test data set using local probability distribution and decide the candidate class of the data set using mean standard deviation and variance etc. To decide each class of the test data, statistical hypothesis theory is applied to the decided candidate class of the test data set. For evaluating, the proposed classification method is compared to the conventional fuzzy c-mean method, k-means algorithm and Discriminator analysis algorithm. The simulation results show more accuracy than results of fuzzy c-mean method, k-means algorithm and Discriminator analysis algorithm.

Design and Implementation of a Body Fat Classification Model using Human Body Size Data

  • Taejun Lee;Hakseong Kim;Hoekyung Jung
    • Journal of information and communication convergence engineering
    • /
    • v.21 no.2
    • /
    • pp.110-116
    • /
    • 2023
  • Recently, as various examples of machine learning have been applied in the healthcare field, deep learning technology has been applied to various tasks, such as electrocardiogram examination and body composition analysis using wearable devices such as smart watches. To utilize deep learning, securing data is the most important procedure, where human intervention, such as data classification, is required. In this study, we propose a model that uses a clustering algorithm, namely, the K-means clustering, to label body fat according to gender and age considering body size aspects, such as chest circumference and waist circumference, and classifies body fat into five groups from high risk to low risk using a convolutional neural network (CNN). As a result of model validation, accuracy, precision, and recall results of more than 95% were obtained. Thus, rational decision making can be made in the field of healthcare or obesity analysis using the proposed method.

Robust k-means Clustering-based High-speed Barcode Decoding Method to Blur and Illumination Variation (블러와 조명 변화에 강인한 k-means 클러스터링 기반 고속 바코드 정보 추출 방법)

  • Kim, Geun-Jun;Cho, Hosang;Kang, Bongsoon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.1
    • /
    • pp.58-64
    • /
    • 2016
  • In this paper presents Robust k-means clustering-based high-speed bar code decoding method to blur and lighting. for fast operation speed and robust decoding to blur, proposed method uses adaptive local threshold binarization methods that calculate threshold value by dividing blur region and a non-blurred region. Also, in order to prevent decoding fail from the noise, decoder based on k-means clustering algorithm is implemented using area data summed pixel width line of the same number of element. Results of simulation using samples taken at various worst case environment, the average success rate of proposed method is 98.47%. it showed the highest decoding success rate among the three comparison programs.

Nonlinear Process Modeling Using Hard Partition-based Inference System (Hard 분산 분할 기반 추론 시스템을 이용한 비선형 공정 모델링)

  • Park, Keon-Jun;Kim, Yong-Kab
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.7 no.4
    • /
    • pp.151-158
    • /
    • 2014
  • In this paper, we introduce an inference system using hard scatter partition method and model the nonlinear process. To do this, we use the hard scatter partition method that partition the input space in the scatter form with the value of the membership degree of 0 or 1. The proposed method is implemented by C-Means clustering algorithm. and is used for the initial center values by means of binary split. by applying the LBG algorithm to compensate for shortcomings in the sensitive initial center value. Hard-scatter-partitioned input space forms the rules in the rule-based system modeling. The premise parameters of the rules are determined by membership matrix by means of C-Means clustering algorithm. The consequence part of the rules is expressed in the form of polynomial functions and the coefficient parameters of each rule are determined by the standard least-squares method. The data widely used in nonlinear process is used to model the nonlinear process and evaluate the characteristics of nonlinear process.

Development of Path Finding System using K-means clustering for Intelligent Wheelchair (K-means clustering을 이용한 지능형 휠체어의 경로 선정 시스템 구현)

  • Kwak, Dongseok;Lee, Jaekook;Ju, Jin Sun;Ko, Eunjeong;Kim, Eun Yi
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2009.11a
    • /
    • pp.381-382
    • /
    • 2009
  • 본 논문은 고령자 및 장애인의 안전한 이동을 지원하기 위한 지능형 휠체어에서의 자동 장애물 감지 및 회피 기술을 개발한다. 이때 다양한 환경에서의 장애물을 정확히 감지하고 회피하기 위하여 학습을 이용한 비전 기반의 경로 선정 방법이 제안 된다. 제안된 시스템은 배경 분류기, occupancy grid map 생성기, 경로 선정기로 구성되며, 경로 선정 시 강건한 장애물 검출을 수행하기 위해 입력 영상을 occupancy grid map으로 변환하고, K-means clustering 알고리즘을 이용하여 생성된 대표 템플릿들과 비교하여 이동 가능한 방향을 선정한다. 제안된 시스템의 효율성을 증명하기 위해 다양한 형태의 장애물을 포함하는 실내 및 실외에서 실험한 결과 81.7%의 정확도를 보였으며, 지능형 휠체어 사용자에게 안전한 이동성을 제공 할 수 있음을 증명 하였다.

Clustering-based Monitoring and Fault detection in Hot Strip Roughing Mill (군집기반 열간조압연설비 상태모니터링과 진단)

  • SEO, MYUNG-KYO;YUN, WON YOUNG
    • Journal of Korean Society for Quality Management
    • /
    • v.45 no.1
    • /
    • pp.25-38
    • /
    • 2017
  • Purpose: Hot strip rolling mill consists of a lot of mechanical and electrical units. In condition monitoring and diagnosis phase, various units could be failed with unknown reasons. In this study, we propose an effective method to detect early the units with abnormal status to minimize system downtime. Methods: The early warning problem with various units is defined. K-means and PAM algorithm with Euclidean and Manhattan distances were performed to detect the abnormal status. In addition, an performance of the proposed algorithm is investigated by field data analysis. Results: PAM with Manhattan distance(PAM_ManD) showed better results than K-means algorithm with Euclidean distance(K-means_ED). In addition, we could know from multivariate field data analysis that the system reliability of hot strip rolling mill can be increased by detecting early abnormal status. Conclusion: In this paper, clustering-based monitoring and fault detection algorithm using Manhattan distance is proposed. Experiments are performed to study the benefit of the PAM with Manhattan distance against the K-means with Euclidean distance.

Automatic Intelligent Asymmetry Detection Using Digital Infrared Imaging with K-Means Clustering

  • Kim, Kwang Baek;Song, Doo Hoen
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.15 no.3
    • /
    • pp.180-185
    • /
    • 2015
  • Digital infrared thermal imaging is a non-invasive adjunctive diagnostic technique that allows an examiner to visualize and quantify changes in skin surface temperature. The asymmetry of temperature differences between the diseased and the contralateral healthy body parts can be automatically analyzed and has been studied in many areas of medical science. In this paper, we propose a method for intelligent automatic asymmetry detection based on a K-means analysis and a YCbCr color model. The implemented software successfully visualizes an asymmetric distribution of colors with respect to the patients’ health status.

Design and Implementation of the Ensemble-based Classification Model by Using k-means Clustering

  • Song, Sung-Yeol;Khil, A-Ra
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.10
    • /
    • pp.31-38
    • /
    • 2015
  • In this paper, we propose the ensemble-based classification model which extracts just new data patterns from the streaming-data by using clustering and generates new classification models to be added to the ensemble in order to reduce the number of data labeling while it keeps the accuracy of the existing system. The proposed technique performs clustering of similar patterned data from streaming data. It performs the data labeling to each cluster at the point when a certain amount of data has been gathered. The proposed technique applies the K-NN technique to the classification model unit in order to keep the accuracy of the existing system while it uses a small amount of data. The proposed technique is efficient as using about 3% less data comparing with the existing technique as shown the simulation results for benchmarks, thereby using clustering.

Product Recommendation System on VLDB using k-means Clustering and Sequential Pattern Technique (k-means 클러스터링과 순차 패턴 기법을 이용한 VLDB 기반의 상품 추천시스템)

  • Shim, Jang-Sup;Woo, Seon-Mi;Lee, Dong-Ha;Kim, Yong-Sung;Chung, Soon-Key
    • The KIPS Transactions:PartD
    • /
    • v.13D no.7 s.110
    • /
    • pp.1027-1038
    • /
    • 2006
  • There are many technical problems in the recommendation system based on very large database(VLDB). So, it is necessary to study the recommendation system' structure and the data-mining technique suitable for the large scale Internet shopping mail. Thus we design and implement the product recommendation system using k-means clustering algorithm and sequential pattern technique which can be used in large scale Internet shopping mall. This paper processes user information by batch processing, defines the various categories by hierarchical structure, and uses a sequential pattern mining technique for the search engine. For predictive modeling and experiment, we use the real data(user's interest and preference of given category) extracted from log file of the major Internet shopping mall in Korea during 30 days. And we define PRP(Predictive Recommend Precision), PRR(Predictive Recommend Recall), and PF1(Predictive Factor One-measure) for evaluation. In the result of experiments, the best recommendation time and the best learning time of our system are much as O(N) and the values of measures are very excellent.

Combining Distributed Word Representation and Document Distance for Short Text Document Clustering

  • Kongwudhikunakorn, Supavit;Waiyamai, Kitsana
    • Journal of Information Processing Systems
    • /
    • v.16 no.2
    • /
    • pp.277-300
    • /
    • 2020
  • This paper presents a method for clustering short text documents, such as news headlines, social media statuses, or instant messages. Due to the characteristics of these documents, which are usually short and sparse, an appropriate technique is required to discover hidden knowledge. The objective of this paper is to identify the combination of document representation, document distance, and document clustering that yields the best clustering quality. Document representations are expanded by external knowledge sources represented by a Distributed Representation. To cluster documents, a K-means partitioning-based clustering technique is applied, where the similarities of documents are measured by word mover's distance. To validate the effectiveness of the proposed method, experiments were conducted to compare the clustering quality against several leading methods. The proposed method produced clusters of documents that resulted in higher precision, recall, F1-score, and adjusted Rand index for both real-world and standard data sets. Furthermore, manual inspection of the clustering results was conducted to observe the efficacy of the proposed method. The topics of each document cluster are undoubtedly reflected by members in the cluster.