• Title/Summary/Keyword: grouping algorithms

Search Result 104, Processing Time 0.02 seconds

Combined Artificial Bee Colony for Data Clustering (융합 인공벌군집 데이터 클러스터링 방법)

  • Kang, Bum-Su;Kim, Sung-Soo
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.40 no.4
    • /
    • pp.203-210
    • /
    • 2017
  • Data clustering is one of the most difficult and challenging problems and can be formally considered as a particular kind of NP-hard grouping problems. The K-means algorithm is one of the most popular and widely used clustering method because it is easy to implement and very efficient. However, it has high possibility to trap in local optimum and high variation of solutions with different initials for the large data set. Therefore, we need study efficient computational intelligence method to find the global optimal solution in data clustering problem within limited computational time. The objective of this paper is to propose a combined artificial bee colony (CABC) with K-means for initialization and finalization to find optimal solution that is effective on data clustering optimization problem. The artificial bee colony (ABC) is an algorithm motivated by the intelligent behavior exhibited by honeybees when searching for food. The performance of ABC is better than or similar to other population-based algorithms with the added advantage of employing fewer control parameters. Our proposed CABC method is able to provide near optimal solution within reasonable time to balance the converged and diversified searches. In this paper, the experiment and analysis of clustering problems demonstrate that CABC is a competitive approach comparing to previous partitioning approaches in satisfactory results with respect to solution quality. We validate the performance of CABC using Iris, Wine, Glass, Vowel, and Cloud UCI machine learning repository datasets comparing to previous studies by experiment and analysis. Our proposed KABCK (K-means+ABC+K-means) is better than ABCK (ABC+K-means), KABC (K-means+ABC), ABC, and K-means in our simulations.

Design of knowledge search algorithm for PHR based personalized health information system (PHR 기반 개인 맞춤형 건강정보 탐사 알고리즘 설계)

  • SHIN, Moon-Sun
    • Journal of Digital Convergence
    • /
    • v.15 no.4
    • /
    • pp.191-198
    • /
    • 2017
  • It is needed to support intelligent customized health information service for user convenience in PHR based Personal Health Care Service Platform. In this paper, we specify an ontology-based health data model for Personal Health Care Service Platform. We also design a knowledge search algorithm that can be used to figure out similar health record by applying machine learning and data mining techniques. Axis-based mining algorithm, which we proposed, can be performed based on axis-attributes in order to improve relevance of knowledge exploration and to provide efficient search time by reducing the size of candidate item set. And K-Nearest Neighbor algorithm is used to perform to do grouping users byaccording to the similarity of the user profile. These algorithms improves the efficiency of customized information exploration according to the user 's disease and health condition. It can be useful to apply the proposed algorithm to a process of inference in the Personal Health Care Service Platform and makes it possible to recommend customized health information to the user. It is useful for people to manage smart health care in aging society.

Group Node Contention Algorithm for Avoiding Continuous Collisions in LR-WPAN (무선 저속 PAN에서 연속된 충돌 회피를 위한 그룹 노드 경쟁 알고리즘)

  • Lee, Ju-Hyun;Yoo, Sang-Jo
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.33 no.12B
    • /
    • pp.1066-1074
    • /
    • 2008
  • In this paper, we proposed an efficient algorithm using pulse signal based on group-node-contention in LR-WPAN. The purpose of IEEE 802.15.4 is low speed, low cost and low power consumption. Recently, as applications of LR-WPAN have been extended, there is a strong probability of collision as well and almost collision occurs because of hidden node problem. Moreover, if the collision continuously occurs due to hidden node collision, network performance could be decreased. Nowadays, although several papers focus on the hidden node collision, algorithms waste the channel resource if continuous collisions frequently occur. In this paper, we assume that PAN has been already formed groups, and by using pulse signal, coordinator allocates channel and orders, and then, nodes in the allocated group can compete each other. Hence, contention nodes are reduced significantly, channel wastage caused by collision is decreased, and data transmission rate is improving. Finally, this algorithm can protect the network from disruption caused by frequent collisions. Simulation shows that this algorithm can improve the performance.

A 2D FLIR Image-based 3D Target Recognition using Degree of Reliability of Contour (윤곽선의 신뢰도를 고려한 2차원 적외선 영상 기반의 3차원 목표물 인식 기법)

  • 이훈철;이청우;배성준;이광연;김성대
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.24 no.12B
    • /
    • pp.2359-2368
    • /
    • 1999
  • In this paper we propose a 2D FLIR image-based 3D target recognition system which performs group-to-ground vehicle recognition using the target contour and its degree of reliability extracted from FLIR image. First we extract target from background in FLIR image. Then we define contour points of the extracted target which have high edge gradient magnitude and brightness value as reliable contour point and make reliable contour by grouping all reliable contour points. After that we extract corresponding reliable contours from model contour image and perform comparison between scene and model features which are calculated by DST(discrete sine transform) of reliable contours. Experiment shows that the proposed algorithm work well and even in case of imperfect target extraction it showed better performance then conventional 2D contour-based matching algorithms.

  • PDF

A study on classification of textile design and extraction of regions of interest (텍스타일 디자인 분류 및 관심 영역 도출에 대한 연구)

  • Chae, Seung Wan;Lee, Woo Chang;Lee, Byoung Woo;Lee, Choong Kwon
    • Smart Media Journal
    • /
    • v.10 no.2
    • /
    • pp.70-75
    • /
    • 2021
  • Grouping and classifying similar designs in design increase efficiency in terms of management and provide convenience in terms of use. Using artificial intelligence algorithms, this study attempted to classify textile designs into four categories: dots, flower patterns, stripes, and geometry. In particular, we explored whether it is possible to find and explain the regions of interest underlying classification from the perspective of artificial intelligence. We randomly extracted a total of 4,536 designs at a ratio of 8:2, comprising 3,629 for training and 907 for testing. The models used in the classification were VGG-16 and ResNet-34, both of which showed excellent classification performance with precision on flower pattern designs of 0.79%, 0.89% and recall of 0.95% and 0.38%. Analysis using the Local Interpretable Model-agnostic Explanation (LIME) technique has shown that geometry and flower-patterned designs derived shapes and petals from the region of interest on which classification was based.

Superpixel Segmentation Scheme Using Image Complexity (영상의 복잡도를 고려한 슈퍼픽셀 분할 방법)

  • Park, Sanghyun
    • The Journal of Korean Institute of Information Technology
    • /
    • v.16 no.12
    • /
    • pp.85-92
    • /
    • 2018
  • When using complicated image processing algorithms, we use superpixels to reduce computational complexity. Superpixel segmentation is a method of grouping pixels having similar characteristics into one group. Since superpixel is used as a preprocessing of image processing, it should be generated quickly, and the edge components of the image should be well preserved. In this paper, we propose a method of generating superpixels with a small amount of computation while preserving edge components well. In the proposed method, superpixels of an image are generated by using the existing k-mean method, and similar superpixels among the generated superpixels are merged to make final superpixels. When merging superpixels, the similarity is calculated only for superpixels. Therefore, the amount of computation is maintained small. It is shown by experimental results that the superpixel images produced by the proposed method are conserving edge information of the original image better than those produced by the existing method.

Hierarchical grouping recommendation system based on the attributes of contents: a case study of 'The Movie Dataset' (콘텐츠 속성에 따른 계층적 그룹화 추천시스템: 'The Movie Dataset' 분석사례연구)

  • Kim, Yoon Kyoung;Yeo, In-Kwon
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.6
    • /
    • pp.833-842
    • /
    • 2020
  • Global platforms such as Netflix, Amazon, and YouTube have developed a precise recommendation system based on various information from large set of customers and many of the items recommended here are leading to actual purchases. In this paper, a cluster analysis was conducted according to the attribute of the content, expecting that there would be a difference in user preferences according to the attribute of the recommended content. Gower distance was used for use regardless of the type of variables. In this paper, using the data of movie rating site 'The Movie Dataset', the users were grouped hierarchically and recommended movies based on genre, director and actor variables. To evaluate the recommended systems proposed, user group was divided into train set and test set to examine the precision. The results showed that proposed algorithms have far higher precision than UBCF.

Mean-shortfall portfolio optimization via sorted L-one penalized estimation (슬로프 방식을 이용한 평균-숏폴 포트폴리오 최적화)

  • Haein Cho;Seyoung Park
    • The Korean Journal of Applied Statistics
    • /
    • v.37 no.3
    • /
    • pp.265-282
    • /
    • 2024
  • Research in the area of financial portfolio optimization, with the dual goals of increasing expected returns and reducing financial risk, has actively explored various risk measurement indicators. At the same time, the incorporation of various penalty terms to construct efficient portfolios with limited assets has been investigated. In this study, we present a novel portfolio optimization formula that combines the mean-shortfall portfolio and the SLOPE penalty term. Specifically, we formulate this optimization expression, which differs from linear programming, by introducing new variables and using the alternating direction method of multipliers (ADMM) algorithms. Through simulations, we validate the automatic grouping property of the SLOPE penalty term within the proposed mean-shortfall portfolio. Furthermore, using the model introduced in this paper, we propose and evaluate four different types of portfolio compositions relevant to real-world investment scenarios through empirical data analysis.

Improved SIM Algorithm for Contents-based Image Retrieval (내용 기반 이미지 검색을 위한 개선된 SIM 방법)

  • Kim, Kwang-Baek
    • Journal of Intelligence and Information Systems
    • /
    • v.15 no.2
    • /
    • pp.49-59
    • /
    • 2009
  • Contents-based image retrieval methods are in general more objective and effective than text-based image retrieval algorithms since they use color and texture in search and avoid annotating all images for search. SIM(Self-organizing Image browsing Map) is one of contents-based image retrieval algorithms that uses only browsable mapping results obtained by SOM(Self Organizing Map). However, SOM may have an error in selecting the right BMU in learning phase if there are similar nodes with distorted color information due to the intensity of light or objects' movements in the image. Such images may be mapped into other grouping nodes thus the search rate could be decreased by this effect. In this paper, we propose an improved SIM that uses HSV color model in extracting image features with color quantization. In order to avoid unexpected learning error mentioned above, our SOM consists of two layers. In learning phase, SOM layer 1 has the color feature vectors as input. After learning SOM Layer 1, the connection weights of this layer become the input of SOM Layer 2 and re-learning occurs. With this multi-layered SOM learning, we can avoid mapping errors among similar nodes of different color information. In search, we put the query image vector into SOM layer 2 and select nodes of SOM layer 1 that connects with chosen BMU of SOM layer 2. In experiment, we verified that the proposed SIM was better than the original SIM and avoid mapping error effectively.

  • PDF

Performance Comparison of Clustering using Discritization Algorithm (이산화 알고리즘을 이용한 계층적 클러스터링의 실험적 성능 평가)

  • Won, Jae Kang;Lee, Jeong Chan;Jung, Yong Gyu;Lee, Young Ho
    • Journal of Service Research and Studies
    • /
    • v.3 no.2
    • /
    • pp.53-60
    • /
    • 2013
  • Datamining from the large data in the form of various techniques for obtaining information have been developed. In recent years one of the most sought areas of pattern recognition and machine learning method is created with most of existing learning algorithms based on categorical attributes to a rule or decision model. However, the real-world data, it may consist of numeric attributes in many cases. In addition it contains attributes with numerical values to the normal categorical attribute. In this case, therefore, it is required processes in order to use the data to learn an appropriate value for the type attribute. In this paper, the domain of the numeric attributes are divided into several segments using learning algorithm techniques of discritization. It is described Clustering with other data mining techniques. Large amount of first cluster with characteristics is similar records from the database into smaller groups that split multiple given finite patterns in the pattern space. It is close to each other of a set of patterns that together make up a bunch. Among the set without specifying a particular category in a given data by extracting a pattern. It will be described similar grouping of data clustering technique to classify the data.

  • PDF