• Title/Summary/Keyword: K means clustering

Search Result 1,118, Processing Time 0.032 seconds

Flower Recognition System Using OpenCV on Android Platform (OpenCV를 이용한 안드로이드 플랫폼 기반 꽃 인식 시스템)

  • Kim, Kangchul;Yu, Cao
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.1
    • /
    • pp.123-129
    • /
    • 2017
  • New mobile phones with high tech-camera and a large size memory have been recently launched and people upload pictures of beautiful scenes or unknown flowers in SNS. This paper develops a flower recognition system that can get information on flowers in the place where mobile communication is not even available. It consists of a registration part for reference flowers and a recognition part based on OpenCV for Android platform. A new color classification method using RGB color channel and K-means clustering is proposed to reduce the recognition processing time. And ORB for feature extraction and Brute-Force Hamming algorithm for matching are used. We use 12 kinds of flowers with four color groups, and 60 images are applied for reference DB design and 60 images for test. Simulation results show that the success rate is 83.3% and the average recognition time is 2.58 s on Huawei ALEUL00 and the proposed system is suitable for a mobile phone without a network.

Processing large-scale data with Apache Spark (Apache Spark를 활용한 대용량 데이터의 처리)

  • Ko, Seyoon;Won, Joong-Ho
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.6
    • /
    • pp.1077-1094
    • /
    • 2016
  • Apache Spark is a fast and general-purpose cluster computing package. It provides a new abstraction named resilient distributed dataset, which is capable of support for fault tolerance while keeping data in memory. This type of abstraction results in a significant speedup compared to legacy large-scale data framework, MapReduce. In particular, Spark framework is suitable for iterative machine learning applications such as logistic regression and K-means clustering, and interactive data querying. Spark also supports high level libraries for various applications such as machine learning, streaming data processing, database querying and graph data mining thanks to its versatility. In this work, we introduce the concept and programming model of Spark as well as show some implementations of simple statistical computing applications. We also review the machine learning package MLlib, and the R language interface SparkR.

The Development of Evaluation Criteria Model for Discriminating Specialized General Hospital (종합전문요양기관 인정기준 모형 개발)

  • Chun Ki Hong;Kang Hye-Young;Kang Dae Ryong;Nam Chung Mo;Lee Gye-Cheol
    • Health Policy and Management
    • /
    • v.15 no.4
    • /
    • pp.46-64
    • /
    • 2005
  • This study was conducted to verify the current criteria and classification system used to determine specialized general hospitals status. In this study, we proposed a new classification system which Is simpler and more convenient than the current one. In the new classification system clinical procedure was chosen as the unit of analysis in order to reflect all the resource consumption and the complexities and degree of medical technologies in determining specialized general hospitals. We developed a statistical model and applied this model to 117 general hospitals which claim their national insurance through electronic data interchange(EDI). Analysis based on 984 clinical procedures and medical facilities' characteristic variable discriminated specialized general hospital in present without misclassification. It means that we can determine specialized general hospital's permission In new way without using the current complicated criteria. This study discriminated specialized general hospital by the new proposed model based on clinical procedures provided by each hospital. For clustering the same types of medical facilities using 984 clinical procedures, we executed multidimensional scale analysis and divided 117 hospitals into 4 groups by two axises : a variety of procedure and the Proportion of high technology Procedure. Therefore, we divided 117 hospitals into 4 groups and one of them was considered as specialized general hospital. In discriminating analysis, we abstracted proportion of 16 clinical procedures which effect on discriminating the specialized general hospital in statistical system also we identify discriminating function which include these variables. As a result, we identify 2 discriminating functions, one is for current discriminating system and the other two is for new discriminating system of specialized general hospital.

Effects of Cu and Ag Addition on Nanocluster Formation Behavior in Al-Mg-Si Alloys

  • Kim, Jae-Hwang;Tezuka, Hiroyasu;Kobayashi, Equo;Sato, Tatsuo
    • Korean Journal of Materials Research
    • /
    • v.22 no.7
    • /
    • pp.329-334
    • /
    • 2012
  • Two types of nanoclusters, termed Cluster (1) and Cluster (2) here, both play an important role in the age-hardening behavior in Al-Mg-Si alloys. Small amounts of additions of Cu and Ag affect the formation of nanoclusters. Two exothermic peaks were clearly detected in differential scanning calorimetry(DSC) curves by means of peak separation by the Gaussian method in the base, Cu-added, Ag-added and Cu-Ag-added Al-Mg-Si alloys. The formation of nanoclusters in the initial stage of natural aging was suppressed in the Ag-added and Cu-Ag-added alloys, while the formation of nanoclusters was enhanced at an aging time longer than 259.2 ks(3 days) of natural aging with the addition Cu and Ag. The formation of nanoclusters while aging at $100^{\circ}C$ was accelerated in the Cu-added, Ag-added and Cu-Ag-added alloys due to the attractive interaction between the Cu and Ag atoms and the Mg atoms. The influence of additions of Cu and Ag on the clustering behavior during low-temperature aging was well characterized based on the interaction energies among solute atoms and on vacancies derived from the first-principle calculation of the full-potential Korrinaga-Kohn-Rostoker(FPKKR)-Green function method. The effects of low Cu and Ag additions on the formation of nanoclusters were also discussed based on the age-hardening phenomena.

A Study on Heavy Rainfall Guidance Realized with the Aid of Neuro-Fuzzy and SVR Algorithm Using AWS Data (AWS자료 기반 SVR과 뉴로-퍼지 알고리즘 구현 호우주의보 가이던스 연구)

  • Kim, Hyun-Myung;Oh, Sung-Kwun;Kim, Yong-Hyuk;Lee, Yong-Hee
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.63 no.4
    • /
    • pp.526-533
    • /
    • 2014
  • In this study, we introduce design methodology to develop a guidance for issuing heavy rainfall warning by using both RBFNNs(Radial basis function neural networks) and SVR(Support vector regression) model, and then carry out the comparative studies between two pattern classifiers. Individual classifiers are designed as architecture realized with the aid of optimization and pre-processing algorithm. Because the predictive performance of the existing heavy rainfall forecast system is commonly affected from diverse processing techniques of meteorological data, under-sampling method as the pre-processing method of input data is used, and also data discretization and feature extraction method for SVR and FCM clustering and PSO method for RBFNNs are exploited respectively. The observed data, AWS(Automatic weather wtation), supplied from KMA(korea meteorological administration), is used for training and testing of the proposed classifiers. The proposed classifiers offer the related information to issue a heavy rain warning in advance before 1 to 3 hours by using the selected meteorological data and the cumulated precipitation amount accumulated for 1 to 12 hours from AWS data. For performance evaluation of each classifier, ETS(Equitable Threat Score) method is used as standard verification method for predictive ability. Through the comparative studies of two classifiers, neuro-fuzzy method is effectively used for improved performance and to show stable predictive result of guidance to issue heavy rainfall warning.

Preference Differences in Interior Images of Restaurants according to Lifestyles (라이프스타일 유형에 따른 레스토랑 실내이미지 선호도 차이에 관한 연구)

  • Kim, Tae-Hee;Park, Young-Seok
    • Journal of the Korean Home Economics Association
    • /
    • v.43 no.10 s.212
    • /
    • pp.69-79
    • /
    • 2005
  • The purpose of this study was to determine restaurant patrons' preference differences in interior design style of restaurants according to their lifestyles. Written questionnaires were handed out to 500 adults in Seoul and surroundings and the results were sampled by convenience sampling. The questionnaire was composed of respondents' general characteristics, lifestyles, and preference for 10 types of interior design style. A total of 415 questionnaires were usable for data analysis, resulting in a response rate of $83\%$. To analyze the collected data, frequency, factor, reliability, quick clustering K- means and One-Way ANOVA analysis were conducted using SPSS 10.0. The results showed that there were preference differences in 10 types of interior design style of restaurants according to lifestyle types which were categorized into 4 groups. The conservative and self-convinced group showed the lowest preference scores in the 10 types of interior design style which are Romantic, Ethnic, Classic, High-Tech, Elegant, Country, Modem, Minimal, Natural, and Casual style. The quality life pursuing group and extroverted individuality groups showed the high preference scores in most of the styles, especially in the Classic and Elegant styles. The realistic self-centered group showed the highest preference scores in Casual style among the 4 groups. These study findings indicate that restaurants should take into account their patrons' lifestyles as a mean of market segmentation, and respond to their taste and preference when they have established suitable servicescape.

Short-term Forecasting of Power Demand based on AREA (AREA 활용 전력수요 단기 예측)

  • Kwon, S.H.;Oh, H.S.
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.39 no.1
    • /
    • pp.25-30
    • /
    • 2016
  • It is critical to forecast the maximum daily and monthly demand for power with as little error as possible for our industry and national economy. In general, long-term forecasting of power demand has been studied from both the consumer's perspective and an econometrics model in the form of a generalized linear model with predictors. Time series techniques are used for short-term forecasting with no predictors as predictors must be predicted prior to forecasting response variables and containing estimation errors during this process is inevitable. In previous researches, seasonal exponential smoothing method, SARMA (Seasonal Auto Regressive Moving Average) with consideration to weekly pattern Neuron-Fuzzy model, SVR (Support Vector Regression) model with predictors explored through machine learning, and K-means clustering technique in the various approaches have been applied to short-term power supply forecasting. In this paper, SARMA and intervention model are fitted to forecast the maximum power load daily, weekly, and monthly by using the empirical data from 2011 through 2013. $ARMA(2,\;1,\;2)(1,\;1,\;1)_7$ and $ARMA(0,\;1,\;1)(1,\;1,\;0)_{12}$ are fitted respectively to the daily and monthly power demand, but the weekly power demand is not fitted by AREA because of unit root series. In our fitted intervention model, the factors of long holidays, summer and winter are significant in the form of indicator function. The SARMA with MAPE (Mean Absolute Percentage Error) of 2.45% and intervention model with MAPE of 2.44% are more efficient than the present seasonal exponential smoothing with MAPE of about 4%. Although the dynamic repression model with the predictors of humidity, temperature, and seasonal dummies was applied to foretaste the daily power demand, it lead to a high MAPE of 3.5% even though it has estimation error of predictors.

Impact of Difference in Korean Wave Awareness among Chinese Women on Quality Perception and Purchasing Behavior of Korean Cosmetic Products (중국여성의 한류 인지도 차이가 한국 화장품에 대한 품질인식과 구매행동에 미치는 영향)

  • Lee, Jeong-Suk
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.14 no.10
    • /
    • pp.5097-5104
    • /
    • 2013
  • To derive implication for marketing strategy for Korean cosmetic products in China, an analysis was conducted on the difference in quality perception and purchase behavior between two groups of Chinese women classified by their awareness of Korean Wave. Analytical methods including k-means clustering method, independent samples t-test, factor analysis were applied on the survey results of Chinese women residing in Guangzhou city. The positive impact of Korean Wave on quality perception and brand image is much stronger for higher awareness group, compared against for lower awareness group, that leads to higher product satisfaction and willingness to recommend purchases. Thus, marketing strategies need to be adjusted based on the difference in customers awareness of Korean Wave. However, the low price is the primary inducement for purchases for both groups, increased efforts to enhance brand image and product quality as premium products is strongly required, together with the utilization of Koran Wave.

The Effect of the Number of Phoneme Clusters on Speech Recognition (음성 인식에서 음소 클러스터 수의 효과)

  • Lee, Chang-Young
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.9 no.11
    • /
    • pp.1221-1226
    • /
    • 2014
  • In an effort to improve the efficiency of the speech recognition, we investigate the effect of the number of phoneme clusters. For this purpose, codebooks of varied number of phoneme clusters are prepared by modified k-means clustering algorithm. The subsequent processing is fuzzy vector quantization (FVQ) and hidden Markov model (HMM) for speech recognition test. The result shows that there are two distinct regimes. For large number of phoneme clusters, the recognition performance is roughly independent of it. For small number of phoneme clusters, however, the recognition error rate increases nonlinearly as it is decreased. From numerical calculation, it is found that this nonlinear regime might be modeled by a power law function. The result also shows that about 166 phoneme clusters would be the optimal number for recognition of 300 isolated words. This amounts to roughly 3 variations per phoneme.

A Design on Face Recognition System Based on pRBFNNs by Obtaining Real Time Image (실시간 이미지 획득을 통한 pRBFNNs 기반 얼굴인식 시스템 설계)

  • Oh, Sung-Kwun;Seok, Jin-Wook;Kim, Ki-Sang;Kim, Hyun-Ki
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.16 no.12
    • /
    • pp.1150-1158
    • /
    • 2010
  • In this study, the Polynomial-based Radial Basis Function Neural Networks is proposed as one of the recognition part of overall face recognition system that consists of two parts such as the preprocessing part and recognition part. The design methodology and procedure of the proposed pRBFNNs are presented to obtain the solution to high-dimensional pattern recognition problem. First, in preprocessing part, we use a CCD camera to obtain a picture frame in real-time. By using histogram equalization method, we can partially enhance the distorted image influenced by natural as well as artificial illumination. We use an AdaBoost algorithm proposed by Viola and Jones, which is exploited for the detection of facial image area between face and non-facial image area. As the feature extraction algorithm, PCA method is used. In this study, the PCA method, which is a feature extraction algorithm, is used to carry out the dimension reduction of facial image area formed by high-dimensional information. Secondly, we use pRBFNNs to identify the ID by recognizing unique pattern of each person. The proposed pRBFNNs architecture consists of three functional modules such as the condition part, the conclusion part, and the inference part as fuzzy rules formed in 'If-then' format. In the condition part of fuzzy rules, input space is partitioned with Fuzzy C-Means clustering. In the conclusion part of rules, the connection weight of pRBFNNs is represented as three kinds of polynomials such as constant, linear, and quadratic. Coefficients of connection weight identified with back-propagation using gradient descent method. The output of pRBFNNs model is obtained by fuzzy inference method in the inference part of fuzzy rules. The essential design parameters (including learning rate, momentum coefficient and fuzzification coefficient) of the networks are optimized by means of the Particle Swarm Optimization. The proposed pRBFNNs are applied to real-time face recognition system and then demonstrated from the viewpoint of output performance and recognition rate.