• Title/Summary/Keyword: k-Means clustering

Search Result 1,111, Processing Time 0.031 seconds

Impact of Difference in Korean Wave Awareness among Chinese Women on Quality Perception and Purchasing Behavior of Korean Cosmetic Products (중국여성의 한류 인지도 차이가 한국 화장품에 대한 품질인식과 구매행동에 미치는 영향)

  • Lee, Jeong-Suk
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.14 no.10
    • /
    • pp.5097-5104
    • /
    • 2013
  • To derive implication for marketing strategy for Korean cosmetic products in China, an analysis was conducted on the difference in quality perception and purchase behavior between two groups of Chinese women classified by their awareness of Korean Wave. Analytical methods including k-means clustering method, independent samples t-test, factor analysis were applied on the survey results of Chinese women residing in Guangzhou city. The positive impact of Korean Wave on quality perception and brand image is much stronger for higher awareness group, compared against for lower awareness group, that leads to higher product satisfaction and willingness to recommend purchases. Thus, marketing strategies need to be adjusted based on the difference in customers awareness of Korean Wave. However, the low price is the primary inducement for purchases for both groups, increased efforts to enhance brand image and product quality as premium products is strongly required, together with the utilization of Koran Wave.

The Effect of the Number of Phoneme Clusters on Speech Recognition (음성 인식에서 음소 클러스터 수의 효과)

  • Lee, Chang-Young
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.9 no.11
    • /
    • pp.1221-1226
    • /
    • 2014
  • In an effort to improve the efficiency of the speech recognition, we investigate the effect of the number of phoneme clusters. For this purpose, codebooks of varied number of phoneme clusters are prepared by modified k-means clustering algorithm. The subsequent processing is fuzzy vector quantization (FVQ) and hidden Markov model (HMM) for speech recognition test. The result shows that there are two distinct regimes. For large number of phoneme clusters, the recognition performance is roughly independent of it. For small number of phoneme clusters, however, the recognition error rate increases nonlinearly as it is decreased. From numerical calculation, it is found that this nonlinear regime might be modeled by a power law function. The result also shows that about 166 phoneme clusters would be the optimal number for recognition of 300 isolated words. This amounts to roughly 3 variations per phoneme.

A Design on Face Recognition System Based on pRBFNNs by Obtaining Real Time Image (실시간 이미지 획득을 통한 pRBFNNs 기반 얼굴인식 시스템 설계)

  • Oh, Sung-Kwun;Seok, Jin-Wook;Kim, Ki-Sang;Kim, Hyun-Ki
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.16 no.12
    • /
    • pp.1150-1158
    • /
    • 2010
  • In this study, the Polynomial-based Radial Basis Function Neural Networks is proposed as one of the recognition part of overall face recognition system that consists of two parts such as the preprocessing part and recognition part. The design methodology and procedure of the proposed pRBFNNs are presented to obtain the solution to high-dimensional pattern recognition problem. First, in preprocessing part, we use a CCD camera to obtain a picture frame in real-time. By using histogram equalization method, we can partially enhance the distorted image influenced by natural as well as artificial illumination. We use an AdaBoost algorithm proposed by Viola and Jones, which is exploited for the detection of facial image area between face and non-facial image area. As the feature extraction algorithm, PCA method is used. In this study, the PCA method, which is a feature extraction algorithm, is used to carry out the dimension reduction of facial image area formed by high-dimensional information. Secondly, we use pRBFNNs to identify the ID by recognizing unique pattern of each person. The proposed pRBFNNs architecture consists of three functional modules such as the condition part, the conclusion part, and the inference part as fuzzy rules formed in 'If-then' format. In the condition part of fuzzy rules, input space is partitioned with Fuzzy C-Means clustering. In the conclusion part of rules, the connection weight of pRBFNNs is represented as three kinds of polynomials such as constant, linear, and quadratic. Coefficients of connection weight identified with back-propagation using gradient descent method. The output of pRBFNNs model is obtained by fuzzy inference method in the inference part of fuzzy rules. The essential design parameters (including learning rate, momentum coefficient and fuzzification coefficient) of the networks are optimized by means of the Particle Swarm Optimization. The proposed pRBFNNs are applied to real-time face recognition system and then demonstrated from the viewpoint of output performance and recognition rate.

A Study on Price Volatility and Properties of Time-series for the Tangerine Price in Jeju (제주지역 감귤가격의 시계열적 특성 및 가격변동성에 관한 연구)

  • Ko, Bong-Hyun
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.6
    • /
    • pp.212-217
    • /
    • 2020
  • The purpose of this study was to analyze the volatility and properties of a time series for tangerine prices in Jeju using the GARCH model of Bollerslev(1986). First, it was found that the time series for the rate of change in tangerine prices had a thicker tail rather than a normal distribution. At a significance level of 1%, the Jarque-Bera statistic led to a rejection of the null hypothesis that the distribution of the time series for the rate of change in tangerine prices is normally distributed. Second, the correlation between the time series was high based on the Ljung-Box Q statistic, which was statistically verified through the ARCH-LM test. Third, the results of the GARCH(1,1) model estimation showed statistically significant results at a significance level of 1%, except for the constant of the mean equation. The persistence parameter value of the variance equation was estimated to be close to 1, which means that there is a high possibility that a similar level of volatility will be present in the future. Finally, it is expected that the results of this study can be used as basic data to optimize the government's tangerine supply and demand control policy.

Mixed-effects zero-inflated Poisson regression for analyzing the spread of COVID-19 in Daejeon (혼합효과 영과잉 포아송 회귀모형을 이용한 대전광역시 코로나 발생 동향 분석)

  • Kim, Gwanghee;Lee, Eunjee
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.3
    • /
    • pp.375-388
    • /
    • 2021
  • This paper aims to help prevent the spread of COVID-19 by analyzing confirmed cases of COVID-19 in Daejeon. A high volume of visitors, downtown areas, and psychological fatigue with prolonged social distancing were considered as risk factors associated with the spread of COVID-19. We considered the weekly confirmed cases in each administrative district as a response variable. Explanatory variables were the number of passengers getting off at a bus station in each administrative district and the elapsed time since the Korean government had imposed distancing in daily life. We employed a mixed-effects zero-inflated Poisson regression model because the number of cases was repeatedly measured with excess zero-count data. We conducted k-means clustering to identify three groups of administrative districts having different characteristics in terms of the number of bars, the population size, and the distance to the closest college. Considering that the number of confirmed cases might vary depending on districts' characteristics, the clustering information was incorporated as a categorical explanatory variable. We found that Covid-19 was more prevalent as population size increased and a district is downtown. As the number of passengers getting off at a downtown district increased, the confirmed cases significantly increased.

Online news-based stock price forecasting considering homogeneity in the industrial sector (산업군 내 동질성을 고려한 온라인 뉴스 기반 주가예측)

  • Seong, Nohyoon;Nam, Kihwan
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.1-19
    • /
    • 2018
  • Since stock movements forecasting is an important issue both academically and practically, studies related to stock price prediction have been actively conducted. The stock price forecasting research is classified into structured data and unstructured data, and it is divided into technical analysis, fundamental analysis and media effect analysis in detail. In the big data era, research on stock price prediction combining big data is actively underway. Based on a large number of data, stock prediction research mainly focuses on machine learning techniques. Especially, research methods that combine the effects of media are attracting attention recently, among which researches that analyze online news and utilize online news to forecast stock prices are becoming main. Previous studies predicting stock prices through online news are mostly sentiment analysis of news, making different corpus for each company, and making a dictionary that predicts stock prices by recording responses according to the past stock price. Therefore, existing studies have examined the impact of online news on individual companies. For example, stock movements of Samsung Electronics are predicted with only online news of Samsung Electronics. In addition, a method of considering influences among highly relevant companies has also been studied recently. For example, stock movements of Samsung Electronics are predicted with news of Samsung Electronics and a highly related company like LG Electronics.These previous studies examine the effects of news of industrial sector with homogeneity on the individual company. In the previous studies, homogeneous industries are classified according to the Global Industrial Classification Standard. In other words, the existing studies were analyzed under the assumption that industries divided into Global Industrial Classification Standard have homogeneity. However, existing studies have limitations in that they do not take into account influential companies with high relevance or reflect the existence of heterogeneity within the same Global Industrial Classification Standard sectors. As a result of our examining the various sectors, it can be seen that there are sectors that show the industrial sectors are not a homogeneous group. To overcome these limitations of existing studies that do not reflect heterogeneity, our study suggests a methodology that reflects the heterogeneous effects of the industrial sector that affect the stock price by applying k-means clustering. Multiple Kernel Learning is mainly used to integrate data with various characteristics. Multiple Kernel Learning has several kernels, each of which receives and predicts different data. To incorporate effects of target firm and its relevant firms simultaneously, we used Multiple Kernel Learning. Each kernel was assigned to predict stock prices with variables of financial news of the industrial group divided by the target firm, K-means cluster analysis. In order to prove that the suggested methodology is appropriate, experiments were conducted through three years of online news and stock prices. The results of this study are as follows. (1) We confirmed that the information of the industrial sectors related to target company also contains meaningful information to predict stock movements of target company and confirmed that machine learning algorithm has better predictive power when considering the news of the relevant companies and target company's news together. (2) It is important to predict stock movements with varying number of clusters according to the level of homogeneity in the industrial sector. In other words, when stock prices are homogeneous in industrial sectors, it is important to use relational effect at the level of industry group without analyzing clusters or to use it in small number of clusters. When the stock price is heterogeneous in industry group, it is important to cluster them into groups. This study has a contribution that we testified firms classified as Global Industrial Classification Standard have heterogeneity and suggested it is necessary to define the relevance through machine learning and statistical analysis methodology rather than simply defining it in the Global Industrial Classification Standard. It has also contribution that we proved the efficiency of the prediction model reflecting heterogeneity.

A STUDY OF MANDIBULAR DENIAL ARCH OF KOREAN ADULTS (한국 성인 유치악자의 하악 치열궁에 관한 조사)

  • Kim, Il-Han;Choi, Dae-Gyun
    • The Journal of Korean Academy of Prosthodontics
    • /
    • v.36 no.1
    • /
    • pp.166-182
    • /
    • 1998
  • The purposes of this study are to evaluate the Korean mandibular dental arch and classify the mandibular dental arch shape and size based on the incisal angle, canine angle, inter second molar width and height. In this study the mandibular study models were fabricated using irreversible hydrocolloid impression material from 225 volunteers with a mean age 23.62 (range 19-29). And the study models were measured with 3-dimensional measuring device and the mandibular dental arch was classified by means of K-means clustering method and visual inspection, then obtained data were analyzed with t-test for the statistical analysis. The results were as follows ; 1. The average canine height was 5.19mm(s.d. 1.17) in both sex, 5.34mm in male, and 4.95mnm in female. And the sexual difference was significant($0). 2. The average second molar height was 39.81mm(s.d. 2.44) in both sex, 40.19mm in male, and 39.21mm in female. And the sexual difference was significant($0). 3. The average inter-canine width was 27.16mm(s.d. 1.78) in both sex, 27.41mm in male, and 26.77mm in female. And the sexual difference was significant($0). 4. The average inter-first molar width was 46.93mm(s.d. 2.67) in both sex, 47.72mm in male, and 45.7mm in female. And the sexual difference was significant($0). 5. The inter-second molar width was average 56.09mm(s.d. 3.01) in both sex, 57.24mm in male, and 54.32mn in woma. And the sexual difference was significant($0). 6. The arch form was classified into three shapes based on the incisal and canine angle. V-shape showed $124.88^{\circ}$ of incisal angle and $141.64^{\circ}$ of canine angle, U-shape showed $152.76^{\circ}\;and\;125.35^{\circ}$, and O-shape showed $138.03^{\circ}\;and \;33.66^{\circ}$ respectively. Each shape distribution was that the V-shape was 14.2%, the U-Shape was 14.7%, and the O-shape was 71.1% of the 225 study models. 7. It was thought that the use of second molar width is more reasonable than height for classifying the dental arch size. The arch size was classified into four sizes based on the second molar width. Size 1 showed range of 42.24-48.23mm, size 2 showed 48.24-54.23mm, size 3 showed 54.24-60.23mm, and size 4 showed 60.24-66.23mm respectively. Each arch size distribution was that the size 1 was 1.3%, the size 2 was 27.1%, the size 3 was 63.6%, and the size 4 was 8.0% of the 225 study models.

  • PDF

Spatio-Temporal Clustering Analysis of HPAI Outbreaks in South Korea, 2014 (2014년 국내 발생 HPAI(고병원성 조류인플루엔자)의 시·공간 군집 분석)

  • MOON, Oun-Kyong;CHO, Seong-Beom;BAE, Sun-Hak
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.18 no.3
    • /
    • pp.89-101
    • /
    • 2015
  • Outbreaks of highly pathogenic avian influenza(HPAI) subtype H5N8 have occurred in Korea, January 2014 and it continued more than a year until 2015. And more than 5 million heads of poultry hads been damaged in 196 farms until May 2014. So, we studied the spatial, temporal and spatio-temporal patterns of the HPAI epidemics for understanding the propagation and diffusion characteristics of the 2014 HPAI. The results are expressed using GIS. Throughout the study period three epidemic waves occurred over the time. And outbreaks made three clusters in space. First spatial cluster is adjacent areas of province of Chungcheongbuk-do, Chungcheongnam-do and Gyeonggi -do. Second is Jeonlabuk-do Gomso Bay area. And the last is Naju and Yeongam in Jeollanam-do. Also, most of spatio-temporal clusters were formed in spatially high clustered areas. Especially, in Gomso Bay area space density and spatio-temporal density were concurrent. It means that the effective prevention activity for HPAI was carried out. But there are some exceptional areas such as Chungcheongbuk-do, Chungcheongnam-do, Gyeonggi-do adjacent area. In these areas the outbreak density was high in space but the spatio-temporal cluster was not formed. It means that the HPAI virus was continuing inflow over a long period.

A Study on Recommendation Technique Using Mining and Clustering of Weighted Preference based on FRAT (마이닝과 FRAT기반 가중치 선호도 군집을 이용한 추천 기법에 관한 연구)

  • Park, Wha-Beum;Cho, Young-Sung;Ko, Hyung-Hwa
    • Journal of Digital Contents Society
    • /
    • v.14 no.4
    • /
    • pp.419-428
    • /
    • 2013
  • Real-time accessibility and agility are required in u-commerce under ubiquitous computing environment. Most of the existing recommendation techniques adopt the method of evaluation based on personal profile, which has been identified with difficulties in accurately analyzing the customers' level of interest and tendencies, as well as the problems of cost, consequently leaving customers unsatisfied. Researches have been conducted to improve the accuracy of information such as the level of interest and tendencies of the customers. However, the problem lies not in the preconstructed database, but in generating new and diverse profiles that are used for the evaluation of the existing data. Also it is difficult to use the unique recommendation method with hierarchy of each customer who has various characteristics in the existing recommendation techniques. Accordingly, this dissertation used the implicit method without onerous question and answer to the users based on the data from purchasing, unlike the other evaluation techniques. We applied FRAT technique which can analyze the tendency of the various personalization and the exact customer.

A Post-Verification Method of Near-Duplicate Image Detection using SIFT Descriptor Binarization (SIFT 기술자 이진화를 이용한 근-복사 이미지 검출 후-검증 방법)

  • Lee, Yu Jin;Nang, Jongho
    • Journal of KIISE
    • /
    • v.42 no.6
    • /
    • pp.699-706
    • /
    • 2015
  • In recent years, as near-duplicate image has been increasing explosively by the spread of Internet and image-editing technology that allows easy access to image contents, related research has been done briskly. However, BoF (Bag-of-Feature), the most frequently used method for near-duplicate image detection, can cause problems that distinguish the same features from different features or the different features from same features in the quantization process of approximating a high-level local features to low-level. Therefore, a post-verification method for BoF is required to overcome the limitation of vector quantization. In this paper, we proposed and analyzed the performance of a post-verification method for BoF, which converts SIFT (Scale Invariant Feature Transform) descriptors into 128 bits binary codes and compares binary distance regarding of a short ranked list by BoF using the codes. Through an experiment using 1500 original images, it was shown that the near-duplicate detection accuracy was improved by approximately 4% over the previous BoF method.