• Title/Summary/Keyword: K-means Cluster Analysis

Search Result 374, Processing Time 0.027 seconds

The recent research wave in ecotourism research using keyword network analysis (키워드 네트워크 분석을 활용한 생태관광연구 경향 분석)

  • Lee, Jae-Hyuck;Son, Yong-Hoon
    • Journal of Korean Society of Rural Planning
    • /
    • v.22 no.2
    • /
    • pp.45-55
    • /
    • 2016
  • From 1970, the concept of ecotourism is introduced, lots of studies in ecotourism appeared. Review these studies are necessary for future ecotourism studies. Some review studies on ecotourism are existed. However, these approach also limitation of subjectivities and some sorts of papers has not been reviewed. This study use keyword network analysis which is used as big data analysis to overcome the limitation. Foreign 2455 studies and domestic 163 studies which have ecotoursim in keywords, are analyzed for reviewing. As a result, 3 cluster('Sustainable tourism development', 'Ecological conservation', 'Ecotourist analysis' appeared, in ecotourism studies. In addition, this cluster has deep relationship with region. 'Sustainable tourism development' is related to Eurasia, Australia, Europe. 'Ecological conservation' is related to Africa. 'Ecotourism analysis' is related to North America. Especially 'Resident participation', 'Stakeholder' are appeared many times in Asia region. These results show that ecotourism studies are interpreted in regional contexts. It means that although only one word 'ecotourism' is used in different contexts, regional approach are needed for exact use. In Korea, the keywords are focused on ecotourists and developments. As Korea has lots of ecotour village, resident participation studies have to be supplemented.

A Study on K -Means Clustering

  • Bae, Wha-Soo;Roh, Se-Won
    • Communications for Statistical Applications and Methods
    • /
    • v.12 no.2
    • /
    • pp.497-508
    • /
    • 2005
  • This paper aims at studying on K-means Clustering focusing on initialization which affect the clustering results in K-means cluster analysis. The four different methods(the MA method, the KA method, the Max-Min method and the Space Partition method) were compared and the clustering result shows that there were some differences among these methods, especially that the MA method sometimes leads to incorrect clustering due to the inappropriate initialization depending on the types of data and the Max-Min method is shown to be more effective than other methods especially when the data size is large.

Classification of Daily Precipitation Patterns in South Korea using Mutivariate Statistical Methods

  • Mika, Janos;Kim, Baek-Jo;Park, Jong-Kil
    • Journal of Environmental Science International
    • /
    • v.15 no.12
    • /
    • pp.1125-1139
    • /
    • 2006
  • The cluster analysis of diurnal precipitation patterns is performed by using daily precipitation of 59 stations in South Korea from 1973 to 1996 in four seasons of each year. Four seasons are shifted forward by 15 days compared to the general ones. Number of clusters are 15 in winter, 16 in spring and autumn, and 26 in summer, respectively. One of the classes is the totally dry day in each season, indicating that precipitation is never observed at any station. This is treated separately in this study. Distribution of the days among the clusters is rather uneven with rather low area-mean precipitation occurring most frequently. These 4 (seasons)$\times$2 (wet and dry days) classes represent more than the half (59 %) of all days of the year. On the other hand, even the smallest seasonal clusters show at least $5\sim9$ members in the 24 years (1973-1996) period of classification. The cluster analysis is directly performed for the major $5\sim8$ non-correlated coefficients of the diurnal precipitation patterns obtained by factor analysis In order to consider the spatial correlation. More specifically, hierarchical clustering based on Euclidean distance and Ward's method of agglomeration is applied. The relative variance explained by the clustering is as high as average (63%) with better capability in spring (66%) and winter (69 %), but lower than average in autumn (60%) and summer (59%). Through applying weighted relative variances, i.e. dividing the squared deviations by the cluster averages, we obtain even better values, i.e 78 % in average, compared to the same index without clustering. This means that the highest variance remains in the clusters with more precipitation. Besides all statistics necessary for the validation of the final classification, 4 cluster centers are mapped for each season to illustrate the range of typical extremities, paired according to their area mean precipitation or negative pattern correlation. Possible alternatives of the performed classification and reasons for their rejection are also discussed with inclusion of a wide spectrum of recommended applications.

An Analysis of Replication Enhancement for a High Availability Cluster

  • Park, Sehoon;Jung, Im Y.;Eom, Heonsang;Yeom, Heon Y.
    • Journal of Information Processing Systems
    • /
    • v.9 no.2
    • /
    • pp.205-216
    • /
    • 2013
  • In this paper, we analyze a technique for building a high-availability (HA) cluster system. We propose what we have termed the 'Selective Replication Manager (SRM),' which improves the throughput performance and reduces the latency of disk devices by means of a Distributed Replicated Block Device (DRBD), which is integrated in the recent Linux Kernel (version 2.6.33 or higher) and that still provides HA and failover capabilities. The proposed technique can be applied to any disk replication and database system with little customization and with a reasonably low performance overhead. We demonstrate that this approach using SRM increases the disk replication speed and reduces latency by 17% and 7%, respectively, as compared to the existing DRBD solution. This approach represents a good effort to increase HA with a minimum amount of risk and cost in terms of commodity hardware.

Comparison of Reflection Hierarchy, Team Learning Climate, and Learning Organization Building on Nursing Competency in Clinical Nurses (간호역량 군집 유형에 따른 성찰 수준, 팀학습 분위기 및 학습조직 구축정도 비교)

  • Kim, Heeyoung;Jang, Keum Seong
    • Journal of Korean Academy of Nursing Administration
    • /
    • v.19 no.2
    • /
    • pp.282-291
    • /
    • 2013
  • Purpose: The purpose of this study was to identify clusters of nursing competency, and investigate the influence of reflective thinking, team learning climate, and learning organization building according to nursing competency clusters. Methods: Participants were 244 clinical nurses who worked in 4 general hospitals in Gwangju Metropolitan City. Data were collected by self-report questionnaires during June and July, 2011. Nursing competency, levels of reflection hierarchy, team learning climate, and learning organization building were measured. Data were analyzed using frequencies, means, t-test, one-way ANOVA, Pearson correlation coefficients, and K-means cluster analysis with SPSS/WIN 20.0 version. Results: Nursing competency correlated positively with intensive reflection, reflection, team learning climate, and learning organization building (p<.001). There were three clusters of nursing competency in a clinical ladder, which were derived from cluster analysis, grouped as high, middle, and low competency. Intensive reflection, reflection, team learning climate, and learning organization building showed significant differences according to grouping of nursing competency. Conclusion: The results indicate that developing intensive reflection, reflection, team learning climate, and learning organization building would be useful strategies for enhancement of nursing competency.

Cluster Analysis of Climate Data for Applying Weather Marketing (날씨 마케팅 적용을 위한 기후 데이터의 군집 분석)

  • Lee, Yang-Koo;Kim, Won-Tae;Jung, Young-Jin;Kim, Kwang-Deuk;Ryu, Keun-Ho
    • Journal of Korea Spatial Information System Society
    • /
    • v.7 no.3 s.15
    • /
    • pp.33-44
    • /
    • 2005
  • Recently, the weather has been influenced by the environmental pollution and the oil price has been risen because of the lack of resources. So, the weather and energy are influencing on not only enterprises or nations, but also individual daily life and economic activities very much. Because of these reasons, there are so many researches about management of solar radiation needed to develope solar energy as alternative energy. And many researchers are also interested in identifying the area according to changing characteristics of climate data. However, the researches have not developed how to apply the cluster analysis, retrieval and analytical results according to the characteristics of the area through data mining. In this paper, we design a data model of the data for storing and managing the climate data tested in twenty cities in the domestic area. And we provide the information according to the characteristics of the area after clustering the domestic climate data, using k-means clustering algorithm. And we suggest the way how to apply the department store and amusement park as an applied weather marketing. The proposed system is useful for constructing the database about the weather marketing and for providing the elements and analysis information.

  • PDF

COUNTING OF FLOWERS BASED ON K-MEANS CLUSTERING AND WATERSHED SEGMENTATION

  • PAN ZHAO;BYEONG-CHUN SHIN
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • v.27 no.2
    • /
    • pp.146-159
    • /
    • 2023
  • This paper proposes a hybrid algorithm combining K-means clustering and watershed algorithms for flower segmentation and counting. We use the K-means clustering algorithm to obtain the main colors in a complex background according to the cluster centers and then take a color space transformation to extract pixel values for the hue, saturation, and value of flower color. Next, we apply the threshold segmentation technique to segment flowers precisely and obtain the binary image of flowers. Based on this, we take the Euclidean distance transformation to obtain the distance map and apply it to find the local maxima of the connected components. Afterward, the proposed algorithm adaptively determines a minimum distance between each peak and apply it to label connected components using the watershed segmentation with eight-connectivity. On a dataset of 30 images, the test results reveal that the proposed method is more efficient and precise for the counting of overlapped flowers ignoring the degree of overlap, number of overlap, and relatively irregular shape.

A Comparison of Cluster Analyses and Clustering of Sensory Data on Hanwoo Bulls (군집분석 비교 및 한우 관능평가데이터 군집화)

  • Kim, Jae-Hee;Ko, Yoon-Sil
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.4
    • /
    • pp.745-758
    • /
    • 2009
  • Cluster analysis is the automated search for groups of related observations in a data set. To group the observations into clusters many techniques has been proposed, and a variety measures aimed at validating the results of a cluster analysis have been suggested. In this paper, we compare complete linkage, Ward's method, K-means and model-based clustering and compute validity measures such as connectivity, Dunn Index and silhouette with simulated data from multivariate distributions. We also select a clustering algorithm and determine the number of clusters of Korean consumers based on Korean consumers' palatability scores for Hanwoo bull in BBQ cooking method.

Detection of onset of failure in prestressed strands by cluster analysis of acoustic emissions

  • Ercolino, Marianna;Farhidzadeh, Alireza;Salamone, Salvatore;Magliulo, Gennaro
    • Structural Monitoring and Maintenance
    • /
    • v.2 no.4
    • /
    • pp.339-355
    • /
    • 2015
  • Corrosion of prestressed concrete structures is one of the main challenges that engineers face today. In response to this national need, this paper presents the results of a long-term project that aims at developing a structural health monitoring (SHM) technology for the nondestructive evaluation of prestressed structures. In this paper, the use of permanently installed low profile piezoelectric transducers (PZT) is proposed in order to record the acoustic emissions (AE) along the length of the strand. The results of an accelerated corrosion test are presented and k-means clustering is applied via principal component analysis (PCA) of AE features to provide an accurate diagnosis of the strand health. The proposed approach shows good correlation between acoustic emissions features and strand failure. Moreover, a clustering technique for the identification of false alarms is proposed.

Re-evaluation of Obesity Syndrome Differentiation Questionnaire Based on Real-world Survey Data Using Data Mining (데이터 마이닝을 이용한 한의비만변증 설문지 재평가: 실제 임상에서 수집한 설문응답 기반으로)

  • Oh, Jihong;Wang, Jing-Hua;Choi, Sun-Mi;Kim, Hojun
    • Journal of Korean Medicine for Obesity Research
    • /
    • v.21 no.2
    • /
    • pp.80-94
    • /
    • 2021
  • Objectives: The purpose of this study is to re-evaluate the importance of questions of obesity syndrome differentiation (OSD) questionnaire based on real-world survey and to explore the possibility of simplifying OSD types. Methods: The OSD frequency was identified, and variance threshold feature selection was performed to filter the questions. Filtered questions were clustered by K-means clustering and hierarchical clustering. After principal component analysis (PCA), the distribution patterns of the subjects were identified and the differences in the syndrome distribution were compared. Results: The frequency of OSD in spleen deficiency, phlegm (PH), and blood stasis (BS) was lower than in food retention (FR), liver qi stagnation (LS), and yang deficiency. We excluded 13 questions with low variance, 7 of which were related to BS. Filtered questions were clustered into 3 groups by K-means clustering; Cluster 1 (17 questions) mainly related to PH, BS syndromes; Cluster 2 (11 questions) related to swelling, and indigestion; Cluster 3 (11 questions) related to overeating or emotional symptoms. After PCA, significant different patterns of subjects were observed in the FR, LS, and other obesity syndromes. The questions that mainly affect the FR distribution were digestive symptoms. And emotional symptoms mainly affect the distribution of LS subjects. And other obesity syndrome was partially affected by both digestive and emotional symptoms, and also affected by symptoms related to poor circulation. Conclusions: In-depth data mining analysis identified relatively low importance questions and the potential to simplify OSD types.