• 제목/요약/키워드: Correlation clustering

검색결과 272건 처리시간 0.024초

선박운항 시뮬레이터 실험조건 축소화 연구 (Reduction of Simulation Number for Ship Handling Safety Assessment)

  • 권세혁;오현승
    • 산업경영시스템학회지
    • /
    • 제35권1호
    • /
    • pp.101-106
    • /
    • 2012
  • Ship handling simulator is a virtual ship navigating system with three dimensional screen system and simulation programs. FTS simulation can produce theoretically infinite experiment tests without time constraint, but which results in collecting determinstic observations. RTS simulation can collect statistical observations but has disadvantage of spending at least 30 minutes for a single experiment. The previous studies suggested that the number of experiment conditions to be tested could be reduced to obtain random data with RTS simulation by focusing on highly difficult experiment condition for ship handling. It has the limitation of not estimating the distribution of ship handling difficulty for the route. In this paper, similarity and clustering analysis are suggested for reduction methodology of experiment conditions. Similarity of experiment conditions are measured as follows: euclidean distance of ship handling difficulty index and correlation matrix of distance differences from the designed route. Clustering analysis and multi-dimensional scaling are applied to classify experiment conditions with measured similarity into reducing the number of RTS simulation conditions. An empirical result on Dangin harbor is shown and discussed.

대학도서관의 종합목록 기여 활동 및 이용 정도에 대한 탐사적 연구 (Exploratory Study on the Activity about Utilization and Contribution to the Union Catalog)

  • 조재인
    • 한국비블리아학회지
    • /
    • 제26권1호
    • /
    • pp.35-50
    • /
    • 2015
  • 종합목록 활성화를 위해서는 서지네트워크 참여 도서관의 공동체 의식과 협력 정신이 가장 중요하겠으나, 기여에 대한 적절한 보상은 참여 동기를 유발시킬 수 있다. 따라서 본 연구는 해외 종합목록의 기여보상제도를 살펴보고, 우리나라 대학도서관 종합목록 참여 도서관의 기여 활동과 이용 정도를 탐사적으로 분석하였다. 연구의 내용을 구체적으로 기술하면 첫째, 기술통계 분석을 통해 종합목록 참여 도서관의 기여 활동과 이용도에 대한 전반적인 현황을 파악해 보며, 둘째, 피어슨 상관 분석(Pearson Correlation Analysis)을 통해 기여활동과 이용 정도간에 어떠한 상관관계가 존재하는지 분석해 보았다. 셋째, 계층적 군집 분석(Hierarchical Clustering)을 통해 참여 기관을 유형화하여 기여 집단의 규모, 특수 공헌 집단의 존재 여부 등을 분석하였다.

소프트웨어 불법복제에 영향을 미치는 환경 요인에 기반한 국가 분류 (Country Clustering Based on Environmental Factors Influencing on Software Piracy)

  • 서보밀;심준호
    • 한국정보시스템학회지:정보시스템연구
    • /
    • 제26권4호
    • /
    • pp.227-246
    • /
    • 2017
  • Purpose: As the importance of software has been emphasized recently, the size of the software market is continuously expanding. The development of the software market is being adversely affected by software piracy. In this study, we try to classify countries around the world based on the macro environmental factors, which influence software piracy. We also try to identify the differences in software piracy for each classified type. Design/methodology/approach: The data-driven approach is used in this study. From the BSA, the World Bank, and the OECD, we collect data from 1990 to 2015 for 127 environmental variables of 225 countries. Cronbach's ${\alpha}$ analysis, item-to-total correlation analysis, and exploratory factor analysis derive 15 constructs from the data. We apply two-step approach to cluster analysis. The number of clusters is determined to be 5 by hierarchical cluster analysis at the first step, and the countries are classified by the K-means clustering at the second step. We conduct ANOVA and MANOVA in order to verify the differences of the environmental factors and software piracy among derived clusters. Findings: The five clusters are identified as underdeveloped countries, developing countries, developed countries, world powers, and developing country with large market. There are statistically significant differences in the environmental factors among the clusters. In addition, there are statistically significant differences in software piracy rate, pirated value, and legal software sales among the clusters.

군집분석을 이용한 국지해일모델 지역확장 (Regional Extension of the Neural Network Model for Storm Surge Prediction Using Cluster Analysis)

  • 이다운;서장원;윤용훈
    • 대기
    • /
    • 제16권4호
    • /
    • pp.259-267
    • /
    • 2006
  • In the present study, the neural network (NN) model with cluster analysis method was developed to predict storm surge in the whole Korean coastal regions with special focuses on the regional extension. The model used in this study is NN model for each cluster (CL-NN) with the cluster analysis. In order to find the optimal clustering of the stations, agglomerative method among hierarchical clustering methods was used. Various stations were clustered each other according to the centroid-linkage criterion and the cluster analysis should stop when the distances between merged groups exceed any criterion. Finally the CL-NN can be constructed for predicting storm surge in the cluster regions. To validate model results, predicted sea level value from CL-NN model was compared with that of conventional harmonic analysis (HA) and of the NN model in each region. The forecast values from NN and CL-NN models show more accuracy with observed data than that of HA. Especially the statistics analysis such as RMSE and correlation coefficient shows little differences between CL-NN and NN model results. These results show that cluster analysis and CL-NN model can be applied in the regional storm surge prediction and developed forecast system.

도로제설 이력자료 기반 제설 인프라 분석 (Analysis of Road Snow-removal Infrastructure using Road Snow-removal Historical Data)

  • 김진국;김승범;양충헌
    • 한국도로학회논문집
    • /
    • 제19권3호
    • /
    • pp.83-90
    • /
    • 2017
  • PURPOSES : In this study, systematic road snow-removal capabilities were estimated based on previous historical data for road-snowremoval works. The final results can be used to aid decision-making strategies for cost-effective snow-removal works by regional offices. METHODS : First, road snow-removal historical data from the road snow-removal management system (RSMS), operated by the Ministry of Land, Infrastructure and Transport, were employed to determine specific characteristics of the snow-removal capabilities by region. The actual owned amount and actual used amount of infrastructure were analyzed for the past three years. Second, the regional offices were classified using K-means clustering into groups "close" to one another. Actual used snow-removal infrastructure was determined from the number of snow-removal working days. Finally, the correlation between the de-icing materials used and infrastructure was analyzed. Significant differences were found among the amounts of used infrastructure depending on snowfall intensity for each regional office during the past three years. RESULTS:The results showed that the amount of snow-removal infrastructure used for low heavy-snowfall intensity did not appear to depend on the amount of heavy snowfall, and therefore, high variation is observed in each area. CONCLUSIONS:This implies that the final analysis results will be useful when making decisions on snow-removal works.

Study of Data Placement Schemes for SNS Services in Cloud Environment

  • Chen, Yen-Wen;Lin, Meng-Hsien;Wu, Min-Yan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제9권8호
    • /
    • pp.3203-3215
    • /
    • 2015
  • Due to the high growth of SNS population, service scalability is one of the critical issues to be addressed. The cloud environment provides the flexible computing and storage resources for services deployment, which fits the characteristics of scalable SNS deployment. However, if the SNS related information is not properly placed, it will cause unbalance load and heavy transmission cost on the storage virtual machine (VM) and cloud data center (CDC) network. In this paper, we characterize the SNS into a graph model based on the users' associations and interest correlations. The node weight represents the degree of associations, which can be indexed by the number of friends or data sources, and the link weight denotes the correlation between users/data sources. Then, based on the SNS graph, the two-step algorithm is proposed in this paper to determine the placement of SNS related data among VMs. Two k-means based clustering schemes are proposed to allocate social data in proper VM and physical servers for pre-configured VM and dynamic VM environment, respectively. The experimental example was conducted and to illustrate and compare the performance of the proposed schemes.

Empirical Comparison of Word Similarity Measures Based on Co-Occurrence, Context, and a Vector Space Model

  • Kadowaki, Natsuki;Kishida, Kazuaki
    • Journal of Information Science Theory and Practice
    • /
    • 제8권2호
    • /
    • pp.6-17
    • /
    • 2020
  • Word similarity is often measured to enhance system performance in the information retrieval field and other related areas. This paper reports on an experimental comparison of values for word similarity measures that were computed based on 50 intentionally selected words from a Reuters corpus. There were three targets, including (1) co-occurrence-based similarity measures (for which a co-occurrence frequency is counted as the number of documents or sentences), (2) context-based distributional similarity measures obtained from a latent Dirichlet allocation (LDA), nonnegative matrix factorization (NMF), and Word2Vec algorithm, and (3) similarity measures computed from the tf-idf weights of each word according to a vector space model (VSM). Here, a Pearson correlation coefficient for a pair of VSM-based similarity measures and co-occurrence-based similarity measures according to the number of documents was highest. Group-average agglomerative hierarchical clustering was also applied to similarity matrices computed by individual measures. An evaluation of the cluster sets according to an answer set revealed that VSM- and LDA-based similarity measures performed best.

한국산 포유동물 24종(13과 6목)의 형태적 형질의 분석 (Morphometric Analyses on 24 Species (13 Families of Six Orders) of Korean Mammals)

  • 고홍선
    • 한국동물학회지
    • /
    • 제32권1호
    • /
    • pp.14-21
    • /
    • 1989
  • 한국산 포유동물 6목 24종 279표본의 4개 외부형질과 22개 두골형질들을 측정하였으며, 측정치를 ordination법과 clustering법에 의해서 분석하였다. 형태적 분석의 결과는 족제비, 청서 및 토마스 땃쥐가 현존 분류체계상의 위치와 다르게 나타났다. 한국산 포유동물의 목 수준에서의 mrophological difference는 다른 포유동물보다는 크게 나타났으나, 목 이하의 수준에서는 별 차이가 없었다. 분류학적 체계에서의 단계가 높아질 수록 average taxonomic distance morphological difference의 수치는 커졌으며, 두 행렬식간의 상관계수는 0.59였다.

  • PDF

Toward Successful Management of Vocational Rehabilitation Services for People with Disabilities: A Data Mining Approach

  • Kim, Yong Seog
    • Industrial Engineering and Management Systems
    • /
    • 제11권4호
    • /
    • pp.371-384
    • /
    • 2012
  • This study proposes a multi-level data analysis approach to identify both superficial and latent relationships among variables in the data set obtained from a vocational rehabilitation (VR) services program of people with significant disabilities. At the first layer, data mining and statistical predictive models are used to extract the superficial relationships between dependent and independent variables. To supplement the findings and relationships from the analysis at the first layer, association rule mining algorithms at the second layer are employed to extract additional sets of interesting associative relationships among variables. Finally, nonlinear nonparametric canonical correlation analysis (NLCCA) along with clustering algorithm is employed to identify latent nonlinear relationships. Experimental outputs validate the usefulness of the proposed approach. In particular, the identified latent relationship indicates that disability types (i.e., physical and mental) and severity (i.e., severe, most severe, not severe) have a significant impact on the levels of self-esteem and self-confidence of people with disabilities. The identified superficial and latent relationships can be used to train education program designers and policy developers to maximize the outcomes of VR training programs.

Decision support system for underground coal pillar stability using unsupervised and supervised machine learning approaches

  • Kamran, Muhammad;Shahani, Niaz Muhammad;Armaghani, Danial Jahed
    • Geomechanics and Engineering
    • /
    • 제30권2호
    • /
    • pp.107-121
    • /
    • 2022
  • Coal pillar assessment is of broad importance to underground engineering structure, as the pillar failure can lead to enormous disasters. Because of the highly non-linear correlation between the pillar failure and its influential attributes, conventional forecasting techniques cannot generate accurate outcomes. To approximate the complex behavior of coal pillar, this paper elucidates a new idea to forecast the underground coal pillar stability using combined unsupervised-supervised learning. In order to build a database of the study, a total of 90 patterns of pillar cases were collected from authentic engineering structures. A state-of-the art feature depletion method, t-distribution symmetric neighbor embedding (t-SNE) has been employed to reduce significance of actual data features. Consequently, an unsupervised machine learning technique K-mean clustering was followed to reassign the t-SNE dimensionality reduced data in order to compute the relative class of coal pillar cases. Following that, the reassign dataset was divided into two parts: 70 percent for training dataset and 30 percent for testing dataset, respectively. The accuracy of the predicted data was then examined using support vector classifier (SVC) model performance measures such as precision, recall, and f1-score. As a result, the proposed model can be employed for properly predicting the pillar failure class in a variety of underground rock engineering projects.