• Title/Summary/Keyword: 계층적 군집화

Search Result 134, Processing Time 0.025 seconds

Hierarchical grouping recommendation system based on the attributes of contents: a case study of 'The Movie Dataset' (콘텐츠 속성에 따른 계층적 그룹화 추천시스템: 'The Movie Dataset' 분석사례연구)

  • Kim, Yoon Kyoung;Yeo, In-Kwon
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.6
    • /
    • pp.833-842
    • /
    • 2020
  • Global platforms such as Netflix, Amazon, and YouTube have developed a precise recommendation system based on various information from large set of customers and many of the items recommended here are leading to actual purchases. In this paper, a cluster analysis was conducted according to the attribute of the content, expecting that there would be a difference in user preferences according to the attribute of the recommended content. Gower distance was used for use regardless of the type of variables. In this paper, using the data of movie rating site 'The Movie Dataset', the users were grouped hierarchically and recommended movies based on genre, director and actor variables. To evaluate the recommended systems proposed, user group was divided into train set and test set to examine the precision. The results showed that proposed algorithms have far higher precision than UBCF.

An Efficiency Analysis of Industry-University-Public Research Institute Collaborative Research: Employing the Input-Output Itemization Model (투입 및 산출 분해모형을 활용한 산학연 협력연구의 효율성 분석)

  • Kim, Hong-Young;Chung, Sunyang
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.18 no.12
    • /
    • pp.473-484
    • /
    • 2017
  • This study analyzed collaborative R&D projects funded by the Korean government from 2013-2015. For this analysis, input and output variables of projects were considered, and a combination of those variables was itemized. The output-oriented variable return to scale (VRS) model extended from the DEA methodology was adopted to evaluate the cooperation efficiency of the types of R&D collaboration, which were classified according to the project leader's organizations. In addition, hierarchical cluster analysis was conducted using the efficiency results of the scientific, technical, and economical outcome models. The results showed that cooperation efficiency between large companies and public research institutions was relatively high. Conversely, cooperation among medium-sized companies, small businesses and universities was particularly inefficient. The clustering results demonstrated the various strengths and weaknesses of the types depending on publications, patents, technical loyalties and the number of commercialization. In conclusion, this study suggests differentiated investment portfolios and strategies based on the efficiency results of diverse cooperation types among industries, universities and public research institutions.

A Study on Market Expansion Strategy via Two-Stage Customer Pre-segmentation Based on Customer Innovativeness and Value Orientation (고객혁신성과 가치지향성 기반의 2단계 사전 고객세분화를 통한 시장 확산 전략)

  • Heo, Tae-Young;Yoo, Young-Sang;Kim, Young-Myoung
    • Journal of Korea Technology Innovation Society
    • /
    • v.10 no.1
    • /
    • pp.73-97
    • /
    • 2007
  • R&D into future technologies should be conducted in conjunction with technological innovation strategies that are linked to corporate survival within a framework of information and knowledge-based competitiveness. As such, future technology strategies should be ensured through open R&D organizations. The development of future technologies should not be conducted simply on the basis of future forecasts, but should take into account customer needs in advance and reflect them in the development of the future technologies or services. This research aims to select as segmentation variables the customers' attitude towards accepting future telecommunication technologies and their value orientation in their everyday life, as these factors wilt have the greatest effect on the demand for future telecommunication services and thus segment the future telecom service market. Likewise, such research seeks to segment the market from the stage of technology R&D activities and employ the results to formulate technology development strategies. Based on the customer attitude towards accepting new technologies, two groups were induced, and a hierarchical customer segmentation model was provided to conduct secondary segmentation of the two groups on the basis of their respective customer value orientation. A survey was conducted in June 2006 on 800 consumers aged 15 to 69, residing in Seoul and five other major South Korean cities, through one-on-one interviews. The samples were divided into two sub-groups according to their level of acceptance of new technology; a sub-group demonstrating a high level of technology acceptance (39.4%) and another sub-group with a comparatively lower level of technology acceptance (60.6%). These two sub-groups were further divided each into 5 smaller sub-groups (10 total smaller sub-groups) through two rounds of segmentation. The ten sub-groups were then analyzed in their detailed characteristics, including general demographic characteristics, usage patterns in existing telecom services such as mobile service, broadband internet and wireless internet and the status of ownership of a computing or information device and the desire or intention to purchase one. Through these steps, we were able to statistically prove that each of these 10 sub-groups responded to telecom services as independent markets. We found that each segmented group responds as an independent individual market. Through correspondence analysis, the target segmentation groups were positioned in such a way as to facilitate the entry of future telecommunication services into the market, as well as their diffusion and transferability.

  • PDF

Visual Exploration based Approach for Extracting the Interesting Association Rules (유용한 연관 규칙 추출을 위한 시각적 탐색 기반 접근법)

  • Kim, Jun-Woo;Kang, Hyun-Kyung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.18 no.9
    • /
    • pp.177-187
    • /
    • 2013
  • Association rule mining is a popular data mining technique with a wide range of application domains, and aims to extract the cause-and-effect relations between the discrete items included in transaction data. However, analysts sometimes have trouble in interpreting and using the plethora of association rules extracted from a large amount of data. To address this problem, this paper aims to propose a novel approach called HTM for extracting the interesting association rules from given transaction data. The HTM approach consists of three main steps, hierarchical clustering, table-view, and mosaic plot, and each step provides the analysts with appropriate visual representation. For illustration, we applied our approach for analyzing the mass health examination data, and the result of this experiment reveals that the HTM approach help the analysts to find the interesting association rules in more effective way.

Decomposition of a Text Block into Words Using Projection Profiles, Gaps and Special Symbols (투영 프로파일, GaP 및 특수 기호를 이용한 텍스트 영역의 어절 단위 분할)

  • Jeong Chang Bu;Kim Soo Hyung
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.9
    • /
    • pp.1121-1130
    • /
    • 2004
  • This paper proposes a method for line and word segmentation for machine-printed text blocks. To separate a text region into the unit of lines, it analyses the horizontal projection profile and performs a recursive projection profile cut method. In the word segmentation, between-word gaps are identified by a hierarchical clustering method after finding gaps in the text line by using a connected component analysis. In addition, a special symbol detection technique is applied to find two types of special symbols tying between words using their morphologic features. An experiment with 84 text regions from English and Korean documents shows that the proposed method achieves 99.92% accuracy of word segmentation, while a commercial OCR software named Armi 6.0 Pro$^{TM}$ has 97.58% accuracy.y.

Property-based Hierarchical Clustering of Peers using Mobile Agent for Unstructured P2P Systems (비구조화 P2P 시스템에서 이동에이전트를 이용한 Peer의 속성기반 계층적 클러스터링)

  • Salvo, MichaelAngelG.;Mateo, RomeoMarkA.;Lee, Jae-Wan
    • Journal of Internet Computing and Services
    • /
    • v.10 no.4
    • /
    • pp.189-198
    • /
    • 2009
  • Unstructured peer-to-peer systems are most commonly used in today's internet. But file placement is random in these systems and no correlation exists between peers and their contents. There is no guarantee that flooding queries will find the desired data. In this paper, we propose to cluster nodes in unstructured P2P systems using the agglomerative hierarchical clustering algorithm to improve the search method. We compared the delay time of clustering the nodes between our proposed algorithm and the k-means clustering algorithm. We also simulated the delay time of locating data in a network topology and recorded the overhead of the system using our proposed algorithm, k-means clustering, and without clustering. Simulation results show that the delay time of our proposed algorithm is shorter compared to other methods and resource overhead is also reduced.

  • PDF

Fuzzy Reasoning based Selection Operator for Genetic Algorithm (퍼지 추론 기반의 유전알고리즘 선택 연산자)

  • Seo, Gi-Seong;Hyeon, Su-Hwan
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2007.11a
    • /
    • pp.112-115
    • /
    • 2007
  • 본 논문은 퍼지추론을 통해 개체의 유사성과 적합도의 종합적 평가를 이용한 유전알고리즘의 선태연산자를 제안한다. 단일 집단을 가상적으로 임의의 n 개의 개체군을 나누고, 개체의 적합도와 유사도에 기반한 퍼지추론을 통해, 효율적인 계층화를 구성하고자 한다. 동시에 점진형(steady-state) 진화방식과 결합시켜 계층화된 군집내에서 개체들이 조기에 수렴하는 현상을 방지해 줄 수 있도록 하고, 적은 개체를 이용하여 효율적인 진화가 가능하도록 구현하였다. 2가지 기만적 문제에 대해서 다른 선태 연산자들의 결과와 비교하였으며, 만족할만한 성능을 얻었다.

  • PDF

Korean Onomatopoeia Clustering for Sound Database (음향 DB 구축을 위한 한국어 의성어 군집화)

  • Kim, Myung-Gwan;Shin, Young-Suk;Kim, Young-Rye
    • Journal of Korea Multimedia Society
    • /
    • v.11 no.9
    • /
    • pp.1195-1203
    • /
    • 2008
  • Onomatopoeia of korean documents is to represent from natural or artificial sound to human language and it can express onomatopoeia language which is the nearest an object and also able to utilize as standard for clustering of Multimedia data. In this study, We get frequency of onomatopoeia in the experiment subject and select 100 onomatopoeia of use to our study In order to cluster onomatopoeia's relation, we extract feature of similarity and distance metric and then represent onomatopoeia's relation on vector space by using PCA. At the end, we can clustering onomatopoeia by using k-means algorithm.

  • PDF

Video Analysis System for Action and Emotion Detection by Object with Hierarchical Clustering based Re-ID (계층적 군집화 기반 Re-ID를 활용한 객체별 행동 및 표정 검출용 영상 분석 시스템)

  • Lee, Sang-Hyun;Yang, Seong-Hun;Oh, Seung-Jin;Kang, Jinbeom
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.89-106
    • /
    • 2022
  • Recently, the amount of video data collected from smartphones, CCTVs, black boxes, and high-definition cameras has increased rapidly. According to the increasing video data, the requirements for analysis and utilization are increasing. Due to the lack of skilled manpower to analyze videos in many industries, machine learning and artificial intelligence are actively used to assist manpower. In this situation, the demand for various computer vision technologies such as object detection and tracking, action detection, emotion detection, and Re-ID also increased rapidly. However, the object detection and tracking technology has many difficulties that degrade performance, such as re-appearance after the object's departure from the video recording location, and occlusion. Accordingly, action and emotion detection models based on object detection and tracking models also have difficulties in extracting data for each object. In addition, deep learning architectures consist of various models suffer from performance degradation due to bottlenects and lack of optimization. In this study, we propose an video analysis system consists of YOLOv5 based DeepSORT object tracking model, SlowFast based action recognition model, Torchreid based Re-ID model, and AWS Rekognition which is emotion recognition service. Proposed model uses single-linkage hierarchical clustering based Re-ID and some processing method which maximize hardware throughput. It has higher accuracy than the performance of the re-identification model using simple metrics, near real-time processing performance, and prevents tracking failure due to object departure and re-emergence, occlusion, etc. By continuously linking the action and facial emotion detection results of each object to the same object, it is possible to efficiently analyze videos. The re-identification model extracts a feature vector from the bounding box of object image detected by the object tracking model for each frame, and applies the single-linkage hierarchical clustering from the past frame using the extracted feature vectors to identify the same object that failed to track. Through the above process, it is possible to re-track the same object that has failed to tracking in the case of re-appearance or occlusion after leaving the video location. As a result, action and facial emotion detection results of the newly recognized object due to the tracking fails can be linked to those of the object that appeared in the past. On the other hand, as a way to improve processing performance, we introduce Bounding Box Queue by Object and Feature Queue method that can reduce RAM memory requirements while maximizing GPU memory throughput. Also we introduce the IoF(Intersection over Face) algorithm that allows facial emotion recognized through AWS Rekognition to be linked with object tracking information. The academic significance of this study is that the two-stage re-identification model can have real-time performance even in a high-cost environment that performs action and facial emotion detection according to processing techniques without reducing the accuracy by using simple metrics to achieve real-time performance. The practical implication of this study is that in various industrial fields that require action and facial emotion detection but have many difficulties due to the fails in object tracking can analyze videos effectively through proposed model. Proposed model which has high accuracy of retrace and processing performance can be used in various fields such as intelligent monitoring, observation services and behavioral or psychological analysis services where the integration of tracking information and extracted metadata creates greate industrial and business value. In the future, in order to measure the object tracking performance more precisely, there is a need to conduct an experiment using the MOT Challenge dataset, which is data used by many international conferences. We will investigate the problem that the IoF algorithm cannot solve to develop an additional complementary algorithm. In addition, we plan to conduct additional research to apply this model to various fields' dataset related to intelligent video analysis.

An Application of FCM(Fuzzy C-Means) for Clustering of Asian Ports Competitiveness Level and Status of Busan Port (FCM법을 이용한 아시아 항만의 경쟁력 수준 분류와 부산항의 위상)

  • 류형근;이홍걸;여기태
    • Journal of Korean Society of Transportation
    • /
    • v.21 no.5
    • /
    • pp.7-18
    • /
    • 2003
  • Due to the changes of shipping and logistic environment, Asian ports today face severe competition. To be a mega-hub port, Asian ports have achieved a big scale development. For these reasons, it has been widely recognized as an important study to analyze and evaluate characteristics of Asian ports, from the standpoint of Korea where Busan Port is located. Although some previous studies have been reported, most of them have been beyond the scope of Asian ports and analyzed the world's major ports; moreover, the studied ports have been about the ports which are well known from the previous research and reports. So, most studies is unlikely to be used as substantial indicators from the perspective of Busan Port. In addition. most of the existing studies have used hierarchical evaluation algorithm for port ranking, such as AHP (analytical hierarchy process) and clustering analysis. However, these two methods have fundamental weaknesses from the algorithm perspective. The aim of this study is to classify major Asian ports based on competitiveness level. Especially. in order to overcome serious problem of the existing studies, major Asian ports were analyzed by using objective indicators. and Fuzzy C-Means algorithm, which alleviates the weakness of the clustering method. It was found that 10 ports of 16 major Asian ports have their own phases and were classified into 4 port groups. This result implies that some ports have higher potential as ports to lead some zones in Asia. Based on those results. present status and future direction of Busan port were discussed as well.