• Title/Summary/Keyword: 공간군집화

Search Result 227, Processing Time 0.027 seconds

Centroid Neural Network with Bhattacharyya Kernel (Bhattacharyya 커널을 적용한 Centroid Neural Network)

  • Lee, Song-Jae;Park, Dong-Chul
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.32 no.9C
    • /
    • pp.861-866
    • /
    • 2007
  • A clustering algorithm for Gaussian Probability Distribution Function (GPDF) data called Centroid Neural Network with a Bhattacharyya Kernel (BK-CNN) is proposed in this paper. The proposed BK-CNN is based on the unsupervised competitive Centroid Neural Network (CNN) and employs a kernel method for data projection. The kernel method adopted in the proposed BK-CNN is used to project data from the low dimensional input feature space into higher dimensional feature space so as the nonlinear problems associated with input space can be solved linearly in the feature space. In order to cluster the GPDF data, the Bhattacharyya kernel is used to measure the distance between two probability distributions for data projection. With the incorporation of the kernel method, the proposed BK-CNN is capable of dealing with nonlinear separation boundaries and can successfully allocate more code vector in the region that GPDF data are densely distributed. When applied to GPDF data in an image classification probleml, the experiment results show that the proposed BK-CNN algorithm gives 1.7%-4.3% improvements in average classification accuracy over other conventional algorithm such as k-means, Self-Organizing Map (SOM) and CNN algorithms with a Bhattacharyya distance, classed as Bk-Means, B-SOM, B-CNN algorithms.

Design and Development of Clustering Algorithm Considering Influences of Spatial Objects (공간객체의 영향력을 고려한 클러스터링 알고리즘의 설계와 구현)

  • Kim, Byung-Cheol
    • The Journal of the Korea Contents Association
    • /
    • v.6 no.12
    • /
    • pp.113-120
    • /
    • 2006
  • This paper proposes DBSCAN-SI that is an algorithm for clustering with influences of spatial objects. DBSCAN-SI that is extended from existing DBSCAN and DBSCAN-W converts from non-spatial properties to the influences of spatial objects during the spatial clustering. It increases probability of inclusion to the cluster according to the higher the influences that is affected by the properties used in clustering and executes the clustering not only respect the spatial distances, but also volume of influences. For the perspective of specific property-centered, the clustering technique proposed in this paper can makeup the disadvantage of existing algorithms that exclude the objects in spite of high influences from cluster by means of being scarcely close objects around the cluster.

  • PDF

Clustering of sediment characteristics in South Korean rivers and its expanded application strategy to H-ADCP based suspended sediment concentration monitoring technique (한국 하천의 지역별 유사특성의 군집화와 H-ADCP 기반 부유사 농도 관측 기법에의 활용 방안)

  • Noh, Hyoseob;Son, GeunSoo;Kim, Dongsu;Park, Yong Sung
    • Journal of Korea Water Resources Association
    • /
    • v.55 no.1
    • /
    • pp.43-57
    • /
    • 2022
  • Advances in measurement techniques have reduced measurement costs and enhanced safety resulting in less uncertainty. For example, an acoustic doppler current profiler (ADCP) based suspended sediment concentration (SSC) measurement technique is being accepted as an alternative to the conventional data collection method. In Korean rivers, horizontal ADCPs (H-ADCPs) are mounted on the automatic discharge monitoring stations, where SSC can be measured using the backscatter of ADCPs. However, automatic discharge monitoring stations and sediment monitoring stations do not always coincide which hinders the application of the new techniques that are not feasible to some stations. This work presents and analyzes H-ADCP-SSC models for 9 discharge monitoring stations in Korean rivers. In application of the Gaussian mixture model (GMM) to sediment-related variables (catchment area, particle size distributions of suspended sediment and bed material, water discharge-sediment discharge curves) from 44 sediment monitoring stations, it is revealed that those characteristics can distinguish sediment monitoring stations regionally. Linking the two results, we propose a protocol determining the H-ADCP-SSC model where no H-ADCP-SSC model is available.

Face Recognition using Vector Quantizer in Eigenspace (아이겐공간에서 벡터 양자기를 이용한 얼굴인식)

  • 임동철;이행세;최태영
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.41 no.5
    • /
    • pp.185-192
    • /
    • 2004
  • This paper presents face recognition using vector quantization in the eigenspace of the faces. The existing eigenface method is not enough for representing the variations of faces. For making up for its defects, the proposed method use a clustering of feature vectors by vector quantization in eigenspace of the faces. In the trainning stage, the face images are transformed the points in the eigenspace by eigeface(eigenvetor) and we represent a set of points for each people as the centroids of vector quantizer. In the recognition stage, the vector quantizer finds the centroid having the minimum quantization error between feature vector of input image and centriods of database. The experiments are performed by 600 faces in Faces94 database. The existing eigenface method has minimum 64 miss-recognition and the proposed method has minimum 20 miss-recognition when we use 4 codevectors. In conclusion, the proposed method is a effective method that improves recognition rate through overcoming the variation of faces.

A study on cluster and positioning of domestic electronic commerce based on purchasing motivation (국내 전자상거래 구매동기에 대한 군집 및 포지셔닝 연구)

  • Jeong, Dong Bin
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.4
    • /
    • pp.841-856
    • /
    • 2015
  • Thirteen types of business and sixteen administrative districts in Korea are categorized and segmented based on their similarities and visually plotted on multidimensional space. The similarities are determined by five characteristics of quantitative evaluation (simplified process of trading, reduced price, direct contact with supplier, faster process of trading, et cetera). Hence, domestic types of business and administrative districts can be categorized into certain clusters. Also, forms and characteristics of types of business and administrative districts can be evaluated between and within the clusters.

The Comparison of Community Characteristics of Ground-dwelling Invertebrates According Agroecosystem Types in the Eastern Region of the Korean Peninsula (한반도 동부 농업생태계에 따른 지표배회성 무척추동물의 군집 특성 비교)

  • Ahn, Chi-Hyun;Oh, Young-Ju;Ock, Suk-Mi;Lee, Wook-Jae;Sohn, Soo-In;Kim, Myung-Hyun;Na, Young-Eun;Kim, Chang-Seok
    • Korean journal of applied entomology
    • /
    • v.56 no.1
    • /
    • pp.29-39
    • /
    • 2017
  • To compare the features of ground-dwelling invertebrates according agroecosystems, we selected paddy fields, dry fields, orchards in the Eastern region of Korea. The surveys were performed by using pit-fall traps twice per year from 2013 to 2015. Total 6,420 individuals of 172 species belonging to 13 orders, 58 families were investigated in the Eastern region, the species of Hymenoptera (38.26%), Orthoptera (16.28%) accounted large portion of the communities. In the geographical observation, invertebrates were caught was 2,983 individuals in Gyeongsangnam-do, the diversity index of Gyeongsangbuk-do community was higher than of the others and abundance and species richness of paddy field were higher than from dry field or orchard. To understand the relation between taxonomic groups and environmental factors, we carried out the canonical correspondence analysis and hierarchical clustering. As a result, Homoptera, Blattaria, Isoptera, and Coleoptera were positively related to soil pH, soil temperature, and moisture contents, and negatively related to the others. Invertebrate community also were patterned dependently by type of ecosystems. This results were shown that distribution of invertebrates is a few influenced the relationship of the space habituated invertebrates and environmental factors.

Parallel Spatial Join Method Using Efficient Spatial Relation Partition In Distributed Spatial Database Systems (분산 공간 DBMS에서의 효율적인 공간 릴레이션 분할 기법을 이용한 병렬 공간 죠인 기법)

  • Ko, Ju-Il;Lee, Hwan-Jae;Bae, Hae-Young
    • Journal of Korea Spatial Information System Society
    • /
    • v.4 no.1 s.7
    • /
    • pp.39-46
    • /
    • 2002
  • In distributed spatial database systems, users nay issue a query that joins two relations stored at different sites. The sheer volume and complexity of spatial data bring out expensive CPU and I/O costs during the spatial join processing. This paper shows a new spatial join method which joins two spatial relation in a parallel way. Firstly, the initial join operation is divided into two distinct ones by partitioning one of two participating relations based on the region. This two join operations are assigned to each sites and executed simultaneously. Finally, each intermediate result sets from the two join operations are merged to an ultimate result set. This method reduces the number of spatial objects participating in the spatial operations. It also reduces the scope and the number of scanning spatial indices. And it does not materialize the temporary results by implementing the join algebra operators using the iterator. The performance test shows that this join method can lead to efficient use in terms of buffer and disk by narrowing down the joining region and decreasing the number of spatial objects.

  • PDF

Spatial clustering of pedestrian traffic accidents in Daegu (대구광역시 교통약자 보행자 교통사고 공간 군집 분석)

  • Hwang, Yeongeun;Park, Seonghee;Choi, Hwabeen;Yoon, Sanghoo
    • Journal of Digital Convergence
    • /
    • v.20 no.3
    • /
    • pp.75-83
    • /
    • 2022
  • Korea, which has the highest pedestrian fatality rate among OECD countries, is making efforts to improve the safe walking environment by enacting laws focusing on pedestrian. Spatial clustering was conducted with scan statistics after examining the social network data related to traffic accidents for children and seniors. The word cloud was used to examine people's recognition Campaigns for children and literature survey for seniors were in main concern. Naedang and Yongsan are the regions with the highest relative risk of weak pedestrian for children and seniors. On the contrary, Bongmu and Beomeo are the lowest relative risk region. Naedang-dong and Yongsan-dong of Daegu Metropolitan City were identified as vulnerable areas for pedestrian safety due to the high risk of pedestrian accidents for children and the elderly. This means that the scan statistics are effective in searching for traffic accident risk areas.

Strategy for Store Management Using SOM Based on RFM (RFM 기반 SOM을 이용한 매장관리 전략 도출)

  • Jeong, Yoon Jeong;Choi, Il Young;Kim, Jae Kyeong;Choi, Ju Choel
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.93-112
    • /
    • 2015
  • Depending on the change in consumer's consumption pattern, existing retail shop has evolved in hypermarket or convenience store offering grocery and daily products mostly. Therefore, it is important to maintain the inventory levels and proper product configuration for effectively utilize the limited space in the retail store and increasing sales. Accordingly, this study proposed proper product configuration and inventory level strategy based on RFM(Recency, Frequency, Monetary) model and SOM(self-organizing map) for manage the retail shop effectively. RFM model is analytic model to analyze customer behaviors based on the past customer's buying activities. And it can differentiates important customers from large data by three variables. R represents recency, which refers to the last purchase of commodities. The latest consuming customer has bigger R. F represents frequency, which refers to the number of transactions in a particular period and M represents monetary, which refers to consumption money amount in a particular period. Thus, RFM method has been known to be a very effective model for customer segmentation. In this study, using a normalized value of the RFM variables, SOM cluster analysis was performed. SOM is regarded as one of the most distinguished artificial neural network models in the unsupervised learning tool space. It is a popular tool for clustering and visualization of high dimensional data in such a way that similar items are grouped spatially close to one another. In particular, it has been successfully applied in various technical fields for finding patterns. In our research, the procedure tries to find sales patterns by analyzing product sales records with Recency, Frequency and Monetary values. And to suggest a business strategy, we conduct the decision tree based on SOM results. To validate the proposed procedure in this study, we adopted the M-mart data collected between 2014.01.01~2014.12.31. Each product get the value of R, F, M, and they are clustered by 9 using SOM. And we also performed three tests using the weekday data, weekend data, whole data in order to analyze the sales pattern change. In order to propose the strategy of each cluster, we examine the criteria of product clustering. The clusters through the SOM can be explained by the characteristics of these clusters of decision trees. As a result, we can suggest the inventory management strategy of each 9 clusters through the suggested procedures of the study. The highest of all three value(R, F, M) cluster's products need to have high level of the inventory as well as to be disposed in a place where it can be increasing customer's path. In contrast, the lowest of all three value(R, F, M) cluster's products need to have low level of inventory as well as to be disposed in a place where visibility is low. The highest R value cluster's products is usually new releases products, and need to be placed on the front of the store. And, manager should decrease inventory levels gradually in the highest F value cluster's products purchased in the past. Because, we assume that cluster has lower R value and the M value than the average value of good. And it can be deduced that product are sold poorly in recent days and total sales also will be lower than the frequency. The procedure presented in this study is expected to contribute to raising the profitability of the retail store. The paper is organized as follows. The second chapter briefly reviews the literature related to this study. The third chapter suggests procedures for research proposals, and the fourth chapter applied suggested procedure using the actual product sales data. Finally, the fifth chapter described the conclusion of the study and further research.

On Characteristics of Word Embeddings by the Word2vec Model (Word2vec 모델의 단어 임베딩 특성 연구)

  • Kang, Hyungsuc;Yang, Janghoon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2019.05a
    • /
    • pp.263-266
    • /
    • 2019
  • 단어 임베딩 모델 중 현재 널리 사용되는 word2vec 모델은 언어의 의미론적 유사성을 잘 반영한다고 알려져 있다. 본 논문은 word2vec 모델로 학습된 단어 벡터가 실제로 의미론적 유사성을 얼마나 잘 반영하는지 확인하는 것을 목표로 한다. 즉, 유사한 범주의 단어들이 벡터 공간상에 가까이 임베딩되는지 그리고 서로 구별되는 범주의 단어들이 뚜렷이 구분되어 임베딩되는지를 확인하는 것이다. 간단한 군집화 알고리즘을 통한 검증의 결과, 상식적인 언어 지식과 달리 특정 범주의 단어들은 임베딩된 벡터 공간에서 뚜렷이 구분되지 않음을 확인했다. 결론적으로, 단어 벡터들의 유사도가 항상 해당 단어들의 의미론적 유사도를 의미하지는 않는다. Word2vec 모델의 결과를 응용하는 향후 연구에서는 이런 한계점에 고려가 요청된다.