• Title/Summary/Keyword: 공간군집화

Search Result 227, Processing Time 0.025 seconds

Modified multi-sense skip-gram using weighted context and x-means (가중 문맥벡터와 X-means 방법을 이용한 변형 다의어스킵그램)

  • Jeong, Hyunwoo;Lee, Eun Ryung
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.3
    • /
    • pp.389-399
    • /
    • 2021
  • In recent years, word embedding has been a popular field of natural language processing research and a skip-gram has become one successful word embedding method. It assigns a word embedding vector to each word using contexts, which provides an effective way to analyze text data. However, due to the limitation of vector space model, primary word embedding methods assume that every word only have a single meaning. As one faces multi-sense words, that is, words with more than one meaning, in reality, Neelakantan (2014) proposed a multi-sense skip-gram (MSSG) to find embedding vectors corresponding to the each senses of a multi-sense word using a clustering method. In this paper, we propose a modified method of the MSSG to improve statistical accuracy. Moreover, we propose a data-adaptive choice of the number of clusters, that is, the number of meanings for a multi-sense word. Some numerical evidence is given by conducting real data-based simulations.

Grouping-based 3D Animation Data Compression Method (군집화 기반 3차원 애니메이션 데이터 압축 기법)

  • Choi, Young-Jin;Yeo, Du-Hwan;Klm, Hyung-Seok;Kim, Jee-In
    • 한국HCI학회:학술대회논문집
    • /
    • 2008.02a
    • /
    • pp.461-468
    • /
    • 2008
  • The needs for visualizing interactive multimedia contents on portable devices with realistic three dimensional shapes are increasing as new ubiquitous services are coming into reality. Especially in digital fashion applications with virtual reality technologies for clothes of various forms on different avatars, it is required to provide very high quality visual models over mobile networks. Due to limited network bandwidths and memory spaces of portable devices, it is very difficult to transmit visual data effectively and render realistic appearance of three dimensional images. In this thesis, we propose a compression method to reduce three dimensional data for digital fashion applications. The three dimensional model includes animation of avatar which require very large amounts of data over time. Our proposed method utilizes temporal and spatial coherence of animation data, to reduce the amount. By grouping vertices from three dimensional models, the entire animation is represented by a movement path of a few representative vertices. The existing three dimensional model compression approaches can get benefits from the proposed method by reducing the compression sources through grouping. We expect that the proposed method to be applied not only to three dimensional garment animations but also to generic deformable objects.

  • PDF

Multidimensional scaling of categorical data using the partition method (분할법을 활용한 범주형자료의 다차원척도법)

  • Shin, Sang Min;Chun, Sun-Kyung;Choi, Yong-Seok
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.1
    • /
    • pp.67-75
    • /
    • 2018
  • Multidimensional scaling (MDS) is an exploratory analysis of multivariate data to represent the dissimilarity among objects in the geometric low-dimensional space. However, a general MDS map only shows the information of objects without any information about variables. In this study, we used MDS based on the algorithm of Torgerson (Theory and Methods of Scaling, Wiley, 1958) to visualize some clusters of objects in categorical data. For this, we convert given data into a multiple indicator matrix. Additionally, we added the information of levels for each categorical variable on the MDS map by applying the partition method of Shin et al. (Korean Journal of Applied Statistics, 28, 1171-1180, 2015). Therefore, we can find information on the similarity among objects as well as find associations among categorical variables using the proposed MDS map.

Road network data matching using the network division technique (네트워크 분할 기법을 이용한 도로 네트워크 데이터 정합)

  • Huh, Yong;Son, Whamin;Lee, Jeabin
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.31 no.4
    • /
    • pp.285-292
    • /
    • 2013
  • This study proposes a network matching method based on a network division technique. The proposed method generates polygons surrounded by links of the original network dataset, and detects corresponding polygon group pairs using a intersection-based graph clustering. Then corresponding sub-network pairs are obtained from the polygon group pairs. To perform the geometric correction between them, the Iterative Closest Points algorithm is applied to the nodes of each corresponding sub-networks pair. Finally, Hausdorff distance analysis is applied to find link pairs of networks. To assess the feasibility of the algorithm, we apply it to the networks from the KTDB center and commercial CNS company. In the experiments, several Hausdorff distance thresholds from 3m to 18m with 3m intervals are tested and, finally, we can get the F-measure of 0.99 when using the threshold of 15m.

Moving object segmentation and tracking using feature based motion flow (특징 기반 움직임 플로우를 이용한 이동 물체의 검출 및 추적)

  • 이규원;김학수;전준근;박규태
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.23 no.8
    • /
    • pp.1998-2009
    • /
    • 1998
  • An effective algorithm for tracking rigid or non-rigid moving object(s) which segments local moving parts from image sequence in the presence of backgraound motion by camera movenment, predicts the direction of it, and tracks the object is proposed. It requires no camera calibration and no knowledge of the installed position of camera. In order to segment the moving object, feature points configuring the shape of moving object are firstly selected, feature flow field composed of motion vectors of the feature points is computed, and moving object(s) is (are) segmented by clustering the feature flow field in the multi-dimensional feature space. Also, we propose IRMAS, an efficient algorithm that finds the convex hull in order to cinstruct the shape of moving object(s) from clustered feature points. And, for the purpose of robjst tracking the objects whose movement characteristics bring about the abrupt change of moving trajectory, an improved order adaptive lattice structured linear predictor is used.

  • PDF

Analysis of deep learning-based deep clustering method (딥러닝 기반의 딥 클러스터링 방법에 대한 분석)

  • Hyun Kwon;Jun Lee
    • Convergence Security Journal
    • /
    • v.23 no.4
    • /
    • pp.61-70
    • /
    • 2023
  • Clustering is an unsupervised learning method that involves grouping data based on features such as distance metrics, using data without known labels or ground truth values. This method has the advantage of being applicable to various types of data, including images, text, and audio, without the need for labeling. Traditional clustering techniques involve applying dimensionality reduction methods or extracting specific features to perform clustering. However, with the advancement of deep learning models, research on deep clustering techniques using techniques such as autoencoders and generative adversarial networks, which represent input data as latent vectors, has emerged. In this study, we propose a deep clustering technique based on deep learning. In this approach, we use an autoencoder to transform the input data into latent vectors, and then construct a vector space according to the cluster structure and perform k-means clustering. We conducted experiments using the MNIST and Fashion-MNIST datasets in the PyTorch machine learning library as the experimental environment. The model used is a convolutional neural network-based autoencoder model. The experimental results show an accuracy of 89.42% for MNIST and 56.64% for Fashion-MNIST when k is set to 10.

Study on the Size of Plant Community in Fragmented Habitats (서식처 분획화에 따른 식물군집의 크기에 관한 연구)

  • 신현탁;김용식
    • Korean Journal of Environment and Ecology
    • /
    • v.12 no.2
    • /
    • pp.147-155
    • /
    • 1998
  • This study was conducted from March to August 1997 to decide the size of plant community in fragmentary habitats. The thirty one sites and one hundred and eighteen plots were plotted in the areas including Yangpyong, Yoju, Pyongtaek and Ansong in Kyonggi-do, Chomchon and Sangju in Kyongsangbuk-do, Nonsan in Chungchongnam-do and Iksan in Chollapuk-do. The area and number of woody species by correlation analysis were recorded as the highest value as 0.716. In order to apply the theory of island biogeography to the fragmented habitats in Korea, the four variables were calculated by regression model. The four variables such as number of woody species, number of woody individuals, number of herbaceous species and number of herbaceous individuals were recorded as significant with area at the level of 0.05 and R square was 0.71. The one function was selected between number of species and number of individuals from the canonical correlation analysis, and the function square was 0.8876. Both canonical function and squared canonical correlation showed significant at the level of 0.01. The number of species and individuals were not increased from the condition that was the size of plant community of 400$m^2$, 30 for number of species and 4,000 for number of individuals. This results of this study can be widely used as a basic information for the conservation management, especially the fragmented ecosystems or the biotop creation in the landscaping.

  • PDF

I-vector similarity based speech segmentation for interested speaker to speaker diarization system (화자 구분 시스템의 관심 화자 추출을 위한 i-vector 유사도 기반의 음성 분할 기법)

  • Bae, Ara;Yoon, Ki-mu;Jung, Jaehee;Chung, Bokyung;Kim, Wooil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.5
    • /
    • pp.461-467
    • /
    • 2020
  • In noisy and multi-speaker environments, the performance of speech recognition is unavoidably lower than in a clean environment. To improve speech recognition, in this paper, the signal of the speaker of interest is extracted from the mixed speech signals with multiple speakers. The VoiceFilter model is used to effectively separate overlapped speech signals. In this work, clustering by Probabilistic Linear Discriminant Analysis (PLDA) similarity score was employed to detect the speech signal of the interested speaker, which is used as the reference speaker to VoiceFilter-based separation. Therefore, by utilizing the speaker feature extracted from the detected speech by the proposed clustering method, this paper propose a speaker diarization system using only the mixed speech without an explicit reference speaker signal. We use phone-dataset consisting of two speakers to evaluate the performance of the speaker diarization system. Source to Distortion Ratio (SDR) of the operator (Rx) speech and customer speech (Tx) are 5.22 dB and -5.22 dB respectively before separation, and the results of the proposed separation system show 11.26 dB and 8.53 dB respectively.

The Content-based Image Retrieval using the Histogram Area Calculation and Color and Texture using Object Segmentation (색상과 질감을 이용한 객체 분할과 히스토그램 영역 계산을 이용한 내용기반 영상 검색)

  • Jang, Se-Young;Han, Deuk-Su;Yoo, Gi-Hyoung;Yoo, Kang-Soo;Kwak, Hoon-Sung
    • Annual Conference of KIPS
    • /
    • 2005.11a
    • /
    • pp.229-232
    • /
    • 2005
  • 본 논문에서는 새로운 HAC(Histogram Area Calculation)방법과 영상의 객체분할 방법을 소개한다. 히스토그램을 이용한 영상은 색상 공간의 특징 때문에 조명에 매우 민감하여 빛의 강도에 따라 유사성이 저하되는 경우가 있다. 또한 공간적 정보를 가지고 있지 않아, 전혀 다른 모양의 영상일지라도 칼라 분포가 같은 영상으로 볼 수 있다. 이 논문에서 제안한 방법은 히스토그램 영역을 임의의 영역으로 나눠, 영역들의 유사성을 매칭(matching) 시킨다. 2차 검색방법으로 원 영상에서의 색상 질감 정보가 동일한 영역을 군집화 하여, 영상 분할된 객체들을 이용하여 검색하는 방법이다. 실험 결과, 제안한 방법이 전통적인 히스토그램 방법보다 검색 성능이 효율적인 결과를 얻었다.

  • PDF

SOM-based Spatio-Temporal Data Mining System (SOM 기반 시공간 데이터 마이닝 시스템)

  • Kang Juyoung;Lee Bongjae;Song Jaeju;Shin Jinho;Yong Hwanseung
    • Annual Conference of KIPS
    • /
    • 2004.11a
    • /
    • pp.105-108
    • /
    • 2004
  • 데이터 양이 급증함에 따라 축적된 데이터로부터 의미있는 지식을 추출해 내고자 하는 데이터 마이닝에 대한 연구가 활발하게 진행되어 왔다. 특히 최근, 환경이 이동 분산화 되어감에 따라 감시${\cdot}$모니터링 시스템, 기상 관측 시스템, GPS 시스템과 같은 다양한 응용 시스템으로부터 방대한 양의 시공간 데이터가 발생하게 되었고, 이른 효율적으로 분석하고자 하는 시공간 데이터 마이닝 연구에 대한 관심이 더욱 높아지고 있다. 기존의 데이터 마이닝 기법의 경우 문자나 숫자 데이터를 대상으로 최적화 되어있기 때문에 시${\cdot}$공간 속성을 동시에 가지는 데이터를 분석하기에는 한계가 있는 것이 사실이다. 본 논문에서는 SOM(Self-Organizing Map)을 적용하여 시공간 클러스터링 모듈을 개발하고, 개발된 모듈의 성능 및 클러스터링 정확성을 다른 세 가지 군집분석 알고리즘과 비교, 분석하였다. 또한 가시화 모듈을 개발하여 입력 데이터의 특성과 결과를 더욱 정확하게 분석할 수 있도록 하였다.

  • PDF