• Title/Summary/Keyword: 군집화 모델링

Search Result 48, Processing Time 0.033 seconds

Modeling and Classification of MPEG VBR Video Data using Gradient-based Fuzzy c_means with Divergence Measure (분산 기반의 Gradient Based Fuzzy c-means 에 의한 MPEG VBR 비디오 데이터의 모델링과 분류)

  • 박동철;김봉주
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.7C
    • /
    • pp.931-936
    • /
    • 2004
  • GBFCM(DM), Gradient-based Fuzzy c-means with Divergence Measure, for efficient clustering of GPDF(Gaussian Probability Density Function) in MPEG VBR video data modeling is proposed in this paper. The proposed GBFCM(DM) is based on GBFCM( Gradient-based Fuzzy c-means) with the Divergence for its distance measure. In this paper, sets of real-time MPEG VBR Video traffic data are considered. Each of 12 frames MPEG VBR Video data are first transformed to 12-dimensional data for modeling and the transformed 12-dimensional data are Pass through the proposed GBFCM(DM) for classification. The GBFCM(DM) is compared with conventional FCM and GBFCM algorithms. The results show that the GBFCM(DM) gives 5∼15% improvement in False Alarm Rate over conventional algorithms such as FCM and GBFCM.

A Study on the Application Modeling of SNS Big-data for a Micro-Targeting using K-Means Clustering (K-평균 군집을 이용한 마이크로타겟팅을 위한 SNS 빅데이터 활용 모델링에 관한 연구)

  • Song, Jeo;Lee, Sang Moon
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2015.01a
    • /
    • pp.321-324
    • /
    • 2015
  • 본 논문에서는 SNS에 존재하는 특정 제품과 브랜드 또는 기업에 대한 평가, 의견, 느낌, 사용 후기 등의 소비자 생각을 수집하여 기업에서 향후 신제품 개발이나 시장 진출 및 확대 등의 경영활동에 활용할 수 있도록 SNS 빅데이터를 문석하고, 이를 활용하여 보다 소집단화 되고 개인화 되어가는 Micro-Trend 중심의 마케팅 활동을 할 수 있는 Micro-Targeting 관련 분석 정보를 제공 모델링하는 것을 제안한다. 본 연구에서는 SNS 데이터의 수집, 저장, 분석에 대한 내용을 다루고 있으며, 특히 마이크로타겟팅을 위한 정보를 머하웃(Mahout)의 유클리드 거리 기반의 유사도와 K-평균 군집 알고리즘을 활용하여 구현하고자 하였다.

  • PDF

HAPS Network MBS placement with EM Clustering Algorithm (HAPS 기반 네트워크에서의 실시간 이동 기지국 위치 문제 해결 정책)

  • Woong-Hee Jung;Ha Yoon Song;Kwan Sik Cho
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2008.11a
    • /
    • pp.1307-1310
    • /
    • 2008
  • EM(Expectation Maximization)은 불확실한 데이터들을 가지고 분포를 모델링하는, 널리 알려진 군집화 알고리즘이다. EM 알고리즘에서, 정규 분포는 기대(Expectation)-최대화(Maximization)과정을 반복하는 과정에서 그 윤곽을 다져간다. 이 때 이 과정은 EM 알고리즘의 다양한 확률 초기화에 따라 다른 결과를 내게 된다, 본 논문에서는 이 확률 초기화 값의 조정을 통하여 HAPS(High Altitude Platform Station) 기반 네트워크에서 이동 기지국의 위치를 실시간으로 결정하고자 하는 문제를 풀기 위한 조건을 몇 가지 반영시켜 확률 초기 값을 결정해 보고, 그 결과를 제시한다. 이에 더불어, ITU에서 제한하고 있는 이동 기지국의 서비스 반경을 고려하는 방법을 제시한다.

Subword Modeling of Vocabulary Independent Speech Recognition Using Phoneme Clustering (음소 군집화 기법을 이용한 어휘독립음성인식의 음소모델링)

  • Koo Dong-Ook;Choi Joon Ki;Yun Young-Sun;Oh Yung-Hwan
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • autumn
    • /
    • pp.33-36
    • /
    • 2000
  • 어휘독립 고립단어인식은 미리 훈련된 부단어(sub-word) 단위의 음향모델을 이용하여 수시로 변하는 인식대상어휘를 인식하는 것이다. 본 논문에서는 소용량 음성 데이터베이스를 이용하여 어휘독립음성인식 시스템을 구성하였다. 소용량 음성 데이터베이스에서 미관측문맥 종속형 부단어에 대한 처리에 효과적인 백오프 기법을 이용한 음소 군집화 방법으로 문턱값을 변화시키며 인식실험을 수행하였다. 그리고 훈련용 데이터의 부족으로 인하여 문맥 종속형 부단어 모델이 훈련용 데이터베이스로 편중되는 문제를 deleted interpolation 방법을 이용하여 문맥 종속형 부단어 모델과 문맥 독립형 부단어 모델을 병합함으로써 해결하였다. 그 결과 음성인식의 성능이 향상되었다.

  • PDF

A Study on the Regional Frequency Analysis Using the Artificial Neural Network Method - the Nakdong River Basin (인공신경망 군집분석을 이용한 지역빈도해석에 관한 연구 - 낙동강 유역을 중심으로)

  • Ahn, Hyunjun;Kim, Sunghun;Jung, Jinseok;Heo, Jun-Haeng
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2017.05a
    • /
    • pp.404-404
    • /
    • 2017
  • 이상기후현상으로 인해 극치 수문 사상들이 빈번히 발생함에 따라 상대적으로 높은 재현기간에 해당하는 극치 수문 사상해석에 대한 관심이 높아지고 있다. 그러나 우리나라의 경우 이러한 극치 수문 사상을 추정하기 위한 표본의 수가 부족한 실정이다. 지역빈도해석은 지점의 표본 수가 적거나 수문자료의 수집이 불가능한 미계측지점인 경우, 해당 지점과 수문학적으로 동질하다고 여겨지는 주변 지점들의 자료를 확보하여 확률수문량을 추정함으로써 상대적으로 지점빈도해석 보다 roubst한 추정값을 얻을 수 있다는 장점을 가지고 있다. 따라서 최근 확률수문량 산정 기법으로 지역빈도해석 방법에 관한 관심이 높아지고 있다. 지역구분은 지역빈도해석이 지점빈도해석과 구분될 수 있는 큰 특징이고 지역구분 결과 따라 지역의 표본 크기가 결정되기 때문에 수문학적으로 동질한 지역을 나누는 방법은 매우 중요하다고 볼 수 있다. 인공신경망은 인간의 뇌가 학습하는 방식을 모사한 통계적 모델링 기법이다. 즉, 인간의 뇌가 일정한 반복 학습을 통해 어떠한 문제의 해법을 추론하거나 예측, 또는 패턴을 인식하는 일련의 과정을 알고리즘화 하여 목적함수의 해를 찾는 방식이다. 특히, 주어진 자료들로 부터 특징을 추출하고 그 특징을 학습하여 전체 자료의 분류나 군집화를 이루는데 널리 이용되고 있다. 본 연구에서는 낙동강유역을 대상으로 인공신경망을 이용한 군집분석을 수행하고 구분된 지역을 이용하여 지역빈도해석을 수행하였다.

  • PDF

Data Analysis of Facebook Insights (페이스북 인사이트 데이터 분석)

  • Cha, Young Jun;Lee, Hak Jun;Jung, Yong Gyu
    • The Journal of the Convergence on Culture Technology
    • /
    • v.2 no.1
    • /
    • pp.93-98
    • /
    • 2016
  • As information technologies are rapidly developed recently, social networking services through a variety of mobile devices and smart screen is becoming popular. SNS is a social networking based services which is online forms from existed offline. SNS can also be used differently which is confused with the online community. A modelling algorithm is a variety of techniques, which are assocoation, clustering, neural networks, and decision trees, etc. By utilizing this technique, it is necessary to study to effectively using the large number of materials. In this paper, we evaluate in particular the performance of the algorithm based on the results of the clustering using Facebook Insights data for the EM algorithm to be evaluated as a good performance in clustering. Through this analysis it was based on the results of the application of the experimental data of the change and the South Australian state library according to the performance of the EM algorithm.

A Study on the Optimization of State Tying Acoustic Models using Mixture Gaussian Clustering (혼합 가우시안 군집화를 이용한 상태공유 음향모델 최적화)

  • Ann, Tae-Ock
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.42 no.6
    • /
    • pp.167-176
    • /
    • 2005
  • This paper describes how the state tying model based on the decision tree which is one of Acoustic models used for speech recognition optimizes the model by reducing the number of mixture Gaussians of the output probability distribution. The state tying modeling uses a finite set of questions which is possible to include the phonological knowledge and the likelihood based decision criteria. And the recognition rate can be improved by increasing the number of mixture Gaussians of the output probability distribution. In this paper, we'll reduce the number of mixture Gaussians at the highest point of recognition rate by clustering the Gaussians. Bhattacharyya and Euclidean method will be used for the distance measure needed when clustering. And after calculating the mean and variance between the pair of lowest distance, the new Gaussians are created. The parameters for the new Gaussians are derived from the parameters of the Gaussians from which it is born. Experiments have been performed using the STOCKNAME (1,680) databases. And the test results show that the proposed method using Bhattacharyya distance measure maintains their recognition rate at $97.2\%$ and reduces the ratio of the number of mixture Gaussians by $1.0\%$. And the method using Euclidean distance measure shows that it maintains the recognition rate at $96.9\%$ and reduces the ratio of the number of mixture Gaussians by $1.0\%$. Then the methods can optimize the state tying model.

Key Point Extraction from LiDAR Data for 3D Modeling (3차원 모델링을 위한 라이다 데이터로부터 특징점 추출 방법)

  • Lee, Dae Geon;Lee, Dong-Cheon
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.34 no.5
    • /
    • pp.479-493
    • /
    • 2016
  • LiDAR(Light Detection and Ranging) data acquired from ALS(Airborne Laser Scanner) has been intensively utilized to reconstruct object models. Especially, researches for 3D modeling from LiDAR data have been performed to establish high quality spatial information such as precise 3D city models and true orthoimages efficiently. To reconstruct object models from irregularly distributed LiDAR point clouds, sensor calibration, noise removal, filtering to separate objects from ground surfaces are required as pre-processing. Classification and segmentation based on geometric homogeneity of the features, grouping and representation of the segmented surfaces, topological analysis of the surface patches for modeling, and accuracy assessment are accompanied by modeling procedure. While many modeling methods are based on the segmentation process, this paper proposed to extract key points directly for building modeling without segmentation. The method was applied to simulated and real data sets with various roof shapes. The results demonstrate feasibility of the proposed method through the accuracy analysis.

Components Clustering for Modular Product Design Using Network Flow Model (네트워크 흐름 모델을 활용한 모듈러 제품 설계를 위한 컴포넌트 군집화)

  • Son, Jiyang;Yoo, Jaewook
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.17 no.7
    • /
    • pp.263-272
    • /
    • 2016
  • Modular product design has contributed to flexible product modification and development, production lead time reduction, and increasing product diversity. Modular product design aims to develop a product architecture that is composed of detachable modules. These modules are constructed by maximizing the similarity of components based on physical and functional interaction analysis among components. Accordingly, a systematic procedure for clustering the components, which is a main activity in modular product design, is proposed in this paper. The first phase in this procedure is to build a component-to-component correlation matrix by analyzing physical and functional interaction relations among the components. In the second phase, network flow modeling is applied to find clusters of components, maximizing their correlations. In the last phase, a network flow model formulated with linear programming is solved to find the clusters and to make them modular. Finally, the proposed procedure in this research and its application are illustrated with an example of modularization for a vacuum cleaner.

Analysis of Research Trends Related to drug Repositioning Based on Machine Learning (머신러닝 기반의 신약 재창출 관련 연구 동향 분석)

  • So Yeon Yoo;Gyoo Gun Lim
    • Information Systems Review
    • /
    • v.24 no.1
    • /
    • pp.21-37
    • /
    • 2022
  • Drug repositioning, one of the methods of developing new drugs, is a useful way to discover new indications by allowing drugs that have already been approved for use in people to be used for other purposes. Recently, with the development of machine learning technology, the case of analyzing vast amounts of biological information and using it to develop new drugs is increasing. The use of machine learning technology to drug repositioning will help quickly find effective treatments. Currently, the world is having a difficult time due to a new disease caused by coronavirus (COVID-19), a severe acute respiratory syndrome. Drug repositioning that repurposes drugsthat have already been clinically approved could be an alternative to therapeutics to treat COVID-19 patients. This study intends to examine research trends in the field of drug repositioning using machine learning techniques. In Pub Med, a total of 4,821 papers were collected with the keyword 'Drug Repositioning'using the web scraping technique. After data preprocessing, frequency analysis, LDA-based topic modeling, random forest classification analysis, and prediction performance evaluation were performed on 4,419 papers. Associated words were analyzed based on the Word2vec model, and after reducing the PCA dimension, K-Means clustered to generate labels, and then the structured organization of the literature was visualized using the t-SNE algorithm. Hierarchical clustering was applied to the LDA results and visualized as a heat map. This study identified the research topics related to drug repositioning, and presented a method to derive and visualize meaningful topics from a large amount of literature using a machine learning algorithm. It is expected that it will help to be used as basic data for establishing research or development strategies in the field of drug repositioning in the future.