• Title/Summary/Keyword: 군집 기법

Search Result 1,014, Processing Time 0.028 seconds

Electric Power Consumption Forecasting Method using Data Clustering (데이터 군집화를 이용한 전력 사용량 예측 기법)

  • Park, Jinwoong;Moon, Jihoon;Kim, Yongsung;Hwang, Eenjun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2016.04a
    • /
    • pp.571-574
    • /
    • 2016
  • 최근 에너지 효율을 최적화하는 차세대 지능형 전력망인 스마트 그리드 시스템(Smart Grid System)이 국내외에 널리 보급되고 있다. 그로 인해 그리드 시스템의 효율적인 운영을 위해 적용되는 EMS(Energy Management System) 기술의 중요성이 커지고 있다. EMS는 에너지 사용량 예측의 높은 정확성이 요구되며, 예측이 정확하게 수행될수록 에너지의 활용성이 높아진다. 본 논문은 전력 사용량 예측의 정확성 향상을 위한 새로운 기법을 제안한다. 구체적으로, 먼저 사용량에 영향을 미치는 환경적인 요인들을 분석한다. 분석된 요인들을 적용하여 유사한 환경을 가지는 전력 사용량 데이터의 사전 군집화를 수행한다. 그리고 예측 일에 관련된 환경 정보와 가장 유사한 군집의 전력 사용량 데이터를 기반으로 전력 사용량을 예측한다. 제안하는 기법의 성능을 평가하기 위해, 다양한 실험을 통하여 일간 전력 사용량을 예측하고 그 정확성을 측정하였다. 결과적으로, 기존의 기법들과 비교했을 때, 최대 52.88% 향상된 전력 사용량 예측 정확성을 보였다.

Text Clustering Algorithm Based on Ontology Concepts Combination (온톨로지 개념 합병 기반 문서 군집화 기법)

  • Guan, XiangDong;Kim, Woosaeng
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2012.10a
    • /
    • pp.722-724
    • /
    • 2012
  • 문서 군집화를 통하여 문서를 효율적으로 조직, 관리, 검색 할 수 있다. 일반적으로 문서 군집화는 많은 단어와 개념들을 포함하고 있기 때문에 차원이 큰 벡터 공간 모델에서 군집화를 수행한다. 본 논문에서 문서 집합에 대응하는 온톨로지를 이용하여 문서 벡터 공간의 차원을 줄여 효율적으로 군집화하는 방법을 제안하고, 실험을 통하여 기존 방법보다 우수함을 보인다.

  • PDF

A Clustering-based Undersampling Method to Prevent Information Loss from Text Data (텍스트 데이터의 정보 손실을 방지하기 위한 군집화 기반 언더샘플링 기법)

  • Jong-Hwi Kim;Saim Shin;Jin Yea Jang
    • Annual Conference on Human and Language Technology
    • /
    • 2022.10a
    • /
    • pp.251-256
    • /
    • 2022
  • 범주 불균형은 분류 모델이 다수 범주에 편향되게 학습되어 소수 범주에 대한 분류 성능을 떨어뜨리는 문제를 야기한다. 언더 샘플링 기법은 다수 범주 데이터의 수를 줄여 소수 범주와 균형을 이루게하는 대표적인 불균형 해결 방법으로, 텍스트 도메인에서의 기존 언더 샘플링 연구에서는 단어 임베딩과 랜덤 샘플링과 같은 비교적 간단한 기법만이 적용되었다. 본 논문에서는 트랜스포머 기반 문장 임베딩과 군집화 기반 샘플링 방법을 통해 텍스트 데이터의 정보 손실을 최소화하는 언더샘플링 방법을 제안한다. 제안 방법의 검증을 위해, 감성 분석 실험에서 제안 방법과 랜덤 샘플링으로 추출한 훈련 세트로 모델을 학습하고 성능을 비교 평가하였다. 제안 방법을 활용한 모델이 랜덤 샘플링을 활용한 모델에 비해 적게는 0.2%, 많게는 2.0% 높은 분류 정확도를 보였고, 이를 통해 제안하는 군집화 기반 언더 샘플링 기법의 효과를 확인하였다.

  • PDF

How to determine the number of clusters (군집수 결정 문제)

  • Yun, Bok-Sik
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2004.05a
    • /
    • pp.689-693
    • /
    • 2004
  • 주어진 데이터를 일정한 기준에 따라 여러 개 군집으로 분할할 때 대부분 경우는 군집수에 대한 사전 정보가 없이 군집화를 실시하게 된다. 적절한 군집수의 결정은 군집화 결과의 타당성에 전제가 되는 매우 중요한 문제이나 내재된 복잡성 때문에 실제 적용에 간편한 방법을 찾기 힘들고 더구나 다양한 형태의 데이터에 보편적으로 적합한 방법을 찾기는 더욱 어렵다. 본 연구에서는 기존의 제시된 군집수 결정방법 들의 아이디어 들을 소개하고 주어진 데이터의 종류에 관계없이 일반적으로 적용할 수 있는 새로운 군집수 결정기법을 제시한다. 대부분의 경우 군집수 결정은 군집화와 동시에 이루어지게 되므로 이것을 한꺼번에 처리하는 범용의 방법도 소개한다. 적용 예제들을 통한 타당성 검증도 이루어진다.

  • PDF

Industrial Clusters and Their Boundaries: A Case Study for Plants in the Cincinnati metropolitan Area (씬씨내티 대도시지역의 산업군집과 경계설정)

  • Lee, Bo-Young
    • Journal of the Korean association of regional geographers
    • /
    • v.6 no.3
    • /
    • pp.169-184
    • /
    • 2000
  • Industrial clusters and their boundaries are identified by factor and hot spot analyses for the greater Cincinnati metropolitan area in USA. While traditional input-output approach identified aspatial industrial clusters, this study combines traditional approach with GIS techniques to identify their boundaries. Combining the results of input-output industrial clusters with the leading industries groups, we have identified five leading industry clusters. They are food (20), chemicals (28), metal manufacturing (32), metal products (33), and machinery (35). We also used hot spot analysis to visualize each industry cluster on the research area by using Arcview software. Determining the degree to which such industries are associated spatially and their spatial delimitation may be an additional approach to measuring the efficiency of the spatial organization of an economy. It is hoped that the industrial clusters and industrial spatial clusters approaches may also proved the basis for the development of new models of the spatial arrangement of industry at a level more aggregated than that of the single plant or firm.

  • PDF

Studies on the Structure of the Forest Community in Mt. Sokri(II) -Analysis on the Plant Community by the Classification and Ordination Techniques- (속리산 삼림군집구조에 관한 연구(II) Classification 및 Ordination 방법에 의한 식생분석 -)

  • 이경재;박인협;조재창;오충현
    • Korean Journal of Environment and Ecology
    • /
    • v.4 no.1
    • /
    • pp.33-43
    • /
    • 1990
  • A survey of Popju Temple district. was conducted using 70 sample plots of 500$m^2$ size. The classification by TWINSPAN and DCA ordination were applied to the study area in order to classify them into several groups based on woody plants and environmental variables. By both techniques. the plant com-munity were divided into six groups by the altitude and soil moisture. The successional trends of tree species seem to be from Pinus densiflora, Sorbus alnifolia through Quercus serrata to Carpinus laxiflora and from P. densiflora, Fraxinus sieboldiana through Q. mongolica in the canopy layer, and from Lespedeza cyrtobotrya, Rhus trichocarpa, Zanthoxylum schnifolium through Rhododendron mucronulatum, Corylus sieboldiana, Lindera obtusiloba, Magnclia sieboldii to Euonymus sieboldianus in the understory and shrub layer. The species diversity of the plant community in the burnt plot was decreased by the forest fire but the importance values of Quercus species were increased in above plot.

  • PDF

A Case Study on Job Analysis Utilizing Cluster Analysis and Community Analysis (군집분석 및 커뮤니티 분석 기법을 활용한 직무분석 사례 연구)

  • Jo, Il-Hyun
    • The Journal of Korean Association of Computer Education
    • /
    • v.7 no.1
    • /
    • pp.151-165
    • /
    • 2004
  • The purpose of the study was to explore the potential of the Cluster Analysis and the Community Analysis of Social Network Analyses family in job-task analysis for curriculum design. These two multivariate analysis techniques were expected to bring us relevant and scientific information as well as inspiration in investigating the structure and nature of job system, which are critical in developing relevant curriculum. To pursue the purpose mentioned above, qualitative and quantitative data were collected from "S" Corporate, a major large high-tech manufacturing company, and analyzed by relevant analytic procedures. Results indicate that there are discrepancies between formal job structures and actual ones. Following Community analysis showed that the presence of center-marginal structure along with clustering structure in the current job formation. Interpretations of the results of the study are provided in light of past research and additional data collected from the study. Implications of the study are also discussed along with suggestions for future research.

  • PDF

Evolutionary Computation-based Hybird Clustring Technique for Manufacuring Time Series Data (제조 시계열 데이터를 위한 진화 연산 기반의 하이브리드 클러스터링 기법)

  • Oh, Sanghoun;Ahn, Chang Wook
    • Smart Media Journal
    • /
    • v.10 no.3
    • /
    • pp.23-30
    • /
    • 2021
  • Although the manufacturing time series data clustering technique is an important grouping solution in the field of detecting and improving manufacturing large data-based equipment and process defects, it has a disadvantage of low accuracy when applying the existing static data target clustering technique to time series data. In this paper, an evolutionary computation-based time series cluster analysis approach is presented to improve the coherence of existing clustering techniques. To this end, first, the image shape resulting from the manufacturing process is converted into one-dimensional time series data using linear scanning, and the optimal sub-clusters for hierarchical cluster analysis and split cluster analysis are derived based on the Pearson distance metric as the target of the transformation data. Finally, by using a genetic algorithm, an optimal cluster combination with minimal similarity is derived for the two cluster analysis results. And the performance superiority of the proposed clustering is verified by comparing the performance with the existing clustering technique for the actual manufacturing process image.

Analysis on the Forest Community of Daewon Vally in Mt. Chiri by the Classification and Ordination Techniques (Classification 및 Ordination 방법에 의한 지리산 대원계곡의 삼림군집구조 분석)

  • 이경재;구관효;최재식;조현서
    • Korean Journal of Environment and Ecology
    • /
    • v.5 no.1
    • /
    • pp.54-67
    • /
    • 1991
  • To investigate the structure of the plant community of Daewon valley forest in Mt. Chiri, eighty-nine plots were set up by the dumped sampling method. The classification by TWINSPAN and DCA ordination were applied to the study area in order to classify item into several groups based on woody plants and environmental variables. The classification had been successfully overlayed on an ordination of the same data using DCA. The plots can be classified into five groups by TWINSPAN and DCA. There are Pinus densiflora community. Quercus variabilis-Q. serrata community. Carpinus laxiflora community. Q. monogolica community and Cornus controversa-Q. mongolica community. The successional trends of tree species by both techniques seem to be from P. densiflora through Q. variabilis, Q. serrata to C. laxiflora on the low altitude and from Q. mongolica to C. controversa on the high altitude in the canopy layer. As a result of the analysis for the relationship between the stand scores of DCA and environmental variables. they had a tendancy to increase significantly from the P. densiflora community to C. laxiflora community that was soil moisture. the amount of soil humus and soil nutrients.

  • PDF

Fuzzy-based Threshold Controlling Method for ART1 Clustering in GPCR Classification (GPCR 분류에서 ART1 군집화를 위한 퍼지기반 임계값 제어 기법)

  • Cho, Kyu-Cheol;Ma, Yong-Beom;Lee, Jong-Sik
    • Journal of the Korea Society of Computer and Information
    • /
    • v.12 no.6
    • /
    • pp.167-175
    • /
    • 2007
  • Fuzzy logic is used to represent qualitative knowledge and provides interpretability to a controlling system model in bioinformatics. This paper focuses on a bioinformatics data classification which is an important bioinformatics application. This paper reviews the two traditional controlling system models The sequence-based threshold controller have problems of optimal range decision for threshold readjustment and long processing time for optimal threshold induction. And the binary-based threshold controller does not guarantee for early system stability in the GPCR data classification for optimal threshold induction. To solve these problems, we proposes a fuzzy-based threshold controller for ART1 clustering in GPCR classification. We implement the proposed method and measure processing time by changing an induction recognition success rate and a classification threshold value. And, we compares the proposed method with the sequence-based threshold controller and the binary-based threshold controller The fuzzy-based threshold controller continuously readjusts threshold values with membership function of the previous recognition success rate. The fuzzy-based threshold controller keeps system stability and improves classification system efficiency in GPCR classification.

  • PDF