• 제목/요약/키워드: 군집 적합도

Search Result 336, Processing Time 0.032 seconds

Drivers Driving Habits Data and Risk Group Cluster Analysis (운전자 행동자료 및 고위험군 군집 분석)

  • Kim, Yong-Chul
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.9 no.2
    • /
    • pp.243-247
    • /
    • 2016
  • Driving Event Data such as the rapid acceleration, the rapid deceleration, the sudden braking, and the sudden departure, and over speeding provide important information to predict or analyze the driving habits and accident risk of a driver. Most of the data that represent the driver's driving habits generally fit to the parametric distribution, whereas extreme parts of the data to estimate the accident risk of a driver may not. This paper presents an empirical distribution that is divided into two regions, one is from the normal distribution, and the other is from the general pareto distribution for the driving habits of a driver.

Morphological Classification of Unit Basin on Soil & Geo-morphological Characteristics in Yeongsan River Basin (토양 및 지형학적 특성에 따른 영산강유역의 소유역 분류)

  • Sonn, Yeon-Kyu;Hyun, Byung-Keun;Jung, Suk-Jae;Hur, Seong-Oh;Jung, Kang-Ho;Seo, Myung-Chul;Ha, Sang-Keun
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2007.05a
    • /
    • pp.246-252
    • /
    • 2007
  • 농업 비점오염원으로부터의 수질 보전이나 수자원 관리는 유역단위로 하여 농업적 관리뿐만 아니라 수질 관리 및 수자원 관리를 위해서도 유역단위 특히, 소유역의 토양특성을 포괄하는 단위로 체계적으로 분류할 필요성이 있다. 우리나라의 남서쪽에 위치한 영산강유역의 50개 소유역을 대상으로 토양도, 지형도, 하천도 및 유역도를 이용하여 만곡도, 산림의 비율, 평탄지의 비율, 다른 소유역으로부터의 유입이 있는지의 여부 등 토양학적으로 중요한 4개의 특성을 가지고 군집분석을 수행하였다. 그 결과 5개의 군으로 구분할 수 있었으며, 이 구분의 적합도를 검정하기 위하여 Mantel test를 한 결과 r = 0.81825로 나타나 적합하다는 결론을 얻었다. 이와 같이 토양과 지형특성을 포괄하는 소유역의 분류 및 유사성에 따른 그룹화는 농업에서의 최적영농관리나 오염물질에 따른 수질관리, 수문모형의 적용성 확대 및 수자원 관리에 합리적 유용성을 제공할 것이며 체계적 관리의 밑바탕이 될 것이다.

  • PDF

Regionalization using cluster probability model and copula based drought frequency analysis (클러스터 확률 모형에 의한 지역화와 코풀라에 의한 가뭄빈도분석)

  • Azam, Muhammad;Choi, Hyun Su;Kim, Hyeong San;Hwang, Ju Ha;Maeng, Seungjin
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2017.05a
    • /
    • pp.46-46
    • /
    • 2017
  • 지역가뭄빈도분석의 분위산정에 대한 신뢰성은 수문학적으로 균일한 지역으로 구분하기 위해 사용된 장기간의 과거 자료와 분석절차에 의해 결정된다. 그러나 극심한 가뭄은 매우 드물게 발생하며 신뢰 할 수 있는 지역빈도분석을 위한 지속기간이 충분치 않는 경우가 많이 발생한다. 이 외에도 우리나라의 복잡한 지형적 및 기후적 특징은 동질한 지역으로 구분하기 위한 통계적인 처리방법이 필요하였다. 본 연구에서 적용한 지역빈도분석은 여러 지역의 다양한 변수인 수문기상 특성을 분석하여 동질한 지역을 확인하고, 주요 가뭄변수(지속 시간 및 심각도)를 통합 적용하여 각각의 동질한 지역 분위를 추정함으로써 동질한 지역을 구분하는 해결책을 제시하였다. 본 연구에서는 가우시안 혼합 모형(Gaussian Mixture Model)을 기반으로 기반 군집분석 방법을 적용하여 최적의 동질한 지역을 구분하고 그 결과를 우도비검정 및 다른 유효성 검사 지수를 이용해서 확인하였다. 가우시안 혼합 모델에서 산정했던 매개변수를 방향저감 공간으로 표현하기 위해서 가우시안 혼합 모델방향 저감(GMMDR)방법을 적용하였다. 이 변수는 가뭄빈도분석을 위해 다양한 분포와 코풀라(copula) 적합도를 이용하여 추정 비교하였다. 그 결과 우리나라를 4개의 동질한 지역으로 나누게 되었다. 가우시안과 Frank copula를 이용한 Pearson type III(PE3) 분포는 우리나라의 가뭄 기간과 심각도의 공동 분포를 추정하는데 적합한 것으로 나타났다.

  • PDF

A New Similarity Measure for Categorical Attribute-Based Clustering (범주형 속성 기반 군집화를 위한 새로운 유사 측도)

  • Kim, Min;Jeon, Joo-Hyuk;Woo, Kyung-Gu;Kim, Myoung-Ho
    • Journal of KIISE:Databases
    • /
    • v.37 no.2
    • /
    • pp.71-81
    • /
    • 2010
  • The problem of finding clusters is widely used in numerous applications, such as pattern recognition, image analysis, market analysis. The important factors that decide cluster quality are the similarity measure and the number of attributes. Similarity measures should be defined with respect to the data types. Existing similarity measures are well applicable to numerical attribute values. However, those measures do not work well when the data is described by categorical attributes, that is, when no inherent similarity measure between values. In high dimensional spaces, conventional clustering algorithms tend to break down because of sparsity of data points. To overcome this difficulty, a subspace clustering approach has been proposed. It is based on the observation that different clusters may exist in different subspaces. In this paper, we propose a new similarity measure for clustering of high dimensional categorical data. The measure is defined based on the fact that a good clustering is one where each cluster should have certain information that can distinguish it with other clusters. We also try to capture on the attribute dependencies. This study is meaningful because there has been no method to use both of them. Experimental results on real datasets show clusters obtained by our proposed similarity measure are good enough with respect to clustering accuracy.

Analysis of Characteristics of Clusters of Middle School Students Using K-Means Cluster Analysis (K-평균 군집분석을 활용한 중학생의 군집화 및 특성 분석)

  • Jaebong, Lee
    • Journal of The Korean Association For Science Education
    • /
    • v.42 no.6
    • /
    • pp.611-619
    • /
    • 2022
  • The purpose of this study is to explore the possibility of applying big data analysis to provide appropriate feedback to students using evaluation data in science education at a time when interest in educational data mining has recently increased in education. In this study, we use the evaluation data of 2,576 students who took 24 questions of the national assessment of educational achievement. And we use K-means cluster analysis as a method of unsupervised machine learning for clustering. As a result of clustering, students were divided into six clusters. The middle-ranking students are divided into various clusters when compared to upper or lower ranks. According to the results of the cluster analysis, the most important factor influencing clusterization is academic achievement, and each cluster shows different characteristics in terms of content domains, subject competencies, and affective characteristics. Learning motivation is important among the affective domains in the lower-ranking achievement cluster, and scientific inquiry and problem-solving competency, as well as scientific communication competency have a major influence in terms of subject competencies. In the content domain, achievement of motion and energy and matter are important factors to distinguish the characteristics of the cluster. As a result, we can provide students with customized feedback for learning based on the characteristics of each cluster. We discuss implications of these results for science education, such as the possibility of using this study results, balanced learning by content domains, enhancement of subject competency, and improvement of scientific attitude.

Design of SAR Satellite Constellation Configuration for ISR Mission (ISR 임무를 위한 SAR 위성의 군집궤도 배치형상 설계)

  • Kim, Hongrae;Song, Sua;Chang, Young-Keun
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.45 no.1
    • /
    • pp.54-62
    • /
    • 2017
  • For the Earth observation satellite for ISR mission, a satellite constellation can be utilized to observe a specific area periodically and ultimately increase the effectiveness of the mission. The Walker-Delta method was applied to design constellation orbits with four satellites, which could detect abnormal activities in AoI(Area of Interest). To evaluate the effectiveness of the mission, a revisiting time was selected as a key requirement. This paper presents the mission analysis process for four SAR satellites constellation as well as the result of constellation configuration design to meet the requirements. Figure of Merits analysis was performed based on algorithm developed. Finally, it was confirmed that the constellation orbit with four different orbital planes is likely to be appropriate for ISR mission.

Adaptive Chain Robots for Effectively Exploring Maze (효과적으로 미로를 찾기 위한 적응적 체인로봇)

  • Cho, Chang-Kwon;Woo, Gyun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2011.04a
    • /
    • pp.275-278
    • /
    • 2011
  • 체인로봇은 단일 로봇보다 강인성 측면에서 유리하기 때문에 무인탐사 환경에 적합하다. 체인로봇을 관리하기 위해서는 선두 로봇을 선정하고 군집을 관리하는 방법이 사용된다. 본 논문에서는 군집로봇의 효과적인 미로 탐색을 위해 체인로봇 그룹의 순위를 미로 환경에 맞게 적응적으로 순위를 재지정 하는 방법인 적응적 체인로봇 알고리즘을 제안한다. 그리고 체인 알고리즘을 적용한 경우와 적응적 체인 알고리즘을 적용한 경우로 나누어 2차원 맵을 탐색하는 실험을 수행하였다. 군집로봇이 맵을 이동하는 과정에서 선두 로봇이 동작 불능 상태가 되는 경우가 발생할 수도 있다. 이때 체인로봇을 적응적으로 순위를 재지정하는 방법을 사용하였다. 본 논문에서 제시한 방법을 시뮬레이션 환경에서 실험하였는데 실험 결과에 따르면 100%의 성공률을 얻을 수 있었다.

Magnifying Block Diagonal Structure for Spectral Clustering (스펙트럼 군집화에서 블록 대각 형태의 유사도 행렬 구성)

  • Heo, Gyeong-Yong;Kim, Kwang-Baek;Woo, Young-Woon
    • Journal of Korea Multimedia Society
    • /
    • v.11 no.9
    • /
    • pp.1302-1309
    • /
    • 2008
  • Traditional clustering methods, like k-means or fuzzy clustering, are prototype-based methods which are applicable only to convex clusters. On the other hand, spectral clustering tries to find clusters only using local similarity information. Its ability to handle concave clusters has gained the popularity recent years together with support vector machine (SVM) which is a kernel-based classification method. However, as is in SVM, the kernel width plays an important role and has a great impact on the result. Several methods are proposed to decide it automatically, it is still determined based on heuristics. In this paper, we proposed an adaptive method deciding the kernel width based on distance histogram. The proposed method is motivated by the fact that the affinity matrix should be formed into a block diagonal matrix to generate the best result. We use the tradition Euclidean distance together with the random walk distance, which make it possible to form a more apparent block diagonal affinity matrix. Experimental results show that the proposed method generates more clear block structured affinity matrix than the existing one does.

  • PDF

A Study on Visitors' Differences of Satisfaction Level Due to Lifestyle in Bukhansan National Park, Korea (북한산 국립공원 탐방객의 라이프 스타일에 따른 만족도 차이에 관한 연구)

  • Yoo, Ki Joon
    • Korean Journal of Environment and Ecology
    • /
    • v.30 no.5
    • /
    • pp.915-921
    • /
    • 2016
  • This study aims at analysing the differences on visitors' satisfactory level based-on personal lifestyles in a Korean national park. The lifestyles were classified and differences of satisfaction level among the groups were verified from a questionnaire survey in Bukhansan National Park, Korea. Lifestyles by the factor analysis, 5 factors were sorted. 4 clusters classified as the suitable clusters by clustering analysis. As a result from differences of satisfaction level on the lifestyle types, satisfaction level for each cluster has been shown to be statistically meaningful difference. Satisfaction level in active tendency cluster was the highest among the 4 cluster types(Social>Individual>Passive cluster). Satisfaction differences according to the lifestyles suggests that a fragmented approach is needed rather than an integrated approach for managing facilities and programs in Korean national park system.

Comparison between Planned and Actual Data of Block Assembly Process using Process Mining in Shipyards (조선 산업에서 프로세스 마이닝을 이용한 블록 조립 프로세스의 계획 및 실적 비교 분석)

  • Lee, Dongha;Park, Jae Hun;Bae, Hyerim
    • The Journal of Society for e-Business Studies
    • /
    • v.18 no.4
    • /
    • pp.145-167
    • /
    • 2013
  • This paper proposes a method to compare planned processes with actual processes of bock assembly operations in shipbuilding industry. Process models can be discovered using the process mining techniques both for planned and actual log data. The comparison between planned and actual process is focused in this paper. The analysis procedure consists of five steps : 1) data pre-processing, 2) definition of analysis level, 3) clustering of assembly bocks, 4) discovery of process model per cluster, and 5) comparison between planned and actual processes per cluster. In step 5, it is proposed to compare those processes by the several perspectives such as process model, task, process instance and fitness. For each perspective, we also defined comparison factors. Especially, in the fitness perspective, cross fitness is proposed and analyzed by the quantity of fitness between the discovered process model by own data and the other data(for example, the fitness of planned model to actual data, and the fitness of actual model to planned data). The effectiveness of the proposed methods was verified in a case study using planned data of block assembly planning system (BAPS) and actual data generated from block assembly monitoring system (BAMS) of a top ranked shipbuilding company in Korea.