• Title/Summary/Keyword: cluster method

Search Result 2,498, Processing Time 0.031 seconds

Environmental Dependence of High-redshift Galaxies in CFHTLS W2 Field

  • Paek, Insu;Im, Myungshin;Kim, Jae-Woo
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.43 no.2
    • /
    • pp.36.1-36.1
    • /
    • 2018
  • Star formation activity of galaxies, along with color and morphology, show significant environmental dependence in local universe, where galaxies in dense environment tend to be more quiescent and redder. However, many studies show that such environmental dependence does not continue at higher redshifts beyond z~1. The question of how the environmental dependence of galactic properties have developed over time is crucial to understanding cosmic galactic evolution. By combining data from Canada-France-Hawaii Telescope Legacy Survey(CFHTLS), Infrared Medium-Deep Survey(IMS), and other surveys, the photometric redshifts of galaxies in CFHTLS W2 field were estimated by fitting spectral energy distribution. The distribution of galaxies was mapped in redshift bins of 0.05 interval from 0.6 to 1.4. For each redshift bin, the number density was mapped. The galaxies in high density regions were grouped into clusters using friend-of-friend method. The color of galaxies were analyzed to study the correlation with redshift as well as environmental difference between field galaxies and cluster member galaxies.

  • PDF

Community Discovery in Weighted Networks Based on the Similarity of Common Neighbors

  • Liu, Miaomiao;Guo, Jingfeng;Chen, Jing
    • Journal of Information Processing Systems
    • /
    • v.15 no.5
    • /
    • pp.1055-1067
    • /
    • 2019
  • In view of the deficiencies of existing weighted similarity indexes, a hierarchical clustering method initialize-expand-merge (IEM) is proposed based on the similarity of common neighbors for community discovery in weighted networks. Firstly, the similarity of the node pair is defined based on the attributes of their common neighbors. Secondly, the most closely related nodes are fast clustered according to their similarity to form initial communities and expand the communities. Finally, communities are merged through maximizing the modularity so as to optimize division results. Experiments are carried out on many weighted networks, which have verified the effectiveness of the proposed algorithm. And results show that IEM is superior to weighted common neighbor (CN), weighted Adamic-Adar (AA) and weighted resources allocation (RA) when using the weighted modularity as evaluation index. Moreover, the proposed algorithm can achieve more reasonable community division for weighted networks compared with cluster-recluster-merge-algorithm (CRMA) algorithm.

A Study of Stable Intrusion Detection for MANET (MANET에서 안정된 침입탐지에 관한 연구)

  • Yang, Hwan-Seok;Yang, Jeong-Mo
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.22 no.1
    • /
    • pp.93-98
    • /
    • 2012
  • MANET composed of only moving nodes is concerned to core technology to construct ubiquitous computing environment. Also, it is a lack of security because of no middle infrastructure. So, it is necessary to intrusion detection system which can track malicious attack. In this study, cluster was used to stable intrusion detection, and rule about various attacks was defined to detect accurately attack that seems like network problem. Proposed method through experience was confirmed that stable detection rate was showed although number of nodes increase.

Enhancing Document Clustering Method using Synonym of Cluster Topic and Similarity (군집 주제의 유의어와 유사도를 이용한 문서군집 향상 방법)

  • Park, Sun;Kim, Chul-Won
    • Annual Conference of KIPS
    • /
    • 2011.04a
    • /
    • pp.1538-1541
    • /
    • 2011
  • 본 논문은 군집 주제의 유의어와 유사도를 이용하여 문서군집의 성능을 향상시키는 방법을 제안한다. 제안된 방법은 비음수행렬분해의 의미특징을 이용하여 군집 주제(topic)의 용어들을 선택함으로서 문서 군집 집합의 내부구조를 잘 표현할 수 있으며, 군집 주제의 용어들에 워드넷의 유의어를 사용하여서 확장함으로써 문서를 용어집합(bag-of-words)으로 표현하는 문제를 해결할 수 있다. 또한 확장된 군집 주제의 용어와 문서집합에 코사인 유사도를 이용하여서 군집의 주제에 적합한 문서를 잘 군집하여서 성능을 높일 수 있다. 실험결과 제안방법을 적용한 문서군집방법이 다른 문서군집 방법에 비하여 좋은 성능을 보인다.

The use of support vector machines in semi-supervised classification

  • Bae, Hyunjoo;Kim, Hyungwoo;Shin, Seung Jun
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.2
    • /
    • pp.193-202
    • /
    • 2022
  • Semi-supervised learning has gained significant attention in recent applications. In this article, we provide a selective overview of popular semi-supervised methods and then propose a simple but effective algorithm for semi-supervised classification using support vector machines (SVM), one of the most popular binary classifiers in a machine learning community. The idea is simple as follows. First, we apply the dimension reduction to the unlabeled observations and cluster them to assign labels on the reduced space. SVM is then employed to the combined set of labeled and unlabeled observations to construct a classification rule. The use of SVM enables us to extend it to the nonlinear counterpart via kernel trick. Our numerical experiments under various scenarios demonstrate that the proposed method is promising in semi-supervised classification.

COUNTING OF FLOWERS BASED ON K-MEANS CLUSTERING AND WATERSHED SEGMENTATION

  • PAN ZHAO;BYEONG-CHUN SHIN
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • v.27 no.2
    • /
    • pp.146-159
    • /
    • 2023
  • This paper proposes a hybrid algorithm combining K-means clustering and watershed algorithms for flower segmentation and counting. We use the K-means clustering algorithm to obtain the main colors in a complex background according to the cluster centers and then take a color space transformation to extract pixel values for the hue, saturation, and value of flower color. Next, we apply the threshold segmentation technique to segment flowers precisely and obtain the binary image of flowers. Based on this, we take the Euclidean distance transformation to obtain the distance map and apply it to find the local maxima of the connected components. Afterward, the proposed algorithm adaptively determines a minimum distance between each peak and apply it to label connected components using the watershed segmentation with eight-connectivity. On a dataset of 30 images, the test results reveal that the proposed method is more efficient and precise for the counting of overlapped flowers ignoring the degree of overlap, number of overlap, and relatively irregular shape.

Study on Hardware Performance Counter Data Collection Method and Overhead in Cluster System (클러스터 시스템에서 하드웨어 퍼포먼스 카운터 데이터 수집 방법 및 오버헤드 연구)

  • Park, Guenchul;Park, Chan-Yeol;Rho, Seungwoo;Choi, Ji Eun
    • Annual Conference of KIPS
    • /
    • 2020.11a
    • /
    • pp.106-108
    • /
    • 2020
  • 대부분의 최신 마이크로 프로세서에서 사용 가능한 하드웨어 퍼포먼스 카운터는 시스템과 어플리케이션의 상태를 모니터링, 분석 및 최적화하는 다양한 용도로 폭넓게 사용되고 있다. 적은 오버헤드로 시스템의 가장 기본적인 정보를 수집할 수 있기 때문에 다양한 분야에서 활용이 가능하다. 이러한 퍼포먼스 카운터는 리눅스에 내장되어 있는 퍼프 이벤트를 통하여 수집 할 수 있는데 클러스터 시스템에서는 단일 노드에서와는 다른 방법을 사용하여 이벤트를 수집해야 한다. 본 연구에서는 클러스터 시스템에서 하드웨어 퍼포먼스 카운터를 수집하는 방법과 오버헤드에 대하여 연구하여 카운터의 활용을 지원하고자 한다.

Variance estimation of a double expanded estimator for two-phase sampling

  • Mingue Park
    • Communications for Statistical Applications and Methods
    • /
    • v.30 no.4
    • /
    • pp.403-410
    • /
    • 2023
  • Two-Phase sampling, which was first introduced by Neyman (1938), has various applications in different forms. Variance estimation for two-phase sampling has been an important research topic because conventional variance estimators used in most softwares are not working. In this paper, we considered a variance estimation for two-phase sampling in which stratified two-stage cluster sampling designs are used in both phases. By defining a conditionally unbiased estimator of an approximate variance estimator, which is calculable when all elements in the first phase sample are observed, we propose an explicit form of variance estimator of the double expanded estimator for a two-phase sample. A small simulation study shows the proposed variance estimator has a negligible bias with small variance. The suggested variance estimator is also applicable to other linear estimators of the population total or mean if appropriate residuals are defined.

Spatiotemporal Impact Assessments of Highway Construction: Autonomous SWAT Modeling

  • Choi, Kunhee;Bae, Junseo
    • International conference on construction engineering and project management
    • /
    • 2015.10a
    • /
    • pp.294-298
    • /
    • 2015
  • In the United States, the completion of Construction Work Zone (CWZ) impact assessments for all federally-funded highway infrastructure improvement projects is mandated, yet it is regarded as a daunting task for state transportation agencies, due to a lack of standardized analytical methods for developing sounder Transportation Management Plans (TMPs). To circumvent these issues, this study aims to create a spatiotemporal modeling framework, dubbed "SWAT" (Spatiotemporal Work zone Assessment for TMPs). This study drew a total of 43,795 traffic sensor reading data collected from heavily trafficked highways in U.S. metropolitan areas. A multilevel-cluster-driven analysis characterized traffic patterns, while being verified using a measurement system analysis. An artificial neural networks model was created to predict potential 24/7 traffic demand automatically, and its predictive power was statistically validated. It is proposed that the predicted traffic patterns will be then incorporated into a what-if scenario analysis that evaluates the impact of numerous alternative construction plans. This study will yield a breakthrough in automating CWZ impact assessments with the first view of a systematic estimation method.

  • PDF

Emerging Machine Learning in Wearable Healthcare Sensors

  • Gandha Satria Adi;Inkyu Park
    • Journal of Sensor Science and Technology
    • /
    • v.32 no.6
    • /
    • pp.378-385
    • /
    • 2023
  • Human biosignals provide essential information for diagnosing diseases such as dementia and Parkinson's disease. Owing to the shortcomings of current clinical assessments, noninvasive solutions are required. Machine learning (ML) on wearable sensor data is a promising method for the real-time monitoring and early detection of abnormalities. ML facilitates disease identification, severity measurement, and remote rehabilitation by providing continuous feedback. In the context of wearable sensor technology, ML involves training on observed data for tasks such as classification and regression with applications in clinical metrics. Although supervised ML presents challenges in clinical settings, unsupervised learning, which focuses on tasks such as cluster identification and anomaly detection, has emerged as a useful alternative. This review examines and discusses a variety of ML algorithms such as Support Vector Machines (SVM), Random Forests (RF), Decision Trees (DT), Neural Networks (NN), and Deep Learning for the analysis of complex clinical data.