• Title/Summary/Keyword: Cluster Computing

Search Result 427, Processing Time 0.029 seconds

Semantic Cloud Resource Recommendation Using Cluster Analysis in Hybrid Cloud Computing Environment (군집분석을 이용한 하이브리드 클라우드 컴퓨팅 환경에서의 시맨틱 클라우드 자원 추천 서비스 기법)

  • Ahn, Younsun;Kim, Yoonhee
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.4 no.9
    • /
    • pp.283-288
    • /
    • 2015
  • Scientists gain benefits from on-demand scalable resource provisioning, and various computing environments by using cloud computing resources for their applications. However, many cloud computing service providers offer their cloud resources according to their own policies. The descriptions of resource specification are diverse among vendors. Subsequently, it becomes difficult to find suitable cloud resources according to the characteristics of an application. Due to limited understanding of resource availability, scientists tend to choose resources used in previous experiments or over-performed resources without considering the characteristics of their applications. The need for standardized notations on diverse cloud resources without the constraints of complicated specification given by providers leads to active studies on intercloud to support interoperability in hybrid cloud environments. However, projects related to intercloud studies are limited as they are short of expertise in application characteristics. We define an intercloud resource classification and propose semantic resource recommendation based on statistical analysis to provide semantic cloud resource services for an application in hybrid cloud computing environments. The scheme proves benefits on resource availability and cost-efficiency with choosing semantically similar cloud resources using cluster analysis while considering application characteristics.

A Cloud-based Big Data System for Performance Comparison of Edge Computing (Edge Computing 성능 비교를 위한 Cloud 기반 빅데이터 시스템 구축 방안)

  • Lim, Hwan-Hee;Lee, Tae-Ho;Lee, Byung-Jun;Kim, Kyung-Tae;Youn, Hee-Yong
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2019.01a
    • /
    • pp.5-6
    • /
    • 2019
  • Edge Computing에서 발생하는 데이터 분석에 대한 알고리즘의 성능 평가나 검증은 필수적이다. 이러한 평가 및 검증을 위해서는 비교 가능한 데이터가 필요하다. 본 논문에서는 Edge Computing에서 발생하는 데이터에 대한 분석 결과 및 Computing Resource에 대한 성능평가를 위해 Cloud 기반의 빅 데이터 분석시스템을 구축한다. Edge Computing 비교분석 빅 데이터 시스템은 실제 IoT 노드에서 Edge Computing을 수행할 때와 유사한 환경을 Cloud 상에 구축하고 연구되는 Edge Computing 알고리즘을 Data Analysis Cluster Container에 탑재해 분석을 시행한다. 그리고 분석 결과와 Computing Resource 사용률 데이터를 기존 IoT 노드 Edge Computing 데이터와 비교하여 개선점을 도출하는 것이 본 논문의 목표이다.

  • PDF

Efficient Processing of Huge Airborne Laser Scanned Data Utilizing Parallel Computing and Virtual Grid (병렬처리와 가상격자를 이용한 대용량 항공 레이저 스캔 자료의 효율적인 처리)

  • Han, Soo-Hee;Heo, Joon;Lkhagva, Enkhbaatar
    • Journal of Korea Spatial Information System Society
    • /
    • v.10 no.4
    • /
    • pp.21-26
    • /
    • 2008
  • A method for processing huge airborne laser scanned data using parallel computing and virtual grid is proposed and the method is tested by generating raster DSM(Digital Surface Model) with IDW(Inverse Distance Weighting). Parallelism is involved for fast interpolation of huge point data and virtual grid is adopted for enhancing searching efficiency of irregularly distributed point data. Processing time was checked for the method using cluster constituted of one master node and six slave nodes, resulting in efficiency near to 1 and load scalability property. Also large data which cannot be processed with a sole system was processed with cluster system.

  • PDF

A Cluster Validity Index Using Overlap and Separation Measures Between Fuzzy Clusters (클러스터간 중첩성과 분리성을 이용한 퍼지 분할의 평가 기법)

  • Kim, Dae-Won;Lee, Kwang-H.
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.13 no.4
    • /
    • pp.455-460
    • /
    • 2003
  • A new cluster validity index is proposed that determines the optimal partition and optimal number of clusters for fuzzy partitions obtained from the fuzzy c-means algorithm. The proposed validity index exploits an overlap measure and a separation measure between clusters. The overlap measure is obtained by computing an inter-cluster overlap. The separation measure is obtained by computing a distance between fuzzy clusters. A good fuzzy partition is expected to have a low degree of overlap and a larger separation distance. Testing of the proposed index and nine previously formulated indexes on well-known data sets showed the superior effectiveness and reliability of the proposed index in comparison to other indexes.

Application of Parallel PSO Algorithm based on PC Cluster System for Solving Optimal Power Flow Problem (PC 클러스터 시스템 기반 병렬 PSO 알고리즘의 최적조류계산 적용)

  • Kim, Jong-Yul;Moon, Kyoung-Jun;Lee, Haw-Seok;Park, June-Ho
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.56 no.10
    • /
    • pp.1699-1708
    • /
    • 2007
  • The optimal power flow(OPF) problem was introduced by Carpentier in 1962 as a network constrained economic dispatch problem. Since then, the OPF problem has been intensively studied and widely used in power system operation and planning. In these days, OPF is becoming more and more important in the deregulation environment of power pool and there is an urgent need of faster solution technique for on-line application. To solve OPF problem, many heuristic optimization methods have been developed, such as Genetic Algorithm(GA), Evolutionary Programming(EP), Evolution Strategies(ES), and Particle Swarm Optimization(PSO). Especially, PSO algorithm is a newly proposed population based heuristic optimization algorithm which was inspired by the social behaviors of animals. However, population based heuristic optimization methods require higher computing time to find optimal point. This shortcoming is overcome by a straightforward parallel processing of PSO algorithm. The developed parallel PSO algorithm is implemented on a PC cluster system with 6 Intel Pentium IV 2GHz processors. The proposed approach has been tested on the IEEE 30-bus system. The results showed that computing time of parallelized PSO algorithm can be reduced by parallel processing without losing the quality of solution.

A Clustering Approach for Feature Selection in Microarray Data Classification Using Random Forest

  • Aydadenta, Husna;Adiwijaya, Adiwijaya
    • Journal of Information Processing Systems
    • /
    • v.14 no.5
    • /
    • pp.1167-1175
    • /
    • 2018
  • Microarray data plays an essential role in diagnosing and detecting cancer. Microarray analysis allows the examination of levels of gene expression in specific cell samples, where thousands of genes can be analyzed simultaneously. However, microarray data have very little sample data and high data dimensionality. Therefore, to classify microarray data, a dimensional reduction process is required. Dimensional reduction can eliminate redundancy of data; thus, features used in classification are features that only have a high correlation with their class. There are two types of dimensional reduction, namely feature selection and feature extraction. In this paper, we used k-means algorithm as the clustering approach for feature selection. The proposed approach can be used to categorize features that have the same characteristics in one cluster, so that redundancy in microarray data is removed. The result of clustering is ranked using the Relief algorithm such that the best scoring element for each cluster is obtained. All best elements of each cluster are selected and used as features in the classification process. Next, the Random Forest algorithm is used. Based on the simulation, the accuracy of the proposed approach for each dataset, namely Colon, Lung Cancer, and Prostate Tumor, achieved 85.87%, 98.9%, and 89% accuracy, respectively. The accuracy of the proposed approach is therefore higher than the approach using Random Forest without clustering.

A Methodology for Performance Modeling and Prediction of Large-Scale Cluster Servers (대규모 클러스터 서버의 성능 모델링 및 예측 방법론)

  • Jang, Hye-Churn;Jin, Hyun-Wook;Kim, Hag-Young
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.11
    • /
    • pp.1041-1045
    • /
    • 2010
  • Clusters can provide scalable and flexible architectures for parallel computing servers and data centers. Their performance prediction has been a very challenging issue. Existing performance measurement methodologies are able to measure the performance of servers already constructed. Thus they cannot provide a way to predict the overall system performance in advance when designing the system at the initial phase or adding more nodes for more capacity. Therefore, the performance modeling and prediction methodology for large-scale clusters is highly required. In this paper, we suggest a methodology to predict the performance of large-scale clusters, which consists of measurement, modeling and prediction steps. We apply the methodology to a real cluster server and show its usefulness.

Implementation of AIoT Edge Cluster System via Distributed Deep Learning Pipeline

  • Jeon, Sung-Ho;Lee, Cheol-Gyu;Lee, Jae-Deok;Kim, Bo-Seok;Kim, Joo-Man
    • International journal of advanced smart convergence
    • /
    • v.10 no.4
    • /
    • pp.278-288
    • /
    • 2021
  • Recently, IoT systems are cloud-based, so that continuous and large amounts of data collected from sensor nodes are processed in the data server through the cloud. However, in the centralized configuration of large-scale cloud computing, computational processing must be performed at a physical location where data collection and processing take place, and the need for edge computers to reduce the network load of the cloud system is gradually expanding. In this paper, a cluster system consisting of 6 inexpensive Raspberry Pi boards was constructed to perform fast data processing. And we propose "Kubernetes cluster system(KCS)" for processing large data collection and analysis by model distribution and data pipeline method. To compare the performance of this study, an ensemble model of deep learning was built, and the accuracy, processing performance, and processing time through the proposed KCS system and model distribution were compared and analyzed. As a result, the ensemble model was excellent in accuracy, but the KCS implemented as a data pipeline proved to be superior in processing speed..

Multi-communication layered HPL model and its application to GPU clusters

  • Kim, Young Woo;Oh, Myeong-Hoon;Park, Chan Yeol
    • ETRI Journal
    • /
    • v.43 no.3
    • /
    • pp.524-537
    • /
    • 2021
  • High-performance Linpack (HPL) is among the most popular benchmarks for evaluating the capabilities of computing systems and has been used as a standard to compare the performance of computing systems since the early 1980s. In the initial system-design stage, it is critical to estimate the capabilities of a system quickly and accurately. However, the original HPL mathematical model based on a single core and single communication layer yields varying accuracy for modern processors and accelerators comprising large numbers of cores. To reduce the performance-estimation gap between the HPL model and an actual system, we propose a mathematical model for multi-communication layered HPL. The effectiveness of the proposed model is evaluated by applying it to a GPU cluster and well-known systems. The results reveal performance differences of 1.1% on a single GPU. The GPU cluster and well-known large system show 5.5% and 4.1% differences on average, respectively. Compared to the original HPL model, the proposed multi-communication layered HPL model provides performance estimates within a few seconds and a smaller error range from the processor/accelerator level to the large system level.

A Container Orchestration System for Process Workloads

  • Jong-Sub Lee;Seok-Jae Moon
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.15 no.4
    • /
    • pp.270-278
    • /
    • 2023
  • We propose a container orchestration system for process workloads that combines the potential of big data and machine learning technologies to integrate enterprise process-centric workloads. This proposed system analyzes big data generated from industrial automation to identify hidden patterns and build a machine learning prediction model. For each machine learning case, training data is loaded into a data store and preprocessed for model training. In the next step, you can use the training data to select and apply an appropriate model. Then evaluate the model using the following test data: This step is called model construction and can be performed in a deployment framework. Additionally, a visual hierarchy is constructed to display prediction results and facilitate big data analysis. In order to implement parallel computing of PCA in the proposed system, several virtual systems were implemented to build the cluster required for the big data cluster. The implementation for evaluation and analysis built the necessary clusters by creating multiple virtual machines in a big data cluster to implement parallel computation of PCA. The proposed system is modeled as layers of individual components that can be connected together. The advantage of a system is that components can be added, replaced, or reused without affecting the rest of the system.