Search | Korea Science

Shared Distributed Big-Data Processing Platform Model: a Study (대용량 분산처리 플랫폼 공유 모델 연구)

Jeong, Hwanjin;Kang, Taeho;Kim, GyuSeok;Shin, YoungHo;Jeong, Jinkyu
- KIISE Transactions on Computing Practices
- /
- v.22 no.11
- /
- pp.601-613
- /
- 2016
With the increasing need for big data processing, building a shared big data processing platform is important to minimize time and monetary costs. In shared big data processing, multitenancy is a major requirement that needs to be addressed, in order to provide a single isolated personal big data platform for each user, but to share the underlying hardware is shared among users to increase hardware utilization. In this paper, we explore two well-known shared big data processing platform models. One is to use a native Hadoop cluster, and the other is to build a virtual Hadoop cluster for each user. For each model we verified whether it is sufficient to support multi-tenancy. We also present a method to complement unsupported multi-tenancy features in a native Hadoop cluster model. Lastly we built prototype platforms and compared the performance of both models.
https://doi.org/10.5626/KTCP.2016.22.11.601 인용 KSCI

Load Balancing of Heterogeneous Workstation Cluster based on Relative Load Index (상대적 부하 색인을 기반으로 한 이기종 워크스테이션 클러스터의 부하 균형)

Ji, Byoung-Jun;Lee, Kwang-Mo
- Journal of KIISE:Computing Practices and Letters
- /
- v.8 no.2
- /
- pp.183-194
- /
- 2002
The clustering environment with heterogeneous workstations provides the cost effectiveness and usability for executing applications in parallel. Load balancing is considered a necessary feature for a cluster of heterogeneous workstations to minimize the turnaround time. Previously, static load balancing that assigns a predetermined weight for the processing capability of each workstation, or dynamic approaches which execute a benchmark program to get relative processing capability of each workstation were proposed. The execution of the benchmark program, which has nothing to do with the application being executed, consumes the computation time and the overall turnaround time is delayed. In this paper, we present efficient methods for task distribution and task migration, based on the relative load index. We designed and implemented a load balancing system for the clustering environment with heterogeneous workstations. Turnaround times of our methods and the round-robin approach, as well as the load balancing method using a benchmark program, were compared. The experimental results show that our methods outperform all the other methods that we compared.
PDF KSCI

Parallel Genetic Algorithm-Tabu Search Using PC Cluster System for Optimal Reconfiguration of Distribution Systems (배전계통 최적 재구성 문제에 PC 클러스터 시스템을 이용한 병렬 유전 알고리즘-타부 탐색법 구현)

Mun Kyeong-Jun;Song Myoung-Kee;Kim Hyung-Su;Kim Chul-Hong;Park June Ho;Lee Hwa-Seok
- The Transactions of the Korean Institute of Electrical Engineers A
- /
- v.53 no.10
- /
- pp.556-564
- /
- 2004
This paper presents an application of parallel Genetic Algorithm-Tabu Search(GA-TS) algorithm to search an optimal solution of a reconfiguration in distribution system. The aim of the reconfiguration of distribution systems is to determine switch position to be opened for loss minimization in the radial distribution systems, which is a discrete optimization problem. This problem has many constraints and very difficult to solve the optimal switch position because it has many local minima. This paper develops parallel GA-TS algorithm for reconfiguration of distribution systems. In parallel GA-TS, GA operators are executed for each processor. To prevent solution of low fitness from appearing in the next generation, strings below the average fitness are saved in the tabu list. If best fitness of the GA is not changed for several generations, TS operators are executed for the upper 10% of the population to enhance the local searching capabilities. With migration operation, best string of each node is transferred to the neighboring node aster predetermined iterations are executed. For parallel computing, we developed a PC-cluster system consisting of 8 PCs. Each PC employs the 2 GHz Pentium Ⅳ CPU and is connected with others through ethernet switch based fast ethernet. To show the usefulness of the proposed method, developed algorithm has been tested and compared on a distribution systems in the reference paper. From the simulation results, we can find that the proposed algorithm is efficient and robust for the reconfiguration of distribution system in terms of the solution qualify. speedup. efficiency and computation time.
PDF KSCI

An Outlier Cluster Detection Technique for Real-time Network Intrusion Detection Systems (실시간 네트워크 침입탐지 시스템을 위한 아웃라이어 클러스터 검출 기법)

Chang, Jae-Young;Park, Jong-Myoung;Kim, Han-Joon
- Journal of Internet Computing and Services
- /
- v.8 no.6
- /
- pp.43-53
- /
- 2007
Intrusion detection system(IDS) has recently evolved while combining signature-based detection approach with anomaly detection approach. Although signature-based IDS tools have been commonly used by utilizing machine learning algorithms, they only detect network intrusions with already known patterns, Ideal IDS tools should always keep the signature database of your detection system up-to-date. The system needs to generate the signatures to detect new possible attacks while monitoring and analyzing incoming network data. In this paper, we propose a new outlier cluster detection algorithm with density (or influence) function, Our method assumes that an outlier is a kind of cluster with similar instances instead of a single object in the context of network intrusion, Through extensive experiments using KDD 1999 Cup Intrusion Detection dataset. we show that the proposed method outperform the conventional outlier detection method using Euclidean distance function, specially when attacks occurs frequently.
PDF

PC Cluster based Parallel Adaptive Evolutionary Algorithm for Service Restoration of Distribution Systems

Mun, Kyeong-Jun;Lee, Hwa-Seok;Park, June-Ho;Kim, Hyung-Su;Hwang, Gi-Hyun
- Journal of Electrical Engineering and Technology
- /
- v.1 no.4
- /
- pp.435-447
- /
- 2006
This paper presents an application of the parallel Adaptive Evolutionary Algorithm (AEA) to search an optimal solution of the service restoration in electric power distribution systems, which is a discrete optimization problem. The main objective of service restoration is, when a fault or overload occurs, to restore as much load as possible by transferring the de-energized load in the out of service area via network reconfiguration to the appropriate adjacent feeders at minimum operational cost without violating operating constraints. This problem has many constraints and it is very difficult to find the optimal solution because of its numerous local minima. In this investigation, a parallel AEA was developed for the service restoration of the distribution systems. In parallel AEA, a genetic algorithm (GA) and an evolution strategy (ES) in an adaptive manner are used in order to combine the merits of two different evolutionary algorithms: the global search capability of the GA and the local search capability of the ES. In the reproduction procedure, proportions of the population by GA and ES are adaptively modulated according to the fitness. After AEA operations, the best solutions of AEA processors are transferred to the neighboring processors. For parallel computing, a PC cluster system consisting of 8 PCs was developed. Each PC employs the 2 GHz Pentium IV CPU and is connected with others through switch based fast Ethernet. To show the validity of the proposed method, the developed algorithm has been tested with a practical distribution system in Korea. From the simulation results, the proposed method found the optimal service restoration strategy. The obtained results were the same as that of the explicit exhaustive search method. Also, it is found that the proposed algorithm is efficient and robust for service restoration of distribution systems in terms of solution quality, speedup, efficiency, and computation time.
https://doi.org/10.5370/JEET.2006.1.4.435 인용 PDF KSCI

A Dynamic Co-scheduling Scheme for MPI-based Parallel Programs on Linux Clusters (리눅스 클러스터에서 MPI 기반 병렬 프로그램의 동적 동시 스케줄링 기법)

Kim, Hyuk;Rhee, Yun-Seok
- Journal of the Korea Society of Computer and Information
- /
- v.13 no.1
- /
- pp.29-35
- /
- 2008
For efficient message passing of Parallel programs, it is required to schedule the involved two processes at the same time which are executed on different nodes, that is called 'co-scheduling' However, each node of cluster systems is built on top of general purpose multitasking OS. which autonomously manages local Processes. Thus it is not so easy to co-schedule two (or more) processes in such computing environment. Our work proposes a co-scheduling scheme for MPI-based parallel programs which exploits message exchange information between two parties. We implement the scheme on Linux cluster which requires slight kernel hacking and MPI library modification. The experiment with NPB parallel suite shows that our scheme results in 33-56% reduction in the execution time compared to the typical scheduling case. and especially better Performance in more communication-bound applications.
PDF

Parallel Genetic Algorithm-Tabu Search Using PC Cluster System for Optimal Reconfiguration of Distribution Systems

Mun Kyeong-Jun;Lee Hwa-Seok;Park June-Ho
- KIEE International Transactions on Power Engineering
- /
- v.5A no.2
- /
- pp.116-124
- /
- 2005
This paper presents an application of the parallel Genetic Algorithm-Tabu Search (GA- TS) algorithm, and that is to search for an optimal solution of a reconfiguration in distribution systems. The aim of the reconfiguration of distribution systems is to determine the appropriate switch position to be opened for loss minimization in radial distribution systems, which is a discrete optimization problem. This problem has many constraints and it is very difficult to solve the optimal switch position because of its numerous local minima. This paper develops a parallel GA- TS algorithm for the reconfiguration of distribution systems. In parallel GA-TS, GA operators are executed for each processor. To prevent solution of low fitness from appearing in the next generation, strings below the average fitness are saved in the tabu list. If best fitness of the GA is not changed for several generations, TS operators are executed for the upper 10$\%$ of the population to enhance the local searching capabilities. With migration operation, the best string of each node is transferred to the neighboring node after predetermined iterations are executed. For parallel computing, we developed a PC-cluster system consisting of 8 PCs. Each PC employs the 2 GHz Pentium IV CPU and is connected with others through switch based rapid Ethernet. To demonstrate the usefulness of the proposed method, the developed algorithm was tested and is compared to a distribution system in the reference paper From the simulation results, we can find that the proposed algorithm is efficient and robust for the reconfiguration of distribution system in terms of the solution quality, speedup, efficiency, and computation time.
PDF KSCI

Distribution System Reconfiguration Using the PC Cluster based Parallel Adaptive Evolutionary Algorithm

Mun Kyeong-Jun;Lee Hwa-Seok;Park June Ho;Hwang Gi-Hyun;Yoon Yoo-Soo
- KIEE International Transactions on Power Engineering
- /
- v.5A no.3
- /
- pp.269-279
- /
- 2005
This paper presents an application of the parallel Adaptive Evolutionary Algorithm (AEA) to search an optimal solution of a reconfiguration in distribution systems. The aim of the reconfiguration is to determine the appropriate switch position to be opened for loss minimization in radial distribution systems, which is a discrete optimization problem. This problem has many constraints and it is very difficult to find the optimal switch position because of its numerous local minima. In this investigation, a parallel AEA was developed for the reconfiguration of the distribution system. In parallel AEA, a genetic algorithm (GA) and an evolution strategy (ES) in an adaptive manner are used in order to combine the merits of two different evolutionary algorithms: the global search capability of GA and the local search capability of ES. In the reproduction procedure, proportions of the population by GA and ES are adaptively modulated according to the fitness. After AEA operations, the best solutions of AEA processors are transferred to the neighboring processors. For parallel computing, a PC-cluster system consisting of 8 PCs·was developed. Each PC employs the 2 GHz Pentium IV CPU, and is connected with others through switch based fast Ethernet. The new developed algorithm has been tested and is compared to distribution systems in the reference paper to verify the usefulness of the proposed method. From the simulation results, it is found that the proposed algorithm is efficient and robust for distribution system reconfiguration in terms of the solution quality, speedup, efficiency, and computation time.
PDF KSCI

Deployment and Performance Analysis of Data Transfer Node Cluster for HPC Environment (HPC 환경을 위한 데이터 전송 노드 클러스터 구축 및 성능분석)

Hong, Wontaek;An, Dosik;Lee, Jaekook;Moon, Jeonghoon;Seok, Woojin
- KIPS Transactions on Computer and Communication Systems
- /
- v.9 no.9
- /
- pp.197-206
- /
- 2020
Collaborative research in science applications based on HPC service needs rapid transfers of massive data between research colleagues over wide area network. With regard to this requirement, researches on enhancing data transfer performance between major superfacilities in the U.S. have been conducted recently. In this paper, we deploy multiple data transfer nodes(DTNs) over high-speed science networks in order to move rapidly large amounts of data in the parallel filesystem of KISTI's Nurion supercomputer, and perform transfer experiments between endpoints with approximately 130ms round trip time. We have shown the results of transfer throughput in different size file sets and compared them. In addition, it has been confirmed that the DTN cluster with three nodes can provide about 1.8 and 2.7 times higher transfer throughput than a single node in two types of concurrency and parallelism settings.
https://doi.org/10.3745/KTCCS.2020.9.9.197 인용 PDF KSCI

A Study on the Effect of the Name Node and Data Node on the Big Data Processing Performance in a Hadoop Cluster (Hadoop 클러스터에서 네임 노드와 데이터 노드가 빅 데이터처리 성능에 미치는 영향에 관한 연구)

Lee, Younghun;Kim, Yongil
- Smart Media Journal
- /
- v.6 no.3
- /
- pp.68-74
- /
- 2017
Big data processing processes various types of data such as files, images, and video to solve problems and provide insightful useful information. Currently, various platforms are used for big data processing, but many organizations and enterprises are using Hadoop for big data processing due to the simplicity, productivity, scalability, and fault tolerance of Hadoop. In addition, Hadoop can build clusters on various hardware platforms and handle big data by dividing into a name node (master) and a data node (slave). In this paper, we use a fully distributed mode used by actual institutions and companies as an operation mode. We have constructed a Hadoop cluster using a low-power and low-cost single board for smooth experiment. The performance analysis of Name node is compared through the same data processing using single board and laptop as name nodes. Analysis of influence by number of data nodes increases the number of data nodes by two times from the number of existing clusters. The effect of the above experiment was analyzed.
PDF KSCI

Search Result 429, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)