• Title/Summary/Keyword: Parallel program

Search Result 584, Processing Time 0.028 seconds

Load Balancing Strategies for Network-based Cluster System

  • Jung, Hoon-Jin;Choung Shik park;Park, Sang-Bang
    • Proceedings of the IEEK Conference
    • /
    • 2000.07a
    • /
    • pp.314-317
    • /
    • 2000
  • Cluster system provides attractive scalability in terms of computation power and memory size. With the advances in high speed computer network technology, cluster systems are becoming increasingly competitive compared to expensive parallel machines. In parallel processing program, each task load is difficult to predict before running the program and each task is interdependent each other in many ways. Load imbalancing induces an obstacle to system performance. Most of researches in load balancing were concerned with distributed system but researches in cluster system are few. In cluster system, the dynamic load balancing algorithm which evaluates each processor's load in runtime is purpose that the load of each node are evenly distributed. But, if communication cost or node complexity becomes high, it is not effective method for all nodes to attend load balancing process. In that circumstances, it is good to reduce the number of node which attend to load balancing process. We have modeled cluster systems and proposed marginal dynamic load balancing algorithms suitable for that circumstances.

  • PDF

Design and Implementation of a Grid System META for Executing CFD Analysis Programs on Distributed Environment (분산 환경에서 CFD 분석 프로그램 수행을 위한 그리드 시스템 META 설계 및 구현)

  • Kang, Kyung-Woo;Woo, Gyun
    • The KIPS Transactions:PartA
    • /
    • v.13A no.6 s.103
    • /
    • pp.533-540
    • /
    • 2006
  • This paper describes the design and implementation of a grid system META (Metacomputing Environment using Test-run of Application) which facilitates the execution of a CFD (Computational Fluid Dynamics) analysis program on distributed environment. The grid system META allows the CFD program developers can access the computing resources distributed over the network just like one computer system. The research issues involved in the grid computing include fault-tolerance, computing resource selection, and user-interface design. In this paper, we exploits an automatic resource selection scheme for executing the parallel SPMD (Single Program Multiple Data) application written in MPI (Message Passing Interface). The proposed resource selection scheme is informed from the network latency time and the elapsed time of the kernel loop attained from test-run. The network latency time highly influences the executional performance when a parallel program is distributed and executed over several systems. The elapsed time of the kernel loop can be used as an estimator of the whole execution time of the CFD Program due to a common characteristic of CFD programs. The kernel loop consumes over 90% of the whole execution time of a CFD program.

Reducing False Alarms in Schizophrenic Parallel Synchronizer Detection for Esterel (Esterel에서 동기장치 중복사용 문제 검출시 과잉 경보 줄이기)

  • Yun, Jeong-Han;Kim, Chul-Joo;Kim, Seong-Gun;Han, Tai-Sook
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.8
    • /
    • pp.647-652
    • /
    • 2010
  • Esterel is an imperative synchronous language well-adapted to control-intensive systems. When an Esterel program is translated to a circuit, the synchronizer of a parallel statement may be executed more than once in a clock; the synchronizer is called schizophrenic. Existing compilers cure the problems of schizophrenic parallel synchronizers using logic duplications. This paper proposes the conditions under which a synchronizer causes no problem in circuits when it is executed more than once in a clock. In addition we design a detection algorithm based on those conditions. Our algorithm detects schizophrenic parallel synchronizers that have to be duplicated in Esterel source codes so that compilers can save the size of synthesized circuits

Study on Real-time Parallel Processing Simulator for Performance Analysis of Missiles (유도탄 성능분석을 위한 실시간 병렬처리 시뮬레이터 연구)

  • Kim Byeong-Moon;Jung Soon-Key
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.11 no.1
    • /
    • pp.84-91
    • /
    • 2005
  • In this paper, we describe the real-time parallel processing simulator developed for the use of performance analysis of rolling missiles. The real-time parallel processing simulator developed here consists of seeker emulator generating infrared image signal on aircraft, real-time computer, host computer, system unit, and actual equipments such as auto-pilot processor and seeker processor. Software is developed from mathematic models, 6 degree-of-freedom module, aerodynamic module which are resided in real-time computer, and graphic user interface program resided in host computer. The real-time computer consists of six TIC-40 processors connected in parallel. The seeker emulator is designed by using analog circuits coupled with mechanical equipments. The system unit provides interface function to match impedance between the components and processes very small electrical signals. Also real launch unit of missiles is interfaced to simulator through system unit. In order to apply the real-time parallel processing simulator to performance analysis equipment of rolling missiles it is essential to perform the performance verification test of simulator.

Design of a Dingle-chip Multiprocessor with On-chip Learning for Large Scale Neural Network Simulation (대규모 신경망 시뮬레이션을 위한 칩상 학습가능한 단일칩 다중 프로세서의 구현)

  • 김종문;송윤선;김명원
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.33B no.2
    • /
    • pp.149-158
    • /
    • 1996
  • In this paper we describe designing and implementing a digital neural chip and a parallel neural machine for simulating large scale neural netsorks. The chip is a single-chip multiprocessor which has four digiral neural processors (DNP-II) of the same architecture. Each DNP-II has program memory and data memory, and the chip operates in MIMD (multi-instruction, multi-data) parallel processor. The DNP-II has the instruction set tailored to neural computation. Which can be sed to effectively simulate various neural network models including on-chip learning. The DNP-II facilitates four-way data-driven communication supporting the extensibility of parallel systems. The parallel neural machine consists of a host computer, processor boards, a buffer board and an interface board. Each processor board consists of 8*8 array of DNP-II(equivalently 2*2 neural chips). Each processor board acn be built including linear array, 2-D mesh and 2-D torus. This flexibility supports efficiency of mapping from neural network models into parallel strucgure. The neural system accomplishes the performance of maximum 40 GCPS(giga connection per second) with 16 processor boards.

  • PDF

Exploiting Implicit Parallelism for Single Loops in Java Programming Language (JAVA 프로그래밍 언어에서 단일루프구조의 무시적 병렬성 검출)

  • Kwon, Oh-Jin
    • Journal of Information Management
    • /
    • v.29 no.3
    • /
    • pp.1-26
    • /
    • 1998
  • The loop is a fundamental for the parallelism exploiting as it has a large portion of execution time for sequential Java program on the parallel machine. This paper proposes the method of exploiting the implicit parallelism through the analysis of data dependence in the existing Java programming language having a single loop structure. The parallel code generation method through the restructuring compiler and the translation method of Java source program into multithread statement, which is supported in the level of the Java programming language, are also proposed here. The performance test of the program translated into the thread statement is conducted using the trip count of loop and the thread count as parameters. The restructuring compiler makes it possible for users to reduce overhead and exploit parallelism efficiently in the Java programming.

  • PDF

Development of Optimum Parameters Sampling Program for Mica Capacitor Design (마이카 커패시터 설계를 위한 최적 파라미터 추출 프로그램 개발)

  • Kim, Jae-Wook;Ryu, Chang-Keun
    • Journal of IKEEE
    • /
    • v.13 no.2
    • /
    • pp.194-199
    • /
    • 2009
  • In this study, ultra high-voltage (170kV AC), reliable 80pF mica capacitors for partial discharge system application were investigated. For capacitors design, Program was developed to sampling of series and parallel parameters. Mica was used as the dielectric of the capacitors. Using the conservative design rule, over 3 individual 50$\mu$m thick mica sheets with a size of 30mm$\times$35mm were used with lead foils to form a parallel capacitor element and 20 mica sheets were interleaved with lead foils to form a series stack of parallel capacitor element to meet the requirements of the capacitors. The dimension of the fabricated 80pF capacitor for 17kV AC were 90mm$\times$90mm. The high-frequency characteristics of the capacitance (C) and dissipation factor (D) of the developed capacitors were measured using a capacitance meter. The developed capacitor exhibited C of 79.5pF, had D of 0.001% over the frequency ranges of 150kHz to 50MHz, had a self-resonant frequency of 65MHz.

  • PDF

Web-Based Distributed Visualization System for Large Scale Geographic Data (대용량 지형 데이터를 위한 웹 기반 분산 가시화 시스템)

  • Hwang, Gyu-Hyun;Yun, Seong-Min;Park, Sang-Hun
    • Journal of Korea Multimedia Society
    • /
    • v.14 no.6
    • /
    • pp.835-848
    • /
    • 2011
  • In this paper, we propose a client server based distributed/parallel system to effectively visualize huge geographic data. The system consists of a web-based client GUI program and a distributed/parallel server program which runs on multiple PC clusters. To make the client program run on mobile devices as well as PCs, the graphical user interface has been designed by using JOGL, the java-based OpenGL graphics library, and sending the information about current available memory space and maximum display resolution the server can minimize the amount of tasks. PC clusters used to play the role of the server access requested geographic data from distributed disks, and properly re-sample them, then send the results back to the client. To minimize the latency happened in repeatedly access the distributed stored geography data, cache data structures have been maintained in both every nodes of the server and the client.

An Implementation of Fault-Tolerant Message Passing Interface on Parallel Computers (병렬 컴퓨터에서의 결함 허용 메시지 전달 인터페이스 구현)

  • Song, Dae-Ki;Lee, Cheol-Hoon
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.6 no.3
    • /
    • pp.319-328
    • /
    • 2000
  • The Message-Passing Interface(MPI) is a standard interface for parallel programming environment, based on that application programs run on the processors of a parallel computer. Processor nodes execute processes consisting the program by passing messages to one another. During executing, however, if a fault occurs on a processor node or a process, this will result an inconsistent state, and consequently, the whole program will have to be stopped. To solve this problem, in this paper, we propose a fault-tolerant message passing interface(FT-MPI) by adding a fault manager module to MPI. The proposed FT-MPI does not need any hardware support, and each application program based on MPI can run on the FT-MPI without any modification. The proposed fault tolerance scheme uses the so-called hot-spare process duplication method, and verified by simulations that application programs run despite of any fault with less than 5% overhead on execution time.

  • PDF

A Processor Allocation Policy using Program Characteristics on Shared Bus (공유 버스상에서 프로그램 특성을 사용한 프로세서 할당 정책)

  • Jeong, In-Beom;Lee, Jun-Won
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.26 no.9
    • /
    • pp.1073-1082
    • /
    • 1999
  • 본 논문에서는 시스템 내의 프로세서들을 효과적으로 사용하기 위한 적응적 프로세서 할당 정책을 제안한다. 프로그램의 병렬성을 향상시키기 위하여 일반적으로 병렬 처리에 사용될 프로세서 개수를 증가시킨다. 그러나 증가된 프로세서들은 그레인 크기에 변화를 일으키며 이는 캐쉬 성능에 영향을 미친다. 특히 대역이 제한된 공유 버스를 사용하는 시스템에서는 프로세서 개수의 증가는 공유 버스에 대한 접근 경쟁을 크게 증가하므로 버스에서 대기하는 시간이 프로세서 증가에 의한 계산 능력 이득을 상쇄시키는 주요한 원인이 되고 있다. 본 논문에서 제안한 적응적 프로세서 할당 정책은 프로그램이 수행되는 도중에 임의의 기간동안 공유버스에 대기중인 프로세서 분포에 관한 정보를 얻는다. 그리고 이 정보를 바탕으로 프로세서 개수를 변경하는 방법이다. 모의 시험에서 적응적 프로세서 할당 정책은 프로그램들의 버스 트래픽 특성에 따른 최적의 적합한 프로세서 개수를 발견함을 보인다. 그리고 적응적 프로세서 할당 정책은 고정된 프로세서 개수를 사용한 가장 좋은 성능보다는 다소 떨어진 성능을 나타내었으나 시스템의 프로세서 활용성을 높여 효과적 시스템 사용에 기여함을 보인다. Abstract In this paper, the adaptive processor allocation policy is suggested to make effective use of processors in system. To enhance the parallelism, the number of processors used in the parallel computing may be increased. However, increasing the number of processors affects the grain size of the parallel program. Therefore, it affects the cache performance. In particular, when the shared bus is employed, since increasing the number of processors can result in a significant amount of contention to achieve the shared-bus, the increased computing power is offset by the bus waiting time due to these contentions. The adaptive processor allocation policy acquires the information about the distribution of waiting processors on shared bus for any execution period of programs. And it changes the number of processors working in parallel processing during the program's run. Our simulation results show that the adaptive processor allocation policy finds the optimum feasible number of processors based on the bus traffic characteristic of programs. Thus, it contributes to effective system utilization, even though it performs slightly less efficiently than using a fixed number of processors with the best performance.