• Title/Summary/Keyword: Parallel computation

Search Result 594, Processing Time 0.032 seconds

Asynchronous Multiplier with Parallel Array Structure (병렬배열구조를 사용한 비동기 곱셈기)

  • Park, Chan-Ho;Choe, Byeong-Su;Lee, Dong-Ik
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.39 no.5
    • /
    • pp.87-94
    • /
    • 2002
  • In this paper an asynchronous away multiplier with a parallel array structure is introduced. This parallel array structure is used to make the computation time faster with a lower Power consumption. Asymmetric parallel away structure is used to minimize the average computation time in an asynchronous multiplier. Simulation shows that this structure reduces the time needed for computation by 55% as compared to conventional booth encoding array structures and that the multiplier with the proposed away structure shows a reduction of 40% in the computational time with a relatively lower power consumption.

The development of parallel computation method for the fire-driven-flow in the subway station (도시철도역사에서 화재유동에 대한 병렬계산방법연구)

  • Jang, Yong-Jun;Lee, Chang-Hyun;Kim, Hag-Beom;Park, Won-Hee
    • Proceedings of the KSR Conference
    • /
    • 2008.06a
    • /
    • pp.1809-1815
    • /
    • 2008
  • This experiment simulated the fire driven flow of an underground station through parallel processing method. Fire analysis program FDS(Fire Dynamics Simulation), using LES(Large Eddy Simulation), has been used and a 6-node parallel cluster, each node with 3.0Ghz_2set installed, has been used for parallel computation. Simulation model was based on the Kwangju-geumnan subway station. Underground station, and the total time for simulation was set at 600s. First, the whole underground passage was divided to 1-Mesh and 8-Mesh in order to compare the parallel computation of a single CPU and Multi-CPU. With matrix numbers($15{\times}10^6$) more than what a single CPU can handle, fire driven flow from the center of the platform and the subway itself was analyzed. As a result, there seemed to be almost no difference between the single CPU's result and the Multi-CPU's ones. $3{\times}10^6$ grid point one employed to test the computing time with 2CPU and 7CPU computation were computable two times and fire times faster than 1CPU respectively. In this study it was confirmed that CPU could be overcome by using parallel computation.

  • PDF

A study on the genetic algorithms for the scheduling of parallel computation (병렬계산의 스케쥴링에 있어서 유전자알고리즘에 관한 연구)

  • 성기석;박지혁
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 1997.10a
    • /
    • pp.166-169
    • /
    • 1997
  • For parallel processing, the compiler partitions a loaded program into a set of tasks and makes a schedule for the tasks that will minimize parallel processing time for the loaded program. Building an optimal schedule for a given set of partitioned tasks of a program has known to be NP-complete. In this paper we introduce a GA(Genetic Algorithm)-based scheduling method in which a chromosome consists of two parts of a string which decide the number and order of tasks on each processor. An additional computation is used for feasibility constraint in the chromosome. By granularity theory, a partitioned program is categorized into coarse-grain or fine-grain types. There exist good heuristic algorithms for coarse-grain type partitioning. We suggested another GA adaptive to the coarse-grain type partitioning. The infeasibility of chromosome is overcome by the encoding and operators. The number of processors are decided while the GA find the minimum parallel processing time.

  • PDF

A Study on the Efficient m-step Parallel Generalization

  • Kim, Sun-Kyung
    • Proceedings of the Korea Society of Information Technology Applications Conference
    • /
    • 2005.11a
    • /
    • pp.13-16
    • /
    • 2005
  • It would be desirable to have methods for specific problems, which have low communication costs compared to the computation costs, and in specific applications, algorithms need to be developed and mapped onto parallel computer architectures. Main memory access for shared memory system or global communication in message passing system deteriorate the computation speed. In this paper, it is found that the m-step generalization of the block Lanczos method enhances parallel properties by forming m simultaneous search direction vector blocks. QR factorization, which lowers the speed on parallel computers, is not necessary in the m-step block Lanczos method. The m-step method has the minimized synchronization points, which resulted in the minimized global communications compared to the standard methods.

  • PDF

Realtime Tide and Storm-Surge Computations for the Yellow Sea Using the Parallel Finite Element Model (병렬 유한요소 모형을 이용한 황해의 실시간 조석 및 태풍해일 산정)

  • Byun, Sang-Shin;Choi, Byung-Ho;Kim, Kyeong-Ok
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.12 no.1
    • /
    • pp.29-36
    • /
    • 2009
  • Realtime tide and storm-surge computations for the Yellow Sea were conducted using the Parallel Finite Element Model. For these computations a high resolution grid system was constructed with a minimum node interval of loom in Gyeonggi Bay. In the modeling, eight main tidal constituents were analyzed and their results agreed well with the observed data. The realtime tide computation with the eight main tidal constituents and the storm-surge simulation for Typhoon Sarah(1959) were also conducted using parallel computing system of MPI-based LINUX clusters. The result showed a good performance in simulating Typhoon Sarah and reducing the computation time.

Parallel Computations for Boundary Element Analysis of Magnetostatic Fields (정자계의 경계요소 해석을 위한 병렬계산)

  • Kim, Keun-Hwan;Choi, Kyung;Jung, Hyun-Kyo;Lee, Ki-Sik;Hahn, Song-Yop
    • The Transactions of the Korean Institute of Electrical Engineers
    • /
    • v.41 no.5
    • /
    • pp.468-473
    • /
    • 1992
  • A boundary element analysis using parallel algorithm on transputers is described for three-dimensional magnetostatic field computations. The parallel algorithm are applied to assembling the system matrix and solving the matrix equation. Through the numerical results, it is shown that the computation time is ideally inverse proportional to the number of transputers, and the computational efficiency increases as the size of the system matrix becomes large. The easiness and simplicity in configuring the system hardware and making programs and computation times are compared in three kinds of topologies.

  • PDF

Performance Enhancement of Parallel Prime Sieving with Hybrid Programming and Pipeline Scheduling (혼합형 병렬처리 및 파이프라이닝을 활용한 소수 연산 알고리즘)

  • Ryu, Seung-yo;Kim, Dongseung
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.4 no.10
    • /
    • pp.337-342
    • /
    • 2015
  • We develop a new parallelization method for Sieve of Eratosthenes algorithm, which enhances both computation speed and energy efficiency. A pipeline scheduling is included for better load balancing after proper workload partitioning. They run on multicore CPUs with hybrid parallel programming model which uses both message passing and multithreading computation. Experimental results performed on both small scale clusters and a PC with a mobile processor show significant improvement in execution time and energy consumptions.

An Iterative Algorithm for the Bottom Up Computation of the Data Cube using MapReduce (맵리듀스를 이용한 데이터 큐브의 상향식 계산을 위한 반복적 알고리즘)

  • Lee, Suan;Jo, Sunhwa;Kim, Jinho
    • Journal of Information Technology and Architecture
    • /
    • v.9 no.4
    • /
    • pp.455-464
    • /
    • 2012
  • Due to the recent data explosion, methods which can meet the requirement of large data analysis has been studying. This paper proposes MRIterativeBUC algorithm which enables efficient computation of large data cube by distributed parallel processing with MapReduce framework. MRIterativeBUC algorithm is developed for efficient iterative operation of the BUC method with MapReduce, and overcomes the limitations about the storage size and processing ability caused by large data cube computation. It employs the idea from the iceberg cube which computes only the interesting aspect of analysts and the distributed parallel process of cube computation by partitioning and sorting. Thus, it reduces data emission so that it can reduce network overload, processing amount on each node, and eventually the cube computation cost. The bottom-up cube computation and iterative algorithm using MapReduce, proposed in this paper, can be expanded in various way, and will make full use of many applications.

Parallelization of A Load balancing Algorithm for Parallel Computations (병렬계산을 위한 부하분산 알고리즘의 병렬화)

  • In-Jae Hwang
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.5 no.3
    • /
    • pp.236-242
    • /
    • 2004
  • In this paper, we propose an approach to parallelize a load balancing algorithm that was shown to be very effective in distributing workload for parallel computations. Load balancing algorithms are required in executing parallel program efficiently As a parallel computation model, we used dynamically growing tree structure that can be found in many application problems. The load balancing algorithm tries to balance the workload among processors while keeping the communication cost under certain limit. We show how the load balancing algorithm is effectively parallelized on mesh and hypercube interconnection networks, and analyzed the time complexity for each case to show that parallel algorithm actually reduced the various overhead.

  • PDF