• 제목/요약/키워드: parallel tasks

검색결과 185건 처리시간 0.027초

병렬공간 조인을 위한 객체 캐쉬 기반 태스크 생성 및 할당 (Task Creation and Assignment based on Object Caching for Parallel Spatial Join)

  • 서영덕;김진덕;홍봉희
    • 한국정보과학회논문지:소프트웨어및응용
    • /
    • 제26권10호
    • /
    • pp.1178-1178
    • /
    • 1999
  • A spatial join has the property that its execution time exponentially increases in proportion to the number of spatial objects. Recently, there have been many attempts for improving the performance of the spatial join by using parallel processing schemes, In the case of executing parallel spatial join using the parallel machine with shared disk architecture, the disk bottleneck of parallel processing of spatial join worsens in comparison with sequential spatial join. This paper presents the algorithms of task creation and assignment to reduce the disk bottleneck caused by accessing the shared disk at the same time, and to minimize message passing between processors, This paper proposes object caching which is a higher level of abstraction than page caching, and uses it to do creation and assignment of tasks according to temporal and spatial localities for minimizing disk access time. The object caching shows the performance improvement of 50%. The task creation and assignment using localities gives the gain of 30% and 20%. Overall performance evaluation of the proposed algorithms shows 7.2 times speed up than those of sequential execution of spatial joins.

GPU 하드웨어 아키텍처 기반 sub-warp 단위 병렬 프리픽스(prefix) 연산의 정확한 구현 (Correct Implementation of Sub-warp Parallel Prefix Operations based on GPU Hardware Architecture)

  • 박태정
    • 디지털콘텐츠학회 논문지
    • /
    • 제18권3호
    • /
    • pp.613-619
    • /
    • 2017
  • 본 논문에서는 대규모 데이터를 길이가 32 미만인 로컬 세그먼트 단위로 구분하고 이 로컬 세그먼트 내에서 정확한 GPU 병렬 프리픽스(prefix) 연산 결과를 출력하는 CUDA (Compute Unified Device Architecture) 코드를 제시한다. 이미 Mark Harris와 Michael Garland가 이러한 목적을 수행하기 위한 CUDA 코드를 이미 발표한 바 있으나 본 논문에서는 로컬 세그먼트의 길이가 32 미만일 때 기존 코드의 결과가 정확하지 않다는 사실을 살펴 보고 그 원인을 논의한 후, 정확한 결과를 출력하는 코드를 제안한다. 본 논문에서 다루는 로컬 세그먼트 단위의 병렬 프리픽스 연산은 최인접 요소 탐색(k-nearest neighbor search) 등은 물론 다양한 대규모 병렬 처리 알고리즘을 구성하는 기본 연산으로 활용 가능하다.

An Improved Hybrid Approach to Parallel Connected Component Labeling using CUDA

  • Soh, Young-Sung;Ashraf, Hadi;Kim, In-Taek
    • 융합신호처리학회논문지
    • /
    • 제16권1호
    • /
    • pp.1-8
    • /
    • 2015
  • In many image processing tasks, connected component labeling (CCL) is performed to extract regions of interest. CCL was usually done in a sequential fashion when image resolution was relatively low and there are small number of input channels. As image resolution gets higher up to HD or Full HD and as the number of input channels increases, sequential CCL is too time-consuming to be used in real time applications. To cope with this situation, parallel CCL framework was introduced where multiple cores are utilized simultaneously. Several parallel CCL methods have been proposed in the literature. Among them are NSZ label equivalence (NSZ-LE) method[1], modified 8 directional label selection (M8DLS) method[2], and HYBRID1 method[3]. Soh [3] showed that HYBRID1 outperforms NSZ-LE and M8DLS, and argued that HYBRID1 is by far the best. In this paper we propose an improved hybrid parallel CCL algorithm termed as HYBRID2 that hybridizes M8DLS with label backtracking (LB) and show that it runs around 20% faster than HYBRID1 for various kinds of images.

Dynamic Load Balancing Algorithm using Execution Time Prediction on Cluster Systems

  • Yoon, Wan-Oh;Jung, Jin-Ha;Park, Sang-Bang
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2002년도 ITC-CSCC -1
    • /
    • pp.176-179
    • /
    • 2002
  • In recent years, an increasing amount of computer network research has focused on the problem of cluster system in order to achieve higher performance and lower cost. The load unbalance is the major defect that reduces performance of a cluster system that uses parallel program in a form of SPMD (Single Program Multiple Data). Also, the load unbalance is a problem of MPP (Massive Parallel Processors), and distributed system. The cluster system is a loosely-coupled distributed system, therefore, it has higher communication overhead than MPP. Dynamic load balancing can solve the load unbalance problem of cluster system and reduce its communication cost. The cluster systems considered in this paper consist of P heterogeneous nodes connected by a switch-based network. The master node can predict the average execution time of tasks for each slave node based on the information from the corresponding slave node. Then, the master node redistributes remaining tasks to each node considering the predicted execution time and the communication overhead for task migration. The proposed dynamic load balancing uses execution time prediction to optimize the task redistribution. The various performance factors such as node number, task number, and communication cost are considered to improve the performance of cluster system. From the simulation results, we verified the effectiveness of the proposed dynamic load balancing algorithm.

  • PDF

PC 클러스터에서 스케줄링 기법의 구현 (Implementation of Scheduling Strategies on PC Clusters)

  • 강오한;송희헌;정중수
    • 정보처리학회논문지A
    • /
    • 제11A권7호
    • /
    • pp.521-528
    • /
    • 2004
  • 본 논문에서는 버스 기반의 클러스터 구조에 적합한 새로운 태스크 스케줄링 기법을 소개하고, PC 클러스터에 구현하여 스케줄링 기법의 성능을 분석한다. 구현된 스케줄링 기법은 태스크 그래프를 입력으로 받아 PC 클러스터로 스케줄링하며, 휴리스틱을 사용하여 태스크를 선택적으로 중복함으로써 병렬연산시간을 단축한다. PC 클러스터는 리눅스 OS가 설치된 6대의 PC가 Gigabit Ethernet으로 연결되어 있다. 통신을 위해 TCP/IP 프로토콜을 사용하며, 메시지 교환을 위해 표준화된 병렬 프로그래밍 도구로 MPI를 사용한다. 실험을 한 결과 본 논문에서 소개한 스케줄링 기법이 비교 기법보다 병렬연산시간 측면에서 성능이 우수함을 확인하였다.

Multi-Objective Pareto Optimization of Parallel Synthesis of Embedded Computer Systems

  • Drabowski, Mieczyslaw
    • International Journal of Computer Science & Network Security
    • /
    • 제21권3호
    • /
    • pp.304-310
    • /
    • 2021
  • The paper presents problems of optimization of the synthesis of embedded systems, in particular Pareto optimization. The model of such a system for its design for high-level of abstract is based on the classic approach known from the theory of task scheduling, but it is significantly extended, among others, by the characteristics of tasks and resources as well as additional criteria of optimal system in scope structure and operation. The metaheuristic algorithm operating according to this model introduces a new approach to system synthesis, in which parallelism of task scheduling and resources partition is applied. An algorithm based on a genetic approach with simulated annealing and Boltzmann tournaments, avoids local minima and generates optimized solutions. Such a synthesis is based on the implementation of task scheduling, resources identification and partition, allocation of tasks and resources and ultimately on the optimization of the designed system in accordance with the optimization criteria regarding cost of implementation, execution speed of processes and energy consumption by the system during operation. This paper presents examples and results for multi-criteria optimization, based on calculations for specifying non-dominated solutions and indicating a subset of Pareto solutions in the space of all solutions.

병렬의 동일기계에서 처리되는 순서의존적인 작업들의 스케쥴링을 위한 유전알고리즘 (A Genetic Algorithm for Scheduling Sequence-Dependant Jobs on Parallel Identical Machines)

  • 이문규;이승주
    • 대한산업공학회지
    • /
    • 제25권3호
    • /
    • pp.360-368
    • /
    • 1999
  • We consider the problem of scheduling n jobs with sequence-dependent processing times on a set of parallel-identical machines. The processing time of each job consists of a pure processing time and a sequence-dependent setup time. The objective is to maximize the total remaining machine available time which can be used for other tasks. For the problem, a hybrid genetic algorithm is proposed. The algorithm combines a genetic algorithm for global search and a heuristic for local optimization to improve the speed of evolution convergence. The genetic operators are developed such that parallel machines can be handled in an efficient and effective way. For local optimization, the adjacent pairwise interchange method is used. The proposed hybrid genetic algorithm is compared with two heuristics, the nearest setup time method and the maximum penalty method. Computational results for a series of randomly generated problems demonstrate that the proposed algorithm outperforms the two heuristics.

  • PDF

MPMD 방식의 동기/비동기 병렬 혼합 멱승법에 의한 거대 고유치 문제의 해법 (A Synchronous/Asynchronous Hybrid Parallel Power Iteration for Large Eigenvalue Problems by the MPMD Methodology)

  • 박필성
    • 정보처리학회논문지A
    • /
    • 제11A권1호
    • /
    • pp.67-74
    • /
    • 2004
  • 대부분의 병렬 알고리즘은 동기 알고리즘으로, 올바른 계산을 위해 작업을 일찍 끝낸 빠른 프로세서들은 동기점에서 느린 프로세서를 기다려야 하는데, 프로세서들의 성능이 다를 경우 연산 속도는 가장 느린 프로세서에 의해 결정된다. 본 논문에서는 거대 고유치 문제의 주요 고유쌍을 구하는 문제에 있어서 빠른 프로세서의 유휴 시간을 줄여 수렴 속도를 가속한 수 있는 동기/비동기 혼합 알고리즘을 고안하고 이를 MPMD 프로그래밍 방식을 사용하여 구현하였다.

Novel Calibration Method for the Multi-Camera Measurement System

  • Wang, Xinlei
    • Journal of the Optical Society of Korea
    • /
    • 제18권6호
    • /
    • pp.746-752
    • /
    • 2014
  • In a multi-camera measurement system, the determination of the external parameters is one of the vital tasks, referred to as the calibration of the system. In this paper, a new geometrical calibration method, which is based on the theory of the vanishing line, is proposed. Using a planar target with three equally spaced parallel lines, the normal vector of the target plane can be confirmed easily in every camera coordinate system of the measurement system. By moving the target into more than two different positions, the rotation matrix can be determined from related theory, i.e., the expression of the same vector in different coordinate systems. Moreover, the translation matrix can be derived from the known distance between the adjacent parallel lines. In this paper, the main factors effecting the calibration are analyzed. Simulations show that the proposed method achieves robustness and accuracy. Experimental results show that the calibration can reach 1.25 mm with the range about 0.5m. Furthermore, this calibration method also can be used for auto-calibration of the multi-camera mefasurement system as the feature of parallels exists widely.

평행사변형 기구를 이용한 평면 병렬형 병진운동 기구 개발 (Development of Two Types of Novel Planar Translational Parallel Manipulators by Using Parallelogram Mechanism)

  • 김한성
    • 한국정밀공학회지
    • /
    • 제24권8호통권197호
    • /
    • pp.50-57
    • /
    • 2007
  • In this paper, two types of novel planar Translational Parallel Manipulators (TPMs) by using parallelogram mechanism are conceived. One is made up of two Pa-P (Parallelogram-Prismatic) legs connecting the base to the moving platform. The other consists of two P-Pa legs, which is the kinematic inversion of the former. Since connecting links in a parallelogram mechanism are subject to only tensile/compressive load and all the heavy actuators are mounted at the base, the proposed manipulators can be applied for planar positioning/assembly tasks requiring high stiffness and high speed. The position, velocity, and statics are analyzed, and the design methodology using prescribed workspace and velocity transmission capability is presented. Finally, two types of prototype manipulators have been developed.