• Title/Summary/Keyword: parallel tasks

Search Result 185, Processing Time 0.022 seconds

Novel Parallel Approach for SIFT Algorithm Implementation

  • Le, Tran Su;Lee, Jong-Soo
    • Journal of information and communication convergence engineering
    • /
    • v.11 no.4
    • /
    • pp.298-306
    • /
    • 2013
  • The scale invariant feature transform (SIFT) is an effective algorithm used in object recognition, panorama stitching, and image matching. However, due to its complexity, real-time processing is difficult to achieve with current software approaches. The increasing availability of parallel computers makes parallelizing these tasks an attractive approach. This paper proposes a novel parallel approach for SIFT algorithm implementation using a block filtering technique in a Gaussian convolution process on the SIMD Pixel Processor. This implementation fully exposes the available parallelism of the SIFT algorithm process and exploits the processing and input/output capabilities of the processor, which results in a system that can perform real-time image and video compression. We apply this implementation to images and measure the effectiveness of such an approach. Experimental simulation results indicate that the proposed method is capable of real-time applications, and the result of our parallel approach is outstanding in terms of the processing performance.

TBBench: A Micro-Benchmark Suite for Intel Threading Building Blocks

  • Marowka, Ami
    • Journal of Information Processing Systems
    • /
    • v.8 no.2
    • /
    • pp.331-346
    • /
    • 2012
  • Task-based programming is becoming the state-of-the-art method of choice for extracting the desired performance from multi-core chips. It expresses a program in terms of lightweight logical tasks rather than heavyweight threads. Intel Threading Building Blocks (TBB) is a task-based parallel programming paradigm for multi-core processors. The performance gain of this paradigm depends to a great extent on the efficiency of its parallel constructs. The parallel overheads incurred by parallel constructs determine the ability for creating large-scale parallel programs, especially in the case of fine-grain parallelism. This paper presents a study of TBB parallelization overheads. For this purpose, a TBB micro-benchmarks suite called TBBench has been developed. We use TBBench to evaluate the parallelization overheads of TBB on different multi-core machines and different compilers. We report in detail in this paper on the relative overheads and analyze the running results.

Parallelism point selection in nested parallelism situations with focus on the bandwidth selection problem (평활량 선택문제 측면에서 본 중첩병렬화 상황에서 병렬처리 포인트선택)

  • Cho, Gayoung;Noh, Hohsuk
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.3
    • /
    • pp.383-396
    • /
    • 2018
  • Various parallel processing R packages are used for fast processing and the analysis of big data. Parallel processing is used when the work can be decomposed into tasks that are non-interdependent. In some cases, each task decomposed for parallel processing can also be decomposed into non-interdependent subtasks. We have to choose whether to parallelize the decomposed tasks in the first step or to parallelize the subtasks in the second step when facing nested parallelism situations. This choice has a significant impact on the speed of computation; consequently, it is important to understand the nature of the work and decide where to do the parallel processing. In this paper, we provide an idea of how to apply parallel computing effectively to problems by illustrating how to select a parallelism point for the bandwidth selection of nonparametric regression.

Scheduler for parallel processing with finely grained tasks

  • Hosoi, Takafumi;Kondoh, Hitoshi;Hara, Shinji
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1991.10b
    • /
    • pp.1817-1822
    • /
    • 1991
  • A method of reducing overhead caused by the processor synchronization process and common memory accesses in finely grained tasks is described. We propose a scheduler which considers the preparation time during searching to minimize the redundant accesses to shared memory. Since the suggested hardware (synchronizer) determines the access order of processors and bus arbitration simultaneously by including the synchronization process into the bus arbitration process, the synchronization time vanishes. Therefore this synchronizer has no overhead caused by the processor synchronization[l]. The proposed scheduler algorithm is processed in parallel. The processes share the upper bound derived by each searching and the lower bound function is built considering the preparation time in order to eliminate as many searches as possible. An application of the proposed method to a multi-DSP system to calculate inverse dynamics for robot arms, showed that the sampling time can be twice shorter than that of the conventional one.

  • PDF

Comparison of Genetic Algorithms and Simulated Annealing for Multiprocessor Task Allocation (멀티프로세서 태스크 할당을 위한 GA과 SA의 비교)

  • Park, Gyeong-Mo
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.9
    • /
    • pp.2311-2319
    • /
    • 1999
  • We present two heuristic algorithms for the task allocation problem (NP-complete problem) in parallel computing. The problem is to find an optimal mapping of multiple communicating tasks of a parallel program onto the multiple processing nodes of a distributed-memory multicomputer. The purpose of mapping these tasks into the nodes of the target architecture is the minimization of parallel execution time without sacrificing solution quality. Many heuristic approaches have been employed to obtain satisfactory mapping. Our heuristics are based on genetic algorithms and simulated annealing. We formulate an objective function as a total computational cost for a mapping configuration, and evaluate the performance of our heuristic algorithms. We compare the quality of solutions and times derived by the random, greedy, genetic, and annealing algorithms. Our experimental findings from a simulation study of the allocation algorithms are presented.

  • PDF

Workspace and Force-Moment Transmission of a Parallel Manipulator with Variable Platform (가변형 병렬기구에 대한 작업공간과 힘/모멘트 전달 특성 해석)

  • Kim Byoung-Chang;Lee Se-Han
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.12 no.2
    • /
    • pp.138-144
    • /
    • 2006
  • Kinematic and dynamic characteristics of a Stewart platform based parallel manipulators are fixed once they are constructed. Thus parallel manipulators with various configurations are required to meet a variety of applications. In this research a parallel manipulator with variable platform (PMVP) has been developed, in which the length of the arm linking the platform center to the platform-leg contact point can be varied by an actuator. The workspace of the PMVP is larger than that of a traditional Stewart platform and especially the range in which the maximum orientation angles can be maintained is significantly expanded. Furthermore, the characteristics of force and moment transmission between the legs and platform can be adjusted to meet the requirements of various tasks. Kinematic and dynamics analysis was performed to verify the usefulness of the PMVP and the actual hardware was built to demonstrate the feasibility.

Development of real time versatile software for automation of chemical processes (화학공정 자동화를 위한 실시간대 다기능 소프트웨어의 개발)

  • 서인식;김상우;남성우;백운화;엄태원;김원철;김태윤;김흥식;이광순
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1988.10a
    • /
    • pp.488-491
    • /
    • 1988
  • In this work, we developed a real-time versatile advanced control and supervisory software for a personal computer control. This software, basically, has background and foreground tasks which are performed in parallel at real time. First, background tasks are composed of controls of various kinds, reports and input-ouput of signals etc, which are performed every sampling time. Second, foreground tasks are observation of operation conditions, data search, regulation of controllers and graphical design and display of processes, which are performed by users request. Additionally, this software has the functions of transporting data and composing distributed control systems, and all background tasks are composed of combination of unit function blocks.

  • PDF

Check of Concurrency in Parallel Programs using Image Information (영상정보를 이용한 병렬 프로그램내의 병행성 판별)

  • Park, Myeong-Chul;Ha, Seok-Wun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.10 no.12
    • /
    • pp.2132-2139
    • /
    • 2006
  • A parallel program including a nested parallelism has a complex execution aspects and tasks are executed concurrently. This concurrency is a main cause raising most of errors. In this paper, a new method for checking concurrency between two tasks is proposed. The existing techniques for checking the concurrency have their limits to represent a global structure. A new labeling technique that appropriate for image visualization is proposed. To show the global structure by imaging of execution aspects through region partition on 2D plane. On the basis of it, each of the tasks that can distinguish the ordered relation create an independent image. Image information generated by the result simplifies semantic analysis of the related task, and provides an outline of a global execution aspects structure of the program to user effectively.

Efficient Task Distribution for Pig Monitoring Applications Using OpenCL (OpenCL을 이용한 돈사 감시 응용의 효율적인 태스크 분배)

  • Kim, Jinseong;Choi, Younchang;Kim, Jaehak;Chung, Yeonwoo;Chung, Yongwha;Park, Daihee;Kim, Hakjae
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.6 no.10
    • /
    • pp.407-414
    • /
    • 2017
  • Pig monitoring applications consisting of many tasks can take advantage of inherent data parallelism and enable parallel processing using performance accelerators. In this paper, we propose a task distribution method for pig monitoring applications into a heterogenous computing platform consisting of a multicore-CPU and a manycore-GPU. That is, a parallel program written in OpenCL is developed, and then the most suitable processor is determined based on the measured execution time of each task. The proposed method is simple but very effective, and can be applied to parallelize other applications consisting of many tasks on a heterogeneous computing platform consisting of a CPU and a GPU. Experimental results show that the performance of the proposed task distribution method on three different heterogeneous computing platforms can improve the performance of the typical GPU-only method where every tasks are executed on a deviceGPU by a factor of 1.5, 8.7 and 2.7, respectively.

Dynamics and Control of 2 DOF 5-bar Parallel Manipulator with Closed Chain

  • Chung, Young-Hoog;Lee, Jae-Won;Sung, Yoon-Gyeoung;Joo, Hae-Hoo
    • International Journal of Precision Engineering and Manufacturing
    • /
    • v.2 no.1
    • /
    • pp.5-10
    • /
    • 2001
  • A method is proposed to obtain the Jacobian matrix of the 5 -bar parallel manipulator by employing the orthogonality between position and velocity vectors of rotating rigid-body around a fixed point. The dynamics of the 5-bar parallel manipulator is analyzed and utilized to design the computed-torque controller by developing a transformation matrix of the passive joints with respect to the active ones. In experimental demonstration, it shows that high-speed and accuracy tasks are performed by the proposed computed-torque control.

  • PDF