• Title/Summary/Keyword: parallel programming

검색결과 295건 처리시간 0.02초

Parallel Deblocking Filter Based on Modified Order of Accessing the Coding Tree Units for HEVC on Multicore Processor

  • Lei, Haiwei;Liu, Wenyi;Wang, Anhong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제11권3호
    • /
    • pp.1684-1699
    • /
    • 2017
  • The deblocking filter (DF) reduces blocking artifacts in encoded video sequences, and thereby significantly improves the subjective and objective quality of videos. Statistics show that the DF accounts for 5-18% of the total decoding time in high-efficiency video coding. Therefore, speeding up the DF will improve codec performance, especially for the decoder. In view of the rapid development of multicore technology, we propose a parallel DF scheme based on a modified order of accessing the coding tree units (CTUs) by analyzing the data dependencies between adjacent CTUs. This enables the DF to run in parallel, providing accelerated performance and more flexibility in the degree of parallelism, as well as finer parallel granularity. We additionally solve the problems of variable privatization and thread synchronization in the parallelization of the DF. Finally, the DF module is parallelized based on the HM16.1 reference software using OpenMP technology. The acceleration performance is experimentally tested under various numbers of cores, and the results show that the proposed scheme is very effective at speeding up the DF.

GPU의 공유메모리를 활용한 확장편집거리 병렬계산 (Parallel Computation for Extended Edit Distances Using the Shared Memory on GPU)

  • 김영호;나중채;심정섭
    • 정보처리학회논문지:컴퓨터 및 통신 시스템
    • /
    • 제4권7호
    • /
    • pp.213-218
    • /
    • 2015
  • 알파벳 ${\Sigma}$로 구성된 길이가 각각 m, n인 두 문자열 X, Y가 주어졌을 때, X, Y의 확장편집거리는 동적프로그래밍을 이용하여 O(mn) 시간과 공간을 계산할 수 있다. 최근 m개의 쓰레드를 이용하여 O(m+n) 시간과 O(mn) 공간을 사용하여 X, Y의 확장편집거리를 계산하는 병렬알고리즘이 제시되었다. 본 논문에서는 GPU의 공유메모리를 활용하여 수행시간을 개선한 병렬알고리즘을 제시한다. 실험 결과, 개선된 병렬알고리즘이 기존의 병렬알고리즘보다 약 19~25배 이상 빠른 수행시간을 보였다.

하이브리드 병렬 프로그램을 이용한 타키온 슈퍼컴퓨터의 성능 (Performance Characterization of Tachyon Supercomputer using Hybrid Multi-zone NAS Parallel Benchmarks)

  • 박남규;정윤수;이홍석
    • 한국정보통신학회논문지
    • /
    • 제14권1호
    • /
    • pp.138-144
    • /
    • 2010
  • 최근에 도입되어 운영되고 있는 타키온 1차 시스템은 쿼드코어 AMD 바로셀로나 노드로 구성된 고성능 슈퍼컴퓨터이다. 본 논문에서는 하이브리드 병렬화 기법을 도입한 프로그램 중 하나로 사용되고 있는 멀티존(Multi-zone) NAS 병렬 벤치마크(NPB)를 이용하여 타키온 성능 및 병렬 확장성을 검증하고자 한다. 하이브리드 병렬 성능 시험을 위하여 NPB-3.3 버전 BT-MZ의 B 및 C클래스를 사용하였으며, 실제로 타키온 시스템의 1024개의 프로세스까지 병렬 확장성을 테스트를 하였다. 프로세서 1024개 이상 이용한 하이브리드 병렬컴퓨팅 계산 결과는 국내 최초이다. 이러한 하이브리드 병렬화 기법은 타키온처럼 멀티코어 기술을 적용한 고성능 컴퓨팅 시스템에서 매우 효율적이고 유용한 병렬 성능 벤치마크가 될 수 있음을 기술하였다.

MPMD 방식의 동기/비동기 병렬 혼합 멱승법에 의한 거대 고유치 문제의 해법 (A Synchronous/Asynchronous Hybrid Parallel Power Iteration for Large Eigenvalue Problems by the MPMD Methodology)

  • 박필성
    • 정보처리학회논문지A
    • /
    • 제11A권1호
    • /
    • pp.67-74
    • /
    • 2004
  • 대부분의 병렬 알고리즘은 동기 알고리즘으로, 올바른 계산을 위해 작업을 일찍 끝낸 빠른 프로세서들은 동기점에서 느린 프로세서를 기다려야 하는데, 프로세서들의 성능이 다를 경우 연산 속도는 가장 느린 프로세서에 의해 결정된다. 본 논문에서는 거대 고유치 문제의 주요 고유쌍을 구하는 문제에 있어서 빠른 프로세서의 유휴 시간을 줄여 수렴 속도를 가속한 수 있는 동기/비동기 혼합 알고리즘을 고안하고 이를 MPMD 프로그래밍 방식을 사용하여 구현하였다.

Go와 C++ TBB의 병렬처리 비교 (Comparison of Go and C++ TBB on Parallel Processing)

  • 박동하;문봉교
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2017년도 춘계학술발표대회
    • /
    • pp.64-67
    • /
    • 2017
  • Applying concurrent structure and parallel processing are a common issue for these day's programs. In this research, Dynamic Programming is used to compare the parallel performance of Go language and Intel C++ Thread Building Blocks. The experiment was performed on 4 core machine and its result contains execution time under Simultaneous Multi-Threading environment. Static Optimal Binary Search Tree was used as an example. From the result, the speed-up of Go was higher than the number of cores, and that of TBB was close to it. TBB performed better in general, but for larger scale, Go was partially faster than the other.

서로 다른 납기를 갖는 작업에 대한 이종 병렬기계에서의 일정계획수립 (Scheduling Jobs with different Due-Date on Nonidentical Parallel Machines)

  • 강용혁;이홍철;김성식
    • 대한산업공학회지
    • /
    • 제24권1호
    • /
    • pp.37-50
    • /
    • 1998
  • This paper considers the nonidentical parallel machine scheduling problem in which n jobs having different due dates are to be scheduled on m nonidentical parallel machines. For the make-to-order manufacturing environment, the objective is to minimize the number of tardy jobs. A 0-1 nonlinear programming model is formulated and a heuristic algorithm that allocates and sequences jobs to machines is developed. The proposed algorithm makes use of the concept of assignment problem based on the suitability measure as the cost coefficient. Computational experiments show that the proposed algorithm is superior to the existing one in some performance measures such as number of tardy jobs. In addition, this algorithm is appropriate for solving real industrial problems efficiently.

  • PDF

도로 윤곽 검출을 위한 셀룰러 아나로직 병렬처리 회 로망(CAPPN) 알고리즘 (Fast Road Edge Detection with Cellular Analogic Parallel Processing Networks)

  • 홍승완;김형석;김봉수
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2002년도 하계종합학술대회 논문집(3)
    • /
    • pp.143-146
    • /
    • 2002
  • The aim of this work is the real-time road edge detection using the fast processing of Cellular Analogic Parallel Processing Networks(CAPPN). The CAPPN is composed of 2D analog cell way. If the dynamic programming is implemented with the CAPPN, the optimal path can be detected in parallel manner Provided that fragments of road edge are utilized as the cost inverse(benefit) in the CAPPN-based optimal path algorithm, the CAPPN determines the most plausible path as the road edge line. Benefits of the proposed algorithm are the fast processing and the utilization of optimal technique to determine the road edge lines.

  • PDF

작업순서 의존형 준비시간을 갖는 이종병렬기계의 휴리스틱 일정계획 (Heuristics for Non-Identical Parallel Machine Scheduling with Sequence Dependent Setup Times)

  • 고시근
    • 대한산업공학회지
    • /
    • 제40권3호
    • /
    • pp.305-312
    • /
    • 2014
  • This research deals with a problem that minimizes makespan in a non-identical parallel machine system with sequence and machine dependent setup times and machine dependent processing times. We first present a new mixed integer programming formulation for the problem, and using this formulation, one can easily find optimal solutions for small problems. However, since the problem is NP-hard and the size of a real problem is large, we propose four heuristic algorithms including genetic algorithm based heuristics to solve the practical big-size problems in a reasonable computational time. To assess the performance of the algorithms, we conduct a computational experiment, from which we found the heuristic algorithms show different performances as the problem characteristics are changed and the simple heuristics show better performances than genetic algorithm based heuristics for the case when the numbers of jobs and/or machines are large.

가중치와 준비시간을 포함한 병렬처리의 일정계획에 관한연구 (Unrelated Parallel Processing Problems with Weighted Jobs and Setup Times in Single Stage)

  • 구제현;정종윤
    • 대한산업공학회지
    • /
    • 제19권4호
    • /
    • pp.125-135
    • /
    • 1993
  • An Unrelated Parallel Processing with Weighted jobs and Setup times scheduling prolem is studied. We consider a parallel processing in which a group of processors(machines) perform a single operation on jobs of a number of different job types. The processing time of each job depends on both the job and the machine, and each job has a weight. In addition each machine requires significant setup time between processing jobs of different job types. The performance measure is to minimize total weighted flow time in order to meet the job importance and to minimize in-process inventory. We present a 0-1 Mixed Integer Programming model as an optimizing algorithm. We also present a simple heuristic algorithm. Computational results for the optimal and the heuristic algorithm are reported and the results show that the simple heuristic is quite effective and efficient.

  • PDF

6축 병렬형 순응기구를 이용한 위치/힘 동시제어 (Kinestatic Control using Six-axis Parallel-type Compliant Device)

  • 김한성
    • 한국생산제조학회지
    • /
    • 제23권5호
    • /
    • pp.421-427
    • /
    • 2014
  • In this paper, the kinestatic control algorithm using a six-axis compliant device is presented. Unlike the traditional control methods using a force/torque sensor with very limited compliance, this method employs a compliant device to provide sufficient compliance between an industrial robot and a rigid environment. This kinestatic control method is used to simply control the position of an industrial robot with twists of compensation, which can be decomposed into twists of compliance and twists of freedom. A simple design method of a six-axis parallel-type compliant device with a diagonal stiffness matrix is presented. A compliant device prototype and kinestatic control hardware system and programming were developed. The effectiveness of the kinestatic control algorithm was verified through two kinds of kinestatic control experiments.