Search | Korea Science

Tile-based Parallelizing for a Fast HEVC Encoder (HEVC 부호화기 고속화를 위한 타일 기반 병렬화)

Kim, Younhee;Jun, DongSan;Jung, Soon-Heung;Seok, Jinwuk;Choi, Jin Soo
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2012.07a
- /
- pp.290-293
- /
- 2012
본 논문에서는 기존 AVC 보다 50% 압축성능 향상을 목표로 표준화가 진행되고 있는 차세대표준인 HEVC 부호화기의 속도를 높이기 위한 방안으로, HEVC 의 기술 중 화면 분할 기술인 타일(Tile)을 기반으로 효율적으로 부호화기를 병렬화하는 구조를 제안한다. 부호화기에서 복잡도가 높은 율왜곡 기반 모드 결정 과정을 멀티코어 병렬프로그래밍으로 구현하고, 병렬처리에 의한 속도 개선 결과를 제시한다. 타일은 병렬처리를 지원하기 위해 HEVC 가 채택한 구조로, 화면을 여러 개로 분할하여 부/복호화 할 수 있어 병렬처리 단위로 적합하며, 표준화의 기고서를 통해 화면분할로 인한 압축성능 변화량은 여러 차례 보고되고 있다. 본 논문의 결과에 의하면 타일의 수만큼 쓰레드를 생성하여 각 타일 단위로 율왜곡 기반 부호화 모드 결정을 하도록 병렬화 하였을 때 기존 참조 소프트웨어 대비 12 개의 쓰레드 생성 시 6 배의 속도 개선을 보인다. 향후 병렬로 처리할 수 있는 모듈을 확장하면 쓰레드 수 증가에 따른 속도개선 효과가 증대되어 부호화기 실용화를 위한 실시간 부호화기 개발에 한 걸음 다가갈 수 있을 것이라 기대한다.
PDF

Data Sampling-based Angular Space Partitioning for Parallel Skyline Query Processing (데이터 샘플링을 통한 각 기반 공간 분할 병렬 스카이라인 질의처리 기법)

Chung, Jaehwa
- The Journal of Korean Association of Computer Education
- /
- v.18 no.5
- /
- pp.63-70
- /
- 2015
In the environment that the complex conditions need to be satisfied, skyline query have been applied to various field. To processing a skyline query in centralized scheme, several techniques have been suggested and recently map/reduce platform based approaches has been proposed which divides data space into multiple partitions for the vast volume of multidimensional data. However, the performances of these approaches are fluctuated due to the uneven data loading between servers and redundant tasks. Motivated by these issues, this paper suggests a novel technique called MR-DEAP which solves the uneven data loading using the random sampling. The experimental result gains the proposed MR-DEAP outperforms MR-Angular and MR-BNL scheme.
PDF KSCI

Design and Evaluation of Cache Structure for Semi-packed Instruction (부분 압축 명령어를 위한 캐쉬 구조의 설계 및 평가)

Hong, Won-Gi;Lee, Seung-Yeop;Kim, Sin-Deok
- Journal of KIISE:Computer Systems and Theory
- /
- v.28 no.5
- /
- pp.245-258
- /
- 2001
VLIW에서는 프로그램 코드를 병렬화 하는 작업이 모두 컴파일러에 의해서만 이루어진다. 따라서 병렬로 수행될 연산어들을 명시적으로 나타내 주어야 하며, 이를 위한 명령어 인코딩 방식으로 전개 인코딩 방식과 압축 인코딩 방식이 사용되어 왔다. 각 인코딩 방식들은 명령어의 적재 및 검색을 위해 서로 다른 캐쉬 구조를 필요로 하는데, 전개 인코딩 방식으로 비압축 캐쉬를 압축 인코딩 방식으로 압축 캐쉬를 사용하고 있다. 그러나 이들은 각각 무효 연산어로 인한 메모리 활용 효율 저하와 복원 과정으로 인한 명령어 인출 오버헤드의 증가라는 문제점을 안고 있다. 본 논문에서는 부분적으로 명령어 길이를 일정하게 유지하는 부분 압축 인코딩을 사용해 메모리 활용 효율을 높이는 동시에 명령어 인출 오버헤드를 줄일 수 있는 분할 캐쉬 구조를 제안한다. 각 캐쉬 구조를 구현하는데 필요한 칩 영역을 계산하여, 분할 캐쉬가 비교적 비용 효율적인 캐쉬 구조임을 확인하였다. 모의 실험을 통한 메모리 활용 효율 측정 결과 하드웨어 비용의 증가를 고려하더라도 분할 캐쉬는 비압축 캐쉬에 비해 최고 약 3배의 메모리 활용 효율을 얻을 수 있었다. 각 캐쉬 구조를 일차 캐쉬로 하는 VLIW 시스템들의 성능 측정 결과는 TCSC(블록 집중형 분할 캐쉬)를 사용한 시스템이 비용 대비 성능 면에서 가장 우수한 것으로 나타났다.
PDF

Parallel Evaluation of Linearly Recursive Rules using a Shared-Nothing Paralled Architecture (비공유 병렬구조를 이용한 선형적 재귀규칙의 병렬평가)

Cho, Woo-Hyun;Kim, Hang-Joon
- The Transactions of the Korea Information Processing Society
- /
- v.4 no.12
- /
- pp.3069-3077
- /
- 1997
This paper is concerned with a new paradigm for parallel evaluation of linear recursion rules which contain transitive dependency in a shared-nothing parallel architecture. For parallel evaluation of rules, we consider a shared-nothing parallel architecture that consists of a set of nodes and a message passing network to these nodes. An evaluation of normalized rules is a computation of the proof theoretic meaning of a collection of rules. We shall here define normalized recursion rules which contain transitive dependency, present an equivalent expression for the rule, propose a paradigm for Parallel evaluation of normalized rule based on the equivalent expression using join, partition, and transitive closure operations, and analyze response-time complexity.
PDF

Parallel Rendering of High Quality Animation based on a Dynamic Workload Allocation Scheme (작업영역의 동적 할당을 통한 고화질 애니메이션의 병렬 렌더링)

Rhee, Yun-Seok
- Journal of the Korea Society of Computer and Information
- /
- v.13 no.1
- /
- pp.109-116
- /
- 2008
Even though many studies on parallel rendering based on PC clusters have been done. most of those did not cope with non-uniform scenes, where locations of 3D models are biased. In this work. we have built a PC cluster system with POV-Ray, a free rendering software on the public domain, and developed an adaptive load balancing scheme to optimize the parallel efficiency Especially, we noticed that a frame of 3D animation are closely coherent with adjacent frames. and thus we could estimate distribution of computation amount, based on the computation time of previous frame. The experimental results with 2 real animation data show that the proposed scheme reduces by 40% of execution time compared to the simple static partitioning scheme.
PDF

A Task Decomposition Scheme for Parallel Rendering of Continuous Images (연속 영상의 효과적 병렬 렌더링을 위한 작업분할 기법)

Choi, Young-Woon;Rhee, Yun-Seok
- Proceedings of the Korean Information Science Society Conference
- /
- 2005.11a
- /
- pp.1042-1044
- /
- 2005
고화질 입체 영상의 효과적인 재생을 위해 PC 클러스터를 활용한 여러 형태의 병렬화 기법이 제안되었지만, 영상을 구성하는 객체의 분포가 균일하지 않은 경우 충분한 성능을 발휘하지 못하였다. 본 연구에서는 Maya 렌더러를 채택한 PC 클러스터 기반의 병렬 렌더링 시스템을 구축하고, 병렬화 성능을 높이기 위한 효과적인 부하 균형 기법을 개발하였다. 특히 애니메이션을 구성하는 연속 프레임 작업에서 프레임 간의 연관성(coherence)이 높다는 사실에 근거하여, 임의 프레임의 각 분할 영역에 소요된 계산량을 바탕으로 다음 프레임의 부하 분포를 예측하고 이에 맞게 각 프로세서의 작업 영역을 재조정하는 기법을 제안하였다.
PDF

Parallel Learning System Optimization using ADMM (ADMM을 이용한 병렬 학습 시스템 최적화)

Kim, Min-Woo;Lim, Hwan-Hee;Lee, Byung-Jun;Kim, Kyung-Tae;Youn, Hee-Yong
- Proceedings of the Korean Society of Computer Information Conference
- /
- 2018.07a
- /
- pp.49-50
- /
- 2018
인공지능의 급격한 발전으로 빅 데이터의 활용이 증가되었지만 이로 인해 머신 러닝에서 일어나는 문제들 또한 해결해야할 과제이다. 본 논문에서는 이에 따라 초래되는 문제들 중 학습 데이터가 많아질 경우의 문제들을 방지하기 위해, 알고리즘의 수정 대신 병렬 처리 기반 시스템을 제안한다. 본 논문에서는 Alternating Direction Method of Multiplier(ADMM) 알고리즘을 소개하고 ADMM 기반의 최적화 기법을 적용하여 병렬 학습 시스템 최적화를 제안하였다.
PDF

A Synchronization Method for Parallelizing Nested Do Loop (중첩 루프의 병렬화를 위한 동기화 기법)

Park, Hyun-Ho;Kim, Yong-Man;Bae, Eun-Ho;Youn, Sung-Dae
- Proceedings of the Korea Information Processing Society Conference
- /
- 2001.04a
- /
- pp.239-242
- /
- 2001
일반적인 응용 프로그램에서 병렬성이 많은 구조는 루프 구조이며, 루프를 병렬로 처리하기 위해 동기화가 필요하다. 본 논문에서는 다중첨자를 갖는 1차원 배열의 루프의 병렬화를 위해 다수 개의 동일한 종속값을 이용하여 종속함수를 생성하고 이를 이용하여 종속관계가 성립하지 않는 비종속 구간(Non-dependence part)을 구한다. 그리고 동일한 값을 가지는 복수개의 종속값 간의 동기화는 외부루프 분할 기법을 이용하여 간소화 한 후 단일 첨자를 갖는 루프에 동기화를 수행하는 기법을 제시한다.
PDF

A Virtual Microscope System for Educational Applications (교육 분야 응용을 위한 가상 현미경 시스템)

Cho, Seung-Ho;Beynon, Mike;Saltz, Joel
- The KIPS Transactions:PartD
- /
- v.10D no.1
- /
- pp.117-124
- /
- 2003
The system implemented in this paper partitions and stores specimen data captured by a light microscope on distributed or parallel systems. Users ran observe images on computers as we use a physical microscope. Based on the client-server computing model, the system consists of client, coordinator, and data manager. Three components communicate messages. For retrieving images, we implemented the client program with necessary functions for educational applications such at image mark and text annotation, and defined the communication protocol. We performed the experiment for introducing a tape storage which stores a large volume of data. The experiment results showed performance improvement by data partitioning and indexing technique.
https://doi.org/10.3745/KIPSTD.2003.10D.1.117 인용 PDF KSCI

A Device of Parallelism Control in POSIX Based Parallelization of Recursive Algorithms (POSIX스레드에 의한 재귀적 알고리즘의 병렬화에서 병렬성 제어 방안)

Lee, Hyung-Bong;Baek, Chung-Ho
- The KIPS Transactions:PartA
- /
- v.9A no.2
- /
- pp.249-258
- /
- 2002
One of the jai or purposes of multiprocessor system is to get a high efficiency in performance improvement. But in most cases, it is unavoidable to use some special programming languages or tools for full use of multiprocessor system. In general, loop and recursive call statements of algorithms are considered as typical parts for parallelization. Especially, recursive call statements are easy to parallelize conceptually without support of any special languages or tools. But it is difficult to control the degree of parallelism caused by high depth of recursive call leading to execution crash. This paper proposes a device to control Parallelism in the process of POSIX thread bated parallelization of recursive algorithms. For this, we define the concept of thread and process in UNIX system, and analyze the results of experimental application of the device to quick sorting algorithm.
https://doi.org/10.3745/KIPSTA.2002.9A.2.249 인용 PDF KSCI

Search Result 309, Processing Time 0.031 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)