Search | Korea Science

Design of Parallel Inverse Quantization and Inverse Transform Architecture for High Performance H.264/AVC Decoder (고성능 H.264/AVC 복호기를 위한 병렬 역양자화 및 역변환 구조 설계)

Jung, Hong-Kyun;Ryoo, Kwang-Ki
- Proceedings of the KAIS Fall Conference
- /
- 2011.12b
- /
- pp.434-437
- /
- 2011
본 논문에서는 H.264/AVC 복호기의 성능을 향상시키기 위해 병렬 역양자화 구조와 역변환 구조를 제안한다. 제안하는 역양자화 구조는 공통 연산기를 사용하여 계산 복잡도를 감소시키고, 4개의 공통연산기를 사용하여 역양자화 수행 사이클 수를 1 사이클로 감소시킨다. 제안하는 역변환 구조는 4개의 변환 연산기를 사용하여 역변환 연산을 수행하는데 2 사이클이 소요된다. 또한 제안하는 구조는 역양자화 연산과 수평 역변환 연산을 동시에 수행하는 병렬 구조를 채택하여 역양자화 및 역변환 수행 사이클 수를 2 사이클로 감소시킨다. 제안하는 구조를 Magnachip 0.18um CMOS 공정 라이브러리를 이용하여 합성한 결과 1.5MHz의 동작 주파수에서 게이트 수는 14,173이고, 표준 참조 소프트웨어 JM 9.4에서 추출한 데이터를 이용하여 성능을 측정한 결과 제안하는 구조의 수행 사이클 수가 기존 구조 대비 38.74% 향상되었다.
PDF

A Constant Time Parallel Algorithm for Finding a Vertex Sequence of the Directed Cycle Graph from the Individual Neighborhood Information (각 정점별 이웃 정보로부터 유향 사이클 그래프의 정점 순서를 찾는 상수 시간 병렬 알고리즘)

Kim, Soo-Hwan;Choi, Jinoh
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2013.10a
- /
- pp.773-775
- /
- 2013
In this paper, we consider the problem for finding a vertex sequence of the directed cycle graph from the individual neighborhood information on a reconfigurable mesh(in short, RMESH). This problem can be solved in linear time using a sequential algorithm. However, it is difficult to develop a sublinear time parallel algorithm for the problem because of its sequential nature. All kinds of polygons can be represented by directed cycles, hence a solution of the problem may be used to solving problems in which a polygon should be constructed from the adjacency information for each vertex. In this paper, we present a constant time $n{\times}n^2$ RMESH algorithm for the problem with n vertices.
PDF

An Architecture for Two's Complement Serial-Parallel Multiplication (2의 보수 직병렬 승산을 위한 논리구조)

Mo, Sang-Man;Yoon, Yong-Ho
- ETRI Journal
- /
- v.13 no.2
- /
- pp.9-14
- /
- 1991
직병렬 승산기는 피승수와 승수중 어느 하나가 병렬로 입력되고 또다른 수는 직렬로 입력되는 구조를 가지며, 디지틀 신호처리, 온라인 응용, 특수 목적용 계산 시스팀 등에서 많이 이용되고 있다. 본 논문에서는 2 의 보수를 위한 직병렬 승산기의 논리구조를 제안한다. 제안한 2의 보수 직병렬 승산기는 효과적인 2의 보수 직병렬 승산 알고리즘에 의해서 모든 데이터 신호가 국부적 연결만으로 구성되며, 간단하고 모듈화된 하드웨어의 구성으로 쉽게 설계할 수 있다. 이 승산기는 무부호 승산과 마찬가지로 2n+1 사이클만을 필요로 하고, 각 사이클 시간은 무부호 직병렬 승산에 비해서 2의 보수 승산을 위한 XOR 게이트의 지연시간이 추가된 것뿐이다. 또한, 제안한 2의 보수 직병렬 승산기는 VLSI 구현에 매우 적합한 구조를 지닌다.
PDF

Simulation of the Characteristics of High-Performance Absorption Cycles (고성능 흡수냉동 사이클의 특성 시뮬레이션)

윤정인;오후규;이용화
- Transactions of the Korean Society of Mechanical Engineers
- /
- v.19 no.1
- /
- pp.231-239
- /
- 1995
This paper describes a computer simulation of the triple effect, water-lithium bromide absorption cooling cycles. The performance of the absorption systems is investigated through cycle simulation to obtain the system characteristics with the cooling water inlet temperature, the working solution concentrations, the ratio of the amount of the weak solution to the high, middle and low temperature generators, and the temperature difference of each solution heat exchanger. The efficiency of different cycles has been studied and the simulation results show that higher coefficient of performance could be obtained for the parallel cycle of constant solution distribution rate. As a result of this analysis, the optimum designs and operating conditions were determined based on the operating conditions and coefficient of performance.
https://doi.org/10.22634/KSME.1995.19.1.231 인용 PDF

Design of High-speed H.264/AVC Parallel Decoder Using ASIP Approach (ASIP 기술을 활용한 H.264/AVC 고속 병렬 복호화기 설계)

Ji, Bong-Il;Sim, Dong-Gyu;Kim, Kyung-Su;Park, Seong-Mo
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2009.11a
- /
- pp.251-254
- /
- 2009
본 논문에서는 고해상도 동영상의 실시간 복호화를 위하여 Application Specific Instruction-set Processor (ASIP)기술을 이용하여 H.264/AVC 고속 병렬 복호화기를 설계하였다. 우선, 하드웨어에 최적화된 구조로 복호화기를 설계하고 LISA로 기술한 멀티미디어 전용 명령어를 명령어 집합에 추가하였다. 이렇게 설계한 고속 H.264/AVC 복호화기는 사이클 기반 시뮬레이터에서 성능을 측정한 결과 기존 대비 약 35%의 복호화 사이클 감소를 보였다. 추가적인 성능 향상을 위해, 앞서 설계한 고속복호화기를 여러 개 사용하여 병렬 H.264/AVC 복호화기를 설계하였다. 병렬 복호화기는 여러 매크로블록을 동시에 복호화 처리함으로써 복호화기의 성능을 대폭 향상시켰다. 병렬 복호화기는 고속 복호화기 대비 약 75%의 복호화 사이클이 감소하였다. 이에 고해상도 동영상의 실시간 복호화를 위한 H.264/AVC 고속 병렬 복호화기의 설계 방법을 제시하고자 한다.
PDF

Methodology and its Hardware Architecture for High-speed Parallel Computation of Computer Generated Hologram (컴퓨터 생성 홀로그램의 고속 병렬 연산을 위한 연산방식 및 하드웨어 구조)

Yang, Wol-Sung;Choi, Hyun-Jun;Seo, Young-Ho;Yoo, Ji-Sang;Kim, Dong-Wook
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2010.11a
- /
- pp.30-33
- /
- 2010
본 논문에서는 연산에 의해 디지털 홀로그램(computer-generated hologram, CGH)을 생성할 때 많은 계산량으로 속도가 지연되는 문제를 해결하기 위해 연산식을 수정하고 이를 하드웨어로 구현한다. 기존에 제시된 CGH 연산 알고리즘에 비해 제안한 알고리즘은 디지털 홀로그램의 완벽한 병렬처리가 가능하게 하여 속도지연의 문제를 해소한다. 구현 결과 하드웨어가 주어진다면 최대 3사이클에 한 광원으로부터의 홀로그램성분 전체를 연산할 수 있고, 파이프라인 기법을 사용하면 두 사이클의 지연시간 후 매 사이클마다 한 광원에 대한 홀로그램 연산결과를 얻을 수 있다.
PDF

A Zero-latency Cycle Detection Scheme for Enhanced Parallelism in Multiprocessing Systems (다중처리 시스템의 병렬성 증대를 위한 사이클의 비 지연 발견 기법)

Kim Ju Gyun
- Journal of KIISE:Computer Systems and Theory
- /
- v.32 no.2
- /
- pp.49-54
- /
- 2005
This Paper Presents a non-blocking deadlock detection scheme with immediate cycle detection in multiprocessing systems. We assume an expedient state and a special case where each type of resource has one unit and each request is limited to one resource unit at a time. Unlike the previous deadlock detection schemes, this new method takes O(1) time for detecting a cycle and O(n+m) time for blocking or handling resource release where n and m are the number of processes and that of resources in the system. The deadlock detection latency is thus minimized and is constant regardless of n and m. However, in a multiprocessing system, the operating system can handle the blocking or release on-the-fly running on a separate processor, thus not interfering with user process execution. To some applications where deadlock is concerned, a predictable and zero-latency deadlock detection scheme could be very useful.
PDF KSCI

An analysis on the characteristics of superheater organization of ORC system for marine waste heat recovery system(WHRS) (선박폐열회수(WHRS) ORC 시스템의 과열기 구성에 따른 특성 해석)

Kim, Jong-Kwon;Kim, You-Taek;Kang, Ho-Keun
- Journal of Advanced Marine Engineering and Technology
- /
- v.38 no.1
- /
- pp.8-14
- /
- 2014
This research designed Waste Heat Recovery System(WHRS) generation system of 250kW whose working fluid is R-245fa and studied on cycle characteristics by superheater organization. It simulated two conditions; series connection and parallel connection between superheater and evaporator. In simulation of series connection of superheater and evaporator, output of 4.7% could be improved because of the increase of enthalpy by overheating of working fluid. When setting 250kW for target output, cycle flux could be reduced by 4.1%. When setting 250kW as a target output of cycle In parallel connection simulation of superheater and evaporator, cycle flux was reduced as flux of heat source fluid for superheater was increased. So, the maximum 7.9% of working fluid pump's electric power was reduced and there was no big change in cycle efficiency and net efficiency by flux ratio.
https://doi.org/10.5916/jkosme.2014.38.1.8 인용 PDF KSCI

An Optimized Hardware Design for High Performance Residual Data Decoder (고성능 잔여 데이터 복호기를 위한 최적화된 하드웨어 설계)

Jung, Hong-Kyun;Ryoo, Kwang-Ki
- Journal of the Korea Academia-Industrial cooperation Society
- /
- v.13 no.11
- /
- pp.5389-5396
- /
- 2012
In this paper, an optimized residual data decoder architecture is proposed to improve the performance in H.264/AVC. The proposed architecture is an integrated architecture that combined parallel inverse transform architecture and parallel inverse quantization architecture with common operation units applied new inverse quantization equations. The equations without division operation can reduce execution time and quantity of operation for inverse quantization process. The common operation unit uses multiplier and left shifter for the equations. The inverse quantization architecture with four common operation units can reduce execution cycle of inverse quantization to one cycle. The inverse transform architecture consists of eight inverse transform operation units. Therefore, the architecture can reduce the execution cycle of inverse transform to one cycle. Because inverse quantization operation and inverse transform operation are concurrency, the execution cycle of inverse transform and inverse quantization operation for one $4{\times}4$ block is one cycle. The proposed architecture is synthesized using Magnachip 0.18um CMOS technology. The gate count and the critical path delay of the architecture are 21.9k and 5.5ns, respectively. The throughput of the architecture can achieve 2.89Gpixels/sec at the maximum clock frequency of 181MHz. As the result of measuring the performance of the proposed architecture using the extracted data from JM 9.4, the execution cycle of the proposed architecture is about 88.5% less than that of the existing designs.
https://doi.org/10.5762/KAIS.2012.13.11.5389 인용 PDF KSCI

Parallel Architecture Design of H.264/AVC CAVLC for UD Video Realtime Processing (UD(Ultra Definition) 동영상 실시간 처리를 위한 H.264/AVC CAVLC 병렬 아키텍처 설계)

Ko, Byung Soo;Kong, Jin-Hyeung
- Journal of the Institute of Electronics and Information Engineers
- /
- v.50 no.5
- /
- pp.112-120
- /
- 2013
In this paper, we propose high-performance H.264/AVC CAVLC encoder for UD video real time processing. Statistical values are obtained in one cycle through the parallel arithmetic and logical operations, using non-zero bit stream which represents zero coefficient or non-zero coefficient. To encode codeword per one cycle, we remove recursive operation in level encoding through parallel comparison for coefficient and escape value. In oder to implement high-speed circuit, proposed CAVLC encoder is designed in two-stage {statical scan, codeword encoding} pipeline. Reducing the encoding table, the arithmetic unit is used to encode non-coefficient and to calculate the codeword. The proposed architecture was simulated in 0.13um standard cell library. The gate count is 33.4Kgates. The architecture can support Ultra Definition Video ($3840{\times}2160$) at 100 frames per second by running at 100MHz.
https://doi.org/10.5573/ieek.2013.50.5.112 인용 PDF KSCI

Search Result 94, Processing Time 0.025 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)