Search | Korea Science

Design of DCT/IDCT Core Processor using Module Generator Technique (모듈생성 기법을 이용한 DCT/IDCT 코어 프로세서의 설계)

황준하;한택돈
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.18 no.10
- /
- pp.1433-1443
- /
- 1993
DCT(Discrete Cosine Transform) / IDCT(Inverse DCT) is widely used in various image compression and decompression systems as well as in DSP(Digital Signal Processing) applications. Since DCT/ IDCT is one of the most complicated part of the compression system, the performance of the system can be greatly enchanced by improving the speed of DCT/IDCT operation. In this thesis, we designed a DCT/IDCT core processor using module generator technique. By utilizing the partial sum and DA(Distributed Arithmetic) techniques, the DCT/ IDCT core processor is designed within small area. It is also designed to perform the IDCT(Inverse DCT) operation with little additional circuitry. The pipeline structure of the core processor enables the high performance, and the high accuracy of the DCT/IDCT operation is obtained by having fewer rounding stages. The proposed design is independent of design rules, and the number of the input bits and the accuracy of the internal calculation coa be easily adjusted due to the module generator technique. The accuracy of the processor satisfies the specifications in CCITT recommendation H, 261.
PDF

Architecture of 2-D DCT processor adopting accuracy comensator (정확도 보상기를 적용한 2차원 이산 코사인 변환 프로세서의 구조)

김견수;장순화;김재호;손경식
- Journal of the Korean Institute of Telematics and Electronics A
- /
- v.33A no.10
- /
- pp.168-176
- /
- 1996
This paper presetns a 2-D DCT architecture adopting accurac y compensator for reducing the hardware complexity and increasing processing speed in VL\ulcornerSI implementation. In the application fields such as moving pictures experts group (MPEG) and joint photographic experts group (JPEG), 2-D DCT processor must be implemented precisely enough to meet the accuracy specifications of the ITU-T H.261. Almost all of 2-D DCT processors have been implemented using many multiplications and accumulations of matrices and vectors. The number of multiplications and accumulations seriously influence on comlexity and speed of 20D DCT processor. In 2-D DCT with fixed-point calculations, the computation bit width must be sufficiently large for the above accuracy specifications. It makes the reduction of hardware complexity hard. This paper proposes the accuracy compensator which compensates the accuracy of the finite word length calculation. 2-D DCT processor with the proposed accuracy compensator shows fairly reduced hardware complexity and improved processing speed.
PDF

Optimization Design Method for Inner Product Using CSHM Algorithm and its Application to 1-D DCT Processor (연산공유 승산 알고리즘을 이용한 내적의 최적화 및 이를 이용한 1차원 DCT 프로세서 설계)

이태욱;조상복
- The Transactions of the Korean Institute of Electrical Engineers D
- /
- v.53 no.2
- /
- pp.86-93
- /
- 2004
The DCT algorithm needs an efficient hardware architecture to compute inner product. The conventional design method, like ROM-based DA(Distributed Arithmetic), has large hardware complexity. Because of this reason, a CSHM(Computation Sharing Multiplication) was proposed for implementing inner product by Park. However, the Park's CSHM has inefficient hardware architecture in the precomputer and select units. Therefore it degrades the performance of the multiplier. In this paper, we presents the optimization design method for inner product using CSHM algorithm and applied it to implementation of 1-D DCT processor. The experimental results show that the proposed multiplier is more efficient than Park's when hardware architectures and logic synthesis results were compared. The designed 1-D DCT processor by using proposed design method is more high performance than typical methods.
PDF KSCI

Design of 1-D DCT processor using a new efficient computation sharing multiplier (새로운 연산 공유 승산기를 이용한 1차원 DCT 프로세서의 설계)

Lee, Tae-Wook;Cho, Sang-Bock
- The KIPS Transactions:PartA
- /
- v.10A no.4
- /
- pp.347-356
- /
- 2003
The OCT algorithm needs efficient hardware architecture to compute inner product. The conventional methods have large hardware complexity. Because of this reason. a computation sharing multiplier was proposed for implementing inner product. However, the existing multiplier has inefficient hardware architecture in precomputer and select units. Therefore it degrades the performance of the multiplier. In this paper, we proposed a new efficient computation sharing multiplier and applied it to implementation of 1-D DCT processor. The comparison results show that the new multiplier is more efficient than an old one when hardware architectures and logic synthesis results were compared. The designed 1-D DCT processor by using the proposed multiplier is more high performance than typical design methods.
https://doi.org/10.3745/KIPSTA.2003.10A.4.347 인용 PDF KSCI

A study on application of DCT algorithm with MVP(Multimedia Video Processor) (MVP(Multimedia Video Processor)를 이용한 DCT알고리즘 구현에 관한 연구)

김상기;정진현
- 제어로봇시스템학회:학술대회논문집
- /
- 1997.10a
- /
- pp.1383-1386
- /
- 1997
Discrete cosine transform(DCT) is the most popular block transform coding in lossy mode. DCT is close to statistically optimal transform-the Karhunen Loeve transform. In this paper, a module for DCT encoder is made with TMS320C80 based on JPEG and MPEG, which are intermational standards for image compression. the DCT encoder consists of three parts-a transformer, a vector quantizer and an entropy encoder.
PDF

A Development of a high speed DCT parallel processor (고속 DCT 병렬처리기의 개발)

박종원;유기현
- Journal of the Korean Institute of Telematics and Electronics B
- /
- v.32B no.8
- /
- pp.1085-1090
- /
- 1995
The Discrete Cosine Transform(DCT) is effective technique for image compression, which is widely used in the area of digital signal processing. In this paper, an efficient DCT processor is proposed and simulated by using Verilog HDL. This algorithm is improved 60% in processing speed, but it's somewhat complicate compared with Y. Arai's algorithm. This algorithm will be used efficiently for real time image processing.
PDF

The Design of SoC for DCT/DWT Processor (DCT/DWT 프로세서를 위한 SoC 설계)

Kim, Young-Jin;Lee, Hyon-Soo
- Proceedings of the IEEK Conference
- /
- 2006.06a
- /
- pp.527-528
- /
- 2006
In this paper, we propose an IP design and implementation of System on a chip(SoC) for Discrete Cosine Transform (DCT) and Discrete Wavelet Transform (DWT) processor using adder-based DA(Adder-based Distributed Arithmetic). To reduced hardware cost and to improve operating speed, the combined DCT/ DWT processor used the bit-serial method and DA module. The transform of coefficient equation result in reduction in hardware cost and has a regularity in implementation. We use Verilog-HDL and Xilinx ISE for simulation and implement FPGA on SoCMaster-3.
PDF

A Wavefront Array Processor Utilizing a Recursion Equation for ME/MC in the frequency Domain (주파수 영역에서의 움직임 예측 및 보상을 위한 재귀 방정식을 이용한 웨이브프런트 어레이 프로세서)

Lee, Joo-Heung;Ryu, Chul
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.31 no.10C
- /
- pp.1000-1010
- /
- 2006
This paper proposes a new architecture for DCT-based motion estimation and compensation. Previous methods do riot take sufficient advantage of the sparseness of 2-D DCT coefficients to reduce execution time. We first derive a recursion equation to perform DCT domain motion estimation more efficiently; we then use it to develop a wavefront array processor (WAP) consisting of processing elements. In addition, we show that the recursion equation enables motion predicted images with different frequency bands, for example, from the images with low frequency components to the images with low and high frequency components. The wavefront way Processor can reconfigure to different motion estimation algorithms, such as logarithmic search and three step search, without architectural modifications. These properties can be effectively used to reduce the energy required for video encoding and decoding. The proposed WAP architecture achieves a significant reduction in computational complexity and processing time. It is also shown that the motion estimation algorithm in the transform domain using SAD (Sum of Absolute Differences) matching criterion maximizes PSNR and the compression ratio for the practical video coding applications when compared to tile motion estimation algorithm in the spatial domain using either SAD or SSD.
PDF KSCI

Implementation of DCT using Bit Slice Signal Processor (BIT SLICE SIGNAL PROCESSOR를 이용한 DCT의 구현)

Kim, Dong-L.;Go, Seok-B.;Paek, Seung-K.;Lee, Tae-S.;Min, Byong-G.
- Proceedings of the KIEE Conference
- /
- 1987.07b
- /
- pp.1449-1453
- /
- 1987
A microprogrammable Bit Slice Sinal Processor for image processing is implemented. Processing speed is increased by the parallelism in horizontal microprogram using 120bits microcode, pipelined architecture, 2 bank memory switching that interfaces with the Host through DMA, a variable clock control, overflow checking H/W,look-up table method and cache memory. With this processor, a DCT algorithm which uses 2-D FFT is performed. The execution time for $512{\times}512{\times}8$ image is 12 sec when 16 bit operation is runned, and the recovered image has acceptable quality with MSE 0.276%.
PDF

2-D DCT/IDCT Processor Design Reducing Adders in DA Architecture (DA구조 이용 가산기 수를 감소한 2-D DCT/IDCT 프로세서 설계)

Jeong Dong-Yun;Seo Hae-Jun;Bae Hyeon-Deok;Cho Tae-Won
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.43 no.3 s.345
- /
- pp.48-58
- /
- 2006
This paper presents 8x8 two dimensional DCT/IDCT processor of adder-based distributed arithmetic architecture without applying ROM units in conventional memories. To reduce hardware cost in the coefficient matrix of DCT and IDCT, an odd part of the coefficient matrix was shared. The proposed architecture uses only 29 adders to compute coefficient operation in the 2-D DCT/IDCT processor, while 1-D DCT processor consists of 18 adders to compute coefficient operation. This architecture reduced 48.6% more than the number of adders in 8x8 1-D DCT NEDA architecture. Also, this paper proposed a form of new transpose network which is different from the conventional transpose memory block. The proposed transpose network block uses 64 registers with reduction of 18% more than the number of transistors in conventional memory architecture. Also, to improve throughput, eight input data receive eight pixels in every clock cycle and accordingly eight pixels are produced at the outputs.
PDF KSCI

Search Result 31, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)