Search | Korea Science

(Design of Systolic Away for High-Speed Fractal Image Compression by Data Reusing) (데이터 재사용에 의한 고속 프랙탈 영상압축을 위한 시스토릭 어레이의 설계)

U, Jong-Ho;Lee, Hui-Jin;Lee, Su-Jin;Seong, Gil-Yeong
- Journal of the Institute of Electronics Engineers of Korea SC
- /
- v.39 no.3
- /
- pp.220-227
- /
- 2002
An one-dimensional VLSI array for high speed processing of Fractal image compression was designed. Using again the overlapped input data of adjacent domain blocks in the existing one-dimensional VLSI array, we can save the number of total input for the operations, and so we can save the total computation time. In the design procedure, we considered the data dependences between the input data, reordered the input data to the array, and designed the processing elements. Registers and multiplexors are added for the storing and routing of the input data in some processing elements. Consequently as adding a little hardware, this design shows (N-4B)/4(N-B) times of speed-up compared with the existing array, where N is image size and B is block size.
PDF KSCI

VLSI Array Architecture for High Speed Fractal Image Compression (고속 프랙탈 영상압축을 위한 VLSI 어레이 구조)

성길영;이수진;우종호
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.25 no.4B
- /
- pp.708-714
- /
- 2000
In this paper, an one-dimensional VLSI array for high speed processing of fractal image compression algorithm based the quad-tree partitioning method is proposed. First of all, the single assignment code algorithm is derived from the sequential Fisher's algorithm, and then the data dependence graph(DG) is obtained. The two-dimension array is designed by projecting this DG along the optimal direction and the one-dimensional VLSI array is designed by transforming the obtained two-dimensional array. The number of Input/Output pins in the designed one-dimensional array can be reduced and the architecture of process elements(PEs) can he simplified by sharing the input pins of range and domain blocks and internal arithmetic units of PEs. Also, the utilization of PEs can be increased by reusing PEs for operations to the each block-size. For fractal image compression of 512X512gray-scale image, the proposed array can be processed fastly about 67 times more than sequential algorithm. The operations of the proposed one-dimensional VLSI array are verified by the computer simulation.
PDF

Design of VLSI Array Architecture with Optimal Pipeline Period for Fast Fractal Image Compression (고속 프랙탈 영상압축을 위한 최적의 파이프라인 주기를 갖는 VLSI 어레이 구조 설계)

성길영;우종호
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.25 no.5A
- /
- pp.702-708
- /
- 2000
In this paper, we designed one-dimensional VLSI array with optimal pipeline period for high speed processing fractal image compression. The algorithm is derived which is suitable for VLSI array from axed block partition algorithm. Also the algorithm satisfies high quality of image and high compression-ratio. The designed VLSI array has optimal pipeline relied because the required processing time of PEs is distributed as same as possible. As this result, we can improve the processing speed up to about 3 times. The number of input/output pins can be reduced by sharing the input/output and arithmetic unit of the domain blocks and the range blocks.
PDF

A Study on the VLSI Systolic Array Implementation of 2-Dimensional FIR Digital Filter (2-Dimensional FIR 디지털 필터의 VLSI 시스토릭 어레이 구조 실험에 관한 연구)

김수현;문대철
- The Journal of the Acoustical Society of Korea
- /
- v.12 no.4
- /
- pp.32-38
- /
- 1993
2-D FIR 필터를 시스토릭 어레이 구조로 실현하는 방법을 제시하였다. 시스토릭 어레이는 1-D FIR 필터로 부분 실현한 후 병렬연겨랗여 구현하였다. 부분 실현한 시스토릭 어레이의 마지막 입력신호를 다음 단의 입력에 직접연결시킴으로써 입력 지연에 사용되는저장요소를 절약 시킨다. 1-D 시스ㅏ토릭 어레이는 지역통신 접근에 의해 DG를 설계한 후 SFG로으ㅟ 사상을 통해 유도하였다. 유도된 SFG는 DG의 노드가 보다 적은수의 PE에 사상됨으로써 PE의 이용률을 개선할 수 잇다. 유도된 구조는 매우 간단하며, 입력 샘플이 공급되어지면 매 샘플링 기간마다 새로운 출력을 얻는 매우 SHB은 데이터 비율(data rate)을 갖는다. 시스토릭 어레이는 규칙적이고, 모듈성이며, local interconnection, highly synchronized multiprocessing 의 특징을 갖기 때문에 VLSI 실현에 매우 적합하다. PE 셀 구조는 높은 처리율, 최소 계산시간과 최소 파이프라인 주기를 갖도록 설계하였다.
PDF

A VLSI Array Processor Architecture for High-Speed Processing of Full Search Block Matching Algorithm (완전탐색 블럭정합 알고리즘의 고속 처리를 위한 VLSI 어레이 프로세서의 구조)

이수진;우종호
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.27 no.4A
- /
- pp.364-370
- /
- 2002
In this paper, we propose a VLSI array architecture for high speed processing of FBMA. First of all, the sequential FBMA is transformed into a single assignment code by using the index space expansion, and then the dependance graph is obtained from it. The two dimensional VLSI array is derived by projecting the dependance graph along the optimal direction. Since the candidate blocks in the search range are overlapped with columns as well as rows, the processing elements of the VLSI array are designed to reuse the overlapped data. As the results, the number of data inputs is reduced so that the processing performance is improved. The proposed VLSI array has (N$^2$+1)${\times}$(2p+1) processing elements and (N+2p) input ports where N is the block size and p is the maximum search range. The computation time of the rat reference block is (N$^2$+2(p+1)N+6p), and the block pipeline period is (3N+4p-1).
PDF KSCI

Efficient One-dimensional VLSI array using the Data reuse for Fractal Image Compression (데이터 재사용을 이용한 프랙탈 영상압축을 위한 효율적인 일차원 VLSI 어레이)

이희진;이수진;우종호
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2001.05a
- /
- pp.265-268
- /
- 2001
In this paper, we designed one-dimensional VLSI array with high speed processing in Fractal image compression. fractal image compression algorithm partitions the original image into domain blocks and range blocks then compresses data using the self similarity of blocks. The image is partitioned into domain block with 50% overlapping. Domain block is reduced by averaging the original image to size of range block. VLSI array is trying to search the best matching between a range block and a large amount of domain blocks. Adjacent domain blocks are overlapped, so we can improve of each block's processing speed using the reuse of the overlapped data. In our experiment, proposed VLSI array has about 25% speed up by adding the least register, MUX, and DEMUX to the PE.
PDF

Reduction of Input Pins in VLSI Array for High Speed Fractal Image Compression (고속 프랙탈 영상압축을 위한 VLSI 어레이의 입력핀의 감소)

성길영;전상현;이수진;우종호
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.26 no.12A
- /
- pp.2059-2066
- /
- 2001
In this paper, we proposed a method to reduce the number of input pins in one-dimensional VLSI array for fractal image compression. We use quad-tree partition scheme and can reduce the number of the input pins up to 50% by sharing the domain\`s and the range\`s data input pins in the proposed VLSI array architecture. Also, we can reduce the input pins and simplify the internal operation circuit of the processing elements by eliminating a few number of bits of the least significant bits of the input data. We simulated using the 256$\times$256 and 512$\times$512 Lena images to verify performance of the proposed method. As the result of simulation, we can decompress the original image with about 32dB(PSNR) in spite of elimination of the least significant 2-bit in the original input data, and additionally reduce the number of input pins up to 25% compared to VLSI array sharing input pins of range and domain.
PDF

Design of an efficient VLSI architecture of SADCT based on systolic array (시스톨릭 어레이에 기반한 SADCT의 효율적 VLSI 구조 설계)

Gang, Tae Jun;Jeong, Ui Yun;Ha, Yeong Ho
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.38 no.3
- /
- pp.46-46
- /
- 2001
본 논문에서는 시스톨릭 어레이에 기반한 모양 적응적 이산 여현 변환(SADCT)의 효율적 VLSI 구조를 제안한다. 모양 적응적 이산 여현 변환은 이산 여현 변환과 달리 변환 크기가 각 블록에서의 객체의 모양에 따라 가변적이므로 기존의 시간 순환구조에서는 각 처리소자의 이용도와 처리속도가 모두 저하된다. 본 논문에서는 이러한 단점을 극복하기 위해 메모리를 필요로 하지 않는 시스톨릭 어레이에 기반한 구조를 제안한다. 제안된 구조에서는 1차원 SADCT를 연속적으로 수행함으로 처리속도를 향상시키고 첫 번째 열의 처리소자들을 마지막 열의 처리소자들과 연결하고, 입력 데이터는 각각의 재배열된 블록에서의 최대 데이터 크기에 따라 각 열에 병렬로 입력하여 처리소자의 이용도를 향상시켰다. 제안된 구조는 VHDL로 기술하고 MentorTM를 이용하여 기능검증을 수행하였다. 검증결과, 하드웨어 복잡도가 다소 증가하나, 처리속도는 기존의 방법에 비해 두 배정도 향상되었다.

An Architecture of One-Dimensional Systolic Array for Full-Search Block Matching Algorithm (완전탐색 블럭정합 알고리즘을 위한 일차원 시스톨릭 어레이의 구조)

Lee, Su-Jin;Woo, Chong-Ho
- Journal of the Institute of Electronics Engineers of Korea SC
- /
- v.39 no.5
- /
- pp.34-42
- /
- 2002
In this paper, we designed the VLSI array architecture for the high speed processing of the motion estimation used by block matching algorithm. We derived the one dimensional systolic array from the full search block matching algorithm. The data and control signals of the proposed systolic array are passed through adjacent processing element. So proposed architecture has temporal and spatial locality. The I/O ports exists only in the first and last processing elements of the array. This architecture has low pin counts and modular expandability. So the proposed array architecture can be cascaded for different block size and search range.
PDF KSCI

An Efficient Array Algorithm for VLSI Implementation of Vector-radix 2-D Fast Discrete Cosine Transform (Vector-radix 2차원 고속 DCT의 VLSI 구현을 위한 효율적인 어레이 알고리듬)

신경욱;전흥우;강용섬
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.18 no.12
- /
- pp.1970-1982
- /
- 1993
This paper describes an efficient array algorithm for parallel computation of vector-radix two-dimensional (2-D) fast discrete cosine transform (VR-FCT), and its VLSI implementation. By mapping the 2-D VR-FCT onto a 2-D array of processing elements (PEs), the butterfly structure of the VR-FCT can be efficiently importanted with high concurrency and local communication geometry. The proposed array algorithm features architectural modularity, regularity and locality, so that it is very suitable for VLSI realization. Also, no transposition memory is required, which is invitable in the conventional row-column decomposition approach. It has the time complexity of O(N+Nnzp-log2N) for (N*N) 2-D DCT, where Nnzd is the number of non-zero digits in canonic-signed digit(CSD) code, By adopting the CSD arithmetic in circuit desine, the number of addition is reduced by about 30%, as compared to the 2`s complement arithmetic. The computational accuracy analysis for finite wordlength processing is presented. From simulation result, it is estimated that (8*8) 2-D DCT (with Nnzp=4) can be computed in about 0.88 sec at 50 MHz clock frequency, resulting in the throughput rate of about 72 Mega pixels per second.
PDF

Search Result 51, Processing Time 0.016 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)