• Title/Summary/Keyword: clock cycle

Search Result 181, Processing Time 0.034 seconds

A Multi-Level Accumulation-Based Rectification Method and Its Circuit Implementation

  • Son, Hyeon-Sik;Moon, Byungin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.6
    • /
    • pp.3208-3229
    • /
    • 2017
  • Rectification is an essential procedure for simplifying the disparity extraction of stereo matching algorithms by removing vertical mismatches between left and right images. To support real-time stereo matching, studies have introduced several look-up table (LUT)- and computational logic (CL)-based rectification approaches. However, to support high-resolution images, the LUT-based approach requires considerable memory resources, and the CL-based approach requires numerous hardware resources for its circuit implementation. Thus, this paper proposes a multi-level accumulation-based rectification method as a simple CL-based method and its circuit implementation. The proposed method, which includes distortion correction, reduces addition operations by 29%, and removes multiplication operations by replacing the complex matrix computations and high-degree polynomial calculations of the conventional rectification with simple multi-level accumulations. The proposed rectification circuit can rectify $1,280{\times}720$ stereo images at a frame rate of 135 fps at a clock frequency of 125 MHz. Because the circuit is fully pipelined, it continuously generates a pair of left and right rectified pixels every cycle after 13-cycle latency plus initial image buffering time. Experimental results show that the proposed method requires significantly fewer hardware resources than the conventional method while the differences between the results of the proposed and conventional full rectifications are negligible.

An Optimized Hardware Design for High Performance Residual Data Decoder (고성능 잔여 데이터 복호기를 위한 최적화된 하드웨어 설계)

  • Jung, Hong-Kyun;Ryoo, Kwang-Ki
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.13 no.11
    • /
    • pp.5389-5396
    • /
    • 2012
  • In this paper, an optimized residual data decoder architecture is proposed to improve the performance in H.264/AVC. The proposed architecture is an integrated architecture that combined parallel inverse transform architecture and parallel inverse quantization architecture with common operation units applied new inverse quantization equations. The equations without division operation can reduce execution time and quantity of operation for inverse quantization process. The common operation unit uses multiplier and left shifter for the equations. The inverse quantization architecture with four common operation units can reduce execution cycle of inverse quantization to one cycle. The inverse transform architecture consists of eight inverse transform operation units. Therefore, the architecture can reduce the execution cycle of inverse transform to one cycle. Because inverse quantization operation and inverse transform operation are concurrency, the execution cycle of inverse transform and inverse quantization operation for one $4{\times}4$ block is one cycle. The proposed architecture is synthesized using Magnachip 0.18um CMOS technology. The gate count and the critical path delay of the architecture are 21.9k and 5.5ns, respectively. The throughput of the architecture can achieve 2.89Gpixels/sec at the maximum clock frequency of 181MHz. As the result of measuring the performance of the proposed architecture using the extracted data from JM 9.4, the execution cycle of the proposed architecture is about 88.5% less than that of the existing designs.

Efficient Implementation of a Pseudorandom Sequence Generator for High-Speed Data Communications

  • Hwang, Soo-Yun;Park, Gi-Yoon;Kim, Dae-Ho;Jhang, Kyoung-Son
    • ETRI Journal
    • /
    • v.32 no.2
    • /
    • pp.222-229
    • /
    • 2010
  • A conventional pseudorandom sequence generator creates only 1 bit of data per clock cycle. Therefore, it may cause a delay in data communications. In this paper, we propose an efficient implementation method for a pseudorandom sequence generator with parallel outputs. By virtue of the simple matrix multiplications, we derive a well-organized recursive formula and realize a pseudorandom sequence generator with multiple outputs. Experimental results show that, although the total area of the proposed scheme is 3% to 13% larger than that of the existing scheme, our parallel architecture improves the throughput by 2, 4, and 6 times compared with the existing scheme based on a single output. In addition, we apply our approach to a $2{\times}2$ multiple input/multiple output (MIMO) detector targeting the 3rd Generation Partnership Project Long Term Evolution (3GPP LTE) system. Therefore, the throughput of the MIMO detector is significantly enhanced by parallel processing of data communications.

Design of A Low-Voltage and High-Speed Pipelined A/D Converter Using Current-Mode Signals (저전압 고속 전류형 Pipelined A/D 변환기의 설계)

  • 박승균;이희덕;한철희
    • Journal of the Korean Institute of Telematics and Electronics A
    • /
    • v.31A no.3
    • /
    • pp.18-27
    • /
    • 1994
  • An 8-bit 2-stage pipelined current mode A/D converter is designed with a new architecture, where the wideband track-and-hold amplifiers which have 2 integrators in parallel sample input signal twice per clock cycle. The conversion speed of the A-D converter is two times faster than that of conventional pipelined method. The converter is designed to be operated at the power supply voltage of 3.3V with the input dynamic range of 0-256$\mu$A. HSPICE simulation results show the performance of up to 55Msamples/s and power consumption of 150mW with the parameters of ISRC $1.5\mu$m BICMOS process. The chip area is 3${\times}4mm^{2}$.

  • PDF

A design of Discrete Wavelet Transform Encoder for Image Signal Processing (영상신호 처리를 위한 이산 웨이브렛 변환용 부호화기 설계)

  • 김윤홍;김정화양원일이강현
    • Proceedings of the IEEK Conference
    • /
    • 1998.10a
    • /
    • pp.1101-1104
    • /
    • 1998
  • The modern multimedia applications which are video processor, video conference or video phone and so forth require real time processing. Because of a large amount of image data, those require high compression performance. In this paper, the prosposed image processing encoder was designed by using wavelet transform encoding. The proposed filter block can process imae data on the high speed because of composing individual function blocks by parallel and compute both highpass and lowpass coefficient in the same clock cycle. When image data is decomposed into multiresolution, the proposed scheme needs external memory and controller to save intermediate results and it can operate within 33MHz.

  • PDF

An Efficient Clock Cycle Reducing Architecture in Full-Search Block Matching Motion Estimation VLSI (전탐색 블럭정합 움직임추정 VLSI 에서 클럭사이클수를 줄이는 효율적 구조)

  • 윤종성;장순화
    • Proceedings of the IEEK Conference
    • /
    • 2000.09a
    • /
    • pp.259-262
    • /
    • 2000
  • 본 논문은 전탐색 블럭매칭 움직임추정 VLSI 구조에서 클럭당 두연산(하나는 클럭의 상향에지, 하나는 하향에지에서 동작)을 수행하는 PE(Processing Element)를 교번적으로 결선, 클럭의 상향에지는 물론 하향에지에서도 동작하도록 하는 방식으로 클럭 사이클수를 줄이는 VLSI 구조를 제안한다 기존 구조에 그대로 적용되는 본 방법은 공급 데이타폭이 2 배, PE 의 HW 복잡도가 1.5 배 절대차 합 연산의 복잡도가 2 배로 늘어나 전체 하드웨어가 복잡해지나, PE수를 2배로 하여 클럭사이클수를 줄이는 방법에 비해서는 매우 효율적이다. 본 제안 구조는 계층적 움직임 추정 알고리듬을 사용한 MPEG-2 움직임 추정기 개발의 설계에 적용하여 기능과 HW 복잡도를 확인하였다.

  • PDF

An Efficient Design of CCMP for Robust Security Network (효율적인 CCMP 코어 설계)

  • Sung Yun-Jong;Kwon Sung-Gu;Bae Du-Hyun;Park Se-Hyun;Song Oh-Young
    • Proceedings of the Korea Institutes of Information Security and Cryptology Conference
    • /
    • 2006.06a
    • /
    • pp.390-393
    • /
    • 2006
  • IEEE 802.11e 과 IEEE 802.11n에서 data의 높은 전송률을 구현하기 위해 Block Ack와 frame agegation 과 같은 새로운 mechanism이 논의 되고 있다. 이러한 mechanism은 각각의 MPDU processing 마다 짧은 응답시간을 요구한다. 본 논문에서는 위의 새로운 MAC을 지원하는 IEEE 802.11i를 위한 효율적인 CCMP 설계를 제안한다. 제안된 설계에서는 한 AES-CCM core에서 MIC calculation 과 정보 암호화가 128bit씩 순차적으로 수행되어지는 mode toggling 접근을 채택했다. 본 설계에서는 응답시간이 44 clock cycle의 짧은 짧은 시간으로 줄었다. 또한 하나의 AES-CCM core를 사용하고 낮은 주파수에서 수용할만한 data throughput과 응답시간을 얻었기 때문에 하드웨어적인 복잡성과 전력 소모를 줄일수 있었다.

  • PDF

Increase the reliability of the gate driver for amorphous TFT displays

  • Wu, Bo-Cang;Shiau, Miin-Shyue;Wu, Hong-Chong;Liu, Don-Gey
    • 한국정보디스플레이학회:학술대회논문집
    • /
    • 2008.10a
    • /
    • pp.1301-1304
    • /
    • 2008
  • In this study, we used a multiple phase scheme for the clock in the dual-pull-down driver for TFT display panels. In this scheme, the turn-on time for the transistors in the dual-pull-down structure was reduced from 1/2 to 1/4 or 1/8 of the period cycle time. While keeping proper operation of the transistor size of circuit was fine tuned to achieve an optimal performance. The relation between the active time and the transistor dimensions was obtained for the optimal design.

  • PDF

A Study on the Implementation of Format Converter using Median Filter (메디안 필터를 이용한 포맷 변환기 구현에 관한 연구)

  • 김현기;하기종;최영규;류기한;이천희
    • Proceedings of the IEEK Conference
    • /
    • 2003.07b
    • /
    • pp.1137-1140
    • /
    • 2003
  • The area of the prototype device is less than 80mm$^2$. Operating with a 60ns clock cycle, the device typically dissipates only 300mW. The full functionality was proven by using the methodical test programs based on typical image processing operations. Also, we realized the whole process from conventional gray image to color image. Format converters, implemented using multidimensional access memories, transfer the data between the processing element array and conventional bit-parallel components in real time. The completed system is fully functional and performs typical low-level image processing tasks at speed exceeding 30 frames of traditional TV system per second.

  • PDF

An Experimental 0.8 V 256-kbit SRAM Macro with Boosted Cell Array Scheme

  • Chung, Yeon-Bae;Shim, Sang-Won
    • ETRI Journal
    • /
    • v.29 no.4
    • /
    • pp.457-462
    • /
    • 2007
  • This work presents a low-voltage static random access memory (SRAM) technique based on a dual-boosted cell array. For each read/write cycle, the wordline and cell power node of selected SRAM cells are boosted into two different voltage levels. This technique enhances the read static noise margin to a sufficient level without an increase in cell size. It also improves the SRAM circuit speed due to an increase in the cell read-out current. A 0.18 ${\mu}m$ CMOS 256-kbit SRAM macro is fabricated with the proposed technique, which demonstrates 0.8 V operation with 50 MHz while consuming 65 ${\mu}W$/MHz. It also demonstrates an 87% bit error rate reduction while operating with a 43% higher clock frequency compared with that of conventional SRAM.

  • PDF