• Title/Summary/Keyword: Input buffer

Search Result 290, Processing Time 0.024 seconds

VLSI Design of DWT-based Image Processor for Real-Time Image Compression and Reconstruction System (실시간 영상압축과 복원시스템을 위한 DWT기반의 영상처리 프로세서의 VLSI 설계)

  • Seo, Young-Ho;Kim, Dong-Wook
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.1C
    • /
    • pp.102-110
    • /
    • 2004
  • In this paper, we propose a VLSI structure of real-time image compression and reconstruction processor using 2-D discrete wavelet transform and implement into a hardware which use minimal hardware resource using ASIC library. In the implemented hardware, Data path part consists of the DWT kernel for the wavelet transform and inverse transform, quantizer/dequantizer, the huffman encoder/huffman decoder, the adder/buffer for the inverse wavelet transform, and the interface modules for input/output. Control part consists of the programming register, the controller which decodes the instructions and generates the control signals, and the status register for indicating the internal state into the external of circuit. According to the programming condition, the designed circuit has the various selective output formats which are wavelet coefficient, quantization coefficient or index, and Huffman code in image compression mode, and Huffman decoding result, reconstructed quantization coefficient, and reconstructed wavelet coefficient in image reconstructed mode. The programming register has 16 stages and one instruction can be used for a horizontal(or vertical) filtering in a level. Since each register automatically operated in the right order, 4-level discrete wavelet transform can be executed by a programming. We synthesized the designed circuit with synthesis library of Hynix 0.35um CMOS fabrication using the synthesis tool, Synopsys and extracted the gate-level netlist. From the netlist, timing information was extracted using Vela tool. We executed the timing simulation with the extracted netlist and timing information using NC-Verilog tool. Also PNR and layout process was executed using Apollo tool. The Implemented hardware has about 50,000 gate sizes and stably operates in 80MHz clock frequency.

A Design of DLL-based Low-Power CDR for 2nd-Generation AiPi+ Application (2세대 AiPi+ 용 DLL 기반 저전력 클록-데이터 복원 회로의 설계)

  • Park, Joon-Sung;Park, Hyung-Gu;Kim, Seong-Geun;Pu, Young-Gun;Lee, Kang-Yoon
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.48 no.4
    • /
    • pp.39-50
    • /
    • 2011
  • In this paper, we presents a CDR circuit for $2^{nd}$-generation AiPi+, one of the Intra-panel Interface. The speed of the proposed clock and data recovery is increased to 1.25 Gbps compared with that of AiPi+. The DLL-based CDR architecture is used to generate the multi-phase clocks. We propose the simple scheme for frequency detector (FD) to mitigate the harmonic-locking and reduce the complexity. In addition, the duty cycle corrector that limits the maximum pulse width is used to avoid the problem of missing clock edges due to the mismatch between rising and falling time of VCDL's delay cells. The proposed CDR is implemented in 0.18 um technology with the supply voltage of 1.8 V. The active die area is $660\;{\mu}m\;{\times}\;250\;{\mu}m$, and supply voltage is 1.8 V. Peak-to-Peak jitter is less than 15 ps and the power consumption of the CDR except input buffer, equalizer, and de-serializer is 5.94 mW.

Optimized Hardware Design of Deblocking Filter for H.264/AVC (H.264/AVC를 위한 디블록킹 필터의 최적화된 하드웨어 설계)

  • Jung, Youn-Jin;Ryoo, Kwang-Ki
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.47 no.1
    • /
    • pp.20-27
    • /
    • 2010
  • This paper describes a design of 5-stage pipelined de-blocking filter with power reduction scheme and proposes a efficient memory architecture and filter order for high performance H.264/AVC Decoder. Generally the de-blocking filter removes block boundary artifacts and enhances image quality. Nevertheless filter has a few disadvantage that it requires a number of memory access and iterated operations because of filter operation for 4 time to one edge. So this paper proposes a optimized filter ordering and efficient hardware architecture for the reduction of memory access and total filter cycles. In proposed filter parallel processing is available because of structured 5-stage pipeline consisted of memory read, threshold decider, pre-calculation, filter operation and write back. Also it can reduce power consumption because it uses a clock gating scheme which disable unnecessary clock switching. Besides total number of filtering cycle is decreased by new filter order. The proposed filter is designed with Verilog-HDL and functionally verified with the whole H.264/AVC decoder using the Modelsim 6.2g simulator. Input vectors are QCIF images generated by JM9.4 standard encoder software. As a result of experiment, it shows that the filter can make about 20% total filter cycles reduction and it requires small transposition buffer size.

Performance of Energy Efficient Optical Ethernet Systems with a Dynamic Lane Control Scheme (동적 레인 제어방식을 적용한 에너지 절감형 광 이더넷 시스템의 성능분석)

  • Seo, Insoo;Yang, Choong-Reol;Yoon, Chongho
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.49 no.11
    • /
    • pp.24-35
    • /
    • 2012
  • In this paper, we propose a dynamic lane control scheme with a traffic predictor module and a rate controller for reconciling with commercial optical PHY modules in energy efficient optical Ethernet systems. The commercial high speed optical Ethernet system capable of 40/100Gbps employs 4 or 10 multiple optical transceivers over WDM or multiple optical links. Each of the transceivers is always turned on even if the link is idle. To save energy, we propose the dynamic lane control scheme. It allows that several links may be entirely turned off in a low traffic load and frames are handled on the remaining active links. To preserve the byte order even if the number of active links may be changed, we propose a rate controller to be sat on the reconciliation sublayer. The main role of the controller is to insert null byte streams into the xGMII of inactive lanes. For the PHY module, the null input streams corresponding to inactive lanes will be disregarded on inactive PMDs. It is very handy to implement the rate controller module with MAC in FPGA without any modification of commercial PHYs. It is very crucial to determine the number of active links based on the fluctuated traffic load, we provide a simple traffic predictor based on both the current transmission buffer size and the past one with different weighting factors for adapting to the traffic load fluctuation. Using the OMNET++ simulation framework, we provide several performance results in terms of the energy consumption.

A New Hardware Design for Generating Digital Holographic Video based on Natural Scene (실사기반 디지털 홀로그래픽 비디오의 실시간 생성을 위한 하드웨어의 설계)

  • Lee, Yoon-Hyuk;Seo, Young-Ho;Kim, Dong-Wook
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.49 no.11
    • /
    • pp.86-94
    • /
    • 2012
  • In this paper we propose a hardware architecture of high-speed CGH (computer generated hologram) generation processor, which particularly reduces the number of memory access times to avoid the bottle-neck in the memory access operation. For this, we use three main schemes. The first is pixel-by-pixel calculation rather than light source-by-source calculation. The second is parallel calculation scheme extracted by modifying the previous recursive calculation scheme. The last one is a fully pipelined calculation scheme and exactly structured timing scheduling by adjusting the hardware. The proposed hardware is structured to calculate a row of a CGH in parallel and each hologram pixel in a row is calculated independently. It consists of input interface, initial parameter calculator, hologram pixel calculators, line buffer, and memory controller. The implemented hardware to calculate a row of a $1,920{\times}1,080$ CGH in parallel uses 168,960 LUTs, 153,944 registers, and 19,212 DSP blocks in an Altera FPGA environment. It can stably operate at 198MHz. Because of the three schemes, the time to access the external memory is reduced to about 1/20,000 of the previous ones at the same calculation speed.

A 2.4-GHz Low-Power Direct-Conversion Transmitter Based on Current-Mode Operation (전류 모드 동작에 기반한 2.4GHz 저전력 직접 변환 송신기)

  • Choi, Joon-Woo;Lee, Hyung-Su;Choi, Chi-Hoon;Park, Sung-Kyung;Nam, Il-Ku
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.48 no.12
    • /
    • pp.91-96
    • /
    • 2011
  • In this paper, a low-power direct-conversion transmitter based on current-mode operation, which satisfies the IEEE 802.15.4 standard, is proposed and implemented in a $0.13{\mu}m$ CMOS technology. The proposed transmitter consists of DACs, LPFs, variable gain I/Q up-conversion mixer, a divide-by-two circuit with LO buffer, and a drive amplifier. By combining DAC, LPF, and variable gain I/Q up-conversion mixer with a simple current mirror configuration, the transmitter's power consumption is reduced and its linearity is improved. The drive amplifier is a cascode amplifier with gain controls and the 2.4GHz I/Q differential LO signals are generated by a divide-by-two current-mode-logic (CML) circuit with an external 4.8GHz input signal. The implemented transmitter has 30dB of gain control range, 0dBm of maximum transmit output power, 33dBc of local oscillator leakage, and 40dBc of the transmit third harmonic component. The transmitter dissipates 10.2mW from a 1.2V supply and the die area of the transmitter is $1.76mm{\times}1.26mm$.

Design of a High Throughput Parallel Turbo Decoder (고 처리율 병렬 터보 복호기 설계)

  • Lee, Won-Ho;Park, Heemin;Rim, Chong S.
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.11
    • /
    • pp.50-57
    • /
    • 2013
  • This paper provides a design of high-throughput parallel turbo decoder that is able to decode several packets of various length simultaneously. For high-speed communications, designing of Turbo decoder as parallel structures reduces the long decoding time caused by iterative turbo decode way. Also, by employing the double buffer structure for input and output packets improves the decoder throughput by enabling continuous decoding. Because parallel turbo decoder is designed to be able to decode the packet of the longest length, there exist idle PE's(Processing Element) in the case of decoding packets of short length. The main idea of this paper is to increase the utilization of PE's in parallel Turbo decoder and to improve the decoder throughput by using the idle PE's immediately for the subsequent packets decoding. For this, the control is necessary to enable the concurrent decoding of several short packets and we propose the method of this control. Applying the proposed method, we implemented Turbo Decoder with 32 PE's that can decode packets of 6144 bits maximum. Compared to the conventional Turbo decoder, although the area was increased about 16%, the decoder throughput was improved 28 times for short packets.

FPGA Implementation of Real-time 2-D Wavelet Image Compressor (실시간 2차원 웨이블릿 영상압축기의 FPGA 구현)

  • 서영호;김왕현;김종현;김동욱
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.27 no.7A
    • /
    • pp.683-694
    • /
    • 2002
  • In this paper, a digital image compression codec using 2D DWT(Discrete Wavelet Transform) is designed using the FPGA technology for real time operation The implemented image compression codec using wavelet decomposition consists of a wavelet kernel part for wavelet filtering process, a quantizer/huffman coder for quantization and huffman encoding of wavelet coefficients, a memory controller for interface with external memories, a input interface to process image pixels from A/D converter, a output interface for reconstructing huffman codes, which has irregular bit size, into 32-bit data having regular size data, a memory-kernel buffer to arrage data for real time process, a PCI interface part, and some modules for setting timing between each modules. Since the memory mapping method which converts read process of column-direction into read process of the row-direction is used, the read process in the vertical-direction wavelet decomposition is very efficiently processed. Global operation of wavelet codec is synchronized with the field signal of A/D converter. The global hardware process pipeline operation as the unit of field and each field and each field operation is classified as decomposition levels of wavelet transform. The implemented hardware used FPGA hardware resource of 11119(45%) LAB and 28352(9%) ESB in FPGA device of APEX20KC EP20k600CB652-7 and mapped into one FPGA without additional external logic. Also it can process 33 frames(66 fields) per second, so real-time image compression is possible.

Linear Resource Sharing Method for Query Optimization of Sliding Window Aggregates in Multiple Continuous Queries (다중 연속질의에서 슬라이딩 윈도우 집계질의 최적화를 위한 선형 자원공유 기법)

  • Baek, Seong-Ha;You, Byeong-Seob;Cho, Sook-Kyoung;Bae, Hae-Young
    • Journal of KIISE:Databases
    • /
    • v.33 no.6
    • /
    • pp.563-577
    • /
    • 2006
  • A stream processor uses resource sharing method for efficient of limited resource in multiple continuous queries. The previous methods process aggregate queries to consist the level structure. So insert operation needs to reconstruct cost of the level structure. Also a search operation needs to search cost of aggregation information in each size of sliding windows. Therefore this paper uses linear structure for optimization of sliding window aggregations. The method comprises of making decision, generation and deletion of panes in sequence. The decision phase determines optimum pane size for holding accurate aggregate information. The generation phase stores aggregate information of data per pane from stream buffer. At the deletion phase, panes are deleted that are no longer used. The proposed method uses resources less than the method where level structures were used as data structures as it uses linear data format. The input cost of aggregate information is saved by calculating only pane size of data though numerous stream data is arrived, and the search cost of aggregate information is also saved by linear searching though those sliding window size is different each other. In experiment, the proposed method has low usage of memory and the speed of query processing is increased.

Study on the Consequence Effect Analysis & Process Hazard Review at Gas Release from Hydrogen Fluoride Storage Tank (최근 불산 저장탱크에서의 가스 누출시 공정위험 및 결과영향 분석)

  • Ko, JaeSun
    • Journal of the Society of Disaster Information
    • /
    • v.9 no.4
    • /
    • pp.449-461
    • /
    • 2013
  • As the hydrofluoric acid leak in Gumi-si, Gyeongsangbuk-do or hydrochloric acid leak in Ulsan, Gyeongsangnam-do demonstrated, chemical related accidents are mostly caused by large amounts of volatile toxic substances leaking due to the damages of storage tank or pipe lines of transporter. Safety assessment is the most important concern because such toxic material accidents cause human and material damages to the environment and atmosphere of the surrounding area. Therefore, in this study, a hydrofluoric acid leaked from a storage tank was selected as the study example to simulate the leaked substance diffusing into the atmosphere and result analysis was performed through the numerical Analysis and diffusion simulation of ALOHA(Areal Location of Hazardous Atmospheres). the results of a qualitative evaluation of HAZOP (Hazard Operability)was looked at to find that the flange leak, operation delay due to leakage of the valve and the hose, and toxic gas leak were danger factors. Possibility of fire from temperature, pressure and corrosion, nitrogen supply overpressure and toxic leak from internal corrosion of tank or pipe joints were also found to be high. ALOHA resulting effects were a little different depending on the input data of Dense Gas Model, however, the wind direction and speed, rather than atmospheric stability, played bigger role. Higher wind speed affected the diffusion of contaminant. In term of the diffusion concentration, both liquid and gas leaks resulted in almost the same $LC_{50}$ and ALOHA AEGL-3(Acute Exposure Guidline Level) values. Each scenarios showed almost identical results in ALOHA model. Therefore, a buffer distance of toxic gas can be determined by comparing the numerical analysis and the diffusion concentration to the IDLH(Immediately Dangerous to Life and Health). Such study will help perform the risk assessment of toxic leak more efficiently and be utilized in establishing community emergency response system properly.