Search | Korea Science

Design of AMBA AX I Slave Unit for Pipelined Arithmetic Unit (파이프라인 구조 연산회로를 위한 AMBA AXI Slave 설계)

Choi, Byeong-Yoon
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2011.05a
- /
- pp.712-713
- /
- 2011
In this paper, the AMBA AXI slave unit that can verify the pipelined arithmetic unit is proposed and the 2-stage 16-bit pipelined multiplier is introduced as design example. The proposed AXI slave unit consists of input buffer block memory, control registers, pipelined arithmetic unit, control unit, output buffer block memory, and AXI slave interface unit. The main operational procedures are divided into the following steps, such as burst-mode input data loading for the input buffer memory, programming of control registers, arithmetic operations for block data in the input buffer memory, and burst-mode output data unloading from output buffer memory to host processor. Because the proposed AXI slave unit is general structure, it can be efficiently applicable to AMBA AXI and AHB slave unit with pipelined arithmetic unit.
PDF

A design of floating-point arithmetic unit for superscalar microprocessor (수퍼스칼라 마이크로프로세서용 부동 소수점 연산회로의 설계)

최병윤;손승일;이문기
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.21 no.5
- /
- pp.1345-1359
- /
- 1996
This paper presents a floating point arithmetic unit (FPAU) for supescalar microprocessor that executes fifteen operations such as addition, subtraction, data format converting, and compare operation using two pipelined arithmetic paths and new rounding and normalization scheme. By using two pipelined arithmetic paths, each aritchmetic operation can be assigned into appropriate arithmetic path which high speed operation is possible. The proposed normalization an rouding scheme enables the FPAU to execute roundig operation in parallel with normalization and to reduce timing delay of post-normalization. And by predicting leading one position of results using input operands, leading one detection(LOD) operation to normalize results in the conventional arithmetic unit can be eliminated. Because the FPAU can execuate fifteen single-precision or double-precision floating-point arithmetic operations through three-stage pipelined datapath and support IEEE standard 754, it has appropriate structure which can be ingegrated into superscalar microprocessor.
PDF

Design of Pipelined Floating-Point Arithmetic Unit for Mobile 3D Graphics Applications

Choi, Byeong-Yoon;Ha, Chang-Soo;Lee, Jong-Hyoung;Salclc, Zoran;Lee, Duck-Myung
- Journal of Korea Multimedia Society
- /
- v.11 no.6
- /
- pp.816-827
- /
- 2008
In this paper, two-stage pipelined floating-point arithmetic unit (FP-AU) is designed. The FP-AU processor supports seventeen operations to apply 3D graphics processor and has area-efficient and low-latency architecture that makes use of modified dual-path computation scheme, new normalization circuit, and modified compound adder based on flagged prefix adder. The FP-AU has about 4-ns delay time at logic synthesis condition using $0.18{\mu}m$ CMOS standard cell library and consists of about 5,930 gates. Because it has 250 MFLOPS execution rate and supports saturated arithmetic including a number of graphics-oriented operations, it is applicable to mobile 3D graphics accelerator efficiently.
PDF

Study of the Superconductive Pipelined Multi-Bit ALU (초전도 Pipelined Multi-Bit ALU에 대한 연구)

Kim, Jin-Young;Ko, Ji-Hoon;Kang, Joon-Hee
- Progress in Superconductivity
- /
- v.7 no.2
- /
- pp.109-113
- /
- 2006
The Arithmetic Logic Unit (ALU) is a core element of a computer processor that performs arithmetic and logic operations on the operands in computer instruction words. We have developed and tested an RSFQ multi-bit ALU constructed with half adder unit cells. To reduce the complexity of the ALU, We used half adder unit cells. The unit cells were constructed of one half adder and three de switches. The timing problem in the complex circuits has been a very important issue. We have calculated the delay time of all components in the circuit by using Josephson circuit simulation tools of XIC, $WRspice^{TM}$, and Julia. To make the circuit work faster, we used a forward clocking scheme. This required a careful design of timing between clock and data pulses in ALU. The designed ALU had limited operation functions of OR, AND, XOR, and ADD. It had a pipeline structure. The fabricated 1-bit, 2-bit, and 4-bit ALU circuits were tested at a few kilo-hertz clock frequency as well as a few tens giga-hertz clock frequency, respectively. For high-speed tests, we used an eye-diagram technique. Our 4-bit ALU operated correctly at up to 5 GHz clock frequency.
PDF

Hardware Design of High Performance Arithmetic Unit with Processing of Complex Data for Multimedia Processor (복소수 데이터 처리가 가능한 멀티미디어 프로세서용 고성능 연산회로의 하드웨어 설계)

Choi, Byeong-yoon
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.20 no.1
- /
- pp.123-130
- /
- 2016
In this paper, a high-performance arithmetic unit which can efficiently accelerate a number of algorithms for multimedia application was designed. The 3-stage pipelined arithmetic unit can execute 38 operations for complex and fixed-point data by using efficient configuration for four 16-bit by 16-bit multipliers, new sign extension method for carry-save data, and correction constant scheme to eliminate sign-extension in compression operation of multiple partial multiplication results. The arithmetic unit has about 300-MHz operating frequency and about 37,000 gates on 45nm CMOS technology and its estimated performance is 300 MCOPS(Million Complex Operations Per Second). Because the arithmetic unit has high processing rate and supports a number of operations dedicated to various applications, it can be efficiently applicable to multimedia processors.
https://doi.org/10.6109/jkiice.2016.20.1.123 인용 PDF KSCI

Hardware Design of Pipelined Special Function Arithmetic Unit for Mobile Graphics Application (모바일 그래픽 응용을 위한 파이프라인 구조 특수 목적 연산회로의 하드웨어 설계)

Choi, Byeong-Yoon
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.17 no.8
- /
- pp.1891-1898
- /
- 2013
To efficiently execute 3D graphic APIs, such as OpenGL and Direct3D, special purpose arithmetic unit(SFU) which supports floating-point sine, cosine, reciprocal, inverse square root, base-two exponential, and logarithmic operations is designed. The SFU uses second order minimax approximation method and lookup table method to satisfy both error less than 2 ulp(unit in the last place) and high speed operation. The designed circuit has about 2.3-ns delay time under 65nm CMOS standard cell library and consists of about 23,300 gates. Due to its maximum performance of 400 MFLOPS and high accuracy, it can be efficiently applicable to mobile 3D graphics application.
https://doi.org/10.6109/jkiice.2013.17.8.1891 인용 PDF KSCI

A New Pipelined Divider with a Small Lookup Table (작은 룩업테이블을 가지는 새로운 파이프라인 나눗셈기)

Jeong, Woong;Park, Woo-Chan;Kwak, Sung-Ho;Yang, Hoon-Mo;Jeong, Cheol-Ho;Han, Tack-Don;Lee, Moon-Key
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.40 no.9
- /
- pp.724-733
- /
- 2003
Generally, dividers have been designed to use iteration, but recently the research on the pipelined divider is underway. It is a difficult point in the known pipelined division unit that a large lookup table is required. In this paper, the cost-effective pipelined divider is proposed, that needs a lookup table smaller than that of the other pipelined divider. The latency of the proposed divider is 3 cycles. We obtain a 30% reduced area than that of P. Hung.
PDF KSCI

Development of Superconductive Arithmetic and Logic Devices (초전도 논리연산자의 개발)

Kang J. H
- Progress in Superconductivity
- /
- v.6 no.1
- /
- pp.7-12
- /
- 2004
Due to the very fast switching speed of Josephson junctions, superconductive digital circuit has been a very good candidate fur future electronic devices. High-speed and Low-power microprocessor can be developed with Josephson junctions. As a part of an effort to develop superconductive microprocessor, we have designed an RSFQ 4-bit ALU (Arithmetic Logic Unit) in a pipelined structure. To make the circuit work faster, we used a forward clocking scheme. This required a careful design of timing between clock and data pulses in ALU. The RSFQ 1-bit block of ALU used in this work consisted of three DC current driven SFQ switches and a half-adder. We successfully tested the half adder cell at clock frequency up to 20 GHz. The switches were commutating output ports of the half adder to produce AND, OR, XOR, or ADD functions. For a high-speed test, we attached switches at the input ports to control the high-speed input data by low-frequency pattern generators. The output in this measurement was an eye-diagram. Using this setup, 1-bit block of ALU was successfully tested up to 40 GHz. An RSFQ 4-bit ALU was fabricated and tested. The circuit worked at 5 GHz. The circuit size of the 4-bit ALU was 3 mm ${\times}$ 1.5 mm, fitting in a 5 mm ${\times}$ 5 mm chip.
PDF

Timing analysis of RSFQ ALU circuit for the development of superconductive microprocessor (초전도 마이크로 프로세서개발을 위한 RSFQ ALU 회로의 타이밍 분석)

Kim J. Y;Baek S. H.;Kim S. H.;Kang J. H.
- Progress in Superconductivity and Cryogenics
- /
- v.7 no.1
- /
- pp.9-12
- /
- 2005
We have constructed an RSFQ 4-bit Arithmetic Logic Unit (ALU) in a pipelined structure. An ALU is a core element of a computer processor that performs arithmetic and logic operation on the operands in computer instruction words. We have simulated the circuit by using Josephson circuit simulation tools. We used simulation tools of XIC, $WRspice^{TM}$, and Julia. To make the circuit work faster, we used a forward clocking scheme. This required a careful design of timing between clock and data pulses in ALU. The RSFQ 1-bit block of ALU used in constructing the 4-bit ALU was consisted of three DC current driven SFQ switches and a half-adder. By commutating output ports of the half adder, we could produce AND, OR, XOR, or ADD functions. The circuit size of the 4-bit ALU when fabricated was 3 mm x 1.5 mm, fitting in a 5 mm x 5mm chip. The fabricated 4-bit ALU operated correctly at 5 GHz clock frequency. The chip was tested at the liquid-helium temperature.
PDF KSCI

A Study on the Interframe Image Coding Using Motion Compensated and Classified Vector Quantizer (Ⅱ : Hardware Implementation) (이동 보상과 분류 벡터 양자화기를 이용한 영상 부호화에 관한 연구 (Ⅱ: 하드웨어 실현))

Jeon, Joong-Nam;Shin, Tae-Min;Choi, Sung-Nam;Park, Kyu-Tae
- Journal of the Korean Institute of Telematics and Electronics
- /
- v.27 no.3
- /
- pp.21-30
- /
- 1990
This paper describes a hardware implementation of the interframe monochrome video CODEC using a MC-CVQ(Motion Compensated and Classified Vector Quantization) algorithm. The specifications of this CODEC are (1) the resolution of image is $128{\times}128$ pixels, and (2) the transmission rates are about 10frames/sec at the 64Kbps channel. In order to design the CODEC under these conditions, it is implemented by a multiprocessor system composed of MC unit, CVQ nuit and decoder unit, which are controlled by microprogramming technique. And the 3~stage pipelined ALU(Arithmetic and Logic Unit) is adopted to calculate the minimum error distance in the MC unit and CVQ nuit. The realized system shows that the transmission rates are 6-15 frames/sec according to the relative motion of the video signal.
PDF

Search Result 11, Processing Time 0.021 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)