Search | Korea Science

A Variable Latency K'th Order Newton-Raphson's Floating Point Number Divider (가변 시간 K차 뉴톤-랍손 부동소수점 나눗셈)

Cho, Gyeong-Yeon
- IEMEK Journal of Embedded Systems and Applications
- /
- v.9 no.5
- /
- pp.285-292
- /
- 2014
The commonly used Newton-Raphson's floating-point number divider algorithm performs two multiplications in one iteration. In this paper, a tentative K'th Newton-Raphson's floating-point number divider algorithm which performs K times multiplications in one iteration is proposed. Since the number of multiplications performed by the proposed algorithm is dependent on the input values, the average number of multiplications per an operation in single precision and double precision divider is derived from many reciprocal tables with varying sizes. In addition, an error correction algorithm, which consists of one multiplication and a decision, to get exact result in divider is proposed. Since the proposed algorithm only performs the multiplications until the error gets smaller than a given value, it can be used to improve the performance of a floating point number divider unit. Also, it can be used to construct optimized approximate reciprocal tables.
https://doi.org/10.14372/IEMEK.2014.9.5.285 인용 PDF KSCI

Algebraic Accuracy Verification for Division-by-Convergence based 24-bit Floating-point Divider Complying with OpenGL (Division-by-Convergence 방식을 사용하는 24-비트 부동소수점 제산기에 대한 OpenGL 정확도의 대수적 검증)

Yoo, Sehoon;Lee, Jungwoo;Kim, Kichul
- Journal of IKEEE
- /
- v.17 no.3
- /
- pp.346-351
- /
- 2013
Low-cost and low-power are important requirements in mobile systems. Thus, when a floating-point arithmetic unit is needed, 24-bit floating-point format can be more useful than 32-bit floating-point format. However, a 24-bit floating-point arithmetic unit can be risky because it usually has lower accuracy than a 32-bit floating-point arithmetic unit. Consecutive floating-point operations are performed in 3D graphic processors. In this case, the verification of the floating-point operation accuracy is important. Among 3D graphic arithmetic operations, the floating-point division is one of the most difficult operations to satisfy the accuracy of $10^{-5}$ which is the required accuracy in OpenGL ES 3.0. No 24-bit floating-point divider, whose accuracy is algebraically verified, has been reported. In this paper, a 24-bit floating-point divider is analyzed and it is algebraically verified that its accuracy satisfies the OpenGL requirement.
https://doi.org/10.7471/ikeee.2013.17.3.346 인용 PDF KSCI

Error Corrected K'th order Goldschmidt's Floating Point Number Division (오차 교정 K차 골드스미트 부동소수점 나눗셈)

Cho, Gyeong-Yeon
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.19 no.10
- /
- pp.2341-2349
- /
- 2015
The commonly used Goldschmidt's floating-point divider algorithm performs two multiplications in one iteration. In this paper, a tentative error corrected K'th Goldschmidt's floating-point number divider algorithm which performs K times multiplications in one iteration is proposed. Since the number of multiplications performed by the proposed algorithm is dependent on the input values, the average number of multiplications per an operation in single precision and double precision divider is derived from many reciprocal tables with varying sizes. In addition, an error correction algorithm, which consists of one multiplication and a decision, to get exact result in divider is proposed. Since the proposed algorithm only performs the multiplications until the error gets smaller than a given value, it can be used to improve the performance of a divider unit. Also, it can be used to construct optimized approximate reciprocal tables.
https://doi.org/10.6109/jkiice.2015.19.10.2341 인용 PDF KSCI KPUBS HTML

IEEE-754 Floating-Point Divider for Embedded Processors (내장형 프로세서를 위한 IEEE-754 고성능 부동소수점 나눗셈기의 설계)

정재원;홍인표;정우경;이용석
- Proceedings of the IEEK Conference
- /
- 2000.11b
- /
- pp.353-356
- /
- 2000
In this paper, a high-performance and small-area floating-point divider, which is suitable for embedded processors and supports all rounding modes defined by IEEE 754 standard, is designed using the series expansion algorithm. This divider shares and fully utilizes the two MAC units for quadratical convergence to the correct quotient. The area increase of two MAC units due to the division is minimized in this design, so that it can be suitable for embedded processors. The tested HDL codes are synthesized and optimized with 0.35$\mu\textrm{m}$ CMOS standard celt libraries. The results show that the latency of the synthesized divider is 17.43 ㎱ in worst condition. But, the divider calculates the correct rounded quotient through only 6 cycles.
PDF

A Design of Radix-2 SRT Floating-Point Divider Unit using ]Redundant Binary Number System (Redundant Binary 수치계를 이용한 radix-2 SRT부동 소수점 제산기 유닛 설계)

이종남;신경욱
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.5 no.3
- /
- pp.517-524
- /
- 2001
This paper describes a design of radix-2 SRT divider unit, which supports IEEE-754 floating-point standard, using redundant binary number system (RBNS). With the RBNS, the partial quotient decision logic can operate about 20-% faster, as well as can be implemented with a simple hardware when compared to the conventional methods based on two's complement arithmetic. By using a new redundant binary adder proposed in this paper, the mantissa divider is efficiently implemented, thus resulting in about 20% smaller area than other works. The divider unit supports double precision format, five exceptions and four rounding modes. It was verified with Verilog HDL and Verilog-XL.
PDF

IEEE-754 Floating-Point Divider for Embedded Processors (내장형 프로세서를 위한 IEEE-754 고성능 부동소수점 나눗셈기의 설계)

Jeong, Jae-Won;Hong, In-Pyo;Jeong, Woo-Kyong;Lee, Yong-Surk
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.39 no.7
- /
- pp.66-73
- /
- 2002
As floating-point operations become widely used in various applications such as computer graphics and high-definition DSP, the needs for fast division become increased. However, conventional floating-point dividers occupy a large hardware area, and bring bottle-becks to the entire floating-point operations. In this paper, a high-performance and small-area floating-point divider, which is suitable for embedded processors, is designed using he series expansion algorithm. The algorithm is selected to utilize two MAC(Multiply-ACcumulate) units for quadratic convergence to the correct quotient. The two MAC units for SIMD-DSP features are shared and the additional area for the division only is very small. The proposed divider supports all rounding modes defined by IEEE 754 standard, and error estimations are performed for appropriate precision.
PDF KSCI

A Study on High Performances Floating Point Unit (고성능 부동 소수점 연산기에 대한 연구)

Park, Woo-Chan;Han, Tack-Don
- The Transactions of the Korea Information Processing Society
- /
- v.4 no.11
- /
- pp.2861-2873
- /
- 1997
An FPU(Floating Point unit) is the principle component in high performance computer and is placed on a chip together with main processing unit recently. As a Processing speed of the FPU is accelerated, the rounding stage, which occupies one of the floating point Processing steps for floating point operations, has a considerable effect on overall floating point operations. In this paper, by studying and analyzing the processing flows of the conventional floating point adder/subtractor, multipler and divider, which are main component of the FPU, efficient rounding mechanisms are presented. Proposed mechanisms do not require any additional execution time and any high speed adder for rounding operation. Thus, performance improvement and cost-effective design can be achieved by this approach.
PDF

Design and MPW Implementation of 3D Graphics Floating Point Ips (3차원 그래픽용 부동 소수점 연산기 IP 설계 및 MPW 구현)

Lee, Jung-Woo;Kim, Ki-Chul
- Proceedings of the IEEK Conference
- /
- 2006.06a
- /
- pp.987-988
- /
- 2006
This paper presents a design and MPW implementation of 3D Graphics Floating Point IPs. Designed IPs include adder, subtractor, multiplier, divider, and reciprocal unit. The IPs have pipelined structures. The IPs meet the accuracy required in OpenGL ES. The operation frequency of the IPs is 100MHz. The IPs can be efficiently used in 3D graphics accelerators.
PDF

A New Pipelined Divider with a Small Lookup Table (작은 룩업테이블을 가지는 새로운 파이프라인 나눗셈기)

Jeong, Woong;Park, Woo-Chan;Kwak, Sung-Ho;Yang, Hoon-Mo;Jeong, Cheol-Ho;Han, Tack-Don;Lee, Moon-Key
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.40 no.9
- /
- pp.724-733
- /
- 2003
Generally, dividers have been designed to use iteration, but recently the research on the pipelined divider is underway. It is a difficult point in the known pipelined division unit that a large lookup table is required. In this paper, the cost-effective pipelined divider is proposed, that needs a lookup table smaller than that of the other pipelined divider. The latency of the proposed divider is 3 cycles. We obtain a 30% reduced area than that of P. Hung.
PDF KSCI

A Variable Latency Goldschmidt's Floating Point Number Divider (가변 시간 골드스미트 부동소수점 나눗셈기)

Kim Sung-Gi;Song Hong-Bok;Cho Gyeong-Yeon
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.9 no.2
- /
- pp.380-389
- /
- 2005
The Goldschmidt iterative algorithm for a floating point divide calculates it by performing a fixed number of multiplications. In this paper, a variable latency Goldschmidt's divide algorithm is proposed, that performs multiplications a variable number of times until the error becomes smaller than a given value. To calculate a floating point divide '$\frac{N}{F}$', multifly '$T=\frac{1}{F}+e_t$' to the denominator and the nominator, then it becomes ’$\frac{TN}{TF}=\frac{N_0}{F_0}$'. And the algorithm repeats the following operations: ’$R_i=(2-e_r-F_i),\;N_{i+1}=N_i{\ast}R_i,\;F_{i+1}=F_i{\ast}R_i$, i$\in${0,1,...n-1}'. The bits to the right of p fractional bits in intermediate multiplication results are truncated, and this truncation error is less than ‘$e_r=2^{-p}$'. The value of p is 29 for the single precision floating point, and 59 for the double precision floating point. Let ’$F_i=1+e_i$', there is $F_{i+1}=1-e_{i+1},\;e_{i+1}',\;where\;e_{i+1}, If '$[F_i-1]<2^{\frac{-p+3}{2}}$ is true, ’$e_{i+1}<16e_r$' is less than the smallest number which is representable by floating point number. So, ‘$N_{i+1}$ is approximate to ‘$\frac{N}{F}$'. Since the number of multiplications performed by the proposed algorithm is dependent on the input values, the average number of multiplications per an operation is derived from many reciprocal tables ($T=\frac{1}{F}+e_t$) with varying sizes. 1'he superiority of this algorithm is proved by comparing this average number with the fixed number of multiplications of the conventional algorithm. Since the proposed algorithm only performs the multiplications until the error gets smaller than a given value, it can be used to improve the performance of a divider. Also, it can be used to construct optimized approximate reciprocal tables. The results of this paper can be applied to many areas that utilize floating point numbers, such as digital signal processing, computer graphics, multimedia, scientific computing, etc
PDF KSCI

Search Result 14, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)