• Title/Summary/Keyword: 부동소수점 나눗셈

Search Result 15, Processing Time 0.02 seconds

A Variable Latency Newton-Raphson's Floating Point Number Reciprocal Computation (가변 시간 뉴톤-랍손 부동소수점 역수 계산기)

  • Kim Sung-Gi;Cho Gyeong-Yeon
    • The KIPS Transactions:PartA
    • /
    • v.12A no.2 s.92
    • /
    • pp.95-102
    • /
    • 2005
  • The Newton-Raphson iterative algorithm for finding a floating point reciprocal which is widely used for a floating point division, calculates the reciprocal by performing a fixed number of multiplications. In this paper, a variable latency Newton-Raphson's reciprocal algorithm is proposed that performs multiplications a variable number of times until the error becomes smaller than a given value. To find the reciprocal of a floating point number F, the algorithm repeats the following operations: '$'X_{i+1}=X=X_i*(2-e_r-F*X_i),\;i\in\{0,\;1,\;2,...n-1\}'$ with the initial value $'X_0=\frac{1}{F}{\pm}e_0'$. The bits to the right of p fractional bits in intermediate multiplication results are truncated, and this truncation error is less than $'e_r=2^{-p}'$. The value of p is 27 for the single precision floating point, and 57 for the double precision floating point. Let $'X_i=\frac{1}{F}+e_i{'}$, these is $'X_{i+1}=\frac{1}{F}-e_{i+1},\;where\;{'}e_{i+1}, is less than the smallest number which is representable by floating point number. So, $X_{i+1}$ is approximate to $'\frac{1}{F}{'}$. Since the number of multiplications performed by the proposed algorithm is dependent on the input values, the average number of multiplications per an operation is derived from many reciprocal tables $(X_0=\frac{1}{F}{\pm}e_0)$ with varying sizes. The superiority of this algorithm is proved by comparing this average number with the fixed number of multiplications of the conventional algorithm. Since the proposed algorithm only performs the multiplications until the error gets smaller than a given value, it can be used to improve the performance of a reciprocal unit. Also, it can be used to construct optimized approximate reciprocal tables. The results of this paper can be applied to many areas that utilize floating point numbers, such as digital signal processing, computer graphics, multimedia scientific computing, etc.

A Variable Latency K'th Order Newton-Raphson's Floating Point Number Divider (가변 시간 K차 뉴톤-랍손 부동소수점 나눗셈)

  • Cho, Gyeong-Yeon
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.9 no.5
    • /
    • pp.285-292
    • /
    • 2014
  • The commonly used Newton-Raphson's floating-point number divider algorithm performs two multiplications in one iteration. In this paper, a tentative K'th Newton-Raphson's floating-point number divider algorithm which performs K times multiplications in one iteration is proposed. Since the number of multiplications performed by the proposed algorithm is dependent on the input values, the average number of multiplications per an operation in single precision and double precision divider is derived from many reciprocal tables with varying sizes. In addition, an error correction algorithm, which consists of one multiplication and a decision, to get exact result in divider is proposed. Since the proposed algorithm only performs the multiplications until the error gets smaller than a given value, it can be used to improve the performance of a floating point number divider unit. Also, it can be used to construct optimized approximate reciprocal tables.

IEEE-754 Floating-Point Divider for Embedded Processors (내장형 프로세서를 위한 IEEE-754 고성능 부동소수점 나눗셈기의 설계)

  • 정재원;홍인표;정우경;이용석
    • Proceedings of the IEEK Conference
    • /
    • 2000.11b
    • /
    • pp.353-356
    • /
    • 2000
  • In this paper, a high-performance and small-area floating-point divider, which is suitable for embedded processors and supports all rounding modes defined by IEEE 754 standard, is designed using the series expansion algorithm. This divider shares and fully utilizes the two MAC units for quadratical convergence to the correct quotient. The area increase of two MAC units due to the division is minimized in this design, so that it can be suitable for embedded processors. The tested HDL codes are synthesized and optimized with 0.35$\mu\textrm{m}$ CMOS standard celt libraries. The results show that the latency of the synthesized divider is 17.43 ㎱ in worst condition. But, the divider calculates the correct rounded quotient through only 6 cycles.

  • PDF

Development of C Compiler for 16-bit CPU (16-bit CPU용 C 컴파일러 개발)

  • Jeong, Sam-Jin
    • Proceedings of the KAIS Fall Conference
    • /
    • 2009.05a
    • /
    • pp.439-442
    • /
    • 2009
  • 본 연구는 16 비트 CPU를 위한 새로운 C 컴파일러를 개발하고자 한다. 새로운 ASIC 프로세서가 특정 용도로 설계되었을 때 그 CPU를 위한 새로운 컴파일러의 개발이 필요하다. 공개 소프트웨어인 GNU C 컴파일러를 사용하여 기계 의존 원시 파일들을 수정함으로서 새로운 컴파일러를 개발할 수 있다. 개발된 컴파일러는 단지 기계어에 의해 처리될 수 있는 기능들만 지원할 수 있기 때문에 곱 셈, 나눗셈, 부동소수점 처리등과 같은 기능들을 지원하기 위해서는 더 많은 연구가 필요하다. 완전한 컴파일러가 개발된 후에는 새로운 CPU에서 실행될 수 있는 응용 프로그램의 개발이 필요하다. 본 연구에 의해서 앞으로 개발될 여러 가지 다른 용도의 CPU를 위한 컴파일러들이 쉽게 개발될 수 있을 것이다.

  • PDF

Design of a Floating-Point Divider for IEEE 754-1985 Single-Precision Operations (IEEE 754-1985 단정도 부동 소수점 연산용 나눗셈기 설계)

  • Park, Ann-Soo;Chung, Tea-Sang
    • Proceedings of the KIEE Conference
    • /
    • 2001.11c
    • /
    • pp.165-168
    • /
    • 2001
  • This paper presents a design of a divide unit supporting IEEE-754 floating point standard single-precision with 32-bit word length. Its functions have been verified with ALTERA MAX PLUS II tool. For a high-speed division operation, the radix-4 non-restoring algorithm has been applied and CLA(carry-look -ahead) adders has been used in order to improve the area efficiency and the speed of performance for the fraction division part. The prevention of the speed decrement of operations due to clocking has been achieved by taking advantage of combinational logic. A quotient select block which is very complicated and significant in the high-radix part was designed by using P-D plot in order to select the fast and accurate quotient. Also, we designed all division steps with Gate-level which visualize the operations and delay time.

  • PDF