• Title/Summary/Keyword: Floating point number

Search Result 83, Processing Time 0.024 seconds

Goldschmidt's Double Precision Floating Point Reciprocal Computation using 32 bit multiplier (32 비트 곱셈기를 사용한 골드스미트 배정도실수 역수 계산기)

  • Cho, Gyeong-Yeon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.15 no.5
    • /
    • pp.3093-3099
    • /
    • 2014
  • Modern graphic processors, multimedia processors and audio processors mostly use floating-point number. Meanwhile, high-level language such as C and Java uses both single-precision and double precision floating-point number. In this paper, an algorithm which computes the reciprocal of double precision floating-point number using a 32 bit multiplier is proposed. It divides the mantissa of double precision floating-point number to upper part and lower part, and calculates the reciprocal of the upper part with Goldschmidt's algorithm, and computes the reciprocal of double precision floating-point number with calculated upper part reciprocal as the initial value is proposed. Since the number of multiplications performed by the proposed algorithm is dependent on the mantissa of floating-point number, the average number of multiplications per an operation is derived from some reciprocal tables with varying sizes.

Newton-Raphson's Double Precision Reciprocal Using 32 bit multiplier (32 비트 곱셈기를 사용한 뉴톤-랍손 배정도실수 역수 계산기)

  • Cho, Gyeong-Yeon
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.18 no.6
    • /
    • pp.31-37
    • /
    • 2013
  • Modern graphic processors, multimedia processors and audio processors mostly use floating-point number. High-level language such as C and Java use both single precision and double precision floating-point number. In this paper, an algorithm which computes the reciprocal of double precision floating-point number using a 32 bit multiplier is proposed. It divides the mantissa of double precision floating-point number to upper part and lower part, and calculates the reciprocal of the upper part with Newton-Raphson algorithm. And it computes the reciprocal of double precision floating-point number with calculated upper part reciprocal as the initial value. Since the number of multiplications performed by the proposed algorithm is dependent on the mantissa of floating-point number, the average number of multiplications per an operation is derived from some reciprocal tables with varying sizes.

A Variable Latency K'th Order Newton-Raphson's Floating Point Number Divider (가변 시간 K차 뉴톤-랍손 부동소수점 나눗셈)

  • Cho, Gyeong-Yeon
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.9 no.5
    • /
    • pp.285-292
    • /
    • 2014
  • The commonly used Newton-Raphson's floating-point number divider algorithm performs two multiplications in one iteration. In this paper, a tentative K'th Newton-Raphson's floating-point number divider algorithm which performs K times multiplications in one iteration is proposed. Since the number of multiplications performed by the proposed algorithm is dependent on the input values, the average number of multiplications per an operation in single precision and double precision divider is derived from many reciprocal tables with varying sizes. In addition, an error correction algorithm, which consists of one multiplication and a decision, to get exact result in divider is proposed. Since the proposed algorithm only performs the multiplications until the error gets smaller than a given value, it can be used to improve the performance of a floating point number divider unit. Also, it can be used to construct optimized approximate reciprocal tables.

Analysis of Some Strange Behaviors of Floating Point Arithmetic using MATLAB Programs (MATLAB을 이용한 부동소수점 연산의 특이사항 분석)

  • Chung, Tae-Sang
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.56 no.2
    • /
    • pp.428-431
    • /
    • 2007
  • A floating-point number system is used to represent a wide range of real numbers using finite number of bits. The standard the IEEE adopted in 1987 divides the range of real numbers into intervals of [$2^E,2^{E+1}$), where E is an Integer represented with finite bits, and defines equally spaced equal counts of discrete numbers in each interval. Since the numbers are defined discretely, not only the number representation itself includes errors but in floating-point arithmetic some strange behaviors are observed which cannot be agreed with the real world arithmetic. In this paper errors with floating-point number representation, those with arithmetic operations, and those due to order of arithmetic operations are analyzed theoretically with help of and verification with the results of some MATLAB program executions.

Kth order Newton-Raphson's Floating Point Number Nth Root (K차 뉴톤-랍손 부동소수점수 N차 제곱근)

  • Cho, Gyeong-Yeon
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.13 no.1
    • /
    • pp.45-51
    • /
    • 2018
  • In this paper, a tentative Kth order Newton-Raphson's floating point number Nth root algorithm for K order convergence rate in one iteration is proposed by applying Taylor series to the Newton-Raphson root algorithm. Using the proposed algorithm, $F^{-1/N}$ and $F^{-(N-1)/N}$ can be computed from iterative multiplications without division. It also predicts the error of the algorithm iteration and iterates only until the predicted error becomes smaller than the specified value. Since the proposed algorithm only performs the multiplications until the error gets smaller than a given value, it can be used to improve the performance of a floating point number Nth root unit.

Floating Point Number N'th Root K'th Order Goldschmidt Algorithm (부동소수점수 N차 제곱근 K차 골드스미스 알고리즘)

  • Cho, Gyeong Yeon
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.9
    • /
    • pp.1029-1035
    • /
    • 2019
  • In this paper, a tentative Kth order Goldschmidt floating point number Nth root algorithm for K order convergence rate in one iteration is proposed by applying Taylor series to the Goldschmidt square root algorithm. Using the proposed algorithm, Nth root and Nth inverse root can be computed from iterative multiplications without division. It also predicts the error of the algorithm iteration. It iterates until the predicted error becomes smaller than the specified value. Since the proposed algorithm only performs the multiplications until the error gets smaller than a given value, it can be used to improve the performance of a floating point number Nth root unit.

A Fixed-point Digital Signal Processor Development System Employing an Automatic Scaling (자동 스케일링 기능이 지원되는 고정 소수집 디지털 시그날 프로세서 개발 시스템)

  • 김시현;성원용
    • Journal of the Korean Institute of Telematics and Electronics A
    • /
    • v.29A no.3
    • /
    • pp.96-105
    • /
    • 1992
  • The use of fixed-point digital signal processors, such as the TMS 320C25, requires scaling of data at each arithmetic step to prevent overflows while keeping the accuracy. A software which automatizes this process is developed for TMS 320C25. The programmers use a model of a hypothetical floating-point digital signal processor and a floating-point format for data representation. However, the program and data are automatically translated to a fixed-point version by this software. Thus, the execution speed is not sacrificed. A fixed-point variable has a unique binary-point location, which is dependent on the range of the variable. The range is estimated from the floating-point simulation. The number of shifts needed for arithmetic or data transfer step is determined by the binary-points of the variables associated with the operation. A fixed-point code generator is also developed by using the proposed automatic scaling software. This code generator produces floating-point assembly programs from the specifiations of FIR, IIR, and adaptive transversal filters, then floating-point programs are transformed to fixed-point versions by the automatic scaling software.

  • PDF

Floating Point Converter Design Supporting Double/Single Precision of IEEE754 (IEEE754 단정도 배정도를 지원하는 부동 소수점 변환기 설계)

  • Park, Sang-Su;Kim, Hyun-Pil;Lee, Yong-Surk
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.48 no.10
    • /
    • pp.72-81
    • /
    • 2011
  • In this paper, we proposed and designed a novel floating point converter which supports single and double precisions of IEEE754 standard. The proposed convertor supports conversions between floating point number single/double precision and signed fixed point number(32bits/64bits) as well as conversions between signed integer(32bits/64bits) and floating point number single/double precision and conversions between floating point number single and double precisions. We defined a new internal format to convert various input types into one type so that overflow checking could be conducted easily according to range of output types. The internal format is similar to the extended format of floating point double precision defined in IEEE754 2008 standard. This standard specifies that minimum exponent bit-width of the extended format of floating point double precision is 15bits, but 11bits are enough to implement the proposed converting unit. Also, we optimized rounding stage of the convertor unit so that we could make it possible to operate rounding and represent correct negative numbers using an incrementer instead an adder. We designed single cycle data path and 5 cycles data path. After describing the HDL model for two data paths of the convertor, we synthesized them with TSMC 180nm technology library using Synopsys design compiler. Cell area of synthesis result occupies 12,886 gates(2 input NAND gate), and maximum operating frequency is 411MHz.

An exact floating point square root calculator using multiplier (곱셈기를 이용한 정확한 부동소수점 제곱근 계산기)

  • Cho, Gyeong-Yeon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.13 no.8
    • /
    • pp.1593-1600
    • /
    • 2009
  • There are two major algorithms to find a square root of floating point number, one is the Newton_Raphson algorithm and GoldSchmidt algorithm which calculate it approximately by iterating multiplications and the other is SRT algorithm which calculates it exactly by iterating subtractions. This paper proposes an exact floating point square root algorithm using only multiplication. At first an approximate inverse square root is calculated by Newton_Raphson algorithm, and then an exact square root algorithm by reducing an error in it and a compensation algorithm of it are proposed. The proposed algorithm is verified to calculate all of numbers in a single precision floating point number and 1 billion random numbers in a double precision floating point number. The proposed algorithm requires only the multipliers without another hardware, so it can be widely used in an embedded system and mobile production which requires an efact square root of floating point number.

Implementation of Efficient Exponential Function Approximation Algorithm Using Format Converter Based on Floating Point Operation in FPGA (부동소수점 기반의 포맷 컨버터를 이용한 효율적인 지수 함수 근사화 알고리즘의 FPGA 구현)

  • Kim, Jeong-Seob;Jung, Seul
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.15 no.11
    • /
    • pp.1137-1143
    • /
    • 2009
  • This paper presents the FPGA implementation of efficient algorithms for approximating exponential function based on floating point format data. The Taylor-Maclaurin expansion as a conventional approximation method becomes inefficient since high order expansion is required for the large number to satisfy the approximation error. A format converter is designed to convert fixed data format to floating data format, and then the real number is separated into two fields, an integer field and an exponent field to separately perform mathematic operations. A new assembly command is designed and added to previously developed command set to refer the math table. To test the proposed algorithm, assembly program has been developed. The program is downloaded into the Altera DSP KIT W/STRATIX II EP2S180N Board. Performances of the proposed method are compared with those of the Taylor-Maclaurin expansion.