• Title/Summary/Keyword: Finite Field Multiplication

Search Result 118, Processing Time 0.022 seconds

VLSI Architecture for High Speed Implementation of Elliptic Curve Cryptographic Systems (타원곡선 암호 시스템의 고속 구현을 위한 VLSI 구조)

  • Kim, Chang-Hoon
    • The KIPS Transactions:PartC
    • /
    • v.15C no.2
    • /
    • pp.133-140
    • /
    • 2008
  • In this paper, we propose a high performance elliptic curve cryptographic processor over $GF(2^{163})$. The proposed architecture is based on a modified Lopez-Dahab elliptic curve point multiplication algorithm and uses Gaussian normal basis for $GF(2^{163})$ field arithmetic. To achieve a high throughput rates, we design two new word-level arithmetic units over $GF(2^{163})$ and derive a parallelized elliptic curve point doubling and point addition algorithm with uniform addressing based on the Lopez-Dahab method. We implement our design using Xilinx XC4VLX80 FPGA device which uses 24,263 slices and has a maximum frequency of 143MHz. Our design is roughly 4.8 times faster with 2 times increased hardware complexity compared with the previous hardware implementation proposed by Shu. et. al. Therefore, the proposed elliptic curve cryptographic processor is well suited to elliptic curve cryptosystems requiring high throughput rates such as network processors and web servers.

A New Parallel Multiplier for Type II Optimal Normal Basis (타입 II 최적 정규기저를 갖는 유한체의 새로운 병렬곱셈 연산기)

  • Kim Chang-Han;Jang Sang-Woon;Lim Jong-In;Ji Sung-Yeon
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.16 no.4
    • /
    • pp.83-89
    • /
    • 2006
  • In H/W implementation for the finite field, the use of normal basis has several advantages, especially, the optimal normal basis is the most efficient to H/W implementation in GF($2^m$). In this paper, we propose a new, simpler, parallel multiplier over GF($2^m$) having a type II optimal normal basis, which performs multiplication over GF($2^m$) in the extension field GF($2^{2m}$). The time and area complexity of the proposed multiplier is same as the best of known type II optimal normal basis parallel multiplier.

Design of Systolic Multipliers in GF(2$^{m}$ ) Using an Irreducible All One Polynomial (기약 All One Polynomial을 이용한 유한체 GF(2$^{m}$ )상의 시스톨릭 곱셈기 설계)

  • Gwon, Sun Hak;Kim, Chang Hun;Hong, Chun Pyo
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.8C
    • /
    • pp.1047-1054
    • /
    • 2004
  • In this paper, we present two systolic arrays for computing multiplications in CF(2$\^$m/) generated by an irreducible all one polynomial (AOP). The proposed two systolic mays have parallel-in parallel-out structure. The first systolic multiplier has area complexity of O(㎡) and time complexity of O(1). In other words, the multiplier consists of m(m+1)/2 identical cells and produces multiplication results at a rate of one every 1 clock cycle, after an initial delay of m/2+1 cycles. Compared with the previously proposed related multiplier using AOP, our design has 12 percent reduced hardware complexity and 50 percent reduced computation delay time. The other systolic multiplier, designed for cryptographic applications, has area complexity of O(m) and time complexity of O(m), i.e., it is composed of m+1 identical cells and produces multiplication results at a rate of one every m/2+1 clock cycles. Compared with other linear systolic multipliers, we find that our design has at least 43 percent reduced hardware complexity, 83 percent reduced computation delay time, and has twice higher throughput rate Furthermore, since the proposed two architectures have a high regularity and modularity, they are well suited to VLSI implementations. Therefore, when the proposed architectures are used for GF(2$\^$m/) applications, one can achieve maximum throughput performance with least hardware requirements.

Efficiently Hybrid $MSK_k$ Method for Multiplication in $GF(2^n)$ ($GF(2^n)$ 곱셈을 위한 효율적인 $MSK_k$ 혼합 방법)

  • Ji, Sung-Yeon;Chang, Nam-Su;Kim, Chang-Han;Lim, Jong-In
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.44 no.9
    • /
    • pp.1-9
    • /
    • 2007
  • For an efficient implementation of cryptosystems based on arithmetic in a finite field $GF(2^n)$, their hardware implementation is an important research topic. To construct a multiplier with low area complexity, the divide-and-conquer technique such as the original Karatsuba-Ofman method and multi-segment Karatsuba methods is a useful method. Leone proposed an efficient parallel multiplier with low area complexity, and Ernst at al. proposed a multiplier of a multi-segment Karatsuba method. In [1], the authors proposed new $MSK_5$ and $MSK_7$ methods with low area complexity to improve Ernst's method. In [3], the authors proposed a method which combines $MSK_2$ and $MSK_3$. In this paper we propose an efficient multiplication method by combining $MSK_2,\;MSK_3\;and\;MSK_5$ together. The proposed method reduces $116{\cdot}3^l$ gates and $2T_X$ time delay compared with Gather's method at the degree $25{\cdot}2^l-2^l with l>0.

Design of an Efficient Bit-Parallel Multiplier using Trinomials (삼항 다항식을 이용한 효율적인 비트-병렬 구조의 곱셈기)

  • 정석원;이선옥;김창한
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.13 no.5
    • /
    • pp.179-187
    • /
    • 2003
  • Recently efficient implementation of finite field operation has received a lot of attention. Among the GF($2^m$) arithmetic operations, multiplication process is the most basic and a critical operation that determines speed-up hardware. We propose a hardware architecture using Mastrovito method to reduce processing time. Existing Mastrovito multipliers using the special generating trinomial p($\chi$)=$x^m$+$x^n$+1 require $m^2$-1 XOR gates and $m^2$ AND gates. The proposed multiplier needs $m^2$ AND gates and $m^2$+($n^2$-3n)/2 XOR gates that depend on the intermediate term xn. Time complexity of existing multipliers is $T_A$+( (m-2)/(m-n) +1+ log$_2$(m) ) $T_X$ and that of proposed method is $T_X$+(1+ log$_2$(m-1)+ n/2 ) )$T_X$. The proposed architecture is efficient for the extension degree m suggested as standards: SEC2, ANSI X9.63. In average, XOR space complexity is increased to 1.18% but time complexity is reduced 9.036%.

A Fast Algorithm for Computing Multiplicative Inverses in GF(2$^{m}$) using Factorization Formula and Normal Basis (인수분해 공식과 정규기저를 이용한 GF(2$^{m}$ ) 상의 고속 곱셈 역원 연산 알고리즘)

  • 장용희;권용진
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.30 no.5_6
    • /
    • pp.324-329
    • /
    • 2003
  • The public-key cryptosystems such as Diffie-Hellman Key Distribution and Elliptical Curve Cryptosystems are built on the basis of the operations defined in GF(2$^{m}$ ):addition, subtraction, multiplication and multiplicative inversion. It is important that these operations should be computed at high speed in order to implement these cryptosystems efficiently. Among those operations, as being the most time-consuming, multiplicative inversion has become the object of lots of investigation Formant's theorem says $\beta$$^{-1}$ =$\beta$$^{2}$sup m/-2/, where $\beta$$^{-1}$ is the multiplicative inverse of $\beta$$\in$GF(2$^{m}$ ). Therefore, to compute the multiplicative inverse of arbitrary elements of GF(2$^{m}$ ), it is most important to reduce the number of times of multiplication by decomposing 2$^{m}$ -2 efficiently. Among many algorithms relevant to the subject, the algorithm proposed by Itoh and Tsujii[2] has reduced the required number of times of multiplication to O(log m) by using normal basis. Furthermore, a few papers have presented algorithms improving the Itoh and Tsujii's. However they have some demerits such as complicated decomposition processes[3,5]. In this paper, in the case of 2$^{m}$ -2, which is mainly used in practical applications, an efficient algorithm is proposed for computing the multiplicative inverse at high speed by using both the factorization formula x$^3$-y$^3$=(x-y)(x$^2$+xy+y$^2$) and normal basis. The number of times of multiplication of the algorithm is smaller than that of the algorithm proposed by Itoh and Tsujii. Also the algorithm decomposes 2$^{m}$ -2 more simply than other proposed algorithms.

A Lightweight Hardware Accelerator for Public-Key Cryptography (공개키 암호 구현을 위한 경량 하드웨어 가속기)

  • Sung, Byung-Yoon;Shin, Kyung-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.12
    • /
    • pp.1609-1617
    • /
    • 2019
  • Described in this paper is a design of hardware accelerator for implementing public-key cryptographic protocols (PKCPs) based on Elliptic Curve Cryptography (ECC) and RSA. It supports five elliptic curves (ECs) over GF(p) and three key lengths of RSA that are defined by NIST standard. It was designed to support four point operations over ECs and six modular arithmetic operations, making it suitable for hardware implementation of ECC- and RSA-based PKCPs. In order to achieve small-area implementation, a finite field arithmetic circuit was designed with 32-bit data-path, and it adopted word-based Montgomery multiplication algorithm, the Jacobian coordinate system for EC point operations, and the Fermat's little theorem for modular multiplicative inverse. The hardware operation was verified with FPGA device by implementing EC-DH key exchange protocol and RSA operations. It occupied 20,800 gate equivalents and 28 kbits of RAM at 50 MHz clock frequency with 180-nm CMOS cell library, and 1,503 slices and 2 BRAMs in Virtex-5 FPGA device.

A Scalable ECC Processor for Elliptic Curve based Public-Key Cryptosystem (타원곡선 기반 공개키 암호 시스템 구현을 위한 Scalable ECC 프로세서)

  • Choi, Jun-Baek;Shin, Kyung-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.8
    • /
    • pp.1095-1102
    • /
    • 2021
  • A scalable ECC architecture with high scalability and flexibility between performance and hardware complexity is proposed. For architectural scalability, a modular arithmetic unit based on a one-dimensional array of processing element (PE) that performs finite field operations on 32-bit words in parallel was implemented, and the number of PEs used can be determined in the range of 1 to 8 for circuit synthesis. A scalable algorithms for word-based Montgomery multiplication and Montgomery inversion were adopted. As a result of implementing scalable ECC processor (sECCP) using 180-nm CMOS technology, it was implemented with 100 kGEs and 8.8 kbits of RAM when NPE=1, and with 203 kGEs and 12.8 kbits of RAM when NPE=8. The performance of sECCP with NPE=1 and NPE=8 was analyzed to be 110 PSMs/sec and 610 PSMs/sec, respectively, on P256R elliptic curve when operating at 100 MHz clock.