• Title/Summary/Keyword: Montgomery Multiplier

Search Result 48, Processing Time 0.025 seconds

A Public-Key Cryptography Processor Supporting GF(p) 224-bit ECC and 2048-bit RSA (GF(p) 224-비트 ECC와 2048-비트 RSA를 지원하는 공개키 암호 프로세서)

  • Sung, Byung-Yoon;Shin, Kyung-Wook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2018.05a
    • /
    • pp.163-165
    • /
    • 2018
  • GF(p)상 타원곡선 암호(ECC)와 RSA를 단일 하드웨어로 통합하여 구현한 공개키 암호 프로세서를 설계하였다. 설계된 EC-RSA 공개키 암호 프로세서는 NIST 표준에 정의된 소수체 상의 224-비트 타원 곡선 P-224와 2048-비트 키 길이의 RSA를 지원한다. ECC와 RSA가 갖는 연산의 공통점을 기반으로 워드기반 몽고메리 곱셈기와 메모리 블록을 효율적으로 결합하여 최적화된 데이터 패스 구조를 적용하였다. EC-RSA 공개키 암호 프로세서는 Modelsim을 이용한 기능검증을 통하여 정상동작을 확인하였으며, $0.18{\mu}m$ CMOS 셀 라이브러리로 합성한 결과 11,779 GEs와 14-Kbit RAM의 경량 하드웨어로 구현되었다. EC-RSA 공개키 암호 프로세서는 최대 동작주파수 133 MHz이며, ECC 연산에는 867,746 클록주기가 소요되며, RSA 복호화 연산에는 26,149,013 클록주기가 소요된다.

  • PDF

FPGA Implementation of RSA Public-Key Cryptographic Coprocessor for Restricted System

  • Kim, Mooseop;Park, Yongje;Kim, Howon
    • Proceedings of the IEEK Conference
    • /
    • 2002.07c
    • /
    • pp.1551-1554
    • /
    • 2002
  • In this paper, the hardware implementation of the RSA public-key cryptographic algorithm is presented. The RSA cryptographic algorithm is depends on the computation of repeated modular exponentials. The Montgomery algorithm is used and modified to reduce hardware resources and to achieve reasonable operating speed for smart card. An efficient architecture for modular multiplications based on the array multiplier is proposed. We have implemented a 10240it RSA cryptographic processor based on proposed scheme in IESA system developed for smart card emulating system. As a result, it is shown that proposed architecture contributes to small area and reasonable speed for smart cards.

  • PDF

Efficient Architecture of an n-bit Radix-4 Modular Multiplier in Systolic Array Structure (시스톨릭 어레이 구조를 갖는 효율적인 n-비트 Radix-4 모듈러 곱셈기 구조)

  • Park, Tae-geun;Cho, Kwang-won
    • The KIPS Transactions:PartA
    • /
    • v.10A no.4
    • /
    • pp.279-284
    • /
    • 2003
  • In this paper, we propose an efficient architecture for radix-4 modular multiplication in systolic array structure based on the Montgomery's algorithm. We propose a radix-4 modular multiplication algorithm to reduce the number of iterations, so that it takes (3/2)n+2 clock cycles to complete an n-bit modular multiplication. Since we can interleave two consecutive modular multiplications for 100% hardware utilization and can start the next multiplication at the earliest possible moment, it takes about only n/2 clock cycles to complete one modular multiplication in the average. The proposed architecture is quite regular and scalable due to the systolic array structure so that it fits in a VLSI implementation. Compared to conventional approaches, the proposed architecture shows shorter period to complete a modular multiplication while requiring relatively less hardware resources.

A Lightweight Hardware Implementation of ECC Processor Supporting NIST Elliptic Curves over GF(2m) (GF(2m) 상의 NIST 타원곡선을 지원하는 ECC 프로세서의 경량 하드웨어 구현)

  • Lee, Sang-Hyun;Shin, Kyung-Wook
    • Journal of IKEEE
    • /
    • v.23 no.1
    • /
    • pp.58-67
    • /
    • 2019
  • A design of an elliptic curve cryptography (ECC) processor that supports both pseudo-random curves and Koblitz curves over $GF(2^m)$ defined by the NIST standard is described in this paper. A finite field arithmetic circuit based on a word-based Montgomery multiplier was designed to support five key lengths using a datapath of fixed size, as well as to achieve a lightweight hardware implementation. In addition, Lopez-Dahab's coordinate system was adopted to remove the finite field division operation. The ECC processor was implemented in the FPGA verification platform and the hardware operation was verified by Elliptic Curve Diffie-Hellman (ECDH) key exchange protocol operation. The ECC processor that was synthesized with a 180-nm CMOS cell library occupied 10,674 gate equivalents (GEs) and a dual-port RAM of 9 kbits, and the maximum clock frequency was estimated at 154 MHz. The scalar multiplication operation over the 223-bit pseudo-random elliptic curve takes 1,112,221 clock cycles and has a throughput of 32.3 kbps.

A Public-Key Crypto-Core supporting Edwards Curves of Edwards25519 and Edwards448 (에드워즈 곡선 Edwards25519와 Edwards448을 지원하는 공개키 암호 코어)

  • Yang, Hyeon-Jun;Shin, Kyung-Wook
    • Journal of IKEEE
    • /
    • v.25 no.1
    • /
    • pp.174-179
    • /
    • 2021
  • An Edwards curve cryptography (EdCC) core supporting point scalar multiplication (PSM) on Edwards curves of Edwards25519 and Edwards448 was designed. For area-efficient implementation, finite field multiplier based on word-based Montgomery multiplication algorithm was designed, and the extended twisted Edwards coordinates system was adopted to implement point operations without division operation. As a result of synthesizing the EdCC core with 100 MHz clock, it was implemented with 24,073 equivalent gates and 11 kbits RAM, and the maximum operating frequency was estimated to be 285 MHz. The evaluation results show that the EdCC core can compute 299 and 66 PSMs per second on Edwards25519 and Edwards448 curves, respectively. Compared to the ECC core with similar structure, the number of clock cycles required for 256-bit PSM was reduced by about 60%, resulting in 7.3 times improvement in computational performance.

A Security SoC supporting ECC based Public-Key Security Protocols (ECC 기반의 공개키 보안 프로토콜을 지원하는 보안 SoC)

  • Kim, Dong-Seong;Shin, Kyung-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.11
    • /
    • pp.1470-1476
    • /
    • 2020
  • This paper describes a design of a lightweight security system-on-chip (SoC) suitable for the implementation of security protocols for IoT and mobile devices. The security SoC using Cortex-M0 as a CPU integrates hardware crypto engines including an elliptic curve cryptography (ECC) core, a SHA3 hash core, an ARIA-AES block cipher core and a true random number generator (TRNG) core. The ECC core was designed to support twenty elliptic curves over both prime field and binary field defined in the SEC2, and was based on a word-based Montgomery multiplier in which the partial product generations/additions and modular reductions are processed in a sub-pipelining manner. The H/W-S/W co-operation for elliptic curve digital signature algorithm (EC-DSA) protocol was demonstrated by implementing the security SoC on a Cyclone-5 FPGA device. The security SoC, synthesized with a 65-nm CMOS cell library, occupies 193,312 gate equivalents (GEs) and 84 kbytes of RAM.

Design of high-speed RSA processor based on radix-4 Montgomery multiplier (래딕스-4 몽고메리 곱셈기 기반의 고속 RSA 연산기 설계)

  • Koo, Bon-Seok;Ryu, Gwon-Ho;Chang, Tae-Joo;Lee, Sang-Jin
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.17 no.6
    • /
    • pp.29-39
    • /
    • 2007
  • RSA is one of the most popular public-key crypto-system in various applications. This paper addresses a high-speed RSA crypto-processor with modified radix-4 modular multiplication algorithm and Chinese Remainder Theorem(CRT) using Carry Save Adder(CSA). Our design takes 0.84M clock cycles for a 1024-bit modular exponentiation and 0.25M cycles for a 512-bit exponentiations. With 0.18um standard cell library, the processor achieves 365Kbps for a 1024-bit exponentiation and 1,233Kbps for two 512-bit exponentiations at a 300MHz clock rate.

Implementation of RSA modular exponentiator using Division Chain (나눗셈 체인을 이용한 RSA 모듈로 멱승기의 구현)

  • 김성두;정용진
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.12 no.2
    • /
    • pp.21-34
    • /
    • 2002
  • In this paper we propos a new hardware architecture of modular exponentiation using a division chain method which has been proposed in (2). Modular exponentiation using the division chain is performed by receding an exponent E as a mixed form of multiplication and addition with divisors d=2 or $d=2^I +1$ and respective remainders r. This calculates the modular exponentiation in about $1.4log_2$E multiplications on average which is much less iterations than $2log_2$E of conventional Binary Method. We designed a linear systolic array multiplier with pipelining and used a horizontal projection on its data dependence graph. So, for k-bit key, two k-bit data frames can be inputted simultaneously and two modular multipliers, each consisting of k/2+3 PE(Processing Element)s, can operate in parallel to accomplish 100% throughput. We propose a new encoding scheme to represent divisors and remainders of the division chain to keep regularity of the data path. When it is synthesized to ASIC using Samsung 0.5 um CMOS standard cell library, the critical path delay is 4.24ns, and resulting performance is estimated to be abort 140 Kbps for a 1024-bit data frame at 200Mhz clock In decryption process, the speed can be enhanced to 560kbps by using CRT(Chinese Remainder Theorem). Futhermore, to satisfy real time requirements we can choose small public exponent E, such as 3,17 or $2^{16} +1$, in encryption and verification process. in which case the performance can reach 7.3Mbps.