• Title/Summary/Keyword: modular arithmetic

Search Result 53, Processing Time 0.027 seconds

Design of a ECC arithmetic engine for Digital Transmission Contents Protection (DTCP) (컨텐츠 보호를 위한 DTCP용 타원곡선 암호(ECC) 연산기의 구현)

  • Kim Eui seek;Jeong Yong jin
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.3C
    • /
    • pp.176-184
    • /
    • 2005
  • In this paper, we implemented an Elliptic Curve Cryptography(ECC) processor for Digital Transmission Contents Protection (DTCP), which is a standard for protecting various digital contents in the network. Unlikely to other applications, DTCP uses ECC algorithm which is defined over GF(p), where p is a 160-bit prime integer. The core arithmetic operation of ECC is a scalar multiplication, and it involves large amount of very long integer modular multiplications and additions. In this paper, the modular multiplier was designed using the well-known Montgomery algorithm which was implemented with CSA(Carry-save Adder) and 4-level CLA(Carry-lookahead Adder). Our new ECC processor has been synthesized using Samsung 0.18 m CMOS standard cell library, and the maximum operation frequency was estimated 98 MHz, with the size about 65,000 gates. The resulting performance was 29.6 kbps, that is, it took 5.4 msec to process a 160-bit data frame. We assure that this performance is enough to be used for digital signature, encryption and decryption, and key exchanges in real time environments.

Parallel Modular Multiplication Algorithm to Improve Time and Space Complexity in Residue Number System (RNS상에서 시간 및 공간 복잡도 향상을 위한 병렬 모듈러 곱셈 알고리즘)

  • 박희주;김현성
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.30 no.9
    • /
    • pp.454-460
    • /
    • 2003
  • In this paper, we present a novel method of parallelization of the modular multiplication algorithm to improve time and space complexity on RNS (Residue Number System). The parallel algorithm executes modular reduction using new table lookup based reduction method. MRS (Mixed Radix number System) is used because algebraic comparison is difficult in RNS which has a non-weighted number representation. Conversion from residue number system to certain MRS is relatively fast in residue computer. Therefore magnitude comparison is easily Performed on MRS. By the analysis of the algorithm, it is known that it requires only 1/2 table size than previous approach. And it requires 0(ι) arithmetic operations using 2ㅣ processors.

Study of Modular Multiplication Methods for Embedded Processors

  • Seo, Hwajeong;Kim, Howon
    • Journal of information and communication convergence engineering
    • /
    • v.12 no.3
    • /
    • pp.145-153
    • /
    • 2014
  • The improvements of embedded processors make future technologies including wireless sensor network and internet of things feasible. These applications firstly gather information from target field through wireless network. However, this networking process is highly vulnerable to malicious attacks including eavesdropping and forgery. In order to ensure secure and robust networking, information should be kept in secret with cryptography. Well known approach is public key cryptography and this algorithm consists of finite field arithmetic. There are many works considering high speed finite field arithmetic. One of the famous approach is Montgomery multiplication. In this study, we investigated Montgomery multiplication for public key cryptography on embedded microprocessors. This paper includes helpful information on Montgomery multiplication implementation methods and techniques for various target devices including 8-bit and 16-bit microprocessors. Further, we expect that the results reported in this paper will become part of a reference book for advanced Montgomery multiplication methods for future researchers.

DIVISIBILITY AND ARITHMETIC PROPERTIES OF CERTAIN ℓ-REGULAR OVERPARTITION PAIRS

  • ANUSREE ANAND;S.N. FATHIMA;M.A. SRIRAJ;P. SIVA KOTA REDDY
    • Journal of applied mathematics & informatics
    • /
    • v.42 no.4
    • /
    • pp.969-983
    • /
    • 2024
  • For an integer ℓ ≥ 1, let ${\bar{B}}_{\ell}(n)$ denotes the number of ℓ-regular over partition pairs of n. For certain conditions of ℓ, we study the divisibility of ${\bar{B}}_{\ell}(n)$ and arithmetic properties for ${\bar{B}}_{\ell}(n)$. We further obtain infinite family of congruences modulo 2t satisfied by ${\bar{B}}_3(n)$ employing a result of Ono and Taguchi (2005) on nilpotency of Hecke operators.

Design and Implementation of Arbitrary Precision Class for Public Key Crypto API based on Java Card (자바카드 기반 공개키 암호 API를 위한 임의의 정수 클래스 설계 및 구현)

  • Kim, Sung-Jun;Lee, Hei-Gyu;Cho, Han-Jin;Lee, Jae-Kwang
    • The KIPS Transactions:PartC
    • /
    • v.9C no.2
    • /
    • pp.163-172
    • /
    • 2002
  • Java Card API porvide benifit for development program based on smart card using limmited resource. This APIs does not support arithmetic operations such as modular arithmetic, greatest common divisor calculation, and generation and certification of prime number, which is necessary arithmetic in PKI algorithm implementation. In this paper, we implement class BigInteger acted in the Java Card platform because that Java Card APIs does not support class BigInteger necessary in implementation of PKI algorithm.

Design of Low-Latency Architecture for AB2 Multiplication over Finite Fields GF(2m) (유한체 GF(2m)상의 낮은 지연시간의 AB2 곱셈 구조 설계)

  • Kim, Kee-Won;Lee, Won-Jin;Kim, HyunSung
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.7 no.2
    • /
    • pp.79-84
    • /
    • 2012
  • Efficient arithmetic design is essential to implement error correcting codes and cryptographic applications over finite fields. This article presents an efficient $AB^2$ multiplier in GF($2^m$) using a polynomial representation. The proposed multiplier produces the result in m clock cycles with a propagation delay of two AND gates and two XOR gates using O($2^m$) area-time complexity. The proposed multiplier is highly modular, and consists of regular blocks of AND and XOR logic gates. Especially, exponentiation, inversion, and division are more efficiently implemented by applying $AB^2$ multiplication repeatedly rather than AB multiplication. As compared to related works, the proposed multiplier has lower area-time complexity, computational delay, and execution time and is well suited to VLSI implementation.

2,048 bits RSA public-key cryptography processor based on 32-bit Montgomery modular multiplier (32-비트 몽고메리 모듈러 곱셈기 기반의 2,048 비트 RSA 공개키 암호 프로세서)

  • Cho, Wook-Lae;Shin, Kyung-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.8
    • /
    • pp.1471-1479
    • /
    • 2017
  • This paper describes a design of RSA public-key cryptography processor supporting key length of 2,048 bits. A modular multiplier that is core arithmetic function in RSA cryptography was designed using word-based Montgomery multiplication algorithm, and a modular exponentiation was implemented by using Left-to-Right (LR) binary exponentiation algorithm. A computation of a modular multiplication takes 8,386 clock cycles, and RSA encryption and decryption requires 185,724 and 25,561,076 clock cycles, respectively. The RSA processor was verified by FPGA implementation using Virtex5 device. The RSA cryptographic processor synthesized with 100 MHz clock frequency using a 0.18 um CMOS cell library occupies 12,540 gate equivalents (GEs) and 12 kbits memory. It was estimated that the RSA processor can operate up to 165 MHz, and the estimated time for RSA encryption and decryption operations are 1.12 ms and 154.91 ms, respectively.

Implementation of High-radix Modular Exponentiator for RSA using CRT (CRT를 이용한 하이래딕스 RSA 모듈로 멱승 처리기의 구현)

  • 이석용;김성두;정용진
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.10 no.4
    • /
    • pp.81-93
    • /
    • 2000
  • In a methodological approach to improve the processing performance of modulo exponentiation which is the primary arithmetic in RSA crypto algorithm, we present a new RSA hardware architecture based on high-radix modulo multiplication and CRT(Chinese Remainder Theorem). By implementing the modulo multiplier using radix-16 arithmetic, we reduced the number of PE(Processing Element)s by quarter comparing to the binary arithmetic scheme. This leads to having the number of clock cycles and the delay of pipelining flip-flops be reduced by quarter respectively. Because the receiver knows p and q, factors of N, it is possible to apply the CRT to the decryption process. To use CRT, we made two s/2-bit multipliers operating in parallel at decryption, which accomplished 4 times faster performance than when not using the CRT. In encryption phase, the two s/2-bit multipliers can be connected to make a s-bit linear multiplier for the s-bit arithmetic operation. We limited the encryption exponent size up to 17-bit to maintain high speed, We implemented a linear array modulo multiplier by projecting horizontally the DG of Montgomery algorithm. The H/W proposed here performs encryption with 15Mbps bit-rate and decryption with 1.22Mbps, when estimated with reference to Samsung 0.5um CMOS Standard Cell Library, which is the fastest among the publications at present.

A Study on Implementation of Multiple-Valued Arithmetic Processor using Current Mode CMOS (전류모드 CMOS에 의한 다치 연산기 구현에 관한 연구)

  • Seong, Hyeon-Kyeong;Yoon, Kwang-Sub
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.36C no.8
    • /
    • pp.35-45
    • /
    • 1999
  • In this paper, the addition and the multiplicative algorithm of two polynomials over finite field $GF(p^m)$ are presented. The 4-valued arithmetic processor of the serial input-parallel output modular structure on $GF(4^3)$ to be performed the presented algorithm is implemented by current mode CMOS. This 4-valued arithmetic processor using current mode CMOS is implemented one addition/multiplication selection circuit and three operation circuits; mod(4) multiplicative operation circuit, MOD operation circuit made by two mod(4) addition operation circuits, and primitive irreducible polynomial operation circuit to be performing same operation as mod(4) multiplicative operation circuit. These operation circuits are simulated under $2{\mu}m$ CMOS standard technology, $15{\mu}A$ unit current, and 3.3V VDD voltage using PSpice. The simulation results have shown the satisfying current characteristics. The presented 4-valued arithmetic processor using current mode CMOS is simple and regular for wire routing and possesses the property of modularity. Also, it is expansible for the addition and the multiplication of two polynomials on finite field increasing the degree m and suitable for VLSI implementation.

  • PDF

A Lightweight Hardware Accelerator for Public-Key Cryptography (공개키 암호 구현을 위한 경량 하드웨어 가속기)

  • Sung, Byung-Yoon;Shin, Kyung-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.12
    • /
    • pp.1609-1617
    • /
    • 2019
  • Described in this paper is a design of hardware accelerator for implementing public-key cryptographic protocols (PKCPs) based on Elliptic Curve Cryptography (ECC) and RSA. It supports five elliptic curves (ECs) over GF(p) and three key lengths of RSA that are defined by NIST standard. It was designed to support four point operations over ECs and six modular arithmetic operations, making it suitable for hardware implementation of ECC- and RSA-based PKCPs. In order to achieve small-area implementation, a finite field arithmetic circuit was designed with 32-bit data-path, and it adopted word-based Montgomery multiplication algorithm, the Jacobian coordinate system for EC point operations, and the Fermat's little theorem for modular multiplicative inverse. The hardware operation was verified with FPGA device by implementing EC-DH key exchange protocol and RSA operations. It occupied 20,800 gate equivalents and 28 kbits of RAM at 50 MHz clock frequency with 180-nm CMOS cell library, and 1,503 slices and 2 BRAMs in Virtex-5 FPGA device.