통합 검색 | Korea Science

A Finite field multiplying unit using Mastrovito's arhitecture

Moon, San-Gook
- 한국정보통신학회:학술대회논문집
- /
- 한국해양정보통신학회 2005년도 춘계종합학술대회
- /
- pp.925-927
- /
- 2005
The study is about a finite field multiplying unit, which performs a calculation t-times as fast as the Mastrovito's multiplier architecture, suggesting and using the 2-times faster multiplier architecture. Former studies on finite field multiplication architecture includes the serial multiplication architecture, the array multiplication architecture, and the hybrid finite field multiplication architecture. Mastrovito's serial multiplication architecture has been regarded as the basic architecture for the finite field multiplication, and in order to exploit parallelism, as much resources were expensed to get as much speed in the finite field array multipliers. The array multiplication architecture has weakness in terms of area/performance ratio. In 1999, Parr has proposed the hybrid multipcliation architecture adopting benefits from both architectures. In the hybrid multiplication architecture, the main hardware frame is based on the Mastrovito's serial multiplication architecture with smaller 2-dimensional array multipliers as processing elements, so that its calculation speed is fairly fast costing intermediate resources. However, as the order of the finite field, complex integers instead of prime integers should be used, which means it cannot be used in the high-security applications. In this paper, we propose a different approach to devise a finite field multiplication architecture using Mastrovito's concepts.
PDF

Efficient Algorithm and Architecture for Elliptic Curve Cryptographic Processor

Nguyen, Tuy Tan;Lee, Hanho
- JSTS:Journal of Semiconductor Technology and Science
- /
- 제16권1호
- /
- pp.118-125
- /
- 2016
This paper presents a new high-efficient algorithm and architecture for an elliptic curve cryptographic processor. To reduce the computational complexity, novel modified Lopez-Dahab scalar point multiplication and left-to-right algorithms are proposed for point multiplication operation. Moreover, bit-serial Galois-field multiplication is used in order to decrease hardware complexity. The field multiplication operations are performed in parallel to improve system latency. As a result, our approach can reduce hardware costs, while the total time required for point multiplication is kept to a reasonable amount. The results on a Xilinx Virtex-5, Virtex-7 FPGAs and VLSI implementation show that the proposed architecture has less hardware complexity, number of clock cycles and higher efficiency than the previous works.
https://doi.org/10.5573/JSTS.2016.16.1.118 인용 PDF KSCI

공통인수 후처리 방식에 기반한 고속 유한체 곱셈기 (Fast GF(2m) Multiplier Architecture Based on Common Factor Post-Processing Method)

문상국
- 한국정보통신학회논문지
- /
- 제8권6호
- /
- pp.1188-1193
- /
- 2004
비도 높은 암호용으로 연구된 유한체 곱셈 연산기는 크게 직렬 유한체 곱셈기, 배열 유한체 곱셈기, 하이브리드 유한체 곱셈기으로 분류되어 왔다. 직렬 유한체 곱셈기는 마스트로비토 (Mastrovito) (1)에 의하여 제안되어 유한체 곱셈기의 가장 기본적인 구조로 자리잡아 왔고, 이를 병렬로 처리하기 위해 m 배의 자원을 투자하여 m 배의 속도를 얻어낸 결과가 2차원 배열 유한체 곱셈기이며 (2), 이들 기존 방식의 장점만을 취하여 제안된 방식이 1999년 Paar에 의해 제안된 하이브리드 (hybrid) 곱셈기이다 (3). 반면 이 하이브리드 곱셈기는 사용 가능한 유한체로서 유한체의 차수를 합성수로 사용해야 한다는 제약이 따른다. 본 논문에서는 마스트로비토의 곱셈기의 구조를 기본으로 하고, 수식적으로 공통인수를 끌어내어 후처리하는 기법을 유도하여 적용한다. 제안한 방식으로 설계한 새로운 유한체 곱셈기는 HDL로 구현하여 소프트웨어 측면 뿐 아니라 하드웨어 측면에서도 그 기능과 성능을 검증하였다. 제안된 방식에서 직렬 다항 기준식 (polynomial)을 t (t는 1보다 큰 양의 정수) 부분으로 나누어 적용하였을 경우 곱셈기는 t 배의 속도 향상을 보일 수 있다.
PDF KSCI

2D Mesh SIMD 구조에서의 병렬 행렬 곱셈의 수치적 성능 분석 (An Analytical Evaluation of 2D Mesh-connected SIMD Architecture for Parallel Matrix Multiplication)

김정길
- 정보통신설비학회논문지
- /
- 제10권1호
- /
- pp.7-13
- /
- 2011
Matrix multiplication is a fundamental operation of linear algebra and arises in many areas of science and engineering. This paper introduces an efficient parallel matrix multiplication scheme on N ${\times}$ N mesh-connected SIMD array processor, called multiple hierarchical SIMD architecture (HMSA). The architectural characteristic of HMSA is the hierarchically structured control units which consist of a global control unit, N local control units configured diagonally, and $N^2$ processing elements (PEs) arranged in an N ${\times}$ N array. PEs are communicating through local buses connecting four adjacent neighbor PEs in mesh-torus networks and global buses running across the rows and columns called horizontal buses and vertical buses, respectively. This architecture enables HMSA to have the features of diagonally indexed concurrent broadcast and the accessibility to either rows (row control mode) or columns (column control mode) of 2D array PEs alternately. An algorithmic mapping method is used for performance evaluation by mapping matrix multiplication on the proposed architecture. The asymptotic time complexities of them are evaluated and the result shows that paralle matrix multiplication on HMSA can provide significant performance improvement.
PDF

시스톨릭 어레이 구조를 갖는 효율적인 n-비트 Radix-4 모듈러 곱셈기 구조 (Efficient Architecture of an n-bit Radix-4 Modular Multiplier in Systolic Array Structure)

박태근;조광원
- 정보처리학회논문지A
- /
- 제10A권4호
- /
- pp.279-284
- /
- 2003
본 논문에서는 Montgomery 알고리즘을 기반으로 시스톨릭 어레이 구조를 이용한 효율적인 Radix-4 모듈러 곱셈기 구조를 제안한다. 제안된 알고리즘을 이용하여 모듈러 곱셈을 위한 반복의 수가 감소되었으며, 따라서 n-비트의 모듈러 곱셈을 수행하기 위하여 (3/2)n+2 클럭이 소요된다. 그러나 하드웨어의 이용도를 감안할 때 두 개의 곱셈에 대한 중첩(interleaving) 연산이 가능하며, 가장 빠른 시기에 새로운 곱셈을 시작한다면 하나의 모듈러 곱셈을 수행하기 위하여 평균 n/2 클럭이 필요하다. 제안된 구조는 시스톨릭 어레이 구조의 잇점으로 규칙성과 확장성을 갖기 때문에 효율적인 VLSI 구조로 설계하기가 용이하다. 기존의 다른 구조들과 비교하여 볼 때 제안된 구조는 상대적으로 적은 하드웨어들을 사용하여 높은 수행 속도를 보여주었다.
https://doi.org/10.3745/KIPSTA.2003.10A.4.279 인용 PDF KSCI

MCM과 폴딩 방식을 적용한 웨이블릿 변환 장치의 VLSI 설계 (VLSI Design for Folded Wavelet Transform Processor using Multiple Constant Multiplication)

김지원;손창훈;김송주;이배호;김영민
- 한국멀티미디어학회논문지
- /
- 제15권1호
- /
- pp.81-86
- /
- 2012
본 논문은 하드웨어 곱셈 연산을 최적화 한 리프팅 기반의 9/7 웨이블릿 필터의 VLSI 구조를 제안한다. 제안하는 구조는 범용 곱셈기를 사용하는 기존의 리프팅 기법과 달리 웨이블릿 계수에 패턴 탐색 기법의 Lef$\grave{e}$vre 알고리즘을 적용하였으며, MCM(Multiple constant multiplication)과 폴딩 방식을 9/7 DWT 필터에 적용하여 효율적으로 하드웨어 설계가 이루어 질수 있도록 제안하였다. 이러한 구조는 하드웨어 자원을 100% 활용하는 이점을 지니며, 이전의 성능에 비해 화질 열화 없이 단순한 하드웨어 구조, 속도, 면적, 전력소모 측면에서 효율적이다. 비교 실험을 위해 Verilog HDL을 통해 구현하였으며, $0.18{\mu}m$ CMOS 공정의 스탠다드 셀을 이용하여 합성하였다. 제안한 구조를 기존의 구조와 200MHz의 합성 타겟 클럭 주파수에서 비교하였을 때 면적, 전력소모 측면에서 60.1%, 44.1% 감소하였으며, 이를 통해 이전의 리프팅 기법에 비해 하드웨어 구현에 보다 최적화된 구조임을 보여준다.
https://doi.org/10.9717/kmms.2012.15.1.081 인용 PDF KSCI

Polynomial basis 방식의 3배속 직렬 유한체 곱셈기 (3X Serial GF($2^m$) Multiplier Architecture on Polynomial Basis Finite Field)

문상국
- 한국정보통신학회논문지
- /
- 제10권2호
- /
- pp.328-332
- /
- 2006
정보 보호 응용에 새로운 이슈가 되고 있는 ECC 공개키 암호 알고리즘은 유한체 차원에서의 효율적인 연산처리가 중요하다. 직렬 유한체 곱셈기의 근간은 Mastrovito의 직렬 곱셈기에서 유래한다. 본 논문에서는 polynomial basis 방식을 적용하고 식을 유도하여 Mastrovito의 직렬 유한체 곱셈방식의 3배 성능을 보이는 유한체 곱셈기를 제안하고, HDL로 기술하여 기능을 검증하고 성능을 평가한다. 설계된 3배속 직렬 유한체 곱셈기는 부분합을 생성하는 회로의 추가만으로 기존 직렬 곱셈기의 3배의 성능을 보여주었다. 비도 높은 암호용으로 연구된 유한체 곱셈 연산기는 크게 직렬 유한체 곱셈기, 배열 유한체 곱셈기, 하이브리드 유한체 곱셈기으로 분류되어 왔다. 본 논문에서는 Mastrovito의 곱셈기의 구조를 기본으로 하고, 수식적으로 공통인수를 끌어내어 후처리하는 기법을 유도하여 적용한다. 제안한 방식으로 설계한 새로운 유한체 곱셈기는 HDL로 구현하여 소프트웨어 측면 뿐 아니라 하드웨어 측면에서도 그 기능과 성능을 검증하였다.
PDF KSCI

Homogeneous Transformation Matrix의 곱셈을 위한 병렬구조 프로세서의 설계 (A Parallel-Architecture Processor Design for the Fast Multiplication of Homogeneous Transformation Matrices)

권두올;정태상
- 대한전기학회논문지:시스템및제어부문D
- /
- 제54권12호
- /
- pp.723-731
- /
- 2005
The $4{\times}4$ homogeneous transformation matrix is a compact representation of orientation and position of an object in robotics and computer graphics. A coordinate transformation is accomplished through the successive multiplications of homogeneous matrices, each of which represents the orientation and position of each corresponding link. Thus, for real time control applications in robotics or animation in computer graphics, the fast multiplication of homogeneous matrices is quite demanding. In this paper, a parallel-architecture vector processor is designed for this purpose. The processor has several key features. For the accuracy of computation for real application, the operands of the processors are floating point numbers based on the IEEE Standard 754. For the parallelism and reduction of hardware redundancy, the processor takes column vectors of homogeneous matrices as multiplication unit. To further improve the throughput, the processor structure and its control is based on a pipe-lined structure. Since the designed processor can be used as a special purpose coprocessor in robotics and computer graphics, additionally to special matrix/matrix or matrix/vector multiplication, several other useful instructions for various transformation algorithms are included for wide application of the new design. The suggested instruction set will serve as standard in future processor design for Robotics and Computer Graphics. The design is verified using FPGA implementation. Also a comparative performance improvement of the proposed design is studied compared to a uni-processor approach for possibilities of its real time application.
PDF KSCI

최적화된 4진18진 혼합 MAC 설계 (An Optimized Hybrid Radix MAC Design)

정진우;김승철;이용주;이용석
- 대한전자공학회:학술대회논문집
- /
- 대한전자공학회 2002년도 하계종합학술대회 논문집(2)
- /
- pp.173-176
- /
- 2002
This paper is about a high-speed MAC (multiplier and accumulator) design applying radix-4 and radix-8 Booth's algorithm at the same time. The optimized hybrid radix design for high speed MAC has taken advantage of both a radix-4 and a radix-8 architectures. A radix-4 architecture meets high-speed, but it takes much more power and chip area than a radix-8 architecture. A radix-8 architecture needs less power and chip area than the other, but it has a bottleneck of generating three times the multiplicand problem. An optimized hybrid architecture performs the radix-4 multiplication partially in parallel with the generation of three times the multiplicand for use of the radix-8 multiplication. It reduces the concerned bit width of multiplier in radix-8 multiplication.
PDF

최적화된 4진/8진 혼합 MAC 설계 (An Optimized Hybrid Radix MAC Design)

정진우;김승철;이용주;이용석
- 대한전자공학회:학술대회논문집
- /
- 대한전자공학회 2002년도 하계종합학술대회 논문집(1)
- /
- pp.125-128
- /
- 2002
This paper is about a high-speed MAC (multiplier and accumulator) design applying radix-4 and radix-8 Booth's algorithm at the same time. The optimized hybrid radix design for high speed MAC has taken advantage of both a radix-4 and a radix-8 architectures. A radix-4 architecture meets high-speed, but it takes much more power and chip area than a radix-8 architecture. A radix-8 architecture needs less power and chip area than the other, but it has a bottleneck of generating three times the multiplicand problem. An optimized hybrid architecture performs tile radix-4 multiplication partially in parallel with the generation of three times the multiplicand for use of tile radix-8 multiplication. It reduces the concerned bit width of multiplier in radix-8 multiplication.
PDF

검색결과 167건 처리시간 0.023초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)