Search | Korea Science

Low-area Bit-parallel Systolic Array for Multiplication and Square over Finite Fields

Kim, Keewon
- Journal of the Korea Society of Computer and Information
- /
- v.25 no.2
- /
- pp.41-48
- /
- 2020
In this paper, we derive a common computational part in an algorithm that can simultaneously perform multiplication and square over finite fields, and propose a low-area bit-parallel systolic array that reduces hardware through sequential processing. The proposed systolic array has less space and area-time (AT) complexity than the existing related arrays. In detail, the proposed systolic array saves about 48% and 44% of Choi-Lee and Kim-Kim's systolic arrays in terms of area complexity, and about 74% and 44% in AT complexity. Therefore, the proposed systolic array is suitable for VLSI implementation and can be applied as a basic component in hardware constrained environment such as IoT.
https://doi.org/10.9708/jksci.2020.25.02.041 인용 PDF KSCI

A 32-bit Microprocessor with enhanced digital signal process functionality (디지털 신호처리 기능을 강화한 32비트 마이크로프로세서)

Moon, Sang-ook
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- v.9 no.2
- /
- pp.820-822
- /
- 2005
We have designed a 32-bit microprocessor with fixed point digital signal processing functionality. This processor, combines both general-purpose microprocessor and digital signal processor functionality using the reduced instruction set computer design principles. It has functional units for arithmetic operation, digital signal processing and memory access. They operate in parallel in order to remove stall cycles after DSP or load/store instructions, which usually need one or more issue latency cycles in addition to the first issue cycle. High performance was achieved with these parallel functional units while adopting a sophisticated five-stage pipeline stucture.
PDF

The Design of 10-bit 200MS/s CMOS Parallel Pipeline A/D Converter (10-비트 200MS/s CMOS 병렬 파이프라인 아날로그/디지털 변환기의 설계)

Chung, Kang-Min
- The KIPS Transactions:PartA
- /
- v.11A no.2
- /
- pp.195-202
- /
- 2004
This paper introduces the design or parallel Pipeline high-speed analog-to-digital converter(ADC) for the high-resolution video applications which require very precise sampling. The overall architecture of the ADC consists of 4-channel parallel time-interleaved 10-bit pipeline ADC structure a]lowing 200MSample/s sampling speed which corresponds to 4-times improvement in sampling speed per channel. Key building blocks are composed of the front-end sample-and-hold amplifier(SHA), the dynamic comparator and the 2-stage full differential operational amplifier. The 1-bit DAC, comparator and gain-2 amplifier are used internally in each stage and they were integrated into single switched capacitor architecture allowing high speed operation as well as low power consumption. In this work, the gain of operational amplifier was enhanced significantly using negative resistance element. In the ADC, a delay line Is designed for each stage using D-flip flops to align the bit signals and minimize the timing error in the conversion. The converter has the power dissipation of 280㎽ at 3.3V power supply. Measured performance includes DNL and INL of +0.7/-0.6LSB, +0.9/-0.3LSB.
https://doi.org/10.3745/KIPSTA.2004.11A.2.195 인용 PDF KSCI

The 64-Bit Scrambler Design of the OFDM Modulation for Vehicles Communications Technology (차량 통신 기술을 위한 OFDM 모듈레이션의 64-비트 스크램블러 설계)

Lee, Dae-Sik
- Journal of Internet Computing and Services
- /
- v.14 no.1
- /
- pp.15-22
- /
- 2013
WAVE(Wireless Access for Vehicular Environment) is new concepts and Vehicles communications technology using for ITS(Intelligent Transportation Systems) service by IEEE standard 802.11p. Also it increases the efficiency and safety of the traffic on the road. However, the efficiency of Scrambler bit computational algorithms of OFDM modulation in WAVE systems will fall as it is not able to process in parallel in terms of hardware and software. This paper proposes an algorithm to configure 64-bits matrix table in scambler bit computation as well as an algorithm to compute 64-bits matrix table and input data in parallel. The proposed algorithm on this thesis is executed using 64-bits matrix table. In the result, the processing speed for 1 and 1000 times is improved about 40.08% ~ 40.27% and processing rate per sec is performed more than 468.35 compared to bit operation scramble. And processing speed for 1 and 1000 times is improved about 7.53% ~ 7.84% and processing rate per sec is performed more than 91.44 compared to 32-bits operation scramble. Therefore, if the 64 bit-CPU is used for 64-bits executable scramble algorithm, it is improved more than 40% compare to 32-bits scrambler.
https://doi.org/10.7472/jksii.2013.14.15 인용 PDF KSCI

Fault Detection Architecture of the Field Multiplication Using Gaussian Normal Bases in GF(2ⁿ (가우시안 정규기저를 갖는 GF(2ⁿ)의 곱셈에 대한 오류 탐지)

Kim, Chang Han;Chang, Nam Su;Park, Young Ho
- Journal of the Korea Institute of Information Security & Cryptology
- /
- v.24 no.1
- /
- pp.41-50
- /
- 2014
In this paper, we proposed an error detection in Gaussian normal basis multiplier over $GF(2^n)$. It is shown that by using parity prediction, error detection can be very simply constructed in hardware. The hardware overheads are only one AND gate, n+1 XOR gates, and one 1-bit register in serial multipliers, and so n AND gates, 2n-1 XOR gates in parallel multipliers. This method are detect in odd number of bit fault in C = AB.
https://doi.org/10.13089/JKIISC.2014.24.1.41 인용 PDF KSCI HTML

Bit Error Rate measurement of an RSFQ switch by using an automatic error counter (자동 Error counter를 이용한 RSFQ switch 소자의 Bit Error Rate 측정)

Kim Se Hoon;Kim Jin Young;Baek Seung Hun;Jung Ku Rak;Hahn Taek Sang;Kang Joon Hee
- Progress in Superconductivity and Cryogenics
- /
- v.7 no.1
- /
- pp.21-24
- /
- 2005
The problem of fluctuation-induced digital errors in a rapid single flux quantum (RSFQ) circuit has been very important issue. So in this experiment, we calculated error rate of RSFQ switch in superconductiyity ALU, The RSFQ switch should have a very low error rate in the optimal bias. We prepared two circuits Placed in parallel. One was a 10 Josephson transmission lines (JTLs) connected in series, and the other was the same circuit but with an RSFQ switch placed in the middle of the 10 JTLs. We used a splitter to feed the same input signal to the both circuits. The outputs of the two circuits were compared with an RSFQ XOR to measure the error rate of the RSFQ switch. By using a computerized bit error rate test setup, we measured the bit error rate of 2.18$\times$$10^{12}$ when the bias to the RSFQ switch was 0.398mh that was quite off from the optimum bias of 0.6mA.
PDF KSCI

A Memory-Efficient Two-Stage String Matching Engine Using both Content-Addressable Memory and Bit-split String Matchers for Deep Packet Inspection (CAM과 비트 분리 문자열 매처를 이용한 DPI를 위한 2단의 문자열 매칭 엔진의 개발)

Kim, HyunJin;Choi, Kang-Il
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.39B no.7
- /
- pp.433-439
- /
- 2014
This paper proposes an architecture of two-stage string matching engine with content-addressable memory(CAM) and parallel bit-split string matchers for deep packet inspection(DPI). Each long signature is divided into subpatterns with the same length, where subpatterns are mapped onto the CAM in the first stage. The long pattern is matched in the second stage using the sequence of the matching indexes from the CAM. By adopting CAM and bit-split string matchers, the memory requirements can be greatly reduced in the heterogeneous string matching environments.
https://doi.org/10.7840/kics.2014.39B.7.433 인용 PDF KSCI

Design of Bit Manipulation Accelerator fo Communication DSP (통신용 DSP를 위한 비트 조작 연산 가속기의 설계)

Jeong Sug H.;Sunwoo Myung H.
- Journal of the Institute of Electronics Engineers of Korea TC
- /
- v.42 no.8 s.338
- /
- pp.11-16
- /
- 2005
This paper proposes a bit manipulation accelerator (BMA) having application specific instructions, which efficiently supports scrambling, convolutional encoding, puncturing, and interleaving. Conventional DSPs cannot effectively perform bit manipulation functions since かey have multiply accumulate (MAC) oriented data paths and word-based functions. However, the proposed accelerator can efficiently process bit manipulation functions using parallel shift and Exclusive-OR (XOR) operations and bit jnsertion/extraction operations on multiple data. The proposed BMA has been modeled by VHDL and synthesized using the SEC $0.18\mu m$ standard cell library and the gate count of the BMA is only about 1,700 gates. Performance comparisons show that the number of clock cycles can be reduced about $40\%\sim80\%$ for scrambling, convolutional encoding and interleaving compared with existing DSPs.
PDF KSCI

High-speed Design of 8-bit Architecture of AES Encryption (AES 암호 알고리즘을 위한 고속 8-비트 구조 설계)

Lee, Je-Hoon;Lim, Duk-Gyu
- Convergence Security Journal
- /
- v.17 no.2
- /
- pp.15-22
- /
- 2017
This paper presents new 8-bit implementation of AES. Most typical 8-bit AES designs are to reduce the circuit area by sacrificing its throughput. The presented AES architecture employs two separated S-box to perform round operation and key generation in parallel. From the simulation results of the proposed AES-128, the maximum critical path delay is 13.0ns. It can be operated in 77MHz and the throughput is 15.2 Mbps. Consequently, the throughput of the proposed AES has 1.54 times higher throughput than the other counterpart although the area increasement is limited in 1.17 times. The proposed AES design enables very low-area design without sacrificing its performance. Thereby, it can be suitable for the various IoT applications that need high speed communication.
PDF KSCI

Parallelized Architecture of Serial Finite Field Multipliers for Fast Computation (유한체 상에서 고속 연산을 위한 직렬 곱셈기의 병렬화 구조)

Cho, Yong-Suk
- Journal of the Korea Institute of Information Security & Cryptology
- /
- v.17 no.1
- /
- pp.33-39
- /
- 2007
Finite field multipliers are the basic building blocks in many applications such as error-control coding, cryptography and digital signal processing. Hence, the design of efficient dedicated finite field multiplier architectures can lead to dramatic improvement on the overall system performance. In this paper, a new bit serial structure for a multiplier with low latency in Galois field is presented. To speed up multiplication processing, we divide the product polynomial into several parts and then process them in parallel. The proposed multiplier operates standard basis of $GF(2^m)$ and is faster than bit serial ones but with lower area complexity than bit parallel ones. The most significant feature of the proposed architecture is that a trade-off between hardware complexity and delay time can be achieved.
https://doi.org/10.13089/JKIISC.2007.17.1.33 인용 PDF KSCI HTML

Search Result 406, Processing Time 0.035 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)