• Title/Summary/Keyword: Bit-Parallel

Search Result 406, Processing Time 0.033 seconds

Optimized Implementation of Block Cipher PIPO in Parallel-Way on 64-bit ARM Processors (64-bit ARM 프로세서 상에서의 블록암호 PIPO 병렬 최적 구현)

  • Eum, Si Woo;Kwon, Hyeok Dong;Kim, Hyun Jun;Jang, Kyoung Bae;Kim, Hyun Ji;Park, Jae Hoon;Song, Gyeung Ju;Sim, Min Joo;Seo, Hwa Jeong
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.10 no.8
    • /
    • pp.223-230
    • /
    • 2021
  • The lightweight block cipher PIPO announced at ICISC'20 has been effectively implemented by applying the bit slice technique. In this paper, we propose a parallel optimal implementation of PIPO for ARM processors. The proposed implementation enables parallel encryption of 8-plaintexts and 16-plaintexts. The implementation targets the A10x fusion processor. On the target processor, the existing reference PIPO code has performance of 34.6 cpb and 44.7 cpb in 64/128 and 64/256 standards. Among the proposed methods, the general implementation has a performance of 12.0 cpb and 15.6 cpb in the 8-plaintexts 64/128 and 64/256 standards, and 6.3 cpb and 8.1 cpb in the 16-plaintexts 64/128 and 64/256 standards. Compared to the existing reference code implementation, the 8-plaintexts parallel implementation for each standard has about 65.3%, 66.4%, and the 16-plaintexts parallel implementation, about 81.8%, and 82.1% better performance. The register minimum alignment implementation shows performance of 8.2 cpb and 10.2 cpb in the 8-plaintexts 64/128 and 64/256 specifications, and 3.9 cpb and 4.8 cpb in the 16-plaintexts 64/128 and 64/256 specifications. Compared to the existing reference code implementation, the 8-plaintexts parallel implementation has improved performance by about 76.3% and 77.2%, and the 16-plaintext parallel implementation is about 88.7% and 89.3% higher for each standard.

3-Way 32 bit VLIW Multimedia Signal Processor

  • Park, Jaebok;Jaehee You
    • Proceedings of the IEEK Conference
    • /
    • 2001.06b
    • /
    • pp.97-100
    • /
    • 2001
  • A 3-way VLIW multimedia signal processor capable of efficient repeated operations as well as both load/store and type transformations for various data types is presented. It is composed of a 32-bit execution unit that can execute two instructions in parallel, an independent load/store unit and a control unit. The processor is implemented with 0.6${\mu}{\textrm}{m}$ gate array and the results are discussed.

  • PDF

Parallel Writing and Detection for Two Dimensional Magnetic Recording Channel

  • Zhang, Yong;Lee, Jaejin
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.37A no.10
    • /
    • pp.821-826
    • /
    • 2012
  • Two-dimensional magnetic recording (TDMR) is treated as the next generation magnetic recording method, but because of its high channel bit error rate, it is difficult to use in practices. In this paper, we introduce a new writing method that can decrease the nonlinear media error effectively, and it can also achieve 10 Tb/$in^2$ of user bit density on a magnetic recording medium with 20 Teragrains/$in^2$.

Digit-serial $AB^2$ Systolic Architecture in GF$(2^m)$ (GF$(2^m)$상에서 디지트 시리얼 $AB^2$시스톨릭 구조 설계)

  • 김남연;유기영
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2003.10a
    • /
    • pp.415-417
    • /
    • 2003
  • 본 논문에서는 유한 필드 GF(2$^{m}$ ) 상에서 A$B^2$연산을 수행하는 디지트 시리얼(digit-serial) 시스톨릭 구조를 제안하였다. 제안한 구조는 디지트 크기를 적당히 선택했을 때, 비트-패러럴(bit-parallel) 구조에 비해 적은 하드웨어를 사용하고 비트-시리얼(bit-serial) 구조에 비해 빠르다 또한, 제안한 디지트 시리얼 구조에 파이프라인 기법을 적용하면 그렇지 않은 구조에 비해 m=160, L=2 일 때 공간-시간 복잡도가 10.9% 적다.

  • PDF

On a PS-WFSR and a Parallel-Structured Word-Based Stream Cipher (PS-WFSR 및 워드기반 스트림암호의 병렬구조 제안)

  • Sung, SangMin;Lee, HoonJae;Lee, SangGon;Lim, HyoTaek
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2009.10a
    • /
    • pp.383-386
    • /
    • 2009
  • In this paper, we propose some parallel structures of the word-based nonlinear combine functions in word-based stream cipher, high-speed versions of general (bit-based) nonlinear combine functions. Especially, we propose the high-speed structures of popular three kinds in word-based nonlinear combiners using by PS-WFSR (Parallel-Shifting or Parallel-Structured Word-based FSR): m-parallel word-based nonlinear combiner without memory, m-parallel word-based nonlinear combiner with memories, and m-parallel word-based nonlinear filter function. Finally, we analyze its cryptographic security and performance.

  • PDF

R-lambda Model based Rate Control for GOP Parallel Coding in A Real-Time HEVC Software Encoder (HEVC 실시간 소프트웨어 인코더에서 GOP 병렬 부호화를 지원하는 R-lambda 모델 기반의 율 제어 방법)

  • Kim, Dae-Eun;Chang, Yongjun;Kim, Munchurl;Lim, Woong;Kim, Hui Yong;Seok, Jin Wook
    • Journal of Broadcast Engineering
    • /
    • v.22 no.2
    • /
    • pp.193-206
    • /
    • 2017
  • In this paper, we propose a rate control method based on the $R-{\lambda}$ model that supports a parallel encoding structure in GOP levels or IDR period levels for 4K UHD input video in real-time. For this, a slice-level bit allocation method is proposed for parallel encoding instead of sequential encoding. When a rate control algorithm is applied in the GOP level or IDR period level parallelism, the information of how many bits are consumed cannot be shared among the frames belonging to a same frame level except the lowest frame level of the hierarchical B structure. Therefore, it is impossible to manage the bit budget with the existing bit allocation method. In order to solve this problem, we improve the bit allocation procedure of the conventional ones that allocate target bits sequentially according to the encoding order. That is, the proposed bit allocation strategy is to assign the target bits in GOPs first, then to distribute the assigned target bits from the lowest depth level to the highest depth level of the HEVC hierarchical B structure within each GOP. In addition, we proposed a processing method that is used to improve subjective image qualities by allocating the bits according to the coding complexities of the frames. Experimental results show that the proposed bit allocation method works well for frame-level parallel HEVC software encoders and it is confirmed that the performance of our rate controller can be improved with a more elaborate bit allocation strategy by using the preprocessing results.

Study of the Switching Errors in an RSFQ Switch by Using a Computerized Test Setup (자동측정장치를 사용한 RSFQ switch의 Switching error에 관한 연구)

  • Kim, Se-Hoon;Baek, Seung-Hun;Yang, Jung-Kuk;Kim, Jun-Ho;Kang, Joon-Hee
    • Progress in Superconductivity
    • /
    • v.7 no.1
    • /
    • pp.36-40
    • /
    • 2005
  • The problem of fluctuation-induced digital errors in a rapid single flux quantum (RSFQ) circuit has been a very important issue. In this work, we calculated the bit error rate of an RSFQ switch used in superconductive arithmetic logic unit (ALU). RSFQ switch should have a very low error rate in the optimal bias. Theoretical estimates of the RSFQ error rate are on the order of $10^{-50}$ per bit operation. In this experiment, we prepared two identical circuits placed in parallel. Each circuit was composed of 10 Josephson transmission lines (JTLs) connected in series with an RSFQ switch placed in the middle of the 10 JTLs. We used a splitter to feed the same input signal to both circuits. The outputs of the two circuits were compared with an RSFQ exclusive OR (XOR) to measure the bit error rate of the RSFQ switch. By using a computerized bit-error-rate test setup, we measured the bit error rate of $2.18{\times}10^{-12}$ when the bias to the RSFQ switch was 0.398 mA that was quite off from the optimum bias of 0.6 mA.

  • PDF

Analysis of Morton Code Conversion for 32 Bit IEEE 754 Floating Point Variables (IEEE 754 부동 소수점 32비트 float 변수의 Morton Code 변환 분석)

  • Park, Taejung
    • Journal of Digital Contents Society
    • /
    • v.17 no.3
    • /
    • pp.165-172
    • /
    • 2016
  • Morton codes play important roles in many parallel GPU applications for the nearest neighbor (NN) search in huge data and queries with its applications growing. This paper discusses and analyzes the meaning of Tero Karras's 32-bit 'unsigned int' Morton code algorithm for three-dimensional spatial information in $[0,1]^3$ and its geometric implications. Based on this, this paper proposes 64-bit 'unsigned long long' version of Morton code and compares the results in both CPU vs. GPU and 32-bit vs. 64-bit versions. The proposed GPU algorithm runs around 1000 times faster than the CPU version.

Design of Pipeline Analog-to-Digital Converter Using a Parallel S/H (병렬 S/H를 이용한 파이프라인 ADC설계)

  • 이승우;이해길;나유찬;신홍규
    • Proceedings of the IEEK Conference
    • /
    • 2003.07b
    • /
    • pp.1229-1232
    • /
    • 2003
  • In this paper, The High-speed Low-power Analog-to-Digital Convener Archecture is proposed using the parallel S/H for High-speed operation. This technique can significantly reduce the sampling frequency per S/H channel. The Analog-to-Digital Converter is designed using 0.35${\mu}{\textrm}{m}$ CMOS technology. The simulation result show that the proposed Analog-to-Digital Converter can be operated at 40Ms/s with 8-bit resolution and INL/DNL errors are +0.4LSB~-0.6LSB / +0.9LSB~-1.4LSB , respectively.

  • PDF

The Parallel High Speed CODEC Design of (88.64)SBEC-DBED code ((88,64)SBEC-DBED부호의 고속병렬 CODEC설계)

  • Woo, Hyeong-Cheol;Kim, Jae-Moon;Rhee, Man-Young
    • Proceedings of the KIEE Conference
    • /
    • 1988.07a
    • /
    • pp.176-178
    • /
    • 1988
  • In this paper, techniques of constructing parity check matrix of SBEC-DBED codes will be presented to improve reliability of muliti-bit-per-chip type memory systems. And the high speed parallel CODEC of (88.64)SBEC-DBED code which is applicable to real system will be designed.

  • PDF