• Title/Summary/Keyword: Operation Processor

Search Result 616, Processing Time 0.504 seconds

Design of an Efficient MAC Unit for RSA Cryptoprocessors (RSA 암호화 프로세서에 적용 가능한 효율적인 누적곱셈 연산기 설계)

  • Moon, Sang-Gook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.1
    • /
    • pp.65-70
    • /
    • 2008
  • RSA crypto-processors equipped with more than 1024 bits of key space handle the entire key stream in units of blocks. The RSA processor which will be the target design in this paper defines the length of the basic word as 128 bits, and uses an 256-bits register as the accumulator. For efficient execution of 128-bit multiplication, 32b${\times}$32b multiplier was designed and adopted and the results are stored in 8 separate 128-bit registers according to the status flag. In this paper, an efficient method to execute 128-bit MAC (multiplication and accumulation) operation is proposed. The suggested method pre-analyze the all possible cases so that the MAC unit can remove unnecessary calculations to speed up the execution. The proposed architecture prototype of the MAC unit was automatically synthesized, and successfully operated at 20MHz, which will be the operation frequency in the target RSA processor.

A Design of the High-Speed Cipher VLSI Using IDEA Algorithm (IDEA 알고리즘을 이용한 고속 암호 VLSI 설계)

  • 이행우;최광진
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.11 no.1
    • /
    • pp.64-72
    • /
    • 2001
  • This paper is on a design of the high-speed cipher IC using IDEA algorithm. The chip is consists of six functional blocks. The principal blocks are encryption and decryption key generator, input data circuit, encryption processor, output data circuit, operation mode controller. In subkey generator, the design goal is rather decrease of its area than increase of its computation speed. On the other hand, the design of encryption processor is focused on rather increase of its computation speed than decrease of its area. Therefore, the pipeline architecture for repeated processing and the modular multiplier for improving computation speed are adopted. Specially, there are used the carry select adder and modified Booth algorithm to increase its computation speed at modular multiplier. To input the data by 8-bit, 16-bit, 32-bit according to the operation mode, it is designed so that buffer shifts by 8-bit, 16-bit, 32-bit. As a result of simulation by 0.25 $\mu\textrm{m}$ process, this IC has achieved the throughput of 1Gbps in addition to its small area, and used 12,000gates in implementing the algorithm.

FPGA Implementation and Performance Analysis of High Speed Architecture for RC4 Stream Cipher Algorithm (RC4 스트림 암호 알고리즘을 위한 고속 연산 구조의 FPGA 구현 및 성능 분석)

  • 최병윤;이종형;조현숙
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.14 no.4
    • /
    • pp.123-134
    • /
    • 2004
  • In this paper a high speed architecture of the RC4 stream cipher is proposed and its FPGA implementation is presented. Compared to the conventional RC4 designs which have long initialization operation or use double or triple S-arrays to reduce latency delay due to S-array initialization phase, the proposed architecture for RC4 stream cipher eliminates the S-array initialization operation using 256-bit valid entry scheme and supports 40/128-bit key lengths with efficient modular arithmetic hardware. The proposed RC4 stream cipher is implemented using Xilinx XCV1000E-6H240C FPGA device. The designed RC4 stream cipher has about a throughput of 106 Mbits/sec at 40 MHz clock and thus can be applicable to WEP processor and RC4 key search processor.

Low-Power Discrete-Event SoC for 3DTV Active Shutter Glasses (3DTV 엑티브 셔터 안경을 위한 저전력 이산-사건 SoC)

  • Park, Dae-Jin;Kwak, Sung-Ho;Kim, Chang-Min;Kim, Tag-Gon
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.48 no.6
    • /
    • pp.18-26
    • /
    • 2011
  • Debates concerning the competitive edge of leading 3DTV technology of the shutter glasses (SG) 3D and the film-type patterned retarder (FPR) are flaring up. Although SG technology enables Full-HD 3D vision, it requires complex systems including the sync transmitter (emitter), the sync processor chip, and the LCD lens in the active shutter glasses. In addition, the transferred sync-signal is easily affected by the external noise and a 3DTV viewer may feel flicker-effect caused by cross-talk of the left and right image. The operating current of the sync processor in the 3DTV active shutter glasses is gradually increasing to compensate the sync reconstruction error. The proposed chip is a low-power hardware sync processor based discrete-event SoC(system on a chip) designed specifically for the 3DTV active shutter glasses. This processor implements the newly designed power-saving techniques targeted for low-power operation in a noisy environment between 3DTV and the active shutter glasses. This design includes a hardware pre-processor based on a universal edge tracer and provides a perfect sync reconstruction based on a floating-point timer to advance the prior commercial 3DTV shutter glasses in terms of their power consumption. These two techniques enable an accurate sync reconstruction in the slow clock frequency of the synchronization timer and reduce the power consumption to less than about a maximum of 20% compared with other major commercial processors. This article describes the system's architecture and the details of the proposed techniques, also identifying the key concepts and functions.

Adaptive Pipeline Architecture for an Asynchronous Embedded Processor (비동기식 임베디드 프로세서를 위한 적응형 파이프라인 구조)

  • Lee, Seung-Sook;Lee, Je-Hoon;Lim, Young-Il;Cho, Kyoung-Rok
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.44 no.1
    • /
    • pp.51-58
    • /
    • 2007
  • This paper presented an adaptive pipeline architecture for a high-performance and low-power asynchronous processor. The proposed pipeline architecture employed a stage-skipping and a stage-combining scheme. The stage-skipping scheme can skip the operation of a bubble stage that is not used pipeline stage in an instruction execution. In the stage-combining scheme, two consecutive stages can be joined to form one stage if the latter stage is empty. The proposed pipeline architecture could reduce the processing time and power consumption. The proposed architecture supports multi-processing in the EX stage that executes parallel 4 instructions. We designed an asynchronous microprocessor to estimate the efficiency of the proposed pipeline architecture that was synthesized to a gate level design using a $0.35-{\mu}m$ CMOS standard cell library. We evaluated the performance of the target processor using SPEC2000 benchmark programs. The proposed architecture showed about 2.3 times higher speed than the asynchronous counterpart, AMULET3i. As a result, the proposed pipeline schemes and architecture can be used for asynchronous high-speed processor design

Experiments on An Network Processor-based Intrusion Detection (네트워크 프로세서 기반의 침입탐지 시스템 구현)

  • Kim, Hyeong-Ju;Kim, Ik-Kyun;Park, Dae-Chul
    • The KIPS Transactions:PartC
    • /
    • v.11C no.3
    • /
    • pp.319-326
    • /
    • 2004
  • To help network intrusion detection systems(NIDSs) keep up with the demands of today's networks, that we the increasing network throughput and amount of attacks, a radical new approach in hardware and software system architecture is required. In this paper, we propose a Network Processor(NP) based In-Line mode NIDS that supports the packet payload inspection detecting the malicious behaviors, as well as the packet filtering and the traffic metering. In particular, we separate the filtering and metering functions from the deep packet inspection function using two-level searching scheme, thus the complicated and time-consuming operation of the deep packet inspection function does not hinder or flop the basic operations of the In-line mode system. From a proto-type NP-based NIDS implemented at a PC platform with an x86 processor running Linux, two Gigabit Ethernet ports, and 2.5Gbps Agere PayloadPlus(APP) NP solution, the experiment results show that our proposed scheme can reliably filter and meter the full traffic of two gigabit ports at the first level even though it can inspect the packet payload up to 320 Mbps in real-time at the second level, which can be compared to the performance of general-purpose processor based Inspection. However, the simulation results show that the deep packet searching is also possible up to 2Gbps in wire speed when we adopt 10Gbps APP solution.

A Design of AES-based WiBro Security Processor (AES 기반 와이브로 보안 프로세서 설계)

  • Kim, Jong-Hwan;Shin, Kyung-Wook
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.44 no.7 s.361
    • /
    • pp.71-80
    • /
    • 2007
  • This paper describes an efficient hardware design of WiBro security processor (WBSec) supporting for the security sub-layer of WiBro wireless internet system. The WBSec processor, which is based on AES (Advanced Encryption Standard) block cipher algorithm, performs data oncryption/decryption, authentication/integrity, and key encryption/decryption for packet data protection of wireless network. It carries out the modes of ECB, CTR, CBC, CCM and key wrap/unwrap with two AES cores working in parallel. In order to achieve an area-efficient implementation, two design techniques are considered; First, round transformation block within AES core is designed using a shared structure for encryption/decryption. Secondly, SubByte/InvSubByte blocks that require the largest hardware in AES core are implemented using field transformation technique. It results that the gate count of WBSec is reduced by about 25% compared with conventional LUT (Look-Up Table)-based design. The WBSec processor designed in Verilog-HDL has about 22,350 gates, and the estimated throughput is about 16-Mbps at key wrap mode and maximum 213-Mbps at CCM mode, thus it can be used for hardware design of WiBro security system.

Real-Time Implementation of the Relative Position Estimation Algorithm Using the Aerial Image Sequence (항공영상에서 상대 위치 추정 알고리듬의 실시간 구현)

  • Park, Jae-Hong;Kim, Gwan-Seok;Kim, In-Cheol;Park, Rae-Hong;Lee, Sang-Uk
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.39 no.3
    • /
    • pp.66-77
    • /
    • 2002
  • This paper deals with an implementation of the navigation parameter extraction technique using the TMS320C80 multimedia video processor (MVP). Especially, this Paper focuses on the relative position estimation algorithm which plays an important role in real-time operation of the overall system. Based on the relative position estimation algorithm using the images obtained at two locations, we develop a fast algorithm that can reduce large amount of computation time and fit into fixed-point processors. Then, the algorithm is reconfigured for parallel processing using the 4 parallel processors in the MVP. As a result, we shall demonstrate that the navigation parameter extraction system employing the MVP can operate at full-frame rate, satisfying real-time requirement of the overall system.

Accelerometer Signal Processing for a Helicopter Active Vibration Control System (헬리콥터 능동진동제어시스템 가속도 신호 처리)

  • Kim, Do-Hyung
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.45 no.10
    • /
    • pp.863-871
    • /
    • 2017
  • LMS (least mean square) algorithm widely used in the AVCS (active vibration control system) of helicopters calculates control input using the forward path transfer function and error signal. If the error signal is sinusoidal, it can be represented as the combination of cosine and sine functions with frequency and phase synchronized with the reference signal. The control input also has the same frequency, therefore control algorithm can be simply implemented if the cosine and the sine amplitudes of the control input are calculated and the frequency and phase of the reference signal are used. Calculation of the control input is implemented as simple matrix operation and the change of the control command is slower than the frequency of the error signal, consequently control algorithm can be operated at lower frequency. The signal processing algorithm extracting cosine and sine components of the error signals are modeled using Simulink and PIL (processor-in-the-loop) mode simulation was executed for real-time performance evaluation.

Efficient SAD Processor for Motion Estimation of H.264 (H.264 움직임 추정을 위한 효율적인 SAD 프로세서)

  • Jang, Young-Beom;Oh, Se-Man;Kim, Bee-Chul;Yoo, Hyeon-Joong
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.44 no.2 s.314
    • /
    • pp.74-81
    • /
    • 2007
  • In this paper, an efficient SAD(Sum of Absolute Differences) processor structure for motion estimation of H.264 is proposed. SAD processors are commonly used both in full search methods for motion estimation and in fast search methods for motion estimation. Proposed structure consists of SAD calculator block, combinator block, and minimum value calculator block. Especially, proposed structure is simplified by using Distributed Arithmetic for addition operation. The Verilog-HDL(Hard Description Language) coding and FPGA(Field Programmable Gate Array) implementation results for the proposed structure show 39% and 32% gate count reduction in comparison with those of the conventional structure, respectively. Due to its efficient processing scheme, the proposed SAD processor structure can be widely used in size dominant H.264 chip.