• Title/Summary/Keyword: Bit-Parallel

Search Result 406, Processing Time 0.03 seconds

Design of High Speed Pipelined ADC for System-on-Panel Applications (System-on-Panel 응용을 위한 고속 Pipelined ADC 설계)

  • Hong, Moon-Pyo;Jeong, Ju-Young
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.46 no.2
    • /
    • pp.1-8
    • /
    • 2009
  • We designed an ADC that operated upto 500Msamples/sec based on proposed R-string folding block as well as second folding block. The upper four bits are processed in parallel by the R-string folding block while the lower four bits are processed in pipeline structured second folding block to supply digital output. To verify the circuit performance, we conducted HSPICE simulation and the average power consumption was only 1.34mW even when the circuit was running at its maximum sampling frequency. We further measured noise immunity by applying linear ramp signal to the input. The DNL was between -0.56*LSB and 0.49*LSB and the INL was between -0.93*LSB and 0.72*LSB. We used 0.35 microns MOSIS device parameters for this work.

A Cryptographic Processor Supporting ARIA/AES-based GCM Authenticated Encryption (ARIA/AES 기반 GCM 인증암호를 지원하는 암호 프로세서)

  • Sung, Byung-Yoon;Kim, Ki-Bbeum;Shin, Kyung-Wook
    • Journal of IKEEE
    • /
    • v.22 no.2
    • /
    • pp.233-241
    • /
    • 2018
  • This paper describes a lightweight implementation of a cryptographic processor supporting GCM (Galois/Counter Mode) authenticated encryption (AE) that is based on the two block cipher algorithms of ARIA and AES. It also provides five modes of operation (ECB, CBC, OFB, CFB, CTR) for confidentiality as well as the key lengths of 128-bit and 256-bit. The ARIA and AES are integrated into a single hardware structure, which is based on their algorithm characteristics, and a $128{\times}12-b$ partially parallel GF (Galois field) multiplier is adopted to efficiently perform concurrent processing of CTR encryption and GHASH operation to achieve overall performance optimization. The hardware operation of the ARIA/AES-GCM AE processor was verified by FPGA implementation, and it occupied 60,800 gate equivalents (GEs) with a 180 nm CMOS cell library. The estimated throughput with the maximum clock frequency of 95 MHz are 1,105 Mbps and 810 Mbps in AES mode, 935 Mbps and 715 Mbps in ARIA mode, and 138~184 Mbps in GCM AE mode according to the key length.

Design of Asynchronous 16-Bit Divider Using NST Algorithm (NST알고리즘을 이용한 비동기식 16비트 제산기 설계)

  • 이우석;박석재;최호용
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.40 no.3
    • /
    • pp.33-42
    • /
    • 2003
  • This paper describes an efficient design of an asynchronous 16-bit divider using the NST (new Svoboda-Tung) algorithm. The divider is designed to reduce power consumption by using the asynchronous design scheme in which the division operation is performed only when it is requested. The divider consists of three blocks, i.e. pre-scale block, iteration step block, and on-the-fly converter block using asynchronous pipeline structure. The pre-scale block is designed using a new subtracter to have small area and high performance. The iteration step block consists of an asynchronous ring structure with 4 division steps for area reduction. In other to reduce hardware overhead, the part related to critical path is designed by a dual-rail circuit, and the other part is done by a single-rail circuit in the ring structure. The on-the-fly converter block is designed for high performance using the on-the-fly algorithm that enables parallel operation with iteration step block. The design results with 0.6${\mu}{\textrm}{m}$ CMOS process show that the divider consists of 12,956 transistors with 1,480 $\times$1,200${\mu}{\textrm}{m}$$^2$area and average-case delay is 41.7㎱.

A Efficient Architecture of MBA-based Parallel MAC for High-Speed Digital Signal Processing (고속 디지털 신호처리를 위한 MBA기반 병렬 MAC의 효율적인 구조)

  • 서영호;김동욱
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.41 no.7
    • /
    • pp.53-61
    • /
    • 2004
  • In this paper, we proposed a new architecture of MAC(Multiplier-Accumulator) to operate high-speed multiplication-accumulation. We used the MBA(Modified radix-4 Booth Algorithm) which is based on the 1's complement number system, and CSA(Carry Save Adder) for addition of the partial products. During the addition of the partial product, the signed numbers with the 1's complement type after Booth encoding are converted in the 2's complement signed number in the CSA tree. Since 2-bit CLA(Carry Look-ahead Adder) was used in adding the lower bits of the partial product, the input bit width of the final adder and whole delay of the critical path were reduced. The proposed MAC was applied into the DWT(Discrete Wavelet Transform) filtering operation for JPEG2000, and it showed the possibility for the practical application. Finally we identified the improved performance according to the comparison with the previous architecture in the aspect of hardware resource and delay.

A Study on the Design of a RISC core with DSP Support (DSP기능을 강화한 RISC 프로세서 core의 ASIC 설계 연구)

  • 김문경;정우경;이용석;이광엽
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.26 no.11C
    • /
    • pp.148-156
    • /
    • 2001
  • This paper proposed embedded application-specific microprocessor(YS-RDSP) whose structure has an additional DSP processor on chip. The YS-RDSP can execute maximum four instructions in parallel. To make program size shorter, 16-bit and 32-bit instruction lengths are supported in YS-RDSP. The YS-RDSP provides programmability. controllability, DSP processing ability, and includes eight-kilobyte on-chip ROM and eight-kilobyte RAM. System controller on the chip gives three power-down modes for low-power operation, and SLEEP instruction changes operation statue of CPU core and peripherals. YS-RDSP processor was implemented with Verilog HDL on top-down methodology, and it was improved and verified by cycle-based simulator written in C-language. The verified model was synthesized with 0.7um, 3.3V CMOS standard cell library, and the layout size was 10.7mm78.4mm which was implemented by using automatic P&R software.

  • PDF

Implementation of a G,723.1 Annex A Using a High Performance DSP (고성능 DSP를 이용한 G.723.1 Annex A 구현)

  • 최용수;강태익
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.7
    • /
    • pp.648-655
    • /
    • 2002
  • This paper describes implementation of a multi-channel G.723.1 Annex A (G.723.1A) focused on code optimization using a high performance general purpose Digital Signal Processor (DSP), To implement a multi-channel G.723.1A functional complexities of the ITU-T G.723.1A fixed-point C-code are measures an analyzed. Then we sort and optimize C functions in complexity order. In parallel with optimization, we verify the bit-exactness of the optimized code using the ITU-T test vectors. Using only internal memory, the optimized code can perform full-duplex 17 channel processing. In addition, we further increase the number of available channels per DSP into 22 using fast codebook search algorithms, referred to as bit -compatible optimization.

An R-lambda Model based Rate Control Scheme to Support Parallel GOP Coding for Real-Time HEVC Software Encoders (HEVC 실시간 소프트웨어 인코더를 위한 GOP 병렬 부호화를 지원하는 R-lambda 모델 기반의 율 제어 방법)

  • Kim, Dae Eun;Chang, Yongjun;Kim, Munchurl;Lim, Woong;Kim, Huiyong;Seok, Jinwuk
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2016.11a
    • /
    • pp.107-109
    • /
    • 2016
  • 본 논문에서는 4K UHD 입력 영상을 실시간으로 부호화하기 위해 적용되는 GOP 단위 또는 IDR 주기 단위의 병렬 부호화 구조를 지원하도록 R-${\lambda}$ 모델 기반의 율 제어 방법을 개선하는 비트 분배(bit allocation) 방법을 제안한다. GOP 단위 또는 IDR 주기 단위의 병렬 부호화기 내에서 율 제어기를 작동시키는 경우, 계층적 B 구조에서 같은 계층에 있는 프레임 간에는 상호간에 얼마만큼의 비트를 소모 하였는지에 대한 정보를 공유 할 수 없기 때문에 기존의 비트 분배 방식으로는 비트 예산(bit budget) 관리가 불가능하다. 이를 해결하기 위해 본 논문에서는, 기존의 R-${\lambda}$ 모델 기반 율 제어 방법을 개선하여 부호화 순서에 의한 시간 순서 방향의 비트 예산 갱신 기반 비트 분배하던 방식으로부터, GOP 마다 비트를 할당한 후 계층적 B 구조에서의 계층이 깊어지는 방향으로 비트 예산을 갱신하여 비트를 분배하는 방식으로 율 배분 방식을 개선하였다. 실험 결과를 통해 R-${\lambda}$ 모델 기반 율 제어의 기존 비트 분배 방식보다 제안 방법에 의한 목표 비트 율 달성 오차가 감소함을 확인하였다.

  • PDF

Implementation of a 3D Graphics Hardwired T&L Accelerator based on a SoC Platform for a Mobile System (SoC 플랫폼 기반 모바일용 3차원 그래픽 Hardwired T&L Accelerator 구현)

  • Lee, Kwang-Yeob;Koo, Yong-Seo
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.44 no.9
    • /
    • pp.59-70
    • /
    • 2007
  • In this paper, we proposed an effective T&L(Transform & Lighting) Processor architecture for a real time 3D graphics acceleration SoC(System on a Chip) in a mobile system. We designed Floating point arithmetic IPs for a T&L processor. And we verified IPs using a SoC Platform. Designed T&L Processor consists of 24 bit floating point data format and 16 bit fixed point data format, and supports the pipeline keeping the balance between Transform process and Lighting process using a parallel computation of 3D graphics. The delay of pipeline processing only Transform operation is almost same as the delay processing both Transform operation and Lighting operation. Designed T&L Processor is implemented and verified using a SoC Platform. The T&L Processor operates at 80MHz frequency in Xilinx-Virtex4 FPGA. The processing speed is measured at the rate of 20M Vertexes/sec.

Implementation of Adaptive Multi Rate (AMR) Vocoder for the Asynchronous IMT-2000 Mobile ASIC (IMT-2000 비동기식 단말기용 ASIC을 위한 적응형 다중 비트율 (AMR) 보코더의 구현)

  • 변경진;최민석;한민수;김경수
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.1
    • /
    • pp.56-61
    • /
    • 2001
  • This paper presents the real-time implementation of an AMR (Adaptive Multi Rate) vocoder which is included in the asynchronous International Mobile Telecommunication (IMT)-2000 mobile ASIC. The implemented AMR vocoder is a multi-rate coder with 8 modes operating at bit rates from 12.2kbps down to 4.75kbps. Not only the encoder and the decoder as basic functions of the vocoder are implemented, but VAD (Voice Activity Detection), SCR (Source Controlled Rate) operation and frame structuring blocks for the system interface are also implemented in this vocoder. The DSP for AMR vocoder implementation is a 16bit fixed-point DSP which is based on the TeakLite core and consists of memory block, serial interface block, register files for the parallel interface with CPU, and interrupt control logic. Through the implementation, we reduce the maximum operating complexity to 24MIPS by efficiently managing the memory structure. The AMR vocoder is verified throughout all the test vectors provided by 3GPP, and stable operation in the real-time testing board is also proved.

  • PDF

An Efficient Hardware Implementation of Lightweight Block Cipher LEA-128/192/256 for IoT Security Applications (IoT 보안 응용을 위한 경량 블록암호 LEA-128/192/256의 효율적인 하드웨어 구현)

  • Sung, Mi-Ji;Shin, Kyung-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.19 no.7
    • /
    • pp.1608-1616
    • /
    • 2015
  • This paper describes an efficient hardware implementation of lightweight encryption algorithm LEA-128/192/256 which supports for three master key lengths of 128/192/256-bit. To achieve area-efficient and low-power implementation of LEA crypto- processor, the key scheduler block is optimized to share hardware resources for encryption/decryption key scheduling of three master key lengths. In addition, a parallel register structure and novel operating scheme for key scheduler is devised to reduce clock cycles required for key scheduling, which results in an increase of encryption/decryption speed by 20~30%. The designed LEA crypto-processor has been verified by FPGA implementation. The estimated performances according to master key lengths of 128/192/256-bit are 181/162/109 Mbps, respectively, at 113 MHz clock frequency.