• Title/Summary/Keyword: Operation Processor

Search Result 614, Processing Time 0.028 seconds

Design of Low-Power and Low-Complexity MIMO-OFDM Baseband Processor for High Speed WLAN Systems (고속 무선 LAN 시스템을 위한 저전력/저면적 MIMO-OFDM 기저대역 프로세서 설계)

  • Im, Jun-Ha;Cho, Mi-Suk;Jung, Yun-Ho;Kim, Jae-Seok
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.33 no.11C
    • /
    • pp.940-948
    • /
    • 2008
  • This paper presents a low-power, low-complexity design and implementation results of a high speed multiple-input multiple-output orthogonal frequency division multiplexing (MIMO-OFDM) wireless LAN (WLAN) baseband processor. The proposed processor is composed of the physical layer convergence procedure (PLCP) processor and physical medium dependent (PMD) processor, which have been optimized to have low-power and reduced-complexity architecture. It was designed in a hardware description language (HDL) and synthesized to gate-level circuits using 0.18um CMOS standard cell library. As a result, the proposed TX-PLCP processor reduced the power consumption by as much as 81% over the bit-level operation architecture. Also, the proposed MIMO symbol detector reduced the hardware complexity by 18% over the conventional SQRD-based architecture with division circuits and square root operations.

Novel IME Instructions and their Hardware Architecture for Fast Search Algorithm (고속 탐색 알고리즘에 적합한 움직임 추정 전용 명령어 및 구조 설계)

  • Bang, Ho-Il;SunWoo, Myung-Hoon
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.48 no.12
    • /
    • pp.58-65
    • /
    • 2011
  • This paper presents an ASIP (Application-specific Instruction Processor) for motion estimation that employs specific IME instructions and its programmable and reconfigurable hardware architecture for various video codecs, such as H.264/AVC, MPEG4, etc. With the proposed specific instructions and variable point 2D SAD hardware accelerator, it can handle the real-time processing requirement of High Definition (HD) video. With the SAD unit and its parallel operations using pattern information, the proposed IME instructions support not only full search algorithms but also other fast search algorithms. The hardware size is 25.5K gates for each Processing Element Group (PEG) which has 128 SAD Processor Elements (PEs). The proposed ASIP has been verified by the Synopsys Processor Designer and implemented by the Design Compiler using the IBM 90nm process technology. The hardware size is 453K gates for the IME unit and the operating frequency is 188MHz for 1080p@30 frame in real time. The proposed ASIP can reduce the hardware size about 26% and the number of operation cycles about 18%.

Implementation of a High Performance SEED Processor for Smart Card Applications (스마트카드용 고성능 SEED 프로세서의 구현)

  • 최홍묵;최명렬
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.14 no.5
    • /
    • pp.37-47
    • /
    • 2004
  • The security of personal informations has been an important issue since the field of smart card applications has been expanded explosively. The security of smart card is based on cryptographic algorithms, which are highly required to be implemented into hardware for higher speed and stronger security. In this paper, a SEED cryptographic processor is designed by employing one round key generation block which generates 16 round keys without key registers and one round function block which is used iteratively. Both the round key generation block and the F function are using only one G function block with one 5${\times}$l MUX sequentially instead of 5 G function blocks. The proposed SEED processor has been implemented such that each round operation is divided into seven sub-rounds and each sub-round is executed per clock. Functional simulation of the proposed cryptographic processor has been executed using the test vectors which are offered by Korea Information Security Agency. In addition, we have evaluated the proposed SEED processor by executing VHDL synthesis and FPGA board test. The die area of the proposed SEED processor decreases up to approximately 40% compared with the conventional processor.

Design and Evaluation of 32-Bit RISC-V Processor Using FPGA (FPGA를 이용한 32-Bit RISC-V 프로세서 설계 및 평가)

  • Jang, Sungyeong;Park, Sangwoo;Kwon, Guyun;Suh, Taeweon
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.11 no.1
    • /
    • pp.1-8
    • /
    • 2022
  • RISC-V is an open-source instruction set architecture which has a simple base structure and can be extensible depending on the purpose. In this paper, we designed a small and low-power 32-bit RISC-V processor to establish the base for research on RISC-V embedded systems. We designed a 2-stage pipelined processor which supports RISC-V base integer instruction set except for FENCE and EBREAK instructions. The processor also supports privileged ISA for trap handling. It used 1895 LUTs and 1195 flip-flops, and consumed 0.001W on Xilinx Zynq-7000 FPGA when synthesized using Vivado Design Suite. GPIO, UART, and timer peripherals are additionally used to compose the system. We verified the operation of the processor on FPGA with FreeRTOS at 16MHz. We used Dhrystone and Coremark benchmarks to measure the performance of the processor. This study aims to provide a low-power, high-efficiency microprocessor for future extension.

SIMD MAC Unit Design for Multimedia Data Processing (멀티미디어 데이터 처리에 적합한 SIMD MAC 연산기의 설계)

  • Hong, In-Pyo;Jeong, Woo-Kyong;Jeong Jae-Won;Lee Yong-Surk
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.38 no.12
    • /
    • pp.44-55
    • /
    • 2001
  • MAC(Multiply and ACcumulate) is the core operation of multimedia data processing. Because MAC units implemented on traditional DSP units or embedded processors have latency of three cycles and cannot operate on multiple data simultaneously, then, performances are seriously limited. Many high end general purpose microprocessors have SIMD MAC unit as a functional unit. But these high end MAC units must support pipeline structure for various operation modes and high clock frequency, which makes control logic complex and increases chip area. In this paper, a 64bit SIMD MAC unit for embedded processors is designed. It is implemented to have a latency of one clock cycle to remove pipeline control logics and a minimal area overhead for SIMD support is added to existing Booth multipliers.

  • PDF

An ASIP Design for Deblocking Filter of H.264/AVC (H.264/AVC 표준의 디블록킹 필터를 가속하기 위한 ASIP 설계)

  • Lee, Hyoung-Pyo;Lee, Yong-Surk
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.45 no.3
    • /
    • pp.142-148
    • /
    • 2008
  • Though a deblocking filter of H.264/AVC provides enhanced image quality by removing blocking artifact on block boundary, the complex filtering operation on this process is a dominant factor of the whole decoding time. In this paper, we designed an ASIP to accelerate deblocking filter operation with the proposed instruction set. We designed a processor based on a MIPS structure with LISA, simulated a deblocking later model, and compared the execution time on the proposed instruction set. In addition, we generated HDL model of the processor through CoWare's Processor Designer and synthesized with TSMC 0.25um CMOS cell library by Synopsys Design Compiler. As the result of the synthesis, the area and delay time increased 7.5% and 3.2%, respectively. However, due to the proposed instruction set, total execution performance is improved by 18.18% on average.

An Efficient Bit Stream Instruction-set for Network Packet Processing Applications (네트워크 패킷 처리를 위한 효율적인 비트 스트림 명령어 세트)

  • Yoon, Yeo-Phil;Lee, Yong-Surk;Lee, Jung-Hee
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.45 no.10
    • /
    • pp.53-58
    • /
    • 2008
  • This paper proposes a new set of instructions to improve the packet processing capacity of a network processor. The proposed set of instructions is able to achieve more efficient packet processing by accelerating integration of packet headers. Furthermore, a hardware configuration dedicated to processing overlay instructions was designed to reduce additional hardware cost. For this purpose, the basic architecture for the network processor was designed using LISA and the overlay block was optimized based on the barrel shifter. The block was synthesized to compare the area and the operation delay, and allocated to a C-level macro function using the compiler known function (CKF). The improvement in performance was confirmed by comparing the execution cycle and the execution time of an application program. Experiments were conducted using the processor designer and the compiler designer from Coware. The result of synthesis with the TSMC ($0.25{\mu}m$) from Synopsys indicated a reduction in operation delay by 20.7% and an improvement in performance of 30.8% with the proposed set of instructions for the entire execution cycle.

Design and Implementation of the Digital Neuron Processor for the real time object recognition in the making Automatic system (생산자동화 시스템에서 실시간 물체인식을 위한 디지털 뉴런프로세서의 설계 및 구현)

  • Hong, Bong-Wha;Joo, Hae-Jong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.12 no.3
    • /
    • pp.37-50
    • /
    • 2007
  • In this paper, we designed and implementation of the high speed neuron processor for real time object recognition in the making automatic system. and we designed of the PE(Processing Element) used residue number system without carry propagation for the high speed operation. Consisting of MAC(Multiplication and Accumulation) operator using residue number system and sigmoid function operator unit using MAC(Mixed Radix conversion) is designed. The designed circuits are descript by C language and VHDL(Very High Speed Integrated Circuit Hardware Description Language) and synthesized by compass tools and finally, the designed processor is fabricated in $0.8{\mu}m$ CMOS process. we designed of MAC operation unit and sigmoid proceeding unit are proved that it could run time 0.6nsec on the simulation and improved to the speed of the three times and decreased to hardware size about 50%, each order. The designed neuron processor can be implemented of the object recognition in making automatic system with desired real time processing.

  • PDF

Four-valued Hybrid FFT processor design using current mode CMOS (전류 모드 CMOS를 이용한 4치 Hybrid FFT 연산기 설계)

  • 서명웅;송홍복
    • Journal of the Korea Computer Industry Society
    • /
    • v.3 no.1
    • /
    • pp.57-66
    • /
    • 2002
  • In this study, Multi-Values Logic processor was designed using the basic circuit of the electric current mode CMOS. First of all, binary FFT(Fast Fourier Transform) was extended and high-speed Multi-Valued Logic processor was constructed using a multi-valued logic circuit. Compared with the existing two-valued FFT, the FFT operation can reduce the number of transistors significantly and show the simplicity of the circuit. Moreover, for the construction of amount was used inside the FFT circuit with the set of redundant numbers like [0,1,2,3]. As a result, the defects in lines were reduced and it turned out to be effective in the aspect of normality an regularity when it was used designing VLSI(Very Large Scale Integration). To multiply FFT, the time and size of the operation was used as LUT(Look Up Table) Finally, for the compatibility with the binary system, multiple-valued hybrid-type FFT processor was proposed and designed using binary-four valued encoder, four-binary valued decoder, and the electric current mode CMOS circuit.

  • PDF

A Design of PRESENT Crypto-Processor Supporting ECB/CBC/OFB/CTR Modes of Operation and Key Lengths of 80/128-bit (ECB/CBC/OFB/CTR 운영모드와 80/128-비트 키 길이를 지원하는 PRESENT 암호 프로세서 설계)

  • Kim, Ki-Bbeum;Cho, Wook-Lae;Shin, Kyung-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.6
    • /
    • pp.1163-1170
    • /
    • 2016
  • A hardware implementation of ultra-lightweight block cipher algorithm PRESENT which was specified as a standard for lightweight cryptography ISO/IEC 29192-2 is described. The PRESENT crypto-processor supports two key lengths of 80 and 128 bits, as well as four modes of operation including ECB, CBC, OFB, and CTR. The PRESENT crypto-processor has on-the-fly key scheduler with master key register, and it can process consecutive blocks of plaintext/ciphertext without reloading master key. In order to achieve a lightweight implementation, the key scheduler was optimized to share circuits for key lengths of 80 bits and 128 bits. The round block was designed with a data-path of 64 bits, so that one round transformation for encryption/decryption is processed in a clock cycle. The PRESENT crypto-processor was verified using Virtex5 FPGA device. The crypto-processor that was synthesized using a $0.18{\mu}m$ CMOS cell library has 8,100 gate equivalents(GE), and the estimated throughput is about 908 Mbps with a maximum operating clock frequency of 454 MHz.