• Title/Summary/Keyword: configurable processor

Search Result 11, Processing Time 0.026 seconds

Application Specific Processor Design for H.264 Decoder with a Configurable Embedded Processor

  • Han, Jin-Ho;Lee, Mi-Young;Bae, Young-Hwan;Cho, Han-Jin
    • ETRI Journal
    • /
    • v.27 no.5
    • /
    • pp.491-496
    • /
    • 2005
  • An application specific processor for an H.264 decoder with a configurable embedded processor is designed in this research. The motion compensation, inverse integer transform, inverse quantization, and entropy decoding algorithm of H.264 decoder software are optimized. We improved the performance of the processor with instruction-level hardware optimization, which is tailored to configurable embedded processor architecture. The optimized instructions for video processing can be used in other video compression standards such as MPEG 1, 2, and 4. A significant performance improvement is achieved with high flexibility. Experimental results show that we could achieve 300% performance for the H.264 baseline profile level 2 decoder.

  • PDF

Embedded System for Video Coding with Logic-Enhanced DRAM and Configurable Processor

  • Kaya, Toshiyuki;Miyamoto, Ryusuke;Onoye, Takao;Shirakawa, Isao
    • Proceedings of the IEEK Conference
    • /
    • 2002.07a
    • /
    • pp.216-219
    • /
    • 2002
  • A novel approach of embedded systems for video coding is introduced with the main theme focused on logic-enhanced DRAM and configurable processor. This approach is aiming at reducing high computational costs and frequent memory accessing, which embedded systems are suffering with in the execution of video coding. According Co the software execution analysis, large size functions with intensive memory accesses are tuned to be executed by the logic-enhanced DRAM while small size functions repeatedly called are to be executed by dedicated instructions, which are newly introduced in the configurable processor. The proposed system can speed up H.263 video coding algorithm 7.4 times in comparison with the conventional embedded processor based system.

  • PDF

Design of Embedded Processor Architecture Applicable to Mobile Multimedia (Mobile Multimedia 지원을 위한 Embedded Processor 구조 설계)

  • 이호석;한진호;배영환;조한진
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.41 no.5
    • /
    • pp.71-80
    • /
    • 2004
  • This paper describes embedded processor architecture design which is applicable to multimedia in mobile platform The main description is based on basic processor architecture and consideration about energy efficiency when used in mobile platform To design processor data path architecture (pipeline, branch prediction, multiple issue superscalar, function unit number) which is optimal to multimedia application and cache hierarchy and its structure, we have nut the simulation with variant architecture using MPEG4 test bench as multimedia application. We analyzed energy efficiency of architecture to check if it is applicable to mobile platform and decide basic processor architecture based on analysis result. The suggested basic processor architecture not only can be applied to mobile platform but also can be applied to basic processor architecture of configurable processor which is designed through automatic design environment.

A Design of Instruction-Set Based Simulator of Processor for Embedded Application System (내장형 제어용 프로세서를 위한 명령어 기반 범용 시뮬레이터 개발)

  • 양훈모;정종철;김도집;이문기
    • Proceedings of the IEEK Conference
    • /
    • 2001.06b
    • /
    • pp.357-360
    • /
    • 2001
  • As SOC design methodology becomes popular, processors, the essential core in embedded system are required to be designed fast and supported to customers with expansive behavior description. This paper presents new methodology to meet such goals with designer configurable instruction set simulator for processors. This paper proposes new language called PML(Processor Modeling Language), which is based on microprogramming scheme and is also successful in most behavior of processors. By using this, we can describe scalar processor very efficiently with by-far faster simulation speed in compared with HDL model.

  • PDF

Performance Evaluation of Pipeline Genetic Algorithm Processor (Pipeline 유전자 알고리즘 프로세서(GAP)의)

  • 김태훈;이동욱;이홍기;심귀보
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2002.12a
    • /
    • pp.379-382
    • /
    • 2002
  • GA(Genetic Algorithm)는 자연계 진화를 모방한 계산 알고리즘으로서 단순하고 응용이 쉽기 때문에 여러 분야에 사용되고 있다. 하지만 GA의 단점은 일반적인 소프트웨어로 동작시켰을 때는 실행속도가 느리다는 것이다. 특히 chromosome이 길 경우 연속적인 교차, 돌연변이를 수행해야한다. GA Processor(GAP)는 GA를 수행하기위한 전용 Processor로서 GA의 동작을 빨리 수행할 수 있게 한다. 본 논문에서는 pipeline 구조의 GAP를 설계하여 GA를 수행함에 있어 소프트웨어와 하드웨어의 성능을 비교한다.

EmXJ : A Framework of Configurable XML Processor for Flexible Embedding (EmXJ : 유연한 임베딩을 위한 XML 처리기 구성 프레임워크)

  • Chung, Won-Ho;Kang, Mi-Yeon
    • The KIPS Transactions:PartA
    • /
    • v.9A no.4
    • /
    • pp.467-478
    • /
    • 2002
  • With the rapid development of wired or wireless Internet, various kinds of resource constrained mobile devices, such as cellular phone, PDA, homepad, smart phone, handhold PC, and so on, have been emerging into personal or commercial usages. Most software to be embedded into those devices has been forced to have the characteristic of flexibility rather than the fixedness which was an inherent property of embedded system. It means that recent technologies require the flexible embedding into the variety of resource constrained mobile devices. A document processor for XML which has been positioned as a standard mark-up language for information representation on the Web, is one of the essential software to be embedded into those devices for browsing the information. In this paper, a framework for configurable XML processor called EmXJ is designed and implemented for flexible embedding into various types of resource constrained mobile devices, and its advantages are compared to conventional XML processors.

Design of an Optimal RSA Crypto-processor for Embedded Systems (내장형 시스템을 위한 최적화된 RSA 암호화 프로세서 설계)

  • 허석원;김문경;이용석
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.4A
    • /
    • pp.447-457
    • /
    • 2004
  • This paper proposes a RSA crypto-processor for embedded systems. The architecture of the RSA crypto-processor should be used relying on Big Montgomery algorithm, and is supported by configurable bit size. The RSA crypto-processor includes a RSA control signal generator, an optimal Big Montgomery processor(adder, multiplier). We use diverse arithmetic unit (adder, multiplier) algorithm. After we compared the various results, we selected the optimal arithmetic unit which can be connected with ARM core-processor. The RSA crypto-processor was implemented with Verilog HDL with top-down methodology, and it was verified by C language and Cadence Verilog-XL. The verified models were synthesized with a Hynix 0.25${\mu}{\textrm}{m}$, CMOS standard cell library while using Synopsys Design Compiler. The RSA crypto-processor can operate at a clock speed of 51 MHz in this worst case conditions of 2.7V, 10$0^{\circ}C$ and has about 36,639 gates.

A Configurable Software-based Approach for Detecting CFEs Caused by Transient Faults

  • Liu, Wei;Ci, LinLin;Liu, LiPing
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.5
    • /
    • pp.1829-1846
    • /
    • 2021
  • Transient faults occur in computation units of a processor, which can cause control flow errors (CFEs) and compromise system reliability. The software-based methods perform illegal control flow detection by inserting redundant instructions and monitoring signature. However, the existing methods not only have drawbacks in terms of performance overhead, but also lack of configurability. We propose a configurable approach CCFCA for detecting CFEs. The configurability of CCFCA is implemented by analyzing the criticality of each region and tuning the detecting granularity. For critical regions, program blocks are divided according to space-time overhead and reliability constraints, so that protection intensity can be configured flexibly. For other regions, signature detection algorithms are only used in the first basic block and last basic block. This helps to improve the fault-tolerant efficiency of the CCFCA. At the same time, CCFCA also has the function of solving confusion and instruction self-detection. Our experimental results show that CCFCA incurs only 10.61% performance overhead on average for several C benchmark program and the average undetected error rate is only 9.29%. CCFCA has high error coverage and low overhead compared with similar algorithms. This helps to meet different cost requirements and reliability requirements.

Design of Programmable and Configurable Elliptic Curve Cryptosystem Coprocessor (재구성 가능한 타원 곡선 암호화 프로세서 설계)

  • Lee Jee-Myong;Lee Chanho;Kwon Woo-Suk
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.42 no.6 s.336
    • /
    • pp.67-74
    • /
    • 2005
  • Crypto-systems have difficulties in designing hardware due to the various standards. We propose a programmable and configurable architecture for cryptography coprocessors to accommodate various crypto-systems. The proposed architecture has a 32 bit I/O interface and internal bus width, and consists of a programmable finite field arithmetic unit, an input/output unit, a register file, and a control unit. The crypto-system is determined by the micro-codes in memory of the control unit, and is configured by programming the micro-codes. The coprocessor has a modular structure so that the arithmetic unit can be replaced if a substitute has an appropriate 32 bit I/O interface. It can be used in many crypto-systems by re-programming the micro-codes for corresponding crypto-system or by replacing operation units. We implement an elliptic curve crypto-processor using the proposed architecture and compare it with other crypto-processors

A 3-D Vision Sensor Implementation on Multiple DSPs TMS320C31 (다중 TMS320C31 DSP를 사용한 3-D 비젼센서 Implementation)

  • Oksenhendler, V.;Bensrhair, Abdelaziz;Miche, Pierre;Lee, Sang-Goog
    • Journal of Sensor Science and Technology
    • /
    • v.7 no.2
    • /
    • pp.124-130
    • /
    • 1998
  • High-speed 3D vision systems are essential for autonomous robot or vehicle control applications. In our study, a stereo vision process has been developed. It consists of three steps : extraction of edges in right and left images, matching corresponding edges and calculation of the 3D map. This process is implemented in a VME 150/40 Imaging Technology vision system. It is a modular system composed by a display, an acquisition, a four Mbytes image frame memory, and three computational cards. Programmable accelerator computational modules are running at 40 MHz and are based on TMS320C31 DSP with a $64{\times}32$ bit instruction cache and two $1024{\times}32$ bit internal RAMs. Each is equipped with 512 Kbytes static RAM, 4 Mbytes image memory, 1 Mbytes flash EEPROM and a serial port. Data transfers and communications between modules are provided by three 8 bit global video bus, and three local configurable pipeline 8 bit video bus. The VME bus is dedicated to system management. Tasks between DSPs are distributed as follows: two DSPs are used to edges detection, one for the right image and the other for the left one. The last processor computes the matching process and the 3D calculation. With $512{\times}512$ pixels images, this sensor generates dense 3D maps at a rate of about 1 Hz depending of the scene complexity. Results can surely be improved by using a special suited multiprocessors cards.

  • PDF