• Title/Summary/Keyword: Processor Core

Search Result 396, Processing Time 0.024 seconds

Design of Delayed Triple-Core Lock-Step Processor with Memory Rollback for Automotive Applications (메모리 롤백 기능을 가진 차량 어플리케이션용 삼중 코어 지연 락스텝 프로세서 설계)

  • Seonghyun, Yang;Ji-Woong, Choi;Seongsoo, Lee
    • Journal of IKEEE
    • /
    • v.26 no.4
    • /
    • pp.628-632
    • /
    • 2022
  • In this paper, a triple-core delayed lock-step processor is proposed for automotive applications. It performs same operations in three different cores independently, and votes their results to get final values. Therefore its operations are safe even if errors occur in one core. Its three cores operate in a delayed manner to prevent simultaneous errors in multiple cores due to radiative ray or electromagnetic wave. When an error occurs in main core connected to the memory, wrong values can be stored in the memory, so the proposed processor performs memory rollback to restore correct values. Simulation results show that the proposed processor successfully compensates various errors.

New Hypervisor Improving Network Performance for Multi-core CE Devices

  • Hong, Cheol-Ho;Park, Miri;Yoo, Seehwan;Yoo, Chuck
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.6 no.4
    • /
    • pp.231-241
    • /
    • 2011
  • Recently, system virtualization has been applied to consumer electronics (CE) such as smart mobile phones. Although multi-core processors have become a viable solution for complex applications of consumer electronics, the issue of utilizing multi-core resources in the virtualization layer has not been researched sufficiently. In this paper, we present a new hypervisor design and implementation for multi-core CE devices. We concretely describe virtualization methods for a multi-core processor and multi-core-related subsystems. We also analyze bottlenecks of network performance in a virtualization environment that supports multimedia applications and propose an efficient virtual interrupt distributor. Our new multi-core hypervisor improves network performance by 5.5 times as compared to a hypervisor without the virtual interrupt distributor.

A Study On Statistical Simulation for Asymmetric Multi-Core Processor Architectures (비대칭적 멀티코어 프로세서의 통계적 모의실험에 관한 연구)

  • Lee, Jongbok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.16 no.2
    • /
    • pp.157-163
    • /
    • 2016
  • If trace-driven or execution-driven simulation is used for the performance analysis of asymmetric multi-core processors, excessive time and much disk space are necessary. In this paper, statistical simulations are performed for asymmetric multi-core processors with various hardware configurations. For the experiment, SPEC 2000 benchmark programs are used for profiling and synthesis, which is supplied as input for the simulation of asymmetric multi-core processors. As a result, the performance of asymmetric multi-core processor obtained by statistical simulation is comparable to that of the trace-driven simulation with a tremendous reduction in the simulation time.

Cost-Aware Scheduling of Computation-Intensive Tasks on Multi-Core Server

  • Ding, Youwei;Liu, Liang;Hu, Kongfa;Dai, Caiyan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.11
    • /
    • pp.5465-5480
    • /
    • 2018
  • Energy-efficient task scheduling on multi-core server is a fundamental issue in green cloud computing. Multi-core processors are widely used in mobile devices, personal computers, and servers. Existing energy efficient task scheduling methods chiefly focus on reducing the energy consumption of the processor itself, and assume that the cores of the processor are controlled independently. However, the cores of some processors in the market are divided into several voltage islands, in each of which the cores must operate on the same status, and the cost of the server includes not only energy cost of the processor but also the energy of other components of the server and the cost of user waiting time. In this paper, we propose a cost-aware scheduling algorithm ICAS for computation intensive tasks on multi-core server. Tasks are first allocated to cores, and optimal frequency of each core is computed, and the frequency of each voltage island is finally determined. The experiments' results show the cost of ICAS is much lower than the existing method.

Implementation of SIMD-based Many-Core Processor for Efficient Image Data Processing (효율적인 영상데이터 처리를 위한 SIMD기반 매니코어 프로세서 구현)

  • Choi, Byong-Kook;Kim, Cheol-Hong;Kim, Jong-Myon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.1
    • /
    • pp.1-9
    • /
    • 2011
  • Recently, as mobile multimedia devices are used more and more, the needs for high-performance and low-energy multimedia processors are increasing. Application-specific integrated circuits (ASIC) can meet the needed high performance for mobile multimedia, but they provide limited, if any, generality needed for various application requirements. DSP based systems can used for various types of applications due to their generality, but they require higher cost and energy consumption as well as less performance than ASICs. To solve this problem, this paper proposes a single instruction multiple data (SIMD) based many-core processor which supports high-performance and low-power image data processing while keeping generality. The proposed SIMD based many-core processor composed of 16 processing elements (PEs) exploits large data parallelism inherent in image data processing. Experimental results indicate that the proposed SIMD-based many-core processor higher performance (22 times better), energy efficiency (7 times better), and area efficiency (3 times better) than conversional commercial high-performance processors.

Design of a Delayed Dual-Core Lock-Step Processor with Automatic Recovery in Soft Errors (소프트 에러 발생 시 자동 복구하는 이중 코어 지연 락스텝 프로세서의 설계)

  • Juho Kim;Seonghyun Yang;Seongsoo Lee
    • Journal of IKEEE
    • /
    • v.27 no.4
    • /
    • pp.683-686
    • /
    • 2023
  • In this paper, we designed a Delayed Dual Core Lock-Step (D-DCLS) processor where two cores operate same instructions with delay and the result is compared to mitigate soft errors and common mode failures in automotive electronic systems. Because D-DCLS does not know which core an error occurred in, each core must be recovered to the point before the error occurred, but complex hardware modifications are required to return all intermediate values on the pipeline stage. In this paper, in order for easy hardware implementation, all register values are saved to a buffer whenever a branch instruction is executed. When an error is detected, the saved register values are automatically restored, and then 'BX LR' instruction is executed to return to the last branch point. The proposed D-DCLS processor was designed using Verilog HDL and was confirmed to continue normal operation after automatically recovering error.

AB9: A neural processor for inference acceleration

  • Cho, Yong Cheol Peter;Chung, Jaehoon;Yang, Jeongmin;Lyuh, Chun-Gi;Kim, HyunMi;Kim, Chan;Ham, Je-seok;Choi, Minseok;Shin, Kyoungseon;Han, Jinho;Kwon, Youngsu
    • ETRI Journal
    • /
    • v.42 no.4
    • /
    • pp.491-504
    • /
    • 2020
  • We present AB9, a neural processor for inference acceleration. AB9 consists of a systolic tensor core (STC) neural network accelerator designed to accelerate artificial intelligence applications by exploiting the data reuse and parallelism characteristics inherent in neural networks while providing fast access to large on-chip memory. Complementing the hardware is an intuitive and user-friendly development environment that includes a simulator and an implementation flow that provides a high degree of programmability with a short development time. Along with a 40-TFLOP STC that includes 32k arithmetic units and over 36 MB of on-chip SRAM, our baseline implementation of AB9 consists of a 1-GHz quad-core setup with other various industry-standard peripheral intellectual properties. The acceleration performance and power efficiency were evaluated using YOLOv2, and the results show that AB9 has superior performance and power efficiency to that of a general-purpose graphics processing unit implementation. AB9 has been taped out in the TSMC 28-nm process with a chip size of 17 × 23 ㎟. Delivery is expected later this year.

A Performance Study of Asymmetric Embedded Multi-Core Processors (비대칭적 임베디드 멀티코어 프로세서의 성능 연구)

  • Lee, Jongbok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.16 no.1
    • /
    • pp.233-238
    • /
    • 2016
  • Recently, the multi-core processor architecture is widely adopted in the embedded processors for enhancing its performance. Multi-core processors are classified either as symmetric or asymmetric. Asymmetric multicore processors are known to score higher performance and more efficient than symmetric multi-core processors. In order to study the performance enhancement of asymmetric multi-core embedded processors over the symmetric ones, the trace-driven simulation has been executed for various asymmetric embedded dual-core, quad-core, octa-core and hexadeca-core processors and compared with the symmetric ones of similar hardware budget using MiBench benchmarks as input.

Implementation of an Optimal Many-core Processor for Beamforming Algorithm of Mobile Ultrasound Image Signals (모바일 초음파 영상신호의 빔포밍 기법을 위한 최적의 매니코어 프로세서 구현)

  • Choi, Byong-Kook;Kim, Jong-Myon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.8
    • /
    • pp.119-128
    • /
    • 2011
  • This paper introduces design space exploration of many-core processors that meet high performance and low power required by the beamforming algorithm of image signals of mobile ultrasound. For the design space exploration of the many-core processor, we mapped different number of ultrasound image data to each processing element of many-core, and then determined an optimal many-core processor architecture in terms of execution time, energy efficiency and area efficiency. Experimental results indicate that PE=4096 and 1024 provide the highest energy efficiency and area efficiency, respectively. In addition, PE=4096 achieves 46x and 10x better than TI DSP C6416, which is widely used for ultrasound image devices, in terms of energy efficiency and area efficiency, respectively.

An Implementation of ECC(Elliptic Curve Cryptographic)Processor with Bus-splitting method for Embedded SoC(System on a Chip) (임베디드 SoC를 위한 Bus-splitting 기법 적용 ECC 보안 프로세서의 구현)

  • Choi, Seon-Jun;Chang, Woo-Youg;Kim, Young-Chul
    • Proceedings of the IEEK Conference
    • /
    • 2005.11a
    • /
    • pp.651-654
    • /
    • 2005
  • In this paper, we designed ECC(Elliptic Curve Cryptographic) Processor with Bus-splitting mothod for embedded SoC. ECC SIP is designed by VHDL RTL modeling, and implemented reusably through the procedure of logic synthesis, simulation and FPGA verification. To communicate with ARM9 core and SIP, we designed SIP bus functional model according to AMBA AHB specification. The design of ECC Processor for platform-based SoC is implemented using the design kit which is composed of many devices such as ARM9 RISC core, memory, UART, interrupt controller, FPGA and so on. We performed software design on the ARM9 core for SIP and peripherals control, memory address mapping and so on.

  • PDF