• Title/Summary/Keyword: Processor Core

Search Result 396, Processing Time 0.022 seconds

A design of 16-bit adiabatic Microprocessor core

  • Youngjoon Shin;Lee, Hanseung;Yong Moon;Lee, Chanho
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.3 no.4
    • /
    • pp.194-198
    • /
    • 2003
  • A 16-bit adiabatic low-power Micro-processor core is designed. The processor consists of control block, multi-port register file and ALU. A simplified four-phase clock generator is designed to provide supply clocks for adiabatic processor. All the clock line charge on the capacitive interconnections is recovered to recycle the energy. Adiabatic circuits are designed based on ECRL(efficient charge recovery logic) and $0.35\mu\textrm$ CMOS technology is used. Simulation results show that the power consumption of the adiabatic Microprocessor core is reduced by a factor of 2.9~3.1 compared to that of conventional CMOS Microprocessor

New Thermal-Aware Voltage Island Formation for 3D Many-Core Processors

  • Hong, Hyejeong;Lim, Jaeil;Lim, Hyunyul;Kang, Sungho
    • ETRI Journal
    • /
    • v.37 no.1
    • /
    • pp.118-127
    • /
    • 2015
  • The power consumption of 3D many-core processors can be reduced, and the power delivery of such processors can be improved by introducing voltage island (VI) design using on-chip voltage regulators. With the dramatic growth in the number of cores that are integrated in a processor, however, it is infeasible to adopt per-core VI design. We propose a 3D many-core processor architecture that consists of multiple voltage clusters, where each has a set of cores that share an on-chip voltage regulator. Based on the architecture, the steady state temperature is analyzed so that the thermal characteristic of each voltage cluster is known. In the voltage scaling and task scheduling stages, the thermal characteristics and communication between cores is considered. The consideration of the thermal characteristics enables the proposed VI formation to reduce the total energy consumption, peak temperature, and temperature gradients in 3D many-core processors.

Variable latency L1 data cache architecture design in multi-core processor under process variation

  • Kong, Joonho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.9
    • /
    • pp.1-10
    • /
    • 2015
  • In this paper, we propose a new variable latency L1 data cache architecture for multi-core processors. Our proposed architecture extends the traditional variable latency cache to be geared toward the multi-core processors. We added a specialized data structure for recording the latency of the L1 data cache. Depending on the added latency to the L1 data cache, the value stored to the data structure is determined. It also tracks the remaining cycles of the L1 data cache which notifies data arrival to the reservation station in the core. As in the variable latency cache of the single-core architecture, our proposed architecture flexibly extends the cache access cycles considering process variation. The proposed cache architecture can reduce yield losses incurred by L1 cache access time failures to nearly 0%. Moreover, we quantitatively evaluate performance, power, energy consumption, power-delay product, and energy-delay product when increasing the number of cache access cycles.

An Optimization Tool for Determining Processor Affinity of Networking Processes (통신 프로세스의 프로세서 친화도 결정을 위한 최적화 도구)

  • Cho, Joong-Yeon;Jin, Hyun-Wook
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.2
    • /
    • pp.131-136
    • /
    • 2013
  • Multi-core processors can improve parallelism of application processes and thus can enhance the system throughput. Researchers have recently revealed that the processor affinity is an important factor to determine network I/O performance due to architectural characteristics of multi-core processors; thus, many researchers are trying to suggest a scheme to decide an optimal processor affinity. Existing schemes to dynamically decide the processor affinity are able to transparently adapt for system changes, such as modifications of application and upgrades of hardware, but these have limited access to characteristics of application behavior and run-time information that can be collected heuristically. Thus, these can provide only sub-optimal processor affinity. In this paper, we define meaningful system variables for determining optimal processor affinity and suggest a tool to gather such information. We show that the implemented tool can overcome limitations of existing schemes and can improve network bandwidth.

Mileage-based Asymmetric Multi-core Scheduling for Mobile Devices (모바일 디바이스를 위한 마일리지 기반 비대칭 멀티코어 스케줄링)

  • Lee, Se Won;Lee, Byoung-Hoon;Lim, Sung-Hwa
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.26 no.5
    • /
    • pp.11-19
    • /
    • 2021
  • In this paper, we proposed an asymmetric multi-core processor scheduling scheme which is based on the mileage of each core. We considered a big-LITTLE multi-core processor structure, which consists of low power consuming LITTLE cores with general performance and high power consuming big cores with high performance. If a task needs to be processed, the processor decides a core type (big or LITTLE) to handle the task, and then investigate the core with the shortest mileage among unoccupied cores. Then assigns the task to the core. We developed a mileage-based balancing algorithm for asymmetric multi-core assignment and showed that the proposed scheduling scheme is more cost-effective compared to the traditional scheme from a management perspective. Simulation is also conducted for the purpose of performance evaluation of our proposed algorithm.

Enhancing the Performance of Multiple Parallel Applications using Heterogeneous Memory on the Intel's Next-Generation Many-core Processor (인텔 차세대 매니코어 프로세서에서의 다중 병렬 프로그램 성능 향상기법 연구)

  • Rho, Seungwoo;Kim, Seoyoung;Nam, Dukyun;Park, Geunchul;Kim, Jik-Soo
    • Journal of KIISE
    • /
    • v.44 no.9
    • /
    • pp.878-886
    • /
    • 2017
  • This paper discusses performance bottlenecks that may occur when executing high-performance computing MPI applications in the Intel's next generation many-core processor called Knights Landing(KNL), as well as effective resource allocation techniques to solve this problem. KNL is composed of a host processor to enable self-booting in addition to an existing accelerator consisting of a many-core processor, and it was released with a new type of on-package memory with improved bandwidth on top of existing DDR4 based memory. We empirically verified an improvement of the execution performance of multiple MPI applications and the overall system utilization ratio by studying a resource allocation method optimized for such new many-core processor architectures.

Dynamic Scheduling of Network Processes for Multi-Core Systems (멀티 코어 시스템에서 통신 프로세스의 동적 스케줄링)

  • Jang, Hye-Churn;Jin, Hyun-Wook;Kim, Hag-Young
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.12
    • /
    • pp.968-972
    • /
    • 2009
  • The multi-core processors are being widely exploited by many high-end systems. With significant advances in processor architecture, the network band-width required on the high-end systems is increasing drastically. It is therefore highly desirable to manage multiple cores efficiently to achieve high network band-width with minimum resource requirements. Modern operating systems, however, still have significant design and optimization space to leverage the network performance over multi-core systems. In this paper, we suggest a novel networking process scheduling scheme, which decides the best processor affinity of networking processes based on the processor cache layout, communication intensiveness, and processor loads. The experimental results show that the scheduling scheme implemented in the Linux kernel can improve the network bandwidth and the effectiveness of processor utilization by 20% and 59%, respectively.

Design of DCT/IDCT Core Processor using Module Generator Technique (모듈생성 기법을 이용한 DCT/IDCT 코어 프로세서의 설계)

  • 황준하;한택돈
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.18 no.10
    • /
    • pp.1433-1443
    • /
    • 1993
  • DCT(Discrete Cosine Transform) / IDCT(Inverse DCT) is widely used in various image compression and decompression systems as well as in DSP(Digital Signal Processing) applications. Since DCT/ IDCT is one of the most complicated part of the compression system, the performance of the system can be greatly enchanced by improving the speed of DCT/IDCT operation. In this thesis, we designed a DCT/IDCT core processor using module generator technique. By utilizing the partial sum and DA(Distributed Arithmetic) techniques, the DCT/ IDCT core processor is designed within small area. It is also designed to perform the IDCT(Inverse DCT) operation with little additional circuitry. The pipeline structure of the core processor enables the high performance, and the high accuracy of the DCT/IDCT operation is obtained by having fewer rounding stages. The proposed design is independent of design rules, and the number of the input bits and the accuracy of the internal calculation coa be easily adjusted due to the module generator technique. The accuracy of the processor satisfies the specifications in CCITT recommendation H, 261.

  • PDF

Construction of an Automatic Generation System of Embedded Processor Cores (임베디드 프로세서 코어 자동생성 시스템의 구축)

  • Cho Jae-Bum;You Yong-Ho;Hwang Sun-Young
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.6A
    • /
    • pp.526-534
    • /
    • 2005
  • This paper presents the structure and function of the system which automatically generates embedded processor cores using the SMDL. Accepting processor description in the SDML, the proposed system generates the processor core, consisting of the pipelined datapath and memory modules together with their control unit. The generated cores support muti-cycle instructions for proper handling of memory accesses, and resolve pipeline hazards encountered in the pipelined processors. Experimental results show the functional accuracy of the generated cores.

3-way SuperScalar Decoder Design for ARMv7 Core (ARMv7 Core를 위한 3-way SuperScalar Decoder 설계)

  • Kim, Hyo-Won;Kim, In-Soo;Baek, Chul-Ki;Min, Hyoung-Bok
    • Proceedings of the KIEE Conference
    • /
    • 2008.10c
    • /
    • pp.246-247
    • /
    • 2008
  • Further evolutions of technologies and needs of users will make mobile equipments improved. To make this happen, processor's good performance is essential. Hence, This paper propose a reform of Instruction Execute and Instruction Decode of contemporary ARMv7 which needs low-power and has the high performance for a faster processor. The first chapter explains why the performance of a processor has to be upgraded, the second chapter shows current technologies. The third chapter explains about the proposal and illustrates the structure. Finally, in the forth chapter, the conclusion will be made. 3-way Superscalar, that is proposed in this paper, will make designing a faster processor possible. And it will contribute for the advanced performance of mobile equipments.

  • PDF