• Title/Summary/Keyword: Multi-core Processors

Search Result 84, Processing Time 0.023 seconds

A Performance Study of Asymmetric Embedded Multi-Core Processors (비대칭적 임베디드 멀티코어 프로세서의 성능 연구)

  • Lee, Jongbok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.16 no.1
    • /
    • pp.233-238
    • /
    • 2016
  • Recently, the multi-core processor architecture is widely adopted in the embedded processors for enhancing its performance. Multi-core processors are classified either as symmetric or asymmetric. Asymmetric multicore processors are known to score higher performance and more efficient than symmetric multi-core processors. In order to study the performance enhancement of asymmetric multi-core embedded processors over the symmetric ones, the trace-driven simulation has been executed for various asymmetric embedded dual-core, quad-core, octa-core and hexadeca-core processors and compared with the symmetric ones of similar hardware budget using MiBench benchmarks as input.

A Performance Study of Asymmetric Multi-core Digital Signal Processor Architectures (비대칭적 멀티코어 디지털 신호처리 프로세서의 성능 연구)

  • Lee, Jongbok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.15 no.5
    • /
    • pp.219-224
    • /
    • 2015
  • Recently, the multi-core processor architecture is widely used in the digital signal processors for enhancing its performance. Multi-core processors are classified either as symmetric or asymmetric. Asymmetric multi-core processors are known to have higher performance and more efficient than symmetric multi-core processors. In order to study the performance enhancement of asymmetric multi-core digital signal processors over the symmetric ones, the trace-driven simulation has been executed for various asymmetric quad-core, octa-core and hexadeca-core digital signal processors and compared with the symmetric ones of similar hardware budget using UTDSP benchmarks as input.

Quantifying Architectural Impact of Liquid Cooling for 3D Multi-Core Processors

  • Jang, Hyung-Beom;Yoon, Ik-Roh;Kim, Cheol-Hong;Shin, Seung-Won;Chung, Sung-Woo
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.12 no.3
    • /
    • pp.297-312
    • /
    • 2012
  • For future multi-core processors, 3D integration is regarded as one of the most promising techniques since it improves performance and reduces power consumption by decreasing global wire length. However, 3D integration causes serious thermal problems since the closer proximity of heat generating dies makes existing thermal hotspots more severe. Conventional air cooling schemes are not enough for 3D multi-core processors due to the limit of the heat dissipation capability. Without more efficient cooling methods such as liquid cooling, the performance of 3D multi-core processors should be degraded by dynamic thermal management. In this paper, we examine the architectural impact of cooling methods on the 3D multi-core processor to find potential benefits of liquid cooling. We first investigate the thermal behavior and compare the performance of two different cooling schemes. We also evaluate the leakage power consumption and lifetime reliability depending on the temperature in the 3D multi-core processor.

Bounding Worst-Case Performance for Multi-Core Processors with Shared L2 Instruction Caches

  • Yan, Jun;Zhang, Wei
    • Journal of Computing Science and Engineering
    • /
    • v.5 no.1
    • /
    • pp.1-18
    • /
    • 2011
  • As the first step toward real-time multi-core computing, this paper presents a novel approach to bounding the worst-case performance for threads running on multi-core processors with shared L2 instruction caches. The idea of our approach is to compute the worst-case instruction access interferences between different threads based on the program control flow information of each thread, which can be statically analyzed. Our experiments indicate that the proposed approach can reasonably estimate the worst-case shared L2 instruction cache misses by considering the inter-thread instruction conflicts. Also, the worst-case execution time (WCET) of applications running on multi-core processors estimated by our approach is much better than the estimation by simply assuming all L2 instruction accesses are misses.

Analysis of Performance, Energy-efficiency and Temperature for 3D Multi-core Processors according to Floorplan Methods (플로어플랜 기법에 따른 3차원 멀티코어 프로세서의 성능, 전력효율성, 온도 분석)

  • Choi, Hong-Jun;Son, Dong-Oh;Kim, Jong-Myon;Kim, Cheol-Hong
    • The KIPS Transactions:PartA
    • /
    • v.17A no.6
    • /
    • pp.265-274
    • /
    • 2010
  • As the process technology scales down and integration densities continue to increase, interconnection has become one of the most important factors in performance of recent multi-core processors. Recently, to reduce the delay due to interconnection, 3D architecture has been adopted in designing multi-core processors. In 3D multi-core processors, multiple cores are stacked vertically and each core on different layers are connected by direct vertical TSVs(through-silicon vias). Compared to 2D multi-core architecture, 3D multi-core architecture reduces wire length significantly, leading to decreased interconnection delay and lower power consumption. Despite the benefits mentioned above, 3D design technique cannot be practical without proper solutions for hotspots due to high temperature. In this paper, we propose three floorplan schemes for reducing the peak temperature in 3D multi-core processors. According to our simulation results, the proposed floorplan schemes are expected to mitigate the thermal problems of 3D multi-core processors efficiently, resulting in improved reliability. Moreover, processor performance improves by reducing the performance degradation due to DTM techniques. Power consumption also can be reduced by decreased temperature and reduced execution time.

Thermal Analysis of 3D Multi-core Processors with Dynamic Frequency Scaling (동적 주파수 조절 기법을 적용한 3D 구조 멀티코어 프로세서의 온도 분석)

  • Zeng, Min;Park, Young-Jin;Lee, Byeong-Seok;Lee, Jeong-A;Kim, Cheol-Hong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.11
    • /
    • pp.1-9
    • /
    • 2010
  • As the process technology scales down, an interconnection has became a major performance constraint for multi-core processors. Recently, in order to mitigate the performance bottleneck of the interconnection for multi-core processors, a 3D integration technique has drawn quite attention. The 3D integrated multi-core processor has advantage for reducing global wire length, resulting in a performance improvement. However, it causes serious thermal problems due to increased power density. For this reason, to design efficient 3D multi-core processors, thermal-aware design techniques should be considered. In this paper, we analyze the temperature on the 3D multi-core processors in function unit level through various experiments. We also present temperature characteristics by varying application features, cooling characteristics, and frequency levels on 3D multi-core processors. According to our experimental results, following two rules should be obeyed for thermal-aware 3D processor design. First, to optimize the thermal profile of cores, the core with higher cooling efficiency should be clocked at a higher frequency. Second, to lower the temperature of cores, a workload with higher thermal impact should be assigned to the core with higher cooling efficiency.

A Study On Statistical Simulation for Asymmetric Multi-Core Processor Architectures (비대칭적 멀티코어 프로세서의 통계적 모의실험에 관한 연구)

  • Lee, Jongbok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.16 no.2
    • /
    • pp.157-163
    • /
    • 2016
  • If trace-driven or execution-driven simulation is used for the performance analysis of asymmetric multi-core processors, excessive time and much disk space are necessary. In this paper, statistical simulations are performed for asymmetric multi-core processors with various hardware configurations. For the experiment, SPEC 2000 benchmark programs are used for profiling and synthesis, which is supplied as input for the simulation of asymmetric multi-core processors. As a result, the performance of asymmetric multi-core processor obtained by statistical simulation is comparable to that of the trace-driven simulation with a tremendous reduction in the simulation time.

Analysis on the Performance and Temperature of the 3D Quad-core Processor according to Cache Organization (캐쉬 구성에 따른 3차원 쿼드코어 프로세서의 성능 및 온도 분석)

  • Son, Dong-Oh;Ahn, Jin-Woo;Choi, Hong-Jun;Kim, Jong-Myon;Kim, Cheol-Hong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.17 no.6
    • /
    • pp.1-11
    • /
    • 2012
  • As the process technology scales down, multi-core processors cause serious problems such as increased interconnection delay, high power consumption and thermal problems. To solve the problems in 2D multi-core processors, researchers have focused on the 3D multi-core processor architecture. Compared to the 2D multi-core processor, the 3D multi-core processor decreases interconnection delay by reducing wire length significantly, since each core on different layers is connected using vertical through-silicon via(TSV). However, the power density in the 3D multi-core processor is increased dramatically compared to that in the 2D multi-core processor, because multiple cores are stacked vertically. Unfortunately, increased power density causes thermal problems, resulting in high cooling cost, negative impact on the reliability. Therefore, temperature should be considered together with performance in designing 3D multi-core processors. In this work, we analyze the temperature of the cache in quad-core processors varying cache organization. Then, we propose the low-temperature cache organization to overcome the thermal problems. Our evaluation shows that peak temperature of the instruction cache is lower than threshold. The peak temperature of the data cache is higher than threshold when the cache is composed of many ways. According to the results, our proposed cache organization not only efficiently reduces the peak temperature but also reduces the performance degradation for 3D quad-core processors.

ETS: Efficient Task Scheduler for Per-Core DVFS Enabled Multicore Processors

  • Hong, Jeongkyu
    • Journal of information and communication convergence engineering
    • /
    • v.18 no.4
    • /
    • pp.222-229
    • /
    • 2020
  • Recent multi-core processors for smart devices use per-core dynamic voltage and frequency scaling (DVFS) that enables independent voltage and frequency control of cores. However, because the conventional task scheduler was originally designed for per-core DVFS disabled processors, it cannot effectively utilize the per-core DVFS and simply allocates tasks evenly across all cores to core utilization with the same CPU frequency. Hence, we propose a novel task scheduler to effectively utilize percore DVFS, which enables each core to have the appropriate frequency, thereby improving performance and decreasing energy consumption. The proposed scheduler classifies applications into two types, based on performance-sensitivity and allows a performance-sensitive application to have a dedicated core, which maximizes core utilization. The experimental evaluations with a real off-the-shelf smart device showed that the proposed task scheduler reduced 13.6% of CPU energy (up to 28.3%) and 3.4% of execution time (up to 24.5%) on average, as compared to the conventional task scheduler.

Variable latency L1 data cache architecture design in multi-core processor under process variation

  • Kong, Joonho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.9
    • /
    • pp.1-10
    • /
    • 2015
  • In this paper, we propose a new variable latency L1 data cache architecture for multi-core processors. Our proposed architecture extends the traditional variable latency cache to be geared toward the multi-core processors. We added a specialized data structure for recording the latency of the L1 data cache. Depending on the added latency to the L1 data cache, the value stored to the data structure is determined. It also tracks the remaining cycles of the L1 data cache which notifies data arrival to the reservation station in the core. As in the variable latency cache of the single-core architecture, our proposed architecture flexibly extends the cache access cycles considering process variation. The proposed cache architecture can reduce yield losses incurred by L1 cache access time failures to nearly 0%. Moreover, we quantitatively evaluate performance, power, energy consumption, power-delay product, and energy-delay product when increasing the number of cache access cycles.