• Title/Summary/Keyword: multi-core architecture

Search Result 158, Processing Time 0.028 seconds

Analysis of Performance, Energy-efficiency and Temperature for 3D Multi-core Processors according to Floorplan Methods (플로어플랜 기법에 따른 3차원 멀티코어 프로세서의 성능, 전력효율성, 온도 분석)

  • Choi, Hong-Jun;Son, Dong-Oh;Kim, Jong-Myon;Kim, Cheol-Hong
    • The KIPS Transactions:PartA
    • /
    • v.17A no.6
    • /
    • pp.265-274
    • /
    • 2010
  • As the process technology scales down and integration densities continue to increase, interconnection has become one of the most important factors in performance of recent multi-core processors. Recently, to reduce the delay due to interconnection, 3D architecture has been adopted in designing multi-core processors. In 3D multi-core processors, multiple cores are stacked vertically and each core on different layers are connected by direct vertical TSVs(through-silicon vias). Compared to 2D multi-core architecture, 3D multi-core architecture reduces wire length significantly, leading to decreased interconnection delay and lower power consumption. Despite the benefits mentioned above, 3D design technique cannot be practical without proper solutions for hotspots due to high temperature. In this paper, we propose three floorplan schemes for reducing the peak temperature in 3D multi-core processors. According to our simulation results, the proposed floorplan schemes are expected to mitigate the thermal problems of 3D multi-core processors efficiently, resulting in improved reliability. Moreover, processor performance improves by reducing the performance degradation due to DTM techniques. Power consumption also can be reduced by decreased temperature and reduced execution time.

Traffic Engineering Based on Local States in Internet Protocol-Based Radio Access Networks

  • Barlow David A.;Vassiliou Vasos;Krasser Sven;Owen Henry L.;Grimminger Jochen;Huth Hans-Peter;Sokol Joachim
    • Journal of Communications and Networks
    • /
    • v.7 no.3
    • /
    • pp.377-384
    • /
    • 2005
  • The purpose of this research is to develop and evaluate a traffic engineering architecture that uses local state information. This architecture is applied to an Internet protocol radio access network (RAN) that uses multi-protocol label switching (MPLS) and differentiated services to support mobile hosts. We assume mobility support is provided by a protocol such as the hierarchical mobile Internet protocol. The traffic engineering architecture is router based-meaning that routers on the edges of the network make the decisions onto which paths to place admitted traffic. We propose an algorithm that supports the architecture and uses local network state in order to function. The goal of the architecture is to provide an inexpensive and fast method to reduce network congestion while increasing the quality of service (QoS) level when compared to traditional routing and traffic engineering techniques. We use a number of different mobility scenarios and a mix of different types of traffic to evaluate our architecture and algorithm. We use the network simulator ns-2 as the core of our simulation environment. Around this core we built a system of pre-simulation, during simulation, and post-processing software that enabled us to simulate our traffic engineering architecture with only very minimal changes to the core ns-2 software. Our simulation environment supports a number of different mobility scenarios and a mix of different types of traffic to evaluate our architecture and algorithm.

Real-Time Implementation of Doppler Beam Sharpening in a SMP Multi-Core Kernel (대칭형 멀티코어 커널에서 DBS(Doppler Beam Sharpening) 알고리즘 실시간 구현)

  • Kong, Young-Joo;Woo, Seon-Keol
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.11 no.4
    • /
    • pp.251-257
    • /
    • 2016
  • The multi-core technology has become pervasive in embedded systems. An implementation of the Doppler Beam Sharpening algorithm that improves the azimuth resolution by using doppler frequency shift is possible only in multi-core environment because of the amount of calculation. In this paper, we design of multi-core architecture for a real time implementation of DBS algorithm. And based on designed structure, we produce a DBS image on P4080 board.

Lane Detection using Embedded Multi-core Platform (임베디드 멀티코어 플랫폼을 이용한 차선검출)

  • Lee, Kwang-Yeob;Kim, Dong-Han;Park, Tae-Ryoung
    • Journal of IKEEE
    • /
    • v.15 no.3
    • /
    • pp.255-260
    • /
    • 2011
  • In this paper, we propose a parallelization technique in lane detection by using Hough transform. Hough transform has a weakness that it has a lot computation quantity, because it has to compute ${\rho}$ value in all candidate ${\Theta}$ to be detected in an image. We propose an architecture of parallel processing for this transform in a multi-core environment. The parallel processing has application to Hough transform as well as noise reduction and edge detection. This proposed architecture has 5.17 times improvement in performance compare to the existing algorithm.

Efficient Process Network Implementation of Ray-Tracing Application on Heterogeneous Multi-Core Systems

  • Jung, Hyeonseok;Yang, Hoeseok
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.5 no.4
    • /
    • pp.289-293
    • /
    • 2016
  • As more mobile devices are equipped with multi-core CPUs and are required to execute many compute-intensive multimedia applications, it is important to optimize the systems, considering the underlying parallel hardware architecture. In this paper, we implement and optimize ray-tracing application tailored to a given mobile computing platform with multiple heterogeneous processing elements. In this paper, a lightweight ray-tracing application is specified and implemented in Kahn process network (KPN) model-of-computation, which is known to be suitable for the description of real-time applications. We take an open-source C/C++ implementation of ray-tracing and adapt it to KPN description in the Distributed Application Layer framework. Then, several possible configurations are evaluated in the target mobile computing platform (Exynos 5422), where eight heterogeneous ARM cores are integrated. We derive the optimal degree of parallelism and a suitable distribution of the replicated tasks tailored to the target architecture.

The Study of Distributed Processing for Graphics Rendering Engine Based on ARINC 653 Multi-Core System (ARINC 653 멀티코어 기반 그래픽스 렌더링 엔진 분산처리방안 연구)

  • Jung, Mukyoung
    • Journal of Aerospace System Engineering
    • /
    • v.13 no.5
    • /
    • pp.1-8
    • /
    • 2019
  • Recently, avionics has been migrating from a federated architecture to an integrated modular architecture based on a multi-core to reduce the number of systems, weight, power consumption, and platform redundancy. The volume of data which must bo provided to the pilot through the display device has increased, because an integrated single device performs multiple functions. For this reason, the volume of data processed by the graphic processor within a fixed operation period has increased. In this paper, we provide a multi-core-based rendering engine in to perform more graphics processing within a fixed operation period. We assume the proposed method uses a multi-core-based partitioning operating system using the AMP (Asymmetric Multi-Processing) architecture.

An implementation of a unified ALU in multi-core GPGPU based on SIMT architecture (SIMT 구조 기반 멀티코어 GPGPU의 통합 ALU 설계)

  • Kyung, Gyu-taek;Kwak, Jae-Chang;Lee, Kwang-yeob
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2013.10a
    • /
    • pp.540-543
    • /
    • 2013
  • This paper describes an implementation of a unified ALU on multi-core GPGPU based on SIMT architecture. Our unified ALU can operate conditional branch instructions, data movement instructions, integer arithmetic instructions and floating-point arithmetic instructions. Since multi-core GPGPU contains a lot of ALU for parallel processing of various types, the main point of this paper is to design the minimum size ALU by unifying similar processing of each operations on circit level. All instrunctions were tested by making a test program. And we compare this results with results of CPU operations to verify our ALU. Our unified ALU's gate size is approximately 20,000 and the maximum operation frequency is 430MHz.

  • PDF

Real-time Scheduling on Heterogeneous Multi-core Architecture for Energy Conservation of Smart Mobile Devices (스마트 모바일 장치의 에너지 보존성을 높이기 위한 비대칭 멀티 코어 기반 실시간 태스크 스케쥴링)

  • Lim, Sung-Hwa
    • Journal of Digital Contents Society
    • /
    • v.19 no.6
    • /
    • pp.1219-1224
    • /
    • 2018
  • Nowaday, smart mobile devices on Internet of Things are required to process and deliver greate amount of data in real-time. Therefore, heterogeneous mult-core architecture such the big.LITTLE core architecture, which shows high energy conservation while guaranteeing high performance, are widely employed on up to date smart mobile devices. The LITTLE cores should be highly utilized to gain higher energy conservation because LITTLE cores have much higher energy efficiency than big cores. In this paper, we propose a core selection algorithm, which tries to firstly assign a real-time task on a LITTLE core rather a big core while the task can be finished within its own deadline. We also perform simulation as performance evaluation to show that our proposed algorithm shows higher energy conservation while guaranteeing the required performance.

SVM-based Energy-Efficient scheduling on Heterogeneous Multi-Core Mobile Devices (비대칭 멀티코어 모바일 단말에서 SVM 기반 저전력 스케줄링 기법)

  • Min-Ho, Han;Young-Bae, Ko;Sung-Hwa, Lim
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.27 no.6
    • /
    • pp.69-75
    • /
    • 2022
  • We propose energy-efficient scheduling considering real-time constraints and energy efficiency in smart mobile with heterogeneous multi-core structure. Recently, high-performance applications such as VR, AR, and 3D game require real-time and high-level processings. The big.LITTLE architecture is applied to smart mobiles devices for high performance and high energy efficiency. However, there is a problem that the energy saving effect is reduced because LITTLE cores are not properly utilized. This paper proposes a heterogeneous multi-core assignment technique that improves real-time performance and high energy efficiency with big.LITTLE architecture. Our proposed method optimizes the energy consumption and the execution time by predicting the actual task execution time using SVM (Support Vector Machine). Experiments on an off-the-shelf smartphone show that the proposed method reduces energy consumption while ensuring the similar execution time to legacy schemes.

Architecture and Characteristics of Multi-Ring based Optical Network with Single-Hop between Edge Nodes (Edge Node간 단일 홉을 갖는 다중링 기반의 광네트워크 구성 및 특성)

  • Lee, Sang-Hwa;Lee, Heesang;Han, Chimoon
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.41 no.6 s.324
    • /
    • pp.69-78
    • /
    • 2004
  • This paper proposes architecture and characteristics of a multi-ring based optical network with single-hop between edge nodes using the concept of circuit switching and multi-wavelength label switching to solve delay problem caused by applying crossconnectors as transit nodes in the wavelength division multiplexing(WDM) network. We suggest multi-ring based architecture composed single and multiple wavelength-bands with multi-wavelength labels, and analyze characteristics of two models. To avoid the packet collision in output ports of edge nodes due to output contention, the static and dynamic allocation scheme, which packets are allocated in time slots, is provided. Based on our analysis, it shows that delay only occur in not core nodes but edge nodes in the proposed architecture. In addition, we evaluate the probabilities of delay, packet loss, and call blocking in the proposed optical packet network.