• Title/Summary/Keyword: Multicore Platforms

Search Result 12, Processing Time 0.023 seconds

Performance Evaluation of Real-time Linux for an Industrial Real-time Platform

  • Jo, Yong Hwan;Choi, Byoung Wook
    • International journal of advanced smart convergence
    • /
    • v.11 no.1
    • /
    • pp.28-35
    • /
    • 2022
  • This paper presents a performance evaluation of real-time Linux for industrial real-time platforms. On industrial platforms, multicore processors are popular due to their work distribution efficiency and cost-effectiveness. Multicore processors, however, are not designed for applications with real-time constraints, and their performance capabilities depend on their core configurations. In order to assess the feasibility of a multicore processor for real-time applications, we conduct a performance evaluation of a general processor and a low-power processor to provide an experimental environment of real-time Linux on both Xenomai and RT-preempt considering the multicore configuration. The real-time performance is evaluated through scheduling latency and in an environment with loads on the CPU, memory, and network to consider an actual situation. The results show a difference between a low-power and a general-purpose processor, but from developer's point of view, it shows that the low-power processor is a proper solution to accommodate low power situations.

Scratchpad Memory Architectures and Allocation Algorithms for Hard Real-Time Multicore Processors

  • Liu, Yu;Zhang, Wei
    • Journal of Computing Science and Engineering
    • /
    • v.9 no.2
    • /
    • pp.51-72
    • /
    • 2015
  • Time predictability is crucial in hard real-time and safety-critical systems. Cache memories, while useful for improving the average-case memory performance, are not time predictable, especially when they are shared in multicore processors. To achieve time predictability while minimizing the impact on performance, this paper explores several time-predictable scratch-pad memory (SPM) based architectures for multicore processors. To support these architectures, we propose the dynamic memory objects allocation based partition, the static allocation based partition, and the static allocation based priority L2 SPM strategy to retain the characteristic of time predictability while attempting to maximize the performance and energy efficiency. The SPM based multicore architectural design and the related allocation methods thus form a comprehensive solution to hard real-time multicore based computing. Our experimental results indicate the strengths and weaknesses of each proposed architecture and the allocation method, which offers interesting on-chip memory design options to enable multicore platforms for hard real-time systems.

Multicore-Aware Code Co-Positioning to Reduce WCET on Dual-Core Processors with Shared Instruction Caches

  • Ding, Yiqiang;Zhang, Wei
    • Journal of Computing Science and Engineering
    • /
    • v.6 no.1
    • /
    • pp.12-25
    • /
    • 2012
  • For real-time systems it is important to obtain the accurate worst-case execution time (WCET). Furthermore, how to improve the WCET of applications that run on multicore processors is both significant and challenging as the WCET can be largely affected by the possible inter-core interferences in shared resources such as the shared L2 cache. In order to solve this problem, we propose an innovative approach that adopts a code positioning method to reduce the inter-core L2 cache interferences between the different real-time threads that adaptively run in a multi-core processor by using different strategies. The worst-case-oriented strategy is designed to decrease the worst-case WCET among these threads to as low as possible. The other two strategies aim at reducing the WCET of each thread to almost equal percentage or amount. Our experiments indicate that the proposed multicore-aware code positioning approaches, not only improve the worst-case performance of the real-time threads but also make good tradeoffs between efficiency and fairness for threads that run on multicore platforms.

Secure and Energy-Efficient MPEG Encoding using Multicore Platforms (멀티코어를 이용한 안전하고 에너지 효율적인 MPEG 인코딩)

  • Lee, Sung-Ju;Lee, Eun-Ji;Hong, Seung-Woo;Choi, Han-Na;Chung, Yong-Wha
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.20 no.3
    • /
    • pp.113-120
    • /
    • 2010
  • Content security and privacy protection are important issues in emerging network-based video surveillance applications. Especially, satisfying both real-time constraint and energy efficiency with embedded system-based video sensors is challenging since the battery-operated sensors need to compress and protect video content in real-time. In this paper, we propose a multicore-based solution to compress and protect video surveillance data, and evaluate the effectiveness of the solution in terms of both real-time constraint and energy efficiency. Based on the experimental results with MPEG2/AES software, we confirm that the multicore-based solution can improve the energy efficiency of a singlecore-based solution by a factor of 30 under the real-time constraint.

Performance Comparison of Parallel Programming Frameworks in Digital Image Transformation

  • Shin, Woochang
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.11 no.3
    • /
    • pp.1-7
    • /
    • 2019
  • Previously, parallel computing was mainly used in areas requiring high computing performance, but nowadays, multicore CPUs and GPUs have become widespread, and parallel programming advantages can be obtained even in a PC environment. Various parallel programming frameworks using multicore CPUs such as OpenMP and PPL have been announced. Nvidia and AMD have developed parallel programming platforms and APIs for program developers to take advantage of multicore GPUs on their graphics cards. In this paper, we develop digital image transformation programs that runs on each of the major parallel programming frameworks, and measure the execution time. We analyze the characteristics of each framework through the execution time comparison. Also a constant K indicating the ratio of program execution time between different parallel computing environments is presented. Using this, it is possible to predict rough execution time without implementing a parallel program.

Implementation of SDR-based LTE-A PDSCH Decoder for Supporting Multi-Antenna Using Multi-Core DSP (멀티코어 DSP를 이용한 다중 안테나를 지원하는 SDR 기반 LTE-A PDSCH 디코더 구현)

  • Na, Yong;Ahn, Heungseop;Choi, Seungwon
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.15 no.4
    • /
    • pp.85-92
    • /
    • 2019
  • This paper presents a SDR-based Long Term Evolution Advanced (LTE-A) Physical Downlink Shared Channel (PDSCH) decoder using a multicore Digital Signal Processor (DSP). For decoder implementation, multicore DSP TMS320C6670 is used, which provides various hardware accelerators such as turbo decoder, fast Fourier transformer and Bit Rate Coprocessors. The TMS320C6670 is a DSP specialized in implementing base station platforms and is not an optimized platform for implementing mobile terminal platform. Accordingly, in this paper, the hardware accelerator was changed to the terminal implementation to implement the LTE-A PDSCH decoder supporting the multi-antenna and the functions not provided by the hardware accelerator were implemented through core programming. Also pipeline using multicore was implemented to meet the transmission time interval. To confirm the feasibility of the proposed implementation, we verified the real-time decoding capability of the PDSCH decoder implemented using the LTE-A Reference Measurement Channel (RMC) waveform about transmission mode 2 and 3.

Parallelizing H.264 and AES Collectively

  • Kim, Heegon;Lee, Sungju;Chung, Yongwha;Pan, Sung Bum
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.7 no.9
    • /
    • pp.2326-2337
    • /
    • 2013
  • Many applications can be parallelized by using multicore platforms. We propose a load-balancing technique for parallelizing a whole application, whose first module (H.264) has data independency and whose second module (AES) has data dependency. Instead of distributing the first module symmetrically over the multi-core platform, we distribute the data-independent workload asymmetrically in order to start the data-dependent workload as early as possible. Based on the experimental results with a compression/encryption application, we confirm that the asymmetric load balancing can provide better performance than the typical symmetric load balancing.

Energy-aware EDZL Real-Time Scheduling on Multicore Platforms (멀티코어 플랫폼에서 에너지 효율적 EDZL 실시간 스케줄링)

  • Han, Sangchul
    • Journal of KIISE
    • /
    • v.43 no.3
    • /
    • pp.296-303
    • /
    • 2016
  • Mobile real-time systems with limited system resources and a limited power source need to fully utilize the system resources when the workload is heavy and reduce energy consumption when the workload is light. EDZL (Earliest Deadline until Zero Laxity), a multiprocessor real-time scheduling algorithm, can provide high system utilization, but little work has been done aimed at reducing its energy consumption. This paper tackles the problem of DVFS (Dynamic Voltage/Frequency Scaling) in EDZL scheduling. It proposes a technique to compute a uniform speed on full-chip DVFS platforms and individual speeds of tasks on per-core DVFS platforms. This technique, which is based on the EDZL schedulability test, is a simple but effective one for determining the speeds of tasks offline. We also show through simulation that the proposed technique is useful in reducing energy consumption.

Improving Multi-DNN Computational Performance of Embedded Multicore Processors through a Global Queue (글로벌 큐를 통한 임베디드 멀티코어 프로세서의 멀티 DNN 연산 성능 향상)

  • Cho, Ho-jin;Kim, Myung-sun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.6
    • /
    • pp.714-721
    • /
    • 2020
  • DNN is expanding its use in embedded systems such as robots and autonomous vehicles. For high recognition accuracy, computational complexity is greatly increased, and multiple DNNs are running aperiodically. Therefore, the ability processing multiple DNNs in embedded environments is a crucial issue. Accordingly, multicore based platforms are being released. However, most DNN models are operated in a batch process, and when multiple DNNs are operated in multicore together, the execution time deviation between each DNN may be large and the end-to-end execution time of the whole DNNs could be long depending on how they are allocated to the cores. In this paper, we solve these problems by providing a framework that decompose each DNN into individual layers and then distribute to multicores through a global queue. As a result of the experiment, the total DNN execution time was reduced by 31%, and when operating multiple identical DNNs, the deviation in execution time was reduced by up to 95.1%.

On-line Trace Based Automatic Parallelization of Java Programs on Multicore Platforms

  • Sun, Yu;Zhang, Wei
    • Journal of Computing Science and Engineering
    • /
    • v.6 no.2
    • /
    • pp.105-118
    • /
    • 2012
  • We propose two new approaches that automatically parallelize Java programs at runtime. These approaches, which rely on run-time trace information collected during program execution, dynamically recompile Java byte code that can be executed in parallel. One approach utilizes trace information to improve traditional loop parallelization, and the other parallelizes traces instead of loop iterations. We also describe a cost/benefit model that makes intelligent parallelization decisions, as well as a parallel execution environment to execute parallelized programs. These techniques are based on Jikes RVM. Our approach is evaluated by parallelizing sequential Java programs, and its performance is compared to that of the manually parallelized code. According to the experimental results, our approach has low overheads and achieves competitive speedups compared to the manually parallelizing code. Moreover, trace parallelization can exploit parallelism beyond loop iterations.