• Title/Summary/Keyword: embedded processors

Search Result 162, Processing Time 0.028 seconds

Embedded Multithreading Processor Architecture for Personal Information Devices (개인용 정보 단말장치를 위한 내장형 멀티스레딩 프로세서 구조)

  • Jeong, Ha-Young;Chung, Won-Young;Lee, Yong-Surk
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.47 no.9
    • /
    • pp.7-13
    • /
    • 2010
  • In this paper, we proposed a processor architecture that is suitable for next generation embedded applications, especially for personal information devices such as smart phones, tablet PC. Latest high performance embedded processors are developed to achieve high clock speed. Because increasing performance makes design more difficult and induces large overhead, architectural evolution in embedded processor field is necessary. Among more enhanced processor types, out-of-order superscalar cannot be a candidate for embedded applications due to its excessive complexity and relatively low performance gain compared to its overhead. Therefore, new architecture with moderate complexity must be designed. In this paper, we developed a low-cost SMT architecture model and compared its performance to other architectures including scalar, superscalar and multiprocessor. Because current personal information devices have a tendency to execute multiple tasks simultaneously, SMT or CMP can be a good choice. And our simulation result shows that the efficiency of SMT is the best among the architectures considered.

Time-Efficient Voltage Scheduling Algorithms for Embedded Real-Time Systems with Task Synchronization (태스크 동기화가 필요한 임베디드 실기간 시스템에서 시간-효율적인 전압 스케쥴링 알고리즘)

  • Lee, Jae-Dong;Kim, Jung-Jong
    • Journal of Korea Multimedia Society
    • /
    • v.13 no.1
    • /
    • pp.30-37
    • /
    • 2010
  • Many embedded real - lime systems have adopted processors supported with dynamic voltage scal-ing(DVS) recently. Power is one of the important metrics for Optimization in the design and operation of embedded real-time systems. We can save considerable energy by using slowdown of processor sup-ported with DVS. In this paper, we improved the previous algorithm at a point of view of time complexity to calculate task slowdown factors for an efficient energy consumption in embedded real-time systems with task synchronization. We grasped the properties of the previous algorithm having $O(n^{2})$ time complexity through mathematical analysis and s simulation. Using its properties we proposed the improved algorithms with O(nlogn) and O(n) time complexity which have the same performance as the previous algorithm has.

An Efficient Voltage Scheduling for Embedded Real-Time Systems with Task Synchronization (태스크 동기화가 필요한 임베디드 실시간 시스템에 대한 효율적인 전압 스케쥴링)

  • Lee, Jae-Dong;Hur, Jung-Youn
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.35 no.6
    • /
    • pp.273-283
    • /
    • 2008
  • Many embedded real-time systems have adopted processors supported with dynamic voltage scaling(DVS) recently. Power is one of the important metrics for optimization in the design and operation of embedded real-time systems. We can save considerable energy by using slowdown of processor supported with DVS. In this paper, we propose heuristic algorithms to calculate task slowdown factors for an efficient energy consumption in embedded real-time systems with task synchronization. The previous algorithm has a following constraint : given the tasks are ordered in a nondecreasing order of their relative deadline, the task slowdown factors computed are in a nonincreasing order. In this paper, we relax the constraint and propose heuristic algorithms which have the same time complexity that previous algorithm has and can save more energy. Experimental results show that the proposed algorithms are energy efficient.

A Study on the Triple Module Redundancy ARM processor for the Avionic Embedded System (항공용 임베디드 시스템을 위한 Triple Module Redundancy 구조의 임베디드 하드웨어 신뢰성 평가)

  • Lee, Dong-Woo;Kim, Byeong-Young;Ko, Wan-Jin;Na, Jong-Whoa
    • Journal of Advanced Navigation Technology
    • /
    • v.14 no.1
    • /
    • pp.87-92
    • /
    • 2010
  • The design of avionic embedded systems requires high-dependability. In this paper, we studied the dependability of the triple modular redundancy (TMR) hardware for highly reliable aviation embedded system. In order to evaluate the dependability of the base ARM processor and the TMR ARM processor, we developed the simulation model of the reduced ARM and TMR ARM processors and performed the simulation fault injection for the analysis of the dependability of the two targets. In the fault injection experiments, we calculated the error recovery rate of the two the processor models. From the experimental results, we could confirm that the reliability of the TMR ARM processor was greater than the single ARM processor by ten times in some cases.

Block-wise Skipping for Embedded Database System (임베디드 데이터베이스 시스템을 위한 블록 단위 스키핑 기법)

  • Chong, Jae-Hyok;Park, Hyoung-Min;Hong, Seok-Jin;Shim, Kyu-Seok
    • The KIPS Transactions:PartD
    • /
    • v.16D no.6
    • /
    • pp.835-844
    • /
    • 2009
  • Today, most of all the query processors in the world generally use the 'Pipelining' method to acquire fast response time (first record latency) and less memory usage. Each of the operator nodes in the Query Execution Plan (QEP) provides Open(), Next(), and Close() functions for their interface to facilitate the iterator mechanism. However, the embedded database systems for the mobile devices, based on the FLASH memory, usually require a function like Previous(), which returns the previous records from current position. It is because that, in the embedded environment, the mobile devices cannot fully provide it main memory to store all the query results. So, whenever needed the previously read records the user (program) should re-fetch the previous records using the Previous() function: the BACKWARD data fetch. In this paper, I introduce the 'Direction Switching Problem' caused by the Previous() function and suggest 'Block-wise Skipping' method to fully utilize the benefits of the block-based data transfer mechanism, which is widely accepted by most of the today's relational database management systems.

Improving Multi-DNN Computational Performance of Embedded Multicore Processors through a Global Queue (글로벌 큐를 통한 임베디드 멀티코어 프로세서의 멀티 DNN 연산 성능 향상)

  • Cho, Ho-jin;Kim, Myung-sun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.6
    • /
    • pp.714-721
    • /
    • 2020
  • DNN is expanding its use in embedded systems such as robots and autonomous vehicles. For high recognition accuracy, computational complexity is greatly increased, and multiple DNNs are running aperiodically. Therefore, the ability processing multiple DNNs in embedded environments is a crucial issue. Accordingly, multicore based platforms are being released. However, most DNN models are operated in a batch process, and when multiple DNNs are operated in multicore together, the execution time deviation between each DNN may be large and the end-to-end execution time of the whole DNNs could be long depending on how they are allocated to the cores. In this paper, we solve these problems by providing a framework that decompose each DNN into individual layers and then distribute to multicores through a global queue. As a result of the experiment, the total DNN execution time was reduced by 31%, and when operating multiple identical DNNs, the deviation in execution time was reduced by up to 95.1%.

Gamma/neutron classification with SiPM CLYC detectors using frequency-domain analysis for embedded real-time applications

  • Ivan Rene Morales;Maria Liz Crespo;Mladen Bogovac;Andres Cicuttin;Kalliopi Kanaki;Sergio Carrato
    • Nuclear Engineering and Technology
    • /
    • v.56 no.2
    • /
    • pp.745-752
    • /
    • 2024
  • A method for gamma/neutron event classification based on frequency-domain analysis for mixed radiation environments is proposed. In contrast to the traditional charge comparison method for pulse-shape discrimination, which requires baseline removal and pulse alignment, our method does not need any preprocessing of the digitized data, apart from removing saturated traces in sporadic pile-up scenarios. It also features the identification of neutron events in the detector's full energy range with a single device, from thermal neutrons to fast neutrons, including low-energy pulses, and still provides a superior figure-of-merit for classification. The proposed frequency-domain analysis consists of computing the fast Fourier transform of a triggered trace and integrating it through a simplified version of the transform magnitude components that distinguish the neutron features from those of the gamma photons. Owing to this simplification, the proposed method may be easily ported to a real-time embedded deployment based on Field-Programmable Gate Arrays or Digital Signal Processors. We target an off-the-shelf detector based on a small CLYC (Cs2LiYCl6:Ce) crystal coupled to a silicon photomultiplier with an integrated bias and preamplifier, aiming at lightweight embedded mixed radiation monitors and dosimeter applications.

Advanced Architecture using DIAM for Improved Performance of Embedded Processor (임베디드 프로세서의 성능 향상을 위한 DIAM의 진보한 아키텍처)

  • Youn, Jong-Hee;Shin, Se-Chul;Baek, You-Heung;Cho, Jeong-hun
    • The KIPS Transactions:PartA
    • /
    • v.16A no.6
    • /
    • pp.443-452
    • /
    • 2009
  • Although 32-bit architectures are becoming the norm for modern microprocessors, 16-bit ones are still employed by many low-end processors, for which small size and low power consumption are of high priority. However, 16-bit architectures have a critical disadvantage for embedded processors that they do not provide enough encoding space to add special instructions coined for certain applications. To overcome this, many existing architectures adopt non-orthogonal, irregular instruction sets to accommodate a variety of unusual addressing modes. In general, these non-orthogonal architectures are regarded compiler-unfriendly as they tend to requires extremely sophisticated compiler techniques for optimal code generation. To address this issue, we proposed a compiler-friendly processor with a new addressing mode, called the dynamic implied addressing mode(DIAM). In this paper, we will demonstrate that the DIAM provides more encoding space for our 16-bit processor so that we are able to support more instructions specially customized for our applications. And we will explain the advanced architecture which has improved performance. In our experiment, the proposed architecture shows 11.6% performance increase on average, as compared to the basic architecture.

Compact Field Remapping for Dynamically Allocated Structures (동적으로 할당된 구조체를 위한 압축된 필드 재배치)

  • Kim, Jeong-Eun;Han, Hwan-Soo
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.10
    • /
    • pp.1003-1012
    • /
    • 2005
  • The most significant difference of embedded systems from general purpose systems is that embedded systems are allowed to use only limited resources including battery and memory. Especially, the number of applications increases which deal with multimedia data. In those systems with high data computations, the delay of memory access is one of the major bottlenecks hurting the system performance. As a result, many researchers have investigated various techniques to reduce the memory access cost. Most programs generally have locality in memory references. Temporal locality of references means that a resource accessed at one point will be used again in the near future. Spatial locality of references is that likelihood of using a resource gets higher if resources near it were just accessed. The latest embedded processors usually adapt cache memory to exploit these two types of localities. Processors access faster cache memory than off-chip memory, reducing the latency. In this paper we will propose the enhanced dynamic allocation technique for structure-type data in order to eliminate unused memory space and to reduce both the cache miss rate and the application execution time. The proposed approach aggregates fields from multiple records dynamically allocated and consecutively remaps them on the memory space. Experiments on Olden benchmarks show $13.9\%$ L1 cache miss rate drop and $15.9\%$ L2 cache miss drop on average, compared to the previously proposed techniques. We also find execution time reduced by $10.9\%$ on average, compared to the previous work.

A Power-aware Branch Predictor for Embedded Processors (내장형 프로세서를 위한 저전력 분기 예측기 설계 기법)

  • Kim, Cheol-Hong;Song, Sung-Gun
    • The KIPS Transactions:PartA
    • /
    • v.14A no.6
    • /
    • pp.347-356
    • /
    • 2007
  • In designing a branch predictor, in addition to accuracy, microarchitects should consider power consumption, especially for embedded processors. This paper proposes a power-aware branch predictor, which is based on the gshare predictor, by accessing the BTB (Branch Target Buffer) only when the prediction from the PHT (Pattern History Table) is taken. To enable the selective access to the BTB, the PHT in the proposed branch predictor is accessed one cycle earlier than the traditional PHT to prevent the additional delay. As a side effect, two predictions from the PHT are obtained through one access to the PHT, which leads to more power savings. The proposed branch predictor reduces the power consumption, not requiring any additional storage arrays, not incurring additional delay (except just one MUX delay) and never harming accuracy. Simulation results show that the proposed predictor reduces the power consumption by $35{\sim}48%$ compared to the traditional predictor.