• Title/Summary/Keyword: Algorithm Execution time

Search Result 559, Processing Time 0.022 seconds

Quick Semi-Buddy Scheme for Dynamic Storage Allocation in Real-Time Systems (실시간 시스템에서의 동적 스토리지 할당을 위한 빠른 수정 이진 버디 기법)

  • 이영재;추현승;윤희용
    • Journal of the Korea Society for Simulation
    • /
    • v.11 no.3
    • /
    • pp.23-34
    • /
    • 2002
  • Dynamic storage allocation (DSA) is a field fairly well studied for a long time as a basic problem of system software area. Due to memory fragmentation problem of DSA and its unpredictable worst case execution time, real-time system designers have believed that DSA may not be promising for real-time application service. Recently, the need for an efficient DSA algorithm is widely discussed and the algorithm is considered to be very important in the real-time system. This paper proposes an efficient DSA algorithm called QSB (quick semi-buddy) which is designed to be suitable for real-time environment. QSB scheme effectively maintains free lists based on quick-fit approach to quickly accommodate small and frequent memory requests, and the other free lists devised with adaptation upon a typical binary buddy mechanism for bigger requests in harmony for the .improved performance. Comprehensive simulation results show that the proposed scheme outperforms QHF which is known to be effective in terms of memory fragmentation up to about 16%. Furthermore, the memory allocation failure ratio is significantly decreased and the worst case execution time is predictable.

  • PDF

(PMU (Performance Monitoring Unit)-Based Dynamic XIP(eXecute In Place) Technique for Embedded Systems) (내장형 시스템을 위한 PMU (Performance Monitoring Unit) 기반 동적 XIP (eXecute In Place) 기법)

  • Kim, Dohun;Park, Chanik
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.3 no.3
    • /
    • pp.158-166
    • /
    • 2008
  • These days, mobile embedded systems adopt flash memory capable of XIP feature since they can reduce memory usage, power consumption, and software load time. XIP provides direct access to ROM and flash memory for processors. However, using XIP incurs unnecessary degradation of applications' performance because direct access to ROM and flash memory shows more delay than that to main memory. In this paper, we propose a memory management framework, dynamic XIP, which can resolve the performance degradation of using XIP. Using a constrained RAM cache, dynamic XIP can dynamically change XIP region according to page access pattern to reduce performance degradation in execution time or energy consumption resulting from native XIP problem. The proposed framework consists of a page profiler gathering applications' memory access pattern using PMU and an XIP manager deciding that a page is accessed whether in main memory or in flash memory. The proposed framework is implemented and evaluated in Linux kernel. Our evaluation shows that our framework can reduce execution time at most 25% and energy consumption at most 22% compared with using XIP-only case adopted in general mobile embedded systems. Moreover, the evaluation shows that in execution time and energy consumption, our modified LRU algorithm with code page filters can reduce more than at most 90% and 80% respectively compared with applying just existing LRU algorithm to dynamic XIP.

  • PDF

High Throughput Parallel KMP Algorithm Considering CPU-GPU Memory Hierarchy (CPU-GPU 메모리 계층을 고려한 고처리율 병렬 KMP 알고리즘)

  • Park, Soeun;Kim, Daehee;Lee, Myungho;Park, Neungsoo
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.67 no.5
    • /
    • pp.656-662
    • /
    • 2018
  • Pattern matching algorithm is widely used in many application fields such as bio-informatics, intrusion detection, etc. Among many string matching algorithms, KMP (Knuth-Morris-Pratt) algorithm is commonly used because of its fast execution time when using large texts. However, the processing speed of KMP algorithm is also limited when the text size increases significantly. In this paper, we propose a high throughput parallel KMP algorithm considering CPU-GPU memory hierarchy based on OpenCL in GPGPU (General Purpose computing on Graphic Processing Unit). We focus on the optimization for the allocation of work-times and work-groups, the local memory copy of the pattern data and the failure table, and the overlapping of the data transfer with the string matching operations. The experimental results show that the execution time of the optimized parallel KMP algorithm is about 3.6 times faster than that of the non-optimized parallel KMP algorithm.

A code optimization algorithm by the loop fusion on RISC complilers (RISC 컴파일러 상에서의 루프 합치기에 의한 코드 최적화 알고리즘)

  • 이철원;임인칠
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.33B no.4
    • /
    • pp.148-155
    • /
    • 1996
  • A loop structure optimization algorithm is proposed for generting a set of efficient codes for loop structure in order to optimize RISC compiler codes. Since there are so many loop structure in the program, most of the execution time is used to process looping codes. Thus, reduction of loop instructions is more effective than optimizing codes outside the loop. The proposed algorithm presents a method to combine several different loops into a simple loop. Therefore, rather than executing each loop independently, loops in the program are serached, analyzed, and finally created some relative informtion such as dependency and range. In doing so, the loops in the program can efficiently be recombined and restructured. As a result, the overall execution time for the program of the sequential programming language is reduced.

  • PDF

Evaluation of fault coverage of digital circutis using initializability of flipflops (플립플롭의 초기화 가능성을 고려한 디지탈 회로에 대한 고장 검출율의 평가 기법)

  • 민형복;김신택;이재훈
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.35C no.4
    • /
    • pp.11-20
    • /
    • 1998
  • Fault simulatior has been used to compute exact fault coverages of test vectors for digial circuits. But it is time consuming because execution time is proportional to square of circuit size. Recently, several algorithms for testability analysis have been published to cope with these problems. COP is very fast and accurate but cannot be used for sequential circuits, while STAFAN can be used for sequential circuits but needs vast amount of execution time due to good circuit simulation. We proposed EXTASEC which gave fast and accurate fault coverage. But it shows noticeable errors for a few sequential circuits. In this paper, it is shown that the inaccuracy is due to uninitializble flipflops, and we propose ITEM to improve the EXTASEC algorithm. ITEM is an improved evaluation method of fault coverage by analysis of backward lines and uninitializable flipflops. It is expected to perform efficiently for very large circuits where execution time is critical.

  • PDF

GPU-Based Acceleration of Quantum-Inspired Evolutionary Algorithm (GPU를 이용한 Quantum-Inspired Evolutionary Algorithm 가속)

  • Ryoo, Ji-Hyun;Park, Han-Min;Choi, Ki-Young
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.49 no.8
    • /
    • pp.1-9
    • /
    • 2012
  • Quantum-Inspired Evolutionary Algorithm(QEA) contains sufficient data-level parallelism to be naturally accelerated on GPUs. For an efficient reduction of execution time, however, careful task-mapping should be done to properly reflect the characteristics of CPU and GPU. Furthermore, when deciding which part of the application should run on GPU, we need to consider the data transfer between CPU and GPU memory spaces as well as the data-level parallelism. In addition, the usage of zero-copy host memory, proper choice of the execution configuration, and thread organization considering memory coalescing is important to further reduce the execution time. With all these techniques, we could run QEA 3.69 times faster on average in comparison with the multi-threading CPU for the case of 0-1 knapsack problem with 30,000 items.

Methods to Reduce Execution Time of Ontology Reasoners based on Tableaux Algorithm (태블로 알고리즘 기반 온톨로지 추론 엔진의 속도 향상을 위한 방법)

  • Kim, Je-Min;Park, Young-Tack
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.2
    • /
    • pp.153-160
    • /
    • 2009
  • As size of ontology has been increased more and more, the descriptions in the ontologies become more complicated, Therefore finding and modifying unsatisfiable concepts is hard work in ontology construction process, Minerva is an ontology reasoner which detects unsatisfiable concepts automatically and infers subsumption relation between concepts in ontology, Most description logic based ontology reasoners (including Minerva) work using tableaux algorithm, Because tableaux algorithm is very costly, ontology reasoners need various optimization methods, In this paper, we propose optimizing methods to reduce execution time of tableaux algorithm based ontology reasoner. Proposed methods were applied to Minerva which was developed as preceding study result. In consequence the new version Minerva shows high performance.

Sensor Node Deployment in Wireless Sensor Networks Based on Tabu Search Algorithm (타부 서치 알고리즘 기반의 무선 센서 네트워크에서 센서 노드 배치)

  • Jang, Kil-woong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.19 no.5
    • /
    • pp.1084-1090
    • /
    • 2015
  • In this paper, we propose a Tabu search algorithm to efficiently deploy the sensor nodes for maximizing the network sensing coverage in wireless sensor networks. As the number of the sensor nodes in wireless sensor networks increases, the amount of calculation for searching the solution would be too much increased. To obtain the best solution within a reasonable execution time in a high-density network, we propose a Tabu search algorithm to maximize the network sensing coverage. In order to search effectively, we propose some efficient neighborhood generating operations of the Tabu search algorithm. We evaluate those performances through some experiments in terms of the maximum network sensing coverage and the execution time of the proposed algorithm. The comparison results show that the proposed algorithm outperforms other existing algorithms.

Improvement of Iterative Algorithm for Live Variable Analysis based on Computation Reordering (사용할 변수의 예측에 사용되는 반복적 알고리즘의 계산순서 재정렬을 통한 수행 속도 개선)

  • Yun Jeong-Han;Han Taisook
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.8
    • /
    • pp.795-807
    • /
    • 2005
  • The classical approaches for computing Live Variable Analysis(LVA) use iterative algorithms across the entire programs based on the Data Flow Analysis framework. In case of Zephyr compiler, average execution time of LVA takes $7\%$ of the compilation time for the benchmark programs. The classical LVA algorithm has many aspects for improvement. The iterative algorithm for LVA scans useless basic blocks and calculates large sets of variables repeatedly. We propose the improvement of Iterative algorithm for LVA based on used variables' upward movement. Our algorithm produces the same result as the previous iterative algorithm. It is based on use-def chain. Reordering of applying the flow equation in DFA reduces the number of visiting basic blocks and redundant flow equation executions, which improves overall processing time. Experimental results say that our algorithm ran reduce $36.4\%\;of\;LVA\;execution\;time\;and\;2.6\%$ of overall computation time in Zephyr compiler with benchmark programs.

Feasibility Test and Scheduling Algorithm for Dynamically Created Preemptable Real-Time Tasks

  • Kim, Yong-Seok
    • Journal of Electrical Engineering and information Science
    • /
    • v.3 no.3
    • /
    • pp.396-401
    • /
    • 1998
  • An optimal algorithm is presented for feasibility test and scheduling of real-time tasks where tasks are preemptable and created dynamically. Each task has an arbitrary creation time, ready time, maximum execution time, and deadline. Feasibility test and scheduling are conducted via the same algorithm. Time complexity of the algorithm is O(n) for each newly created task where n is the number of tasks. This result improves the previous result of O(n log n). It is shown that the algorithm can be used for scheduling tasks with different levels of importance. Time complexity of the algorithm for the problem is O(n\ulcorner) which improves the previous results of O(n\ulcorner log n).

  • PDF