• Title/Summary/Keyword: processor allocation

Search Result 65, Processing Time 0.022 seconds

A Processor Allocation Policy using Program Characteristics on Shared Bus (공유 버스상에서 프로그램 특성을 사용한 프로세서 할당 정책)

  • Jeong, In-Beom;Lee, Jun-Won
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.26 no.9
    • /
    • pp.1073-1082
    • /
    • 1999
  • 본 논문에서는 시스템 내의 프로세서들을 효과적으로 사용하기 위한 적응적 프로세서 할당 정책을 제안한다. 프로그램의 병렬성을 향상시키기 위하여 일반적으로 병렬 처리에 사용될 프로세서 개수를 증가시킨다. 그러나 증가된 프로세서들은 그레인 크기에 변화를 일으키며 이는 캐쉬 성능에 영향을 미친다. 특히 대역이 제한된 공유 버스를 사용하는 시스템에서는 프로세서 개수의 증가는 공유 버스에 대한 접근 경쟁을 크게 증가하므로 버스에서 대기하는 시간이 프로세서 증가에 의한 계산 능력 이득을 상쇄시키는 주요한 원인이 되고 있다. 본 논문에서 제안한 적응적 프로세서 할당 정책은 프로그램이 수행되는 도중에 임의의 기간동안 공유버스에 대기중인 프로세서 분포에 관한 정보를 얻는다. 그리고 이 정보를 바탕으로 프로세서 개수를 변경하는 방법이다. 모의 시험에서 적응적 프로세서 할당 정책은 프로그램들의 버스 트래픽 특성에 따른 최적의 적합한 프로세서 개수를 발견함을 보인다. 그리고 적응적 프로세서 할당 정책은 고정된 프로세서 개수를 사용한 가장 좋은 성능보다는 다소 떨어진 성능을 나타내었으나 시스템의 프로세서 활용성을 높여 효과적 시스템 사용에 기여함을 보인다. Abstract In this paper, the adaptive processor allocation policy is suggested to make effective use of processors in system. To enhance the parallelism, the number of processors used in the parallel computing may be increased. However, increasing the number of processors affects the grain size of the parallel program. Therefore, it affects the cache performance. In particular, when the shared bus is employed, since increasing the number of processors can result in a significant amount of contention to achieve the shared-bus, the increased computing power is offset by the bus waiting time due to these contentions. The adaptive processor allocation policy acquires the information about the distribution of waiting processors on shared bus for any execution period of programs. And it changes the number of processors working in parallel processing during the program's run. Our simulation results show that the adaptive processor allocation policy finds the optimum feasible number of processors based on the bus traffic characteristic of programs. Thus, it contributes to effective system utilization, even though it performs slightly less efficiently than using a fixed number of processors with the best performance.

Enhancing the Performance of Multiple Parallel Applications using Heterogeneous Memory on the Intel's Next-Generation Many-core Processor (인텔 차세대 매니코어 프로세서에서의 다중 병렬 프로그램 성능 향상기법 연구)

  • Rho, Seungwoo;Kim, Seoyoung;Nam, Dukyun;Park, Geunchul;Kim, Jik-Soo
    • Journal of KIISE
    • /
    • v.44 no.9
    • /
    • pp.878-886
    • /
    • 2017
  • This paper discusses performance bottlenecks that may occur when executing high-performance computing MPI applications in the Intel's next generation many-core processor called Knights Landing(KNL), as well as effective resource allocation techniques to solve this problem. KNL is composed of a host processor to enable self-booting in addition to an existing accelerator consisting of a many-core processor, and it was released with a new type of on-package memory with improved bandwidth on top of existing DDR4 based memory. We empirically verified an improvement of the execution performance of multiple MPI applications and the overall system utilization ratio by studying a resource allocation method optimized for such new many-core processor architectures.

A Dynamic Resource Allocation scheme with a GPS algorithm in Cellular-based Hybrid and Distributed Wireless Multi-hop Systems (셀룰라 기반의 하이브리드 분산식 멀티홉 시스템에서의 GPS 알고리즘을 이용한 동적 자원할당 기법)

  • Bae, Byung-Joo;Kim, Dong-Kun;Shin, Bong-Jhin;Kang, Byoung-Ik;Hong, Dae-Hyoung;Choe, Jin-Woo
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.32 no.11A
    • /
    • pp.1120-1127
    • /
    • 2007
  • In this paper, we propose a generalized processor sharing - dynamic resource allocation (GPS-DRA) scheme which allocates the required amount of resources to each hop dynamically in cellular-based multi-hop systems. In the hybrid-distributed system considered in this paper, a central controller such as a base station (BS) should allocate resources properly to each hop. However, due to changing channel condition with time, it is difficult to allocate as much amount of resources as each hop needs for transmission. GPS-DRA scheme allocates the required amount of resources dynamically to each hop based on the amount of resources used in previous frames by each hop. The amount of control overhead generated by GPS-DRA scheme can be very small because a central controller doesn't need to collect all link information for resource allocation. Our simulation results show that channel utilization increased about 16% and cell capacity increased about 65% compared to those of fixed resource allocation (FRA) scheme.

TCP/IP Using Minimal Resources in IoT Systems

  • Lee, Seung-Chul;Shin, Dongha
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.10
    • /
    • pp.125-133
    • /
    • 2020
  • In this paper, we design 4-layer TCP/IP that utilizes minimal memory and processor resources in Internet of Things(IoT) systems. The TCP/IP designed in this paper has the following characteristics. First, memory resource is minimized by using minimal memory allocation. Second, processor resource is minimized by using minimal memory copy. Third, the execution time of the TCP/IP can be completed in a deterministic time. Fourth, there is no memory leak problem. The standard in minimal resources for memory and processor derived in this paper can be used to check whether the network subsystems of the already implemented IoT systems are efficiently implemented. As the result of measuring the amount of memory allocation and copy of the network subsystem of Zephyr, an open source IoT kernel recently released by the Linux Foundation, we found that it was bigger than the standard in minimal resources derived in this paper. The network subsystem of Zephyr was improved according to the design proposed in this paper, confirming that the amount of memory allocation and copy were decreased by about 39% and 67%, respectively, and the execution time was also reduced by about 28%.

Algorithm for Block Packing of Main Memory Allocation Problem (주기억장치 할당 문제의 블록 채우기 알고리즘)

  • Lee, Sang-Un
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.22 no.6
    • /
    • pp.99-105
    • /
    • 2022
  • This paper deals with the problem of appropriately allocating multiple processors arriving at the ready queue to the block in the user space of the main memory is divided into blocks of variable size at compilation time. The existing allocation methods, first fit(FF), best fit(BF), worst fit(WF), and next fit(NF) methods, had the disadvantage of waiting for a specific processor because they failed to allocate all processors arriving at the ready queue. The proposed algorithm in this paper is a simple block packing algorithm that allocates as many processors as possible to the largest block by sorting the size of the partitioned blocks(holes) and the size of the processor in the ready queue in descending order. The application of the proposed algorithm to nine benchmarking experimental data showed the performance of allocating all processors while having minimal internal fragment(IF) for all eight data except one data in which the weiting processor occurs due to partition errors.

A Scheduling Method on Parallel Computation Models with Limited Number of Processors Using Genetic Algorithms (프로세서의 수가 한정되어있는 병렬계산모델에서 유전알고리즘을 이용한 스케쥴링해법)

  • 성기석;박지혁
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.23 no.2
    • /
    • pp.15-27
    • /
    • 1998
  • In the parallel processing systems, a compiler partitions a loaded program into tasks, allocates the tasks on multiple processors and schedules the tasks on each allocated processor. In this paper we suggest a Genetic Algorithm(GA) based scheduling method to find an optimal allocation and sequence of tasks on each Processor. The suggested method uses a chromosome which consists of task sequence and binary string that represent the number and order of tasks on each processor respectively. Two correction algorithms are used to maintain precedency constraints of the tasks in the chromosome. This scheduling method determines the optimal number of processors within limited numbers, and then finds the optimal schedule for each processor. A result from computational experiment of the suggested method is given.

  • PDF

A Flexible Processor Allocation Strategy for 2D Meshes (2차원 메쉬에서의 유연성 있는 프로세서 할당기법)

  • 서경희
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.10c
    • /
    • pp.656-658
    • /
    • 2000
  • 상호연결망으로 메쉬 구조를 채택한 대규모 병렬처리 시스템에 대해서 제안된 기존의 프로세서 할당기법들은 직사각형 모양의 서브메쉬 할당 기법으로 제한되어왔다. 그 결과 기존의 기법들은 심각한 시스템의 단편화를 초래하는 문제를 갖고 있다. 본 논문에서는 외부 프래그멘테이션과 작업 응답 시간을 동시에 줄이기 위해서, 단편화된 메쉬 시스템에도 적용될 수 있도록 직사각형뿐만 아니라 변형된 L자 모양의 서브메쉬를 할당하는 확장된 LSSA(L-Shaped Submesh Allocation) 기법을 제안한다. LSSA 기법에서 수행되는 모든 서브메쉬 모양의 변형들은 응용 프로그래머에서 투명성을 보장한다. 시뮬레이션 결과를 통해서 LSSA 기법이 작업 응답 시간과 시스템의 활용도 면에서 다른 기법들보다 우수함을 보인다.

  • PDF

SS-DRM: Semi-Partitioned Scheduling Based on Delayed Rate Monotonic on Multiprocessor Platforms

  • Senobary, Saeed;Naghibzadeh, Mahmoud
    • Journal of Computing Science and Engineering
    • /
    • v.8 no.1
    • /
    • pp.43-56
    • /
    • 2014
  • Semi-partitioned scheduling is a new approach for allocating tasks on multiprocessor platforms. By splitting some tasks between processors, semi-partitioned scheduling is used to improve processor utilization. In this paper, a new semi-partitioned scheduling algorithm called SS-DRM is proposed for multiprocessor platforms. The scheduling policy used in SS-DRM is based on the delayed rate monotonic algorithm, which is a modified version of the rate monotonic algorithm that can achieve higher processor utilization. This algorithm can safely schedule any system composed of two tasks with total utilization less than or equal to that on a single processor. First, it is formally proven that any task which is feasible under the rate monotonic algorithm will be feasible under the delayed rate monotonic algorithm as well. Then, the existing allocation method is extended to the delayed rate monotonic algorithm. After that, two improvements are proposed to achieve more processor utilization with the SS-DRM algorithm than with the rate monotonic algorithm. According to the simulation results, SS-DRM improves the scheduling performance compared with previous work in terms of processor utilization, the number of required processors, and the number of created subtasks.

Performance analysis of call admission control in ATM networks considering bulk arrivals services (벌크 입력과 서비스를 고려한 ATM망에서 호 수락 제어에 관한 성능 분석)

  • 서순석;박광채
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.21 no.3
    • /
    • pp.675-683
    • /
    • 1996
  • CAC, UPC, NPC, cell level QoS and congestion control is required to assign efficiently channels's BW and to prevent networks from congestion. In the CAC algorithm, each user defines characteristics of input traffic when channels are set up and network based on this parameters determines the acception or rejection of the required BW. The CAC control mechanism is classified into the centralized BW allocation mechanism and the distributed BW Allocation mechanism according to the function and position of CAC processor allocating BW. In this paper, in contrast with esisted the distributed BW allocation mechanism which assumes the required BW of input traffic as constant, we assume input traffic & serices as bulk probability distribution in order to analyze performance more precisely.

  • PDF

An Efficient Processor Allocation Scheme for Hypercube (하이퍼큐브에서의 효과적인 프로세서할당 기법)

  • Son, Yoo-Ek;Nam, Jae-Yeal
    • The Transactions of the Korea Information Processing Society
    • /
    • v.3 no.4
    • /
    • pp.781-790
    • /
    • 1996
  • processors must be allocated to incoming tasks in a way that will maximize the processor utilization and minimize the system fragmentation. Thus, an efficient method of allocating processors in a hypercube is a key to system performance. In order to achieve this goal, it is necessary to detect the availability of a subcube of required size and merge the released small cubes to form a larger ones. This paper presents the tree-exchange algorithm which detemines the levels and partners of the binary tree representation of a hypercube, and an efficient allocation strategy using the algorithm. The complexity for search time of the algorithm is $O\ulcorner$n/2$\lrcorner$$\times$2n)and it shows good performance in comparison with other strategies.

  • PDF