• Title/Summary/Keyword: Parallel-Machine Scheduling

Search Result 89, Processing Time 0.025 seconds

A Study on Machine Learning Compiler and Modulo Scheduler (머신러닝 컴파일러와 모듈로 스케쥴러에 관한 연구)

  • Doosan Cho
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.27 no.1
    • /
    • pp.87-95
    • /
    • 2024
  • This study is on modulo scheduling algorithms for multicore processor in machine learning applications. Machine learning algorithms are designed to perform a large amount of operations such as vectors and matrices in order to quickly process large amounts of data stream. To support such large amounts of computations, processor architectures to support applications such as artificial intelligence, neural networks, and machine learning are designed in the form of parallel processing such as multicore. To effectively utilize these multi-core hardware resources, various compiler techniques are being used and studied. In this study, among these compiler techniques, we analyzed the modular scheduler, which is especially important in one core's computation pipeline. This paper looked at and compared the iterative modular scheduler and the swing modular scheduler, which are the most widely used and studied. As a result, both schedulers provided similar performance results, and when measuring register pressure as an indicator, it was confirmed that the swing modulo scheduler provided slightly better performance. In this study, a technique that divides recurrence edge is proposed to improve the minimum initiation interval of the modulo schedulers.

A Genetic Algorithm for Production Scheduling of Biopharmaceutical Contract Manufacturing Products (바이오의약품 위탁생산 일정계획 수립을 위한 유전자 알고리즘)

  • Ji-Hoon Kim;Jeong-Hyun Kim;Jae-Gon Kim
    • The Journal of Bigdata
    • /
    • v.9 no.1
    • /
    • pp.141-152
    • /
    • 2024
  • In the biopharmaceutical contract manufacturing organization (CMO) business, establishing a production schedule that satisfies the due date for various customer orders is crucial for competitiveness. In a CMO process, each order consists of multiple batches that can be allocated to multiple production lines in small batch units for parallel production. This study proposes a meta-heuristic algorithm to establish a scheduling plan that minimizes the total delivery delay of orders in a CMO process with identical parallel machine. Inspired by biological evolution, the proposed algorithm generates random data structures similar to chromosomes to solve specific problems and effectively explores various solutions through operations such as crossover and mutation. Based on real-world data provided by a domestic CMO company, computer experiments were conducted to verify that the proposed algorithm produces superior scheduling plans compared to expert algorithms used by the company and commercial optimization packages, within a reasonable computation time.

Scheduling and Load Balancing Methods of Multithread Parallel Linear Solver of Finite Element Structural Analysis (유한요소 구조해석 다중쓰레드 병렬 선형해법의 스케쥴링 및 부하 조절 기법 연구)

  • Kim, Min Ki;Kim, Seung Jo
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.42 no.5
    • /
    • pp.361-367
    • /
    • 2014
  • In this paper, task scheduling and load balancing methods of multifrontal solution methods of finite element structural analysis in a modern multicore machine are introduced. Many structural analysis problems have generally irregular grid and many kinds of properties and materials. These irregularities and heterogeneities lead to bottleneck of parallelization and cause idle time to analysis. Therefore, task scheduling and load balancing are desired to reduce inefficiency. Several kinds of multithreaded parallelization methods are presented and comparison between static and dynamic task scheduling are shown. To reduce the idle time caused by irregular partitioned subdomains, computational load balancing methods, Balancing all tasks and minmax task pairing balancing, are invented. Theoretical and actual elapsed time are shown and the reason of their performance gap are discussed.

A Genetic Algorithm and Discrete-Event Simulation Approach to the Dynamic Scheduling (유전 알고리즘과 시뮬레이션을 통한 동적 스케줄링)

  • Yoon, Sanghan;Lee, Jonghwan;Jung, Gwan-Young;Lee, Hyunsoo;Wie, Doyeong;Jeong, Jiyong;Seo, Yeongbok
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.36 no.4
    • /
    • pp.116-122
    • /
    • 2013
  • This study develops a dynamic scheduling model for parallel machine scheduling problem based on genetic algorithm (GA). GA combined with discrete event simulation to minimize the makespan and verifies the effectiveness of the developed model. This research consists of two stages. In the first stage, work sequence will be generated using GA, and the second stage developed work schedule applied to a real work area to verify that it could be executed in real work environment and remove the overlapping work, which causes bottleneck and long lead time. If not, go back to the first stage and develop another schedule until satisfied. Small size problem was experimented and suggested a reasonable schedule within limited resources. As a result of this research, work efficiency is increased, cycle time is decreased, and due date is satisfied within existed resources.

A Heuristic for parallel Machine Scheduling Depending on Job Characteristics (작업의 특성에 종속되는 병렬기계의 일정계획을 위한 발견적 기법)

  • 이동현;이경근;김재균;박창권;장길상
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.17 no.1
    • /
    • pp.41-41
    • /
    • 1992
  • In the real world situations that some jobs need be processed only on certain limited machines frequently occur due to the capacity restrictions of machines such as tools fixtures or material handling equipment. In this paper we consider n-job non-preemptive and m parallel machines scheduling problem having two machines group. The objective function is to minimize the sum of earliness and tardiness with different release times and due dates. The problem is formulated as a mixed integer programming problem. The problem is proved to be Np-complete. Thus a heuristic is developed to solve this problem. To illustrate its suitability and efficiency a proposed heuristic is compared with a genetic algorithm and tabu search for a large number of randomly generated test problems in ship engine assembly shop. Through the experimental results it is showed that the proposed algorithm yields good solutions efficiently.

Improved Dispatching Algorithm for Satisfying both Quality and Due Date (품질과 납기를 동시에 만족하는 작업투입 개선에 관한 연구)

  • Yoon, Ji-Myoung;Ko, Hyo-Heon;Baek, Jong-Kwan;Kim, Sung-Shick
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.9 no.6
    • /
    • pp.1838-1855
    • /
    • 2008
  • The manufacturing industry seeks for improvements in efficiency at the manufacturing process. This paper presents a method for effective real time dispatching for parallel machines with multi product that minimizes mean tardiness and maximizes the quality of the product. What is shown in this paper is that using the Rolling Horizon Tabu search method in the real time dispatching process, mean tardiness can be reduced to the minimum. The effectiveness of the method presented in this paper has been examined in the simulation and compared with other dispatching methods. In fact, using this method manufacturing companies can increase profits and improve customer satisfaction as well.

GPGPU Task Management Technique to Mitigate Performance Degradation of Virtual Machines due to GPU Operation in Cloud Environments (클라우드 환경에서 GPU 연산으로 인한 가상머신의 성능 저하를 완화하는 GPGPU 작업 관리 기법)

  • Kang, Jihun;Gil, Joon-Min
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.9 no.9
    • /
    • pp.189-196
    • /
    • 2020
  • Recently, GPU cloud computing technology applying GPU(Graphics Processing Unit) devices to virtual machines is widely used in the cloud environment. In a cloud environment, GPU devices assigned to virtual machines can perform operations faster than CPUs through massively parallel processing, which can provide many benefits when operating high-performance computing services in a variety of fields in a cloud environment. In a cloud environment, a GPU device can help improve the performance of a virtual machine, but the virtual machine scheduler, which is based on the CPU usage time of a virtual machine, does not take into account GPU device usage time, affecting the performance of other virtual machines. In this paper, we test and analyze the performance degradation of other virtual machines due to the virtual machine that performs GPGPU(General-Purpose computing on Graphics Processing Units) task in the direct path based GPU virtualization environment, which is often used when assigning GPUs to virtual machines in cloud environments. Then to solve this problem, we propose a GPGPU task management method for a virtual machine.

Machine Scheduling Models Based on Reinforcement Learning for Minimizing Due Date Violation and Setup Change (납기 위반 및 셋업 최소화를 위한 강화학습 기반의 설비 일정계획 모델)

  • Yoo, Woosik;Seo, Juhyeok;Kim, Dahee;Kim, Kwanho
    • The Journal of Society for e-Business Studies
    • /
    • v.24 no.3
    • /
    • pp.19-33
    • /
    • 2019
  • Recently, manufacturers have been struggling to efficiently use production equipment as their production methods become more sophisticated and complex. Typical factors hindering the efficiency of the manufacturing process include setup cost due to job change. Especially, in the process of using expensive production equipment such as semiconductor / LCD process, efficient use of equipment is very important. Balancing the tradeoff between meeting the deadline and minimizing setup cost incurred by changes of work type is crucial planning task. In this study, we developed a scheduling model to achieve the goal of minimizing the duedate and setup costs by using reinforcement learning in parallel machines with duedate and work preparation costs. The proposed model is a Deep Q-Network (DQN) scheduling model and is a reinforcement learning-based model. To validate the effectiveness of our proposed model, we compared it against the heuristic model and DNN(deep neural network) based model. It was confirmed that our proposed DQN method causes less due date violation and setup costs than the benchmark methods.

Design and Implementation of An I/O System for Irregular Application under Parallel System Environments (병렬 시스템 환경하에서 비정형 응용 프로그램을 위한 입출력 시스템의 설계 및 구현)

  • No, Jae-Chun;Park, Seong-Sun;;Gwon, O-Yeong
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.26 no.11
    • /
    • pp.1318-1332
    • /
    • 1999
  • 본 논문에서는 입출력 응용을 위해 collective I/O 기법을 기반으로 한 실행시간 시스템의 설계, 구현 그리고 그 성능평가를 기술한다. 여기서는 모든 프로세서가 동시에 I/O 요구에 따라 스케쥴링하며 I/O를 수행하는 collective I/O 방안과 프로세서들이 여러 그룹으로 묶이어, 다음 그룹이 데이터를 재배열하는 통신을 수행하는 동안 오직 한 그룹만이 동시에 I/O를 수행하는 pipelined collective I/O 등의 두 가지 설계방안을 살펴본다. Pipelined collective I/O의 전체 과정은 I/O 노드 충돌을 동적으로 줄이기 위해 파이프라인된다. 이상의 설계 부분에서는 동적으로 충돌 관리를 위한 지원을 제공한다. 본 논문에서는 다른 노드의 메모리 영역에 이미 존재하는 데이터를 재 사용하여 I/O 비용을 줄이기 위해 collective I/O 방안에서의 소프트웨어 캐슁 방안과 두 가지 모형에서의 chunking과 온라인 압축방안을 기술한다. 그리고 이상에서 기술한 방안들이 입출력을 위해 높은 성능을 보임을 기술하는데, 이 성능결과는 Intel Paragon과 ASCI/Red teraflops 기계 상에서 실험한 것이다. 그 결과 응용 레벨에서의 bandwidth는 peak point가 55%까지 측정되었다.Abstract In this paper we present the design, implementation and evaluation of a runtime system based on collective I/O techniques for irregular applications. We present two designs, namely, "Collective I/O" and "Pipelined Collective I/O". In the first scheme, all processors participate in the I/O simultaneously, making scheduling of I/O requests simpler but creating a possibility of contention at the I/O nodes. In the second approach, processors are grouped into several groups, so that only one group performs I/O simultaneously, while the next group performs communication to rearrange data, and this entire process is pipelined to reduce I/O node contention dynamically. In other words, the design provides support for dynamic contention management. Then we present a software caching method using collective I/O to reduce I/O cost by reusing data already present in the memory of other nodes. Finally, chunking and on-line compression mechanisms are included in both models. We demonstrate that we can obtain significantly high-performance for I/O above what has been possible so far. The performance results are presented on an Intel Paragon and on the ASCI/Red teraflops machine. Application level I/O bandwidth up to 55% of the peak is observed.he peak is observed.