• Title/Summary/Keyword: GPU Sharing

Search Result 10, Processing Time 0.024 seconds

GPU Resource Contention Management Technique for Simultaneous GPU Tasks in the Container Environments with Share the GPU (GPU를 공유하는 컨테이너 환경에서 GPU 작업의 동시 실행을 위한 GPU 자원 경쟁 관리기법)

  • Kang, Jihun
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.11 no.10
    • /
    • pp.333-344
    • /
    • 2022
  • In a container-based cloud environment, multiple containers can share a graphical processing unit (GPU), and GPU sharing can minimize idle time of GPU resources and improve resource utilization. However, in a cloud environment, GPUs, unlike CPU or memory, cannot logically multiplex computing resources to provide users with some of the resources in an isolated form. In addition, containers occupy GPU resources only when performing GPU operations, and resource usage is also unknown because the timing or size of each container's GPU operations is not known in advance. Containers unrestricted use of GPU resources at any given point in time makes managing resource contention very difficult owing to where multiple containers run GPU tasks simultaneously, and GPU tasks are handled in black box form inside the GPU. In this paper, we propose a container management technique to prevent performance degradation caused by resource competition when multiple containers execute GPU tasks simultaneously. Also, this paper demonstrates the efficiency of container management techniques that analyze and propose the problem of degradation due to resource competition when multiple containers execute GPU tasks simultaneously through experiments.

KAWS: Coordinate Kernel-Aware Warp Scheduling and Warp Sharing Mechanism for Advanced GPUs

  • Vo, Viet Tan;Kim, Cheol Hong
    • Journal of Information Processing Systems
    • /
    • v.17 no.6
    • /
    • pp.1157-1169
    • /
    • 2021
  • Modern graphics processor unit (GPU) architectures offer significant hardware resource enhancements for parallel computing. However, without software optimization, GPUs continuously exhibit hardware resource underutilization. In this paper, we indicate the need to alter different warp scheduler schemes during different kernel execution periods to improve resource utilization. Existing warp schedulers cannot be aware of the kernel progress to provide an effective scheduling policy. In addition, we identified the potential for improving resource utilization for multiple-warp-scheduler GPUs by sharing stalling warps with selected warp schedulers. To address the efficiency issue of the present GPU, we coordinated the kernel-aware warp scheduler and warp sharing mechanism (KAWS). The proposed warp scheduler acknowledges the execution progress of the running kernel to adapt to a more effective scheduling policy when the kernel progress attains a point of resource underutilization. Meanwhile, the warp-sharing mechanism distributes stalling warps to different warp schedulers wherein the execution pipeline unit is ready. Our design achieves performance that is on an average higher than that of the traditional warp scheduler by 7.97% and employs marginal additional hardware overhead.

Direct Pass-Through based GPU Virtualization for Biologic Applications (바이오 응용을 위한 직접 통로 기반의 GPU 가상화)

  • Choi, Dong Hoon;Jo, Heeseung;Lee, Myungho
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.2
    • /
    • pp.113-118
    • /
    • 2013
  • The current GPU virtualization techniques incur large overheads when executing application programs mainly due to the fine-grain time-sharing scheduling of the GPU among multiple Virtual Machines (VMs). Besides, the current techniques lack of portability, because they include the APIs for the GPU computations in the VM monitor. In this paper, we propose a low overhead and high performance GPU virtualization approach on a heterogeneous HPC system based on the open-source Xen. Our proposed techniques are tailored to the bio applications. In our virtualization framework, we allow a VM to solely occupy a GPU once the VM is assigned a GPU instead of relying on the time-sharing the GPU. This improves the performance of the applications and the utilization of the GPUs. Our techniques also allow a direct pass-through to the GPU by using the IOMMU virtualization features embedded in the hardware for the high portability. Experimental studies using microbiology genome analysis applications show that our proposed techniques based on the direct pass-through significantly reduce the overheads compared with the previous Domain0 based approaches. Furthermore, our approach closely matches the performance for the applications to the bare machine or rather improves the performance.

GPU Memory Management Technique to Improve the Performance of GPGPU Task of Virtual Machines in RPC-Based GPU Virtualization Environments (RPC 기반 GPU 가상화 환경에서 가상머신의 GPGPU 작업 성능 향상을 위한 GPU 메모리 관리 기법)

  • Kang, Jihun
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.10 no.5
    • /
    • pp.123-136
    • /
    • 2021
  • RPC (Remote Procedure Call)-based Graphics Processing Unit (GPU) virtualization technology is one of the technologies for sharing GPUs with multiple user virtual machines. However, in a cloud environment, unlike CPU or memory, general GPUs do not provide a resource isolation technology that can limit the resource usage of virtual machines. In particular, in an RPC-based virtualization environment, since GPU tasks executed in each virtual machine are performed in the form of multi-process, the lack of resource isolation technology causes performance degradation due to resource competition. In addition, the GPU memory competition accelerates the performance degradation as the resource demand of the virtual machines increases, and the fairness decreases because it cannot guarantee equal performance between virtual machines. This paper, in the RPC-based GPU virtualization environment, analyzes the performance degradation problem caused by resource contention when the GPU memory requirement of virtual machines exceeds the available GPU memory capacity and proposes a GPU memory management technique to solve this problem. Also, experiments show that the GPU memory management technique proposed in this paper can improve the performance of GPGPU tasks.

A design of GPU container co-execution framework measuring interference among applications (GPU 컨테이너 동시 실행에 따른 응용의 간섭 측정 프레임워크 설계)

  • Kim, Sejin;Kim, Yoonhee
    • KNOM Review
    • /
    • v.23 no.1
    • /
    • pp.43-50
    • /
    • 2020
  • As General Purpose Graphics Processing Unit (GPGPU) recently plays an essential role in high-performance computing, several cloud service providers offer GPU service. Most cluster orchestration platforms in a cloud environment using containers allocate the integer number of GPU to jobs and do not allow a node shared with other jobs. In this case, resource utilization of a GPU node might be low if a job does not intensively require either many cores or large size of memory in GPU. GPU virtualization brings opportunities to realize kernel concurrency and share resources. However, performance may vary depending on characteristics of applications running concurrently and interference among them due to resource contention on a node. This paper proposes GPU container co-execution framework with multiple server creation and execution based on Kubernetes, container orchestration platform for measuring interference which may be occurred by sharing GPU resources. Performance changes according to scheduling policies were investigated by executing several jobs on GPU. The result shows that optimal scheduling is not possible only considering GPU memory and computing resource usage. Interference caused by co-execution among applications is measured using the framework.

Performance Management Technique of Remote VR Service for Multiple Users in Container-Based Cloud Environments Sharing GPU (GPU를 공유하는 컨테이너 기반 클라우드 환경에서 다수의 사용자를 위한 원격 VR 서비스의 성능 관리 기법)

  • Kang, Jihun
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.11 no.1
    • /
    • pp.9-22
    • /
    • 2022
  • Virtual Reality(VR) technology is an interface technology that is actively used in various audio-visual-based applications by showing users a virtual world composed of computer graphics. Since VR-based applications are graphic processing-based applications, expensive computing devices equipped with Graphics Processing Unit(GPU) are essential for graphic processing. This incurs a cost burden on VR application users for maintaining and managing computing devices, and as one of the solutions to this, a method of operating services in cloud environments is being used. This paper proposes a performance management technique to address the problem of performance interference between containers owing to GPU resource competition in container-based high-performance cloud environments in which multiple containers share a single GPU. The proposed technique reduces performance deviation due to performance interference, helping provide uniform performance-based remote VR services for users. In addition, this paper verifies the efficiency of the proposed technique through experiments.

A Execution Performance Analysis of Applications using Multi-Process Service over GPU (다중 프로세스 서비스를 이용한 GPU 응용 동시 실행 성능 분석)

  • Kim, Se-Jin;Oh, Ji-Sun;Kim, Yoonhee
    • KNOM Review
    • /
    • v.22 no.1
    • /
    • pp.60-67
    • /
    • 2019
  • Graphical Processing Units(GPUs) achieve high performance undertaking from relatively uniformed computation in parallel. The technology related to General Purpose GPU(GPGPU) has been enhanced, which provides concurrent kernel execution of multi and diverse applications at the same time, but it is still limited to support resource sharing or planning. NVIDIA recently introduces Multi-Process Service(MPS), which allows kernels from different applications can be execute concurrently. However, the strength of MPS comes along with the characteristics of applications and the order of their execution. This paper shows the performance analysis of diverse scientific applications in real world. Based on the analysis, we prove that it is important to the identify characteristics of co-run applications, and to schedule multiple applications via profiling to maximize MPS functionality.

An Analysis of Existing Studies on Parallel and Distributed Processing of the Rete Algorithm (Rete 알고리즘의 병렬 및 분산 처리에 관한 기존 연구 분석)

  • Kim, Jaehoon
    • The Journal of Korean Institute of Information Technology
    • /
    • v.17 no.7
    • /
    • pp.31-45
    • /
    • 2019
  • The core technologies for intelligent services today are deep learning, that is neural networks, and parallel and distributed processing technologies such as GPU parallel computing and big data. However, for intelligent services and knowledge sharing services through globally shared ontologies in the future, there is a technology that is better than the neural networks for representing and reasoning knowledge. It is a knowledge representation of IF-THEN in RIF or SWRL, which is the standard rule language of the Semantic Web, and can be inferred efficiently using the rete algorithm. However, when the number of rules processed by the rete algorithm running on a single computer is 100,000, its performance becomes very poor with several tens of minutes, and there is an obvious limitation. Therefore, in this paper, we analyze the past and current studies on parallel and distributed processing of rete algorithm, and examine what aspects should be considered to implement an efficient rete algorithm.

CANVAS: A Cloud-based Research Data Analytics Environment and System

  • Kim, Seongchan;Song, Sa-kwang
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.10
    • /
    • pp.117-124
    • /
    • 2021
  • In this paper, we propose CANVAS (Creative ANalytics enVironment And System), an analytics system of the National Research Data Platform (DataON). CANVAS is a personalized analytics cloud service for researchers who need computing resources and tools for research data analysis. CANVAS is designed in consideration of scalability based on micro-services architecture and was built on top of open-source software such as eGovernment Standard framework (Spring framework), Kubernetes, and JupyterLab. The built system provides personalized analytics environments to multiple users, enabling high-speed and large-capacity analysis by utilizing high-performance cloud infrastructure (CPU/GPU). More specifically, modeling and processing data is possible in JupyterLab or GUI workflow environment. Since CANVAS shares data with DataON, the research data registered by users or downloaded data can be directly processed in the CANVAS. As a result, CANVAS enhances the convenience of data analysis for users in DataON and contributes to the sharing and utilization of research data.

Deriving adoption strategies of deep learning open source framework through case studies (딥러닝 오픈소스 프레임워크의 사례연구를 통한 도입 전략 도출)

  • Choi, Eunjoo;Lee, Junyeong;Han, Ingoo
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.4
    • /
    • pp.27-65
    • /
    • 2020
  • Many companies on information and communication technology make public their own developed AI technology, for example, Google's TensorFlow, Facebook's PyTorch, Microsoft's CNTK. By releasing deep learning open source software to the public, the relationship with the developer community and the artificial intelligence (AI) ecosystem can be strengthened, and users can perform experiment, implementation and improvement of it. Accordingly, the field of machine learning is growing rapidly, and developers are using and reproducing various learning algorithms in each field. Although various analysis of open source software has been made, there is a lack of studies to help develop or use deep learning open source software in the industry. This study thus attempts to derive a strategy for adopting the framework through case studies of a deep learning open source framework. Based on the technology-organization-environment (TOE) framework and literature review related to the adoption of open source software, we employed the case study framework that includes technological factors as perceived relative advantage, perceived compatibility, perceived complexity, and perceived trialability, organizational factors as management support and knowledge & expertise, and environmental factors as availability of technology skills and services, and platform long term viability. We conducted a case study analysis of three companies' adoption cases (two cases of success and one case of failure) and revealed that seven out of eight TOE factors and several factors regarding company, team and resource are significant for the adoption of deep learning open source framework. By organizing the case study analysis results, we provided five important success factors for adopting deep learning framework: the knowledge and expertise of developers in the team, hardware (GPU) environment, data enterprise cooperation system, deep learning framework platform, deep learning framework work tool service. In order for an organization to successfully adopt a deep learning open source framework, at the stage of using the framework, first, the hardware (GPU) environment for AI R&D group must support the knowledge and expertise of the developers in the team. Second, it is necessary to support the use of deep learning frameworks by research developers through collecting and managing data inside and outside the company with a data enterprise cooperation system. Third, deep learning research expertise must be supplemented through cooperation with researchers from academic institutions such as universities and research institutes. Satisfying three procedures in the stage of using the deep learning framework, companies will increase the number of deep learning research developers, the ability to use the deep learning framework, and the support of GPU resource. In the proliferation stage of the deep learning framework, fourth, a company makes the deep learning framework platform that improves the research efficiency and effectiveness of the developers, for example, the optimization of the hardware (GPU) environment automatically. Fifth, the deep learning framework tool service team complements the developers' expertise through sharing the information of the external deep learning open source framework community to the in-house community and activating developer retraining and seminars. To implement the identified five success factors, a step-by-step enterprise procedure for adoption of the deep learning framework was proposed: defining the project problem, confirming whether the deep learning methodology is the right method, confirming whether the deep learning framework is the right tool, using the deep learning framework by the enterprise, spreading the framework of the enterprise. The first three steps (i.e. defining the project problem, confirming whether the deep learning methodology is the right method, and confirming whether the deep learning framework is the right tool) are pre-considerations to adopt a deep learning open source framework. After the three pre-considerations steps are clear, next two steps (i.e. using the deep learning framework by the enterprise and spreading the framework of the enterprise) can be processed. In the fourth step, the knowledge and expertise of developers in the team are important in addition to hardware (GPU) environment and data enterprise cooperation system. In final step, five important factors are realized for a successful adoption of the deep learning open source framework. This study provides strategic implications for companies adopting or using deep learning framework according to the needs of each industry and business.