• Title/Summary/Keyword: vCPU

Search Result 55, Processing Time 0.028 seconds

Improving Performance of I/O Virtualization Framework based on Multi-queue SSD (다중 큐 SSD 기반 I/O 가상화 프레임워크의 성능 향상 기법)

  • Kim, Tae Yong;Kang, Dong Hyun;Eom, Young Ik
    • Journal of KIISE
    • /
    • v.43 no.1
    • /
    • pp.27-33
    • /
    • 2016
  • Virtualization has become one of the most helpful techniques in computing systems, and today it is prevalent in several computing environments including desktops, data-centers, and enterprises. However, since I/O layers are implemented to be oblivious to the I/O behaviors on virtual machines (VM), there still exists an I/O scalability issue in virtualized systems. In particular, when a multi-queue solid state drive (SSD) is used as a secondary storage, each system reveals a semantic gap that degrades the overall performance of the VM. This is due to two key problems, accelerated lock contentions and the I/O parallelism issue. In this paper, we propose a novel approach, including the design of virtual CPU (vCPU)-dedicated queues and I/O threads, which efficiently distributes the lock contentions and addresses the parallelism issue of Virtio-blk-data-plane in virtualized environments. Our approach is based on the above principle, which allocates a dedicated queue and an I/O thread for each vCPU to reduce the semantic gap. Our experimental results with various I/O traces clearly show that our design improves the I/O operations per second (IOPS) in virtualized environments by up to 155% over existing QEMU-based systems.

Implementation of the Shared Memory in the Dual Core System (Dual Core 시스템에서 Shared Memory 기능 구현)

  • Jang, Seung-Ju
    • The Journal of the Korea Contents Association
    • /
    • v.8 no.9
    • /
    • pp.27-33
    • /
    • 2008
  • This paper designs Shared Memory on the Dual Core system so that it operates a general System V IPC on the Linux O.S. Shared Memory is the technique that many processes can access to identical memory area. We treat Shared Memory which is SVR in a kernel step. We design a share memory facility of Linux operating system on the Dual Core System. In this paper the suggesting of share memory facility design plan in Dual Core system is enhance the performance in existing an unity processor system as a dual core practical use. We attemp a performance enhance in each CPU for each process which uses a share memory.

The Design of the Shared Memory in the Dual Core System (Dual Core 시스템에서 Shared Memory 기능 설계)

  • Jang, Seung-Ju;Lee, Gwang-Yong;Kim, Jae-Myeong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.8
    • /
    • pp.1448-1455
    • /
    • 2008
  • This paper designs Shared Memory on the Dual Core system so that it operates a general System V IPC on the Linux O.S. Shared Memory is the technique that many processes can access to identical memory area. We treat Shared Memory in this paper among big two branches of Shared Memory which are SVR in a kernel step format. We design a share memory facility of Linux operating system on the Dual Core System. In this paper the suggesting design plan of share memory facility in Dual Core system is enhancing the performance in existing unity processor system as a dual core practical use. We attempt a performance enhance in each CPU for each process which uses a share memory.

Low-Complexity Deeply Embedded CPU and SoC Implementation (낮은 복잡도의 Deeply Embedded 중앙처리장치 및 시스템온칩 구현)

  • Park, Chester Sungchung;Park, Sungkyung
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.17 no.3
    • /
    • pp.699-707
    • /
    • 2016
  • This paper proposes a low-complexity central processing unit (CPU) that is suitable for deeply embedded systems, including Internet of things (IoT) applications. The core features a 16-bit instruction set architecture (ISA) that leads to high code density, as well as a multicycle architecture with a counter-based control unit and adder sharing that lead to a small hardware area. A co-processor, instruction cache, AMBA bus, internal SRAM, external memory, on-chip debugger (OCD), and peripheral I/Os are placed around the core to make a system-on-a-chip (SoC) platform. This platform is based on a modified Harvard architecture to facilitate memory access by reducing the number of access clock cycles. The SoC platform and CPU were simulated and verified at the C and the assembly levels, and FPGA prototyping with integrated logic analysis was carried out. The CPU was synthesized at the ASIC front-end gate netlist level using a $0.18{\mu}m$ digital CMOS technology with 1.8V supply, resulting in a gate count of merely 7700 at a 50MHz clock speed. The SoC platform was embedded in an FPGA on a miniature board and applied to deeply embedded IoT applications.

Efficient Convolutional Neural Network with low Complexity (저연산량의 효율적인 콘볼루션 신경망)

  • Lee, Chanho;Lee, Joongkyung;Ho, Cong Ahn
    • Journal of IKEEE
    • /
    • v.24 no.3
    • /
    • pp.685-690
    • /
    • 2020
  • We propose an efficient convolutional neural network with much lower computational complexity and higher accuracy based on MobileNet V2 for mobile or edge devices. The proposed network consists of bottleneck layers with larger expansion factors and adjusted number of channels, and excludes a few layers, and therefore, the computational complexity is reduced by half. The performance the proposed network is verified by measuring the accuracy and execution times by CPU and GPU using ImageNet100 dataset. In addition, the execution time on GPU depends on the CNN architecture.

Design and Implementation of Green Coastal Lighting System for Entrance to Coastal Pier

  • Jae-Kyung Lee;Jae-Hong Yim
    • Journal of Navigation and Port Research
    • /
    • v.47 no.2
    • /
    • pp.85-92
    • /
    • 2023
  • The hardware of an LED lighting control system for coastal lighting at coastal pier entrance consists of a power supply unit, an AVR control unit, a CLCD output unit, an LED control unit, a scenario selection switch unit, and an operation speed display unit. It is made of an 8-channel. The CPU used ATmega128 and the FET was used to control the current signal. To operate the CPU, DC 12V was converted to DC 5V using a regulator 7805. A heat sink was used to remove heat generated in the FET. By connecting the load LED module to the manufactured 8-channel LED lighting control system, the operation was confirmed through various production scenarios. In addition, a control system was designed to show the most suitable color for the atmosphere of the coastal pier according to the input value of temperature and illumination using a fuzzy control system. Computer simulation was then conducted. Results confirmed that fuzzy control did not need to store many data inputs due to characteristics of artificial intelligence and that it could efficiently represent many output values with simple fuzzy rules.

A 2D GPU-Accelerated High Resolution Numerical Scheme for Solving Diffusive Wave Equation (고해상도 수치기법을 이용한 GPU 기반 2D 확산파 모형)

  • Park, Seonryang;Kim, Dae-Hong
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2019.05a
    • /
    • pp.109-109
    • /
    • 2019
  • 본 연구에서는 강우-유출 과정 모의를 위한 GPU 기반 확산파 모형을 개발하였다. 확산파 방정식을 풀기위한 수치기법으로는 유한체적법을 이용하였으며, van Leer TVD limiter를 적용한 MUSCL 기법을 이용하여 각 셀의 인터페이스의 물리적 성질을 재구성하여 구하였다. 또한, 침투를 고려하기 위하여 Horton 침투 모형을 이용하였다. 개발된 모형을 이용하여 1D single overland plane과 2D V-shaped overland에서 강우-유출 과정을 모의실험을 하였으며, 각각 해석해와 dynamic wave model을 이용하여 계산된 수치 결과와 비교하여 본 모형의 정확성을 검증하였다. 또한, 1D와 2D의 기복이 심한 지형에 적용하여 강우-유출과정이 본 모형을 통하여 물리적으로 타당한 해석이 가능함을 검증하였다. 마지막으로 복잡한 실제 지형에 적용하였으며, 측정값과의 비교를 통하여 실제 유역에서의 확산파 모형의 적정성을 검증하였다. 또한, 본 연구에서는 NVIDIA사의 GPU인 Geforce GTX 1050과 GPU의 병렬 연산 처리 능력을 활용할 수 있는 NVIDIA사의 CUDA-Fortran을 이용하여 GPU 기반 확산파 모형을 개발하였다. PC windows에서 CPU(Intel i7, 4.70 GHz) 기반 모형 대비 GPU 기반 모형의 계산속도 성능을 비교한 결과, 격자 간격이 증가할수록 CPU 기반 모형 대비 GPU 기반 모형의 연산 효율이 증가하였으며, 격자 간격이 $3200{\times}3200$일 때, CPU 기반 모형 대비 GPU 기반 모형의 연산 효율이 최대 약 150배 증가하였다.

  • PDF

A Study of n Multigrid Finite-Volume Method for Radiation (다중격자 유한체적법에 의한 복사열전달 해석)

  • Kim, Man-Young;Do, Young-Byun;Baek, Seung-Wook
    • Transactions of the Korean Society of Mechanical Engineers B
    • /
    • v.27 no.1
    • /
    • pp.135-140
    • /
    • 2003
  • The convergence of finite volume method (FVM) or discrete ordinate method (DOM) is known to degrade for optical thickness greater than unity and large scattering albedo. The present article presents a convergence acceleration procedure for the FVM based on a full approximation storage (FAS) multigrid method. Among a variety of multigrid cycles, the V-cycle is used and the full multigrid algorithm (FMG) is applied to an analysis of radiation in irregular two-dimensional geometry. Solution convergence is discussed for the several cases of various optical thickness and scattering albedo. At small scattering albedo and optical thickness, there is no advantage to using the multigrid method for calculation CPU time. For large scattering albedo greater than 0.5 and optical thickness greater than unity, however, the multigrid method improves the convergence and the solution is rapidly obtained.

An Optimization Technique for Diesel Engine Combustion Using a Micro Genetic Algorithm (유전알고리즘을 이용한 디젤엔진의 연소최적화 기법에 대한 연구)

  • 김동광;조남효;차순창;조순호
    • Transactions of the Korean Society of Automotive Engineers
    • /
    • v.12 no.3
    • /
    • pp.51-58
    • /
    • 2004
  • Optimization of engine desist and operation parameters using a genetic algorithm was demonstrated for direct injection diesel engine combustion. A micro genetic algorithm and a modified KIVA-3V code were used for the analysis and optimization of the engine combustion. At each generation of the optimization step the micro genetic algorithm generated five groups of parameter sets, and the five cases of KIVA-3V analysis were to be performed either in series or in parallel. The micro genetic algorithm code was also parallelized by using MPI programming, and a multi-CPU parallel supercomputer was used to speed up the optimization process by four times. An example case for a fixed engine speed was performed with six parameters of intake swirl ratio, compression ratio, fuel injection included angle, injector hole number, SOI, and injection duration. A simultaneous optimization technique for the whole range of engine speeds would be suggested for further studies.

An approach to model the temperature effects on I-V characteristics of CNTFETs

  • Marani, Roberto;Perri, Anna G.
    • Advances in nano research
    • /
    • v.5 no.1
    • /
    • pp.61-67
    • /
    • 2017
  • A semi-empirical approach to model the temperature effects on I-V characteristics of Carbon Nanotube Field Effect Transistors (CNTFETs) is proposed. The model includes two thermal parameters describing CNTFET behaviour in terms of saturation drain current and threshold voltage, whose values are extracted from the simulated and trans-characteristics of the device in different temperature conditions. Our results are compared with those of a numerical model online available, obtaining I-V characteristics comparable but with a lower CPU calculation time.