• Title/Summary/Keyword: 병렬 I/O

Search Result 126, Processing Time 0.027 seconds

A Parallel I/O System on Workstation Clustering Environment for Irregular Applications (비정형 응용을 위한 워크스테이션 클러스터링 환경에서의 병렬 입출력 시스템)

  • No, Jae-Chun;Park, Sung-Soon;Choudhary, Alok
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.27 no.5
    • /
    • pp.496-505
    • /
    • 2000
  • Clusters of workstations (COW) are becoming an attractive option for parallel scientific computing, a field formerly reserved to the MPPs, because their cost-performance ratio is usuallybetter than that of comparable MPPS, and their hardware and software can be easily enhanced to thelatest generations. In this paper we present the design and implementation of our runtime library forclusters of workstations, called "Collective I/O Clustering". The library provides a friendlyprogramming model for the I/O of irregular applications on clusters of workstations, being completelyintegrated with the underlying communication and I/O system. In the collective I/O clustering, two I/Oconfigurations are possible. In the first I/O configuration, all processors allocated can act as I/Oservers as well as compute nodes. In the second I/O configuration, only a subset of processors canact as I/O servers, The compression and software caching facilities have been incorporated into thecollective 1/0 clustering to optimize the communication and I/O costs. All the performance results wereobtained on the IBM-SP machine, located at Argonne National Labs.

  • PDF

Optimizing LRU Lock Management in the Linux Kernel for Improving Parallel Write Throughout in Many-Core CPU Systems (매니코어 CPU 시스템의 병렬 쓰기 성능 향상을 위한 리눅스 커널의 LRU 관리 최적화 기법)

  • Eun-Kyu Byun;Gibeom Gu;Kwang-Jin Oh;Jiwoo Bang
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.12 no.7
    • /
    • pp.209-216
    • /
    • 2023
  • Modern HPC systems are equipped with many-core CPUs with dozens of cores. When performing parallel I/O in such a system, there is a limit to scalability due to the problem of the LRU lock management policy of the Linux system. The study proposes an improved FinerLRU to solve this problem. Our new FinerLRU improves the parallel write performance of file systems using the buffer cache through granular lock management by increasing the number of LRU locks upto the maximum number of cores. The proposed method was implemented in Linux 5.18.11, and the performance was measured on two types of CPUs, Intel Icelake Xeon and Intel Knights landing, with different characteristics, and it was found that a performance improvement of about two times can be obtained in both types of systems.

A Methodology to Simulate I/O-Intensive Applications (I/O 집약적인 응용의 시뮬레이션 방법론)

  • Eom, Hyeon-Sang
    • The KIPS Transactions:PartA
    • /
    • v.13A no.5 s.102
    • /
    • pp.445-454
    • /
    • 2006
  • We introduce a family of simulators for I/O-intensive distributed or parallel applications, and a methodology that permits selecting the most efficient simulator meeting a given user-defined accuracy requirement. This methodology consists of a series of tests to choose an appropriate simulation based on the attributes of the application. In addition, each simulator provides two estimates of application execution time: the minimum expected time and the maximum. We present the results of applying our methodology to existing applications, and show that we can accurately simulate applications tens to hundreds of tunes faster than the application execution times.

Performance Analysis of NVMe SSDs and Design of Direct Access Engine on Virtualized Environment (가상화 환경에서 NVMe SSD 성능 분석 및 직접 접근 엔진 개발)

  • Kim, Sewoog;Choi, Jongmoo
    • KIISE Transactions on Computing Practices
    • /
    • v.24 no.3
    • /
    • pp.129-137
    • /
    • 2018
  • NVMe(Non-Volatile Memory Express) SSD(Solid State Drive) is a high-performance storage that makes use of flash memory as a storage cell, PCIe as an interface and NVMe as a protocol on the interface. It supports multiple I/O queues which makes it feasible to process parallel-I/Os on multi-core environments and to provide higher bandwidth than SATA SSDs. Hence, NVMe SSD is considered as a next generation-storage for data-center and cloud computing system. However, in the virtualization system, the performance of NVMe SSD is not fully utilized due to the bottleneck of the software I/O stack. Especially, when it uses I/O stack of the hypervisor or the host operating system like Xen and KVM, I/O performance degrades seriously due to doubled-I/O stack between host and virtual machine. In this paper, we propose a new I/O engine, called Direct-AIO (Direct-Asynchronous I/O) engine, that can access NVMe SSD directly for I/O performance improvements on QEMU emulator. We develop our proposed I/O engine and analyze I/O performance differences between the existed I/O engine and Direct-AIO engine.

The Design of Parallel Routing Algorithm on a Recursive Circulant Network (재귀원형군에서 병렬 경로 알고리즘의 설계)

  • Bae, Yong-Keun;Park, Byung-Kwon;Chung, Il-Yong
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.11
    • /
    • pp.2701-2710
    • /
    • 1997
  • Recursive circulant graph has recently developed as a new model of multiprocessors, and drawn considerable attention to supercomputing, In this paper, we investigate the routing of a message i recursive circulant, that is a key to the performance of this network. On recursive circulant network, we would like to transmit m packets from a source node to a destination node simultaneously along paths, where the ith packet will traverse along the ith path $(o{\leq}i{\leq}m-1)$. In oder for all packets to arrive at the destination node quickly and securely, the ith path must be node-disjoint from all other paths. For construction of these paths, employing the Hamiltonian Circuit Latin Square(HCLS), a special class of $(n{\times}n)$ matrices, we present $O(n^2)$ parallel routing algorithm on recursive circulant network.

  • PDF

Design and Implementation of I/O Tracer for PVFS (PVFS를 위한 I/O Tracer 설계 및 구현)

  • Hyeyoung Cho;Kwangho Cha;Sungho Kim;SangDong Lee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2008.11a
    • /
    • pp.966-969
    • /
    • 2008
  • 사용자 프로그램의 I/O 패턴을 분석하거나 파일 시스템의 워크로드를 보다 정확하게 분석하기 위해서 실제 가동중인 파일 시스템의 동적 I/O 로그를 확보하기 위한 연구들이 많이 진행되어 왔다. 그러나 대량의 I/O 트렌젝션(transcation)이 처리되는 파일 시스템에서 동적 I/O 로그를 확보하는 일은 시스템의 부하와 막대한 데이터량 때문에 한계가 많다. 특히 다수의 이용자가 사용하는 대용량 분산/병렬 파일 시스템에서의 I/O Tracing은 로컬 파일 시스템에서 I/O Tracing에 비해 더욱 복잡하고 오버헤드가 크다. 본 논문에서는 기존의 파일 시스템 로깅 방법들을 알아보고, 클러스터 시스템에서 널리 이용되고 있는 분산 파일 시스템인 PVFS(Parallel Virtual File System)에서 동적 I/O 연산들의 로그를 생성할 수 있는 로깅 시스템을 제안하고 설계하였다.

Configuration System Implementation Algorithm to Manage the I/O Device of the Parallel Processing Programmable Logic Controller (병렬 처리 기법을 이용한 프로그래머블 로직 컨트롤러의 입출력 접점 관리를 위한 컨피규레이션 시스템 구현 알고리즘)

  • Kim, Kwang-Jin;Kwon, Wook-Hyun
    • Proceedings of the KIEE Conference
    • /
    • 1998.07g
    • /
    • pp.2327-2329
    • /
    • 1998
  • In this paper, an algorithm to make a configuration system for managing the I/O device of programmable logic controller(PLC) is proposed. Parallel processing architecture is used to deal with a number of I/O devices. From that architecture, a contention problem between processors can arise. To resolve this problem, the configuration system that contains informations about I/O devices is introduced. This configuration system is used to check the contention between processors in the I/O device and also used in program execution.

  • PDF

Design and Implementation of the Parallel Multimedia File System on Fast Ethernet (Fast Ethernet 환경에서 병렬 멀티미디어 파일 시스템의 설계와 구현)

  • Park, Seong-Ho;Kim, Gwang-Mun;Jeong, Gi-Dong
    • The KIPS Transactions:PartB
    • /
    • v.8B no.1
    • /
    • pp.89-97
    • /
    • 2001
  • 대용량 멀티미디어 미디어 서버를 구성함에 있어 I/O 병목현상을 극복하기 위하여 저장 서버들과 제어 서버로 구성되어진 2계층 분산 클러스터 서버구조가 많이 사용된다. 2 계층 분산 클러스터 서버는 부하 균등, 대역폭 관리 및 저장 서버의 관리 측면에서 유리한 반면, 저장 서버와 제어 서버간의 통신 오버헤드를 발생시킨다. 이러한 오버헤드를 줄이기 위해서는 저장 서버에서 읽은 미디어 데이터를 제어 서버를 거치지 않고 직접 클라이언트에 전송할 수 있어야 한다. 그리고, 저장 용량을 확장하거나 손상된 디스크를 교체하는 경우를 대비하여 분산 클러스터 서버는 다양한 성능의 이기종 디스크를 지원하여야 한다. 또한, I/O 장치와 운영체제가 빠르게 발전됨에 따라 미디어 서버는 새로운 I/O 장치 및 운영체제 등에 쉽게 이식될 수 있어야 하고, 응용 소프트웨어 개발자가 시스템의 환경에 따라 블록크기, 데이터 배치정책, 사본 정책 등을 유연하게 조절할 수 있어야 한다. 본 논문에서 위에서 언급한 멀티미디어 서버의 요구를 고려하여 Fast Ethernet 환경에서 병렬 멀티미디어 파일 시스템(PMFS : Parallel Multimedia File System)을 설계 및 구현하고 실험을 통해 PVFS(Parallel Virtual File System)와 성능을 비교 분석하였다. 이 실험의 결과에 따르면 PMFS는 멀티미디어 데이터에 대하여 PVFS보다 3%∼15%의 향상된 성능을 보였다.

  • PDF

The Design and Implementation of the Distributed Shared Disk for Efficient Parallel I/O (효율적인 병렬 입출력을 지원하기 위한 분산공유디스트의 설계 및 구현)

  • 송창호;남영진;박찬익
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 1998.10a
    • /
    • pp.718-720
    • /
    • 1998
  • 병렬파일시스템을 분산 환경에서 구현하고자 할 때 하부기능들을 관리 및 유지하기 위해서는 복잡한 내부 동작이 필요하다. 저 수준의 하드웨어 관리기능들을 고수준의 파일 서비스 기능들과 분리함으로써 병렬파일시스템 구현의 복잡도를 감소시킬수 있다. 이를 위해 본 논문에서는 분산환경상에서 물리적으로 분산되어 있는 디스크들을 하나의 거대한 논리적인 가상 디스크로 보여주는 분산공유디스크개념을 제안한다. 분산 공유디스크는 병렬 파일 시스템을 지원하기 위한 저수준의 인터페이스를 제공함으로써 병렬파일시스템에서 필용로 하는 하부기능들을 잠재적으로 제공할 수 있다. 또한 클러스터 기반 시스템에서 분산공유디스크의 프로토타입을 구현하여 그의 동작을 실험하였다.

  • PDF

High Noise Margin LVDS I/O Circuits for Highly Parallel I/O Environments (다수의 병렬 입.출력 환경을 위한 높은 노이즈 마진을 갖는 LVDS I/O 회로)

  • Kim, Dong-Gu;Kim, Sam-Dong;Hwang, In-Seok
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.44 no.1
    • /
    • pp.85-93
    • /
    • 2007
  • This paper presents new LVDS I/O circuits with a high noise margin for use in highly parallel I/O environments. The proposed LVDS I/O includes transmitter and receiver parts. The transmitter circuits consist of a differential phase splitter and a output stage with common mode feedback(CMFB). The differential phase splitter generates a pair of differential signals which have a balanced duty cycle and $180^{\circ}$ phase difference over a wide supply voltage variation due to SSO(simultaneous switching output) noises. The CMFB output stage produces the required constant output current and maintains the required VCM(common mode voltage) within ${\pm}$0.1V tolerance without external circuits in a SSO environment. The proposed receiver circuits in this paper utilizes a three-stage structure(single-ended differential amp., common source amp., output stage) to accurately receive high-speed signals. The receiver part employs a very wide common mode input range differential amplifier(VCDA). As a result, the receiver improves the immunities for the common mode noise and for the supply voltage difference, represented by Vgdp, between the transmitter and receiver sides. Also, the receiver produces a rail-to-rail, full swing output voltage with a balanced duty cycle(50% ${\pm}$ 3%) without external circuits in a SSO environment, which enables correct data recovery. The proposed LVDS I/O circuits have been designed and simulated with 0.18um TSMC library using H-SPICE.