• Title/Summary/Keyword: Memory Improvement

Search Result 698, Processing Time 0.026 seconds

A Distributed VOD Server Based on Virtual Interface Architecture and Interval Cache (버추얼 인터페이스 아키텍처 및 인터벌 캐쉬에 기반한 분산 VOD 서버)

  • Oh, Soo-Cheol;Chung, Sang-Hwa
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.33 no.10
    • /
    • pp.734-745
    • /
    • 2006
  • This paper presents a PC cluster-based distributed VOD server that minimizes the load of an interconnection network by adopting the VIA communication protocol and the interval cache algorithm. Video data is distributed to the disks of the distributed VOD server and each server node receives the data through the interconnection network and sends it to clients. The load of the interconnection network increases because of the large amount of video data transferred. This paper developed a distributed VOD file system, which is based on VIA, to minimize cost using interconnection network when accessing remote disks. VIA is a user-level communication protocol removing the overhead of TCP/IP. This papers also improved the performance of the interconnection network by expanding the maximum transfer size of VIA. In addition, the interval cache reduces traffic on the interconnection network by caching, in main memory, the video data transferred from disks of remote server nodes. Experiments using the distributed VOD server of this paper showed a maximum performance improvement of 21.3% compared with a distributed VOD server without VIA and the interval cache, when used with a four-node PC cluster.

Implementation of Massive FDTD Simulation Computing Model Based on MPI Cluster for Semi-conductor Process (반도체 검증을 위한 MPI 기반 클러스터에서의 대용량 FDTD 시뮬레이션 연산환경 구축)

  • Lee, Seung-Il;Kim, Yeon-Il;Lee, Sang-Gil;Lee, Cheol-Hoon
    • The Journal of the Korea Contents Association
    • /
    • v.15 no.9
    • /
    • pp.21-28
    • /
    • 2015
  • In the semi-conductor process, a simulation process is performed to detect defects by analyzing the behavior of the impurity through the physical quantity calculation of the inner element. In order to perform the simulation, Finite-Difference Time-Domain(FDTD) algorithm is used. The improvement of semiconductor which is composed of nanoscale elements, the size of simulation is getting bigger. Problems that a processor such as CPU or GPU cannot perform the simulation due to the massive size of matrix or a computer consist of multiple processors cannot handle a massive FDTD may come up. For those problems, studies are performed with parallel/distributed computing. However, in the past, only single type of processor was used. In GPU's case, it performs fast, but at the same time, it has limited memory. On the other hand, in CPU, it performs slower than that of GPU. To solve the problem, we implemented a computing model that can handle any FDTD simulation regardless of size on the cluster which consist of heterogeneous processors. We tested the simulation on processors using MPI libraries which is based on 'point to point' communication and verified that it operates correctly regardless of the number of node and type. Also, we analyzed the performance by measuring the total execution time and specific time for the simulation on each test.

Properties of $RuO_2$ Thin Films for Bottom Electrode in Ferroelectric Memory by Using the RF Sputtering (RF Sputtering 법으로 제작한 강유전체 메모리의 하부전극용$RuO_2$ 박막의 특성에 관한 연구)

  • 강성준;정양희
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.4 no.5
    • /
    • pp.1127-1134
    • /
    • 2000
  • $RuO_2$ thin films are prepared by RP magnetron reactive sputtering and their characteristics of crystalliBation,microstructure, surface roughness and resistivity are studied with various O2/(Ar+O2)ratios and substrate temperatures. As O2/(Ar+O2) ratio decreases and substrate temperature increases, the preferred growing plane of$RuO_2$ thin films are changed from (110) to (101) plane. With increase of the 021(Ar+O2) ratio from 2075 to 50%, the surface roughness and the resistivity of $RuO_2$ thin films increase from 2.38nm to 7.81nm, and from $103.6 \mu\Omega-cm\; to \; 227 \mu\Omega-cm$, respectively, but the deposition rate decreases from 47nm/min to 17nm/min. On the other hand, as the substrate temperature increases from room temperature to$500^{\circ}C$, resistivity decreases from $210.5 \mu\Omega-cm\; to \; 93.7\mu\Omega-cm$. $RuO_2$ thin film deposited at $300^{\circ}C$ shows a excellent surface roughness of 2.38 m. As the annealing temperature increases in the range between $400^{\circ}C$ and $650^{\circ}C$, the resistivity decreases because of the improvement of crystallinity. We find that RuO$_2$ thin film deposited at 20% of 02/(Ar+O2) ratio and $300^{\circ}C$ of substrate temperature shows excellent combination of surface smoothness and low resistivity so that it is well qualified for bottom electrode for ferroelectric thin films.

  • PDF

A Study on the Procedure for Constructing Linked Open Data of Records Information by Using Open Source Tool (오픈소스 도구를 이용한 기록정보 링크드 오픈 데이터 구축 절차 연구)

  • Ha, Seung Rok;Yim, Jin Hee;Rieh, Hae-young
    • Journal of the Korean Society for information Management
    • /
    • v.34 no.1
    • /
    • pp.341-371
    • /
    • 2017
  • Recently, the web service environment has changed from document-centered to data-oriented focus, and the Linked Open Data (LOD) exists at the core of the new environment. Specific procedures and methods were examined to build the LOD of records information in accordance with this trend. With the service sustainability of small-scale archive in consideration, an exemplification on LOD building process by utilizing open source software was developed in this paper. To this end, a 5-step service framework for LOD construction was proposed and applied to a collection of diary records from 'Human and Memory Archive'. Proof of Concept (POC) utilizing open source softwares, Protege and Apache Jena Fuseki, was conducted according to the proposed 5 step framework. After establishing the LOD of record information by utilizing the open source software, the connection with external LOD through interlinking and SPARQL search has been successfully performed. In addition, archives' considerations for LOD construction, including improvement on the quality of content information, the role of the archivist, were suggested based on the understanding obtained through the LOD construction process of records information.

Utilizing Channel Bonding-based M-n and Interval Cache on a Distributed VOD Server (효율적인 분산 VOD 서버를 위한 Channel Bonding 기반 M-VIA 및 인터벌 캐쉬의 활용)

  • Chung, Sang-Hwa;Oh, Soo-Cheol;Yoon, Won-Ju;kim, Hyun-Pil;Choi, Young-In
    • The KIPS Transactions:PartA
    • /
    • v.12A no.7 s.97
    • /
    • pp.627-636
    • /
    • 2005
  • This paper presents a PC cluster-based distributed video on demand (VOD) server that minimizes the load of the interconnection network by adopting channel bonding-based MVIA and the interval cache algorithm Video data is distributed to the disks of each server node of the distributed VOD server and each server node receives the data through the interconnection network and sends it to clients. The load of the interconnection network increases because of the large volume of video data transferred. We adopt two techniques to reduce the load of the interconnection network. First, an Msupporting channel bonding technique is adopted for the interconnection network. n which is a user-level communication protocol that reduces the overhead of the TCP/IP protocol in cluster systems, minimizes the time spent in communicating. We increase the bandwidth of the interconnection network using the channel bonding technique with MThe channel bonding technique expands the bandwidth by sending data concurrently through multiple network cards. Second, the interval cache reduces traffic on the interconnection network by caching the video data transferred from the remote disks in main memory Experiments using the distributed VOD server of this paper showed a maximum performance improvement of $30\%$ compared with a distributed VOD server without channel bonding-based MVIA and the interval cache, when used with a four-node PC cluster.

Parallel Cell-Connectivity Information Extraction Algorithm for Ray-casting on Unstructured Grid Data (비정렬 격자에 대한 광선 투사를 위한 셀 사이 연결정보 추출 병렬처리 알고리즘)

  • Lee, Jihun;Kim, Duksu
    • Journal of the Korea Computer Graphics Society
    • /
    • v.26 no.1
    • /
    • pp.17-25
    • /
    • 2020
  • We present a novel multi-core CPU based parallel algorithm for the cell-connectivity information extraction algorithm, which is one of the preprocessing steps for volume rendering of unstructured grid data. We first check the synchronization issues when parallelizing the prior serial algorithm naively. Then, we propose a 3-step parallel algorithm that achieves high parallelization efficiency by removing synchronization in each step. Also, our 3-step algorithm improves the cache utilization efficiency by increasing the spatial locality for the duplicated triangle test process, which is the core operation of building cell-connectivity information. We further improve the efficiency of our parallel algorithm by employing a memory pool for each thread. To check the benefit of our approach, we implemented our method on a system consisting of two octa-core CPUs and measured the performance. As a result, our method shows continuous performance improvement as we add threads. Also, it achieves up to 82.9 times higher performance compared with the prior serial algorithm when we use thirty-two threads (sixteen physical cores). These results demonstrate the high parallelization efficiency and high cache utilization efficiency of our method. Also, it validates the suitability of our algorithm for large-scale unstructured data.

Performance Evaluation of the GPU Architecture Executing Parallel Applications (병렬 응용프로그램 실행 시 GPU 구조에 따른 성능 분석)

  • Choi, Hong-Jun;Kim, Cheol-Hong
    • The Journal of the Korea Contents Association
    • /
    • v.12 no.5
    • /
    • pp.10-21
    • /
    • 2012
  • The role of GPU has evolved from graphics-specific processing to general-purpose processing with the development of unified shader core architecture. Especially, execution methods for general-purpose parallel applications using GPU have been researched intensively, since the parallel hardware architecture can be utilized efficiently when the parallel applications are executed. However, current GPU architecture has limitations in executing general-purpose parallel applications, since the GPU is not specialized for general-purpose computing yet. To improve the GPU performance when general-purpose parallel applications are executed, the GPU architecture should be evolved. In this work, we analyze the GPU performance according to the architecture varying the number of cores and clock frequency. Our simulation results show that the GPU performance improves by up to 125.8% and 16.2% as the number of cores increases and the clock frequency increases, respectively. However, note that the improvement of the GPU performance is saturated even though the number of cores increases and the clock frequency increases continuously, since the data cannot be provided to the GPU due to the limit of memory bandwidth. Consequently, to accomplish high performance effectiveness on GPU, computational resources must be more suitably considered.

Efficient DRAM Buffer Access Scheduling Techniques for SSD Storage System (SSD 스토리지 시스템을 위한 효율적인 DRAM 버퍼 액세스 스케줄링 기법)

  • Park, Jun-Su;Hwang, Yong-Joong;Han, Tae-Hee
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.48 no.7
    • /
    • pp.48-56
    • /
    • 2011
  • Recently, new storage device SSD(Solid State Disk) based on NAND flash memory is gradually replacing HDD(Hard Disk Drive) in mobile device and thus a variety of research efforts are going on to find the cost-effective ways of performance improvement. By increasing the NAND flash channels in order to enhance the bandwidth through parallel processing, DRAM buffer which acts as a buffer cache between host(PC) and NAND flash has become the bottleneck point. To resolve this problem, this paper proposes an efficient low-cost scheme to increase SSD performance by improving DRAM buffer bandwidth through scheduling techniques which utilize DRAM multi-banks. When both host and NAND flash multi-channels request access to DRAM buffer concurrently, the proposed technique checks their destination and then schedules appropriately considering properties of DRAMs. It can reduce overheads of bank active time and row latency significantly and thus optimizes DRAM buffer bandwidth utilization. The result reveals that the proposed technique improves the SSD performance by 47.4% in read and 47.7% in write operation respectively compared to conventional methods with negligible changes and increases in the hardware.

Review on the Three-Dimensional Magnetotelluric Modeling (MT 법의 3차원 모델링 개관)

  • Kim, Hee-Joon;Nam, Myung-Jin;Song, Yoon-Ho;Suh, Jung-Hee
    • Geophysics and Geophysical Exploration
    • /
    • v.7 no.2
    • /
    • pp.148-154
    • /
    • 2004
  • This article reviews the development of three-dimensional (3-D) magnetotelluric (MT) modeling. The 3-D modeling of electromagnetic fields is essential in understanding the physics of MT soundings, and in implementing an inversion method to reconstruct a 3-D resistivity image. Although various numerical schemes have been developed over the last two decades, practical methods have been quite limited. However, the recent rapid improvement in computer speed and memory, as well as the advance in iterative solution algorithms for a large system of equations, makes it possible to model the MT responses of complex 3-D structures, which have been very difficult to simulate before. The use of staggered grids in finite difference method has become popular, conserving a magnetic flux and an electric current and allowing for realistic discontinuous fields. The convergence of numerical solutions has been greatly accelerated by adopting Krylov subspace methods, proper preconditioning techniques, and static divergence corrections. The vector finite-element method using edge elements is also free from the discontinuity problem, and seems a natural choice for modeling complex structures including irregular topography because its flexibility allows one to capture full geometric complexity.

RSP-DS: Real Time Sequential Patterns Analysis in Data Streams (RSP-DS: 데이터 스트림에서의 실시간 순차 패턴 분석)

  • Shin Jae-Jyn;Kim Ho-Seok;Kim Kyoung-Bae;Bae Hae-Young
    • Journal of Korea Multimedia Society
    • /
    • v.9 no.9
    • /
    • pp.1118-1130
    • /
    • 2006
  • Existed pattern analysis algorithms in data streams environment have researched performance improvement and effective memory usage. But when new data streams come, existed pattern analysis algorithms have to analyze patterns again and have to generate pattern tree again. This approach needs many calculations in real situation that needs real time pattern analysis. This paper proposes a method that continuously analyzes patterns of incoming data streams in real time. This method analyzes patterns fast, and thereafter obtains real time patterns by updating previously analyzed patterns. The incoming data streams are divided into several sequences based on time based window. Informations of the sequences are inputted into a hash table. When the number of the sequences are over predefined bound, patterns are analyzed from the hash table. The patterns form a pattern tree, and later created new patterns update the pattern tree. In this way, real time patterns are always maintained in the pattern tree. During pattern analysis, suffixes of both new pattern and existed pattern in the tree can be same. Then a pointer is created from the new pattern to the existed pattern. This method reduce calculation time during duplicated pattern analysis. And old patterns in the tree are deleted easily by FIFO method. The advantage of our algorithm is proved by performance comparison with existed method, MILE, in a condition that pattern is changed continuously. And we look around performance variation by changing several variable in the algorithm.

  • PDF