• Title/Summary/Keyword: 버퍼 캐시

Search Result 68, Processing Time 0.029 seconds

Hierarchically Encoded Multimedia-data Management System for Over The Top Service (OTT 서비스를 위한 계층적 부호화 기반 멀티미디어 데이터 관리 시스템)

  • Lee, Taehoon;Jung, Kidong
    • Journal of KIISE
    • /
    • v.42 no.6
    • /
    • pp.723-733
    • /
    • 2015
  • The OTT service that provides multimedia video has spread over the Internet for terminals with a variety of resolutions. The terminals are in communication via a networks such as 3G, LTE, VDSL, ADSL. The service of the network has been increased for a variety of terminals giving rise to the need for a new way of encoding multimedia is increasing. SVC is an encoding technique optimized for OTT services. We proposed an efficient multimedia management system for the SVC encoded multimedia data. The I/O trace was generated using a zipf distribution, and were comparatively evaluated for performance with the existing system.

Study of Zero-copy Mechanism in TCP/IP (TCP/IP 에서의 Zero-copy 매커니즘 연구)

  • Chae, Byoung soo;Tcha, Seung Ju
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.1 no.2
    • /
    • pp.131-136
    • /
    • 2008
  • From the reciprocal connection by this Internet network researchs about the efficiency improvement of the whole system is accomplished with the method which reduces delays in message transmission. From here, we will do a comparative study between the user data program protocol (UDP) and the zero copy which does not use the buffer cache to fine out the valid method to improve the efficiency. In this thesis, I will change the message copy from execution process of the buffer cache of the TCP/IP on Unix OS with process on Linux OS. The object of conversion is to show you that the zero copy which doesn't use the buffer cache from transfer control class improves the communication efficiency.

  • PDF

Performance Enhancement through Prefetching Based On Looping Reference Characteristics (순환 참조 특성을 기반한 선반입 성능의 개선)

  • Lee, Hyo-Jeong;Doh, In-Hwan;Noh, Sam-H.
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2007.06b
    • /
    • pp.327-332
    • /
    • 2007
  • 버퍼캐시에서 선반입은 교체정책과 함께 중요한 성능 향상 기법 중의 하나이다. 하지만 참조 패턴의 특성에 따라서는 선반입을 수행하면 오히려 전체 수행시간을 증가시키는 경우도 보고된 바 있다. 본 논문에서는 참조 패턴을 탐지하고 탐지된 패턴에 적절히 대응하여, 선반입의 이익은 유지하되 성능에 악영향을 미치지 않는 선반입 기법으로 순환 참조 선반입을 제안한다. 성능 평가를 위해서 리눅스에서 현재 사용되고 있는 미리 읽기 선반입과 순환 참조 선반입의 수행 시간을 비교했다. 다양한 참조 패턴을 가지는 트레이스들에 대한 시뮬레이션 성능 평가 결과, 순차 참조를 많이 포함하는 트레이스에 대해서는 순환참조 선반입이 리눅스의 미리 읽기 선반입과 유사한 정도의 $3\sim5%$ 성능향상을 보였다. 뿐만 아니라, 미리 읽기 선반입 정책을 적용했을 때 오히려 40% 가량의 성능 악화를 초래하는 특정 트레이스에 대해서도 순환 참조 선반입을 적용할 경우 0.07%의 아주 미미한 성능 저하만을 유발하였다. 본 연구에서 제안하는 순환 참조 선반입 기법은 이득이 있을 때만 적극적인 선반입을 수행하여 시스템 성능을 향상시키며, 손해가 발생할 때는 선반입을 중지하여 시스템 성능 악화를 방지함을 실험을 통해 알 수 있다.

  • PDF

Functionality-based Processing-In-Memory Accelerator for Deep Neural Networks (딥뉴럴네트워크를 위한 기능성 기반의 핌 가속기)

  • Kim, Min-Jae;Kim, Shin-Dug
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.11a
    • /
    • pp.8-11
    • /
    • 2020
  • 4 차 산업혁명 시대의 도래와 함께 AI, ICT 기술의 융합이 진행됨에 따라, 유저 레벨의 디바이스에서도 AI 서비스의 요청이 실현되었다. 이미지 처리와 관련된 AI 서비스는 피사체 판별, 불량품 검사, 자율주행 등에 이용되고 있으며, 특히 Deep Convolutional Neural Network (DCNN)은 이미지의 특색을 파악하는 데 뛰어난 성능을 보여준다. 하지만, 이미지의 크기가 커지고, 신경망이 깊어짐에 따라 연산 처리에 있어 낮은 데이터 지역성과 빈번한 메모리 참조를 야기했다. 이에 따라, 기존의 계층적 시스템 구조는 DCNN 을 scalable 하고 빠르게 처리하는 데 한계를 보인다. 본 연구에서는 DCNN 의 scalable 하고 빠른 처리를 위해 3 차원 메모리 구조의 Processing-In-Memory (PIM) 가속기를 제안한다. 이를 위해 기존 3 차원 메모리인 Hybrid Memory Cube (HMC)에 하드웨어 및 소프트웨어 모듈을 추가로 구성하였다. 구체적으로, Processing Element (PE)간 데이터를 공유할 수 있는 공유 캐시 및 소프트웨어 스택, 파이프라인화된 곱셈기 및 듀얼 프리페치 버퍼를 구성하였다. 이를 유명 DCNN 알고리즘 LeNet, AlexNet, ZFNet, VGGNet, GoogleNet, RestNet 에 대해 성능 평가를 진행한 결과 기존 HMC 대비 40.3%의 속도 향상을 29.4%의 대역폭 향상을 보였다.

NVM-based Write Amplification Reduction to Avoid Performance Fluctuation of Flash Storage (플래시 스토리지의 성능 지연 방지를 위한 비휘발성램 기반 쓰기 증폭 감소 기법)

  • Lee, Eunji;Jeong, Minseong;Bahn, Hyokyung
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.16 no.4
    • /
    • pp.15-20
    • /
    • 2016
  • Write amplification is a critical factor that limits the stable performance of flash-based storage systems. To reduce write amplification, this paper presents a new technique that cooperatively manages data in flash storage and nonvolatile memory (NVM). Our scheme basically considers NVM as the cache of flash storage, but allows the original data in flash storage to be invalidated if there is a cached copy in NVM, which can temporarily serve as the original data. This scheme eliminates the copy-out operation for a substantial number of cached data, thereby enhancing garbage collection efficiency. Experimental results show that the proposed scheme reduces the copy-out overhead of garbage collection by 51.4% and decreases the standard deviation of response time by 35.4% on average.

Prefetching Framework for General Workloads Using Breakpoint (브레이크포인트를 이용한 범용 워크로드 프리페칭 프레임워크)

  • Ko, Kwangjin;Ryu, Junhee;Kang, Kyungtae;Shin, Heonshik
    • Journal of KIISE
    • /
    • v.41 no.10
    • /
    • pp.832-837
    • /
    • 2014
  • Application loading speed can be improved by timely prefetching disk blocks likely to be needed by an application. However, existing prefetchers -- if they are not specialized to a particular application -- incur high overheads and are poor at identifying the blocks that will actually be required. There are many sequences in which blocks may be needed and, even if two access sequences are identical, block tracing and access timings can be affected significantly by the state of the buffer cache. We propose a new application-independent software-based prefetching technique, in which breakpoints are inserted at appropriate places in an application to collect the information on correlations between the blocks and to prefetch the potential blocks ahead of their schedule based on it. Experiments on an HDD-based desktop PC demonstrated an average 30% reduction in application launch time and 15% in general I/O, while reducing the wasted overhead.

Data De-duplication and Recycling Technique in SSD-based Storage System for Increasing De-duplication Rate and I/O Performance (SSD 기반 스토리지 시스템에서 중복률과 입출력 성능 향상을 위한 데이터 중복제거 및 재활용 기법)

  • Kim, Ju-Kyeong;Lee, Seung-Kyu;Kim, Deok-Hwan
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.49 no.12
    • /
    • pp.149-155
    • /
    • 2012
  • SSD is a storage device of having high-performance controller and cache buffer and consists of many NAND flash memories. Because NAND flash memory does not support in-place update, valid pages are invalidated when update and erase operations are issued in file system and then invalid pages are completely deleted via garbage collection. However, garbage collection performs many erase operations of long latency and then it reduces I/O performance and increases wear leveling in SSD. In this paper, we propose a new method of de-duplicating valid data and recycling invalid data. The method de-duplicates valid data and then recycles invalid data so that it improves de-duplication ratio. Due to reducing number of writes and garbage collection, the method could increase I/O performance and decrease wear leveling in SSD. Experimental result shows that it can reduce maximum 20% number of garbage collections and 9% I/O latency than those of general case.

Optimizing LRU Lock Management in the Linux Kernel for Improving Parallel Write Throughout in Many-Core CPU Systems (매니코어 CPU 시스템의 병렬 쓰기 성능 향상을 위한 리눅스 커널의 LRU 관리 최적화 기법)

  • Eun-Kyu Byun;Gibeom Gu;Kwang-Jin Oh;Jiwoo Bang
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.12 no.7
    • /
    • pp.209-216
    • /
    • 2023
  • Modern HPC systems are equipped with many-core CPUs with dozens of cores. When performing parallel I/O in such a system, there is a limit to scalability due to the problem of the LRU lock management policy of the Linux system. The study proposes an improved FinerLRU to solve this problem. Our new FinerLRU improves the parallel write performance of file systems using the buffer cache through granular lock management by increasing the number of LRU locks upto the maximum number of cores. The proposed method was implemented in Linux 5.18.11, and the performance was measured on two types of CPUs, Intel Icelake Xeon and Intel Knights landing, with different characteristics, and it was found that a performance improvement of about two times can be obtained in both types of systems.