• Title/Summary/Keyword: Cache System

Search Result 457, Processing Time 0.026 seconds

Locally weighted linear regression prefetching method for hybrid memory system (하이브리드 메모리 시스템의 지역 가중 선형회귀 프리페치 방법)

  • Tang, Qian;Kim, Jeong-Geun;Kim, Shin-Dug
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.11a
    • /
    • pp.12-15
    • /
    • 2020
  • Data access characteristics can directly affect the efficiency of the system execution. This research is to design an accurate predictor by using historical memory access information, where highly accessible data can be migrated from low-speed storage (SSD/HHD) to high-speed memory (Memory/CPU Cache) in advance, thereby reducing data access latency and further improving overall performance. For this goal, we design a locally weighted linear regression prefetch scheme to cope with irregular access patterns in large graph processing applications for a DARM-PCM hybrid memory structure. By analyzing the testing result, the appropriate structural parameters can be selected, which greatly improves the cache prefetching performance, resulting in overall performance improvement.

Real-Time Search System using Distributed Cache (분산 캐시를 적용한 실시간 검색 시스템)

  • Ren, Jian-Ji;Lee, Jae-Kee
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.4
    • /
    • pp.472-476
    • /
    • 2010
  • Nowadays, as the indices of the major search engines grow to a tremendous proportion, vertical search services can help customers to find what they need. Real time search is valuable because it lets you know what's happening right now on any given topic. In this paper, we designed a new architecture to implement a high performance real time search system. Based on the real time search's characters, we divided the whole system to two parts which are collection system and search system. The evaluation results showed that our design has the potential to provide the real time search transparent scalability while maintaining the replication overhead costs in check.

2Q-CFP: A Client Cache Management Scheme for Broadcast-based Information Systems (2Q-CFP: 방송에 기초한 정보 시스템을 위한 클라이언트 캐쉬 관리 기법)

  • 권혁민
    • Journal of KIISE:Databases
    • /
    • v.30 no.6
    • /
    • pp.561-572
    • /
    • 2003
  • Broadcast-based data delivery has attracted a lot of attention as an efficient way of disseminating data to very large client populations. The main motivation of broadcast-based information systems (BBISs) is that the number of clients that they serve can grow arbitrarily large without any effect on their performance. The performance of BBISs depends mainly on client caching strategies and on data broadcast scheduling mechanisms. This paper addresses the former issue and proposes a new client cache management scheme, named 2Q-CFP, that is suitable to BBISs. This paper also evaluates the performance of 2Q-CFP on the basis of a simulation model. The performance results indicate that 2Q-CFP scheme shows superior performances over GRAY, LRU and CF in the average response time.

Performance Analysis of Futurebus+ based Multiprocessor Systems with MESI Cache Coherence Protocol (MESI 캐쉬 코히어런스 프로토콜을 사용하는 Futurebus+ 기반 멀티프로세서 시스템의 성능 평가)

  • 고석범;강인곤;박성우;김영천
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.18 no.12
    • /
    • pp.1815-1827
    • /
    • 1993
  • In this paper, we evaluate the performance of a Futurebus based multiprocessor system with MESI cache coherence protocol for four bus transaction types. Graphical symbols and compiler of SLAM II are used in modeling and simulation. A steady-state probability of each state for MESI protocol is computed by a Markov chain. The probability of each state is used as an input value for a correct simulation. Processor utilization, memory utilization, bus utilization, and the waiting time for bus arbitration are measured in terms of the number of processors, the hit ratio of cache memory, the probability of internal operation, and bus bandwidth.

  • PDF

Analysis and Improvement of I/O Performance Degradation by Journaling in a Virtualized Environment (가상화 환경에서 저널링 기법에 의한 입출력 성능저하 분석 및 개선)

  • Kim, Sunghwan;Lee, Eunji
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.16 no.6
    • /
    • pp.177-181
    • /
    • 2016
  • This paper analyzes the host cache effectiveness in full virtualization, particularly associated with journaling of guests. We observe that the journal access of guests degrades cache performance significantly due to the write-once access pattern and the frequent sync operations. To remedy this problem, we design and implement a novel caching policy, called PDC (Pollution Defensive Caching), that detects the journal accesses and prevents them from entering the host cache. The proposed PDC is implemented in QEMU-KVM 2.1 on Linux 4.14 and provides 3-32% performance improvement for various file and I/O benchmarks.

A Lock Mechanism for HiPi-bus Based Multiprocessor Systems (HiPi-bus 구조의 다중 프로세서 시스템에서의 잠금장치)

  • 윤용호;임인칠
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.30B no.2
    • /
    • pp.33-43
    • /
    • 1993
  • Lock mechanism is essential for synchronization on the multiprocessor systems. Lock mechanism needs to reduce the time for lock operation in low lock contention. Lock mechanism must consider the case of the high lock contention. The conventional lock control scheme in memory results in the increase of bus traffic and memory utilization in lock operation. This paper suggests a lock scheme which stores the lock data in cache and manages it efficiently to reduce the time spent in lock operation when the lock contention is low on a multiprocessor system built on HiPi-bus(Highly Pipelined bus). This paper also presents the design of the HIPi-CLOCK (Highly Pipelined bus Cache LOCK mechanism) which transfere the data from on cache to another when the lock contention is high. The designed simulator compares the conventional lock scheme which controls the lock in memory with the suggested HiPi-CLOCK scheme in terms of the RMW(Read-Modify-Write) operation time using simulated trace. It is shown that the suggested lock control scheme performance is over twice than that of the conventional method in low lock contention. When the lock contention is high, the performance of the suggested scheme increases as the number of the shared lock data increases.

  • PDF

Task failure resilience technique for improving the performance of MapReduce in Hadoop

  • Kavitha, C;Anita, X
    • ETRI Journal
    • /
    • v.42 no.5
    • /
    • pp.748-760
    • /
    • 2020
  • MapReduce is a framework that can process huge datasets in parallel and distributed computing environments. However, a single machine failure during the runtime of MapReduce tasks can increase completion time by 50%. MapReduce handles task failures by restarting the failed task and re-computing all input data from scratch, regardless of how much data had already been processed. To solve this issue, we need the computed key-value pairs to persist in a storage system to avoid re-computing them during the restarting process. In this paper, the task failure resilience (TFR) technique is proposed, which allows the execution of a failed task to continue from the point it was interrupted without having to redo all the work. Amazon ElastiCache for Redis is used as a non-volatile cache for the key-value pairs. We measured the performance of TFR by running different Hadoop benchmarking suites. TFR was implemented using the Hadoop software framework, and the experimental results showed significant performance improvements when compared with the performance of the default Hadoop implementation.

Study of Zero-copy Mechanism in TCP/IP (TCP/IP 에서의 Zero-copy 매커니즘 연구)

  • Chae, Byoung soo;Tcha, Seung Ju
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.1 no.2
    • /
    • pp.131-136
    • /
    • 2008
  • From the reciprocal connection by this Internet network researchs about the efficiency improvement of the whole system is accomplished with the method which reduces delays in message transmission. From here, we will do a comparative study between the user data program protocol (UDP) and the zero copy which does not use the buffer cache to fine out the valid method to improve the efficiency. In this thesis, I will change the message copy from execution process of the buffer cache of the TCP/IP on Unix OS with process on Linux OS. The object of conversion is to show you that the zero copy which doesn't use the buffer cache from transfer control class improves the communication efficiency.

  • PDF

A Distributed VOD Server Based on Virtual Interface Architecture and Interval Cache (버추얼 인터페이스 아키텍처 및 인터벌 캐쉬에 기반한 분산 VOD 서버)

  • Oh, Soo-Cheol;Chung, Sang-Hwa
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.33 no.10
    • /
    • pp.734-745
    • /
    • 2006
  • This paper presents a PC cluster-based distributed VOD server that minimizes the load of an interconnection network by adopting the VIA communication protocol and the interval cache algorithm. Video data is distributed to the disks of the distributed VOD server and each server node receives the data through the interconnection network and sends it to clients. The load of the interconnection network increases because of the large amount of video data transferred. This paper developed a distributed VOD file system, which is based on VIA, to minimize cost using interconnection network when accessing remote disks. VIA is a user-level communication protocol removing the overhead of TCP/IP. This papers also improved the performance of the interconnection network by expanding the maximum transfer size of VIA. In addition, the interval cache reduces traffic on the interconnection network by caching, in main memory, the video data transferred from disks of remote server nodes. Experiments using the distributed VOD server of this paper showed a maximum performance improvement of 21.3% compared with a distributed VOD server without VIA and the interval cache, when used with a four-node PC cluster.

DJFS: Providing Highly Reliable and High-Performance File System with Small-Sized NVRAM

  • Kim, Junghoon;Lee, Minho;Song, Yongju;Eom, Young Ik
    • ETRI Journal
    • /
    • v.39 no.6
    • /
    • pp.820-831
    • /
    • 2017
  • File systems and applications try to implement their own update protocols to guarantee data consistency, which is one of the most crucial aspects of computing systems. However, we found that the storage devices are substantially under-utilized when preserving data consistency because they generate massive storage write traffic with many disk cache flush operations and force-unit-access (FUA) commands. In this paper, we present DJFS (Delta-Journaling File System) that provides both a high level of performance and data consistency for different applications. We made three technical contributions to achieve our goal. First, to remove all storage accesses with disk cache flush operations and FUA commands, DJFS uses small-sized NVRAM for a file system journal. Second, to reduce the access latency and space requirements of NVRAM, DJFS attempts to journal compress the differences in the modified blocks. Finally, to relieve explicit checkpointing overhead, DJFS aggressively reflects the checkpoint transactions to file system area in the unit of the specified region. Our evaluation on TPC-C SQLite benchmark shows that, using our novel optimization schemes, DJFS outperforms Ext4 by up to 64.2 times with only 128 MB of NVRAM.