• Title/Summary/Keyword: Cache Cost

Search Result 102, Processing Time 0.025 seconds

Performance Analysis of Multicore Processor Architectures Based On Cache Size Effects (캐쉬 용량 효과에 대한 멀티코어 프로세서의 성능 연구)

  • Lee, Jongbok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.12 no.6
    • /
    • pp.175-180
    • /
    • 2012
  • In order to overcome the complexity and performance limit problems of superscalar processors, the multicore architecture has been prevalent recently. The configuration and the size of instruction and data caches greatly gives effect on the performance of multicore processors. Using SPEC 2000 benchmarks as input, the trace-driven simulation has been performed for the 2-core to 16-core architectures with different sizes of caches extensively. As a result, the 2-way set associative instruction and data cache with the size of 64KB brought the best cost-effective performance.

IpCSB+ - tree : An Enhanced Main Memory Index Structure Employing the Level Prefetching Technique (레벨 프리페칭 기법을 이용한 향상된 주기억장치 상주형 색인구조)

  • Hong Hyun-Taek;Kang Tae-Ho;Yoo Jae-Soo
    • Journal of Internet Computing and Services
    • /
    • v.4 no.6
    • /
    • pp.75-86
    • /
    • 2003
  • In main-memory resident index structures, secondary cache misses considerably have an effect on the performance of index structures. Recently, several main-memory resident index structures that consider cache have been proposed to reduce the impact of secondary cache misses. However they still suffer from full secondary cache misses whenever visiting each level of a index tree, In this paper, we propose a new index structure that eliminates cache misses even when visiting each level of index tree. The proposed index structure prefetches the grandchildren of a current node. The basic structure of the proposed index structure is from CSB+-tree that uses the concepts of the node group to increase fan-out. However the insert algorithm of the proposed index structure reduces the cost of a split significantly, Also, we show the superiority of our algorithm through various performance evaluation.

  • PDF

Implementation of a Large-scale Web Query Processing System Using the Multi-level Cache Scheme (계층적 캐시 기법을 이용한 대용량 웹 검색 질의 처리 시스템의 구현)

  • Lim, Sung-Chae
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.7
    • /
    • pp.669-679
    • /
    • 2008
  • With the increasing demands of information sharing and searches via the web, the web search engine has drawn much attention. Although many researches have been done to solve technical challenges to build the web search engine, the issue regarding its query processing system is rarely dealt with. Since the software architecture and operational schemes of the query processing system are hard to elaborate, we here present related techniques implemented on a commercial system. The implemented system is a very large-scale system that can process 5-million user queries per day by using index files built on about 65-million web pages. We implement a multi-level cache scheme to save already returned query results for performance considerations, and the multi-level cache is managed in 4-level cache storage areas. Using the multi-level cache, we can improve the system throughput by a factor of 4, thereby reducing around 70% of the server cost.

Performance analysis of cache strategy for signaling traffic management in wireless ATM network (무선 ATM망에서 신호 트래픽 관리를 위한 기억공간 적재기법의 성능분석)

  • 최기무;조동호
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.23 no.7
    • /
    • pp.1639-1649
    • /
    • 1998
  • For mobile multimedia services, wireless ATM(Asynchronous transfer Mode) network is studied actively. In wireless ATM network, the existing signaling protocols generate heavy traffics for HLR due to the centralized structure that all signaling loads mush be handled in HLR(Home Location Register). Also, centralized structure causes critical connection setup delays. Thus, it is important that wireless ATM reduces the connection setup delays occurred due to high traffic loads of signaling based on distributed processing. In this thesis, we propose a cache strategy for call delivery as well as the cache updates of registration based on ATM multicasting and compares the cost of cache scheme with that of conventional scheme. Our study shows that cache scheme has better performance than the conventional methods in the case that the portable mobility is low and traffic density is large.

  • PDF

Performance Impact of Large File Transfer on Web Proxy Caching: A Case Study in a High Bandwidth Campus Network Environment

  • Kim, Hyun-Chul;Lee, Dong-Man;Chon, Kil-Nam;Jang, Beak-Cheol;Kwon, Tae-Kyoung;Choi, Yang-Hee
    • Journal of Communications and Networks
    • /
    • v.12 no.1
    • /
    • pp.52-66
    • /
    • 2010
  • Since large objects consume substantial resources, web proxy caching incurs a fundamental trade-off between performance (i.e., hit-ratio and latency) and overhead (i.e., resource usage), in terms of caching and relaying large objects to users. This paper investigates how and to what extent the current dedicated-server based web proxy caching scheme is affected by large file transfers in a high bandwidth campus network environment. We use a series of trace-based performance analyses and profiling of various resource components in our experimental squid proxy cache server. Large file transfers often overwhelm our cache server. This causes a bottleneck in a web network, by saturating the network bandwidth of the cache server. Due to the requests for large objects, response times required for delivery of concurrently requested small objects increase, by a factor as high as a few million, in the worst cases. We argue that this cache bandwidth bottleneck problem is due to the fundamental limitations of the current centralized web proxy caching model that scales poorly when there are a limited amount of dedicated resources. This is a serious threat to the viability of the current web proxy caching model, particularly in a high bandwidth access network, since it leads to sporadic disconnections of the downstream access network from the global web network. We propose a peer-to-peer cooperative web caching scheme to address the cache bandwidth bottleneck problem. We show that it performs the task of caching and delivery of large objects in an efficient and cost-effective manner, without generating significant overheads for participating peers.

An Efficient Buffer Cache Management Scheme for Heterogeneous Storage Environments (이기종 저장 장치 환경을 위한 버퍼 캐시 관리 기법)

  • Lee, Se-Hwan;Koh, Kern;Bahn, Hyo-Kyung
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.37 no.5
    • /
    • pp.285-291
    • /
    • 2010
  • Flash memory has many good features such as small size, shock-resistance, and low power consumption, but the cost of flash memory is still high to substitute for hard disk entirely. Recently, some mobile devices, such as laptops, attempt to use both flash memory and hard disk together for taking advantages of merits of them. However, existing OSs (Operating Systems) are not optimized to use the heterogeneous storage media. This paper presents a new buffer cache management scheme. First, we allocate buffer cache space according to access patterns of block references and the characteristics of storage media. Second, we prefetch data blocks selectively according to the location of them and access patterns of them. Third, we moves destaged data from buffer cache to hard disk or flash memory considering the access patterns of block references. Trace-driven simulation shows that the proposed schemes enhance the buffer cache hit ratio by up to 29.9% and reduce the total I/O elapsed time by up to 49.5%.

lpCSB+- tree : An Enhanced Main Memory Index Structure Employing the Level Prefetching Technique (lpCSB+-트리 : 레벨 프리페칭 기법을 이용하는 향상된 주기억장치 상주형 색인구조)

  • Hong Hyun Taek;Pee Jun Il;Song Seok Il;Yoo Jae Soo
    • Journal of KIISE:Databases
    • /
    • v.31 no.6
    • /
    • pp.675-683
    • /
    • 2004
  • In main-memory resident index structures, secondary cache misses considerably have an effect on the performance of index structures. Recently, several main-memory resident index structures that consider cache have been proposed to reduce the impact of secondary cache misses. However they still suffer from full secondary cache misses whenever visiting each level of a index tree. In this paper, we propose a new index structure that eliminates cache misses even when visiting each level of index tree. The proposed index structure prefetches the grandchildren of a current node. The basic structure of the proposed index structure is from CSB+-tree that uses the concepts of the node group to increase fan-out. However the insert algorithm of the proposed index structure reduces the cost of a split significantly. Also, we show the superiority of our algorithm through various performance evaluation.

Design and Implementation of an In-Memory File System Cache with Selective Compression (대용량 파일시스템을 위한 선택적 압축을 지원하는 인-메모리 캐시의 설계와 구현)

  • Choe, Hyeongwon;Seo, Euiseong
    • Journal of KIISE
    • /
    • v.44 no.7
    • /
    • pp.658-667
    • /
    • 2017
  • The demand for large-scale storage systems has continued to grow due to the emergence of multimedia, social-network, and big-data services. In order to improve the response time and reduce the load of such large-scale storage systems, DRAM-based in-memory cache systems are becoming popular. However, the high cost of DRAM severely restricts their capacity. While the method of compressing cache entries has been proposed to deal with the capacity limitation issue, compression and decompression, which are technically difficult to parallelize, induce significant processing overhead and in turn retard the response time. A selective compression scheme is proposed in this paper for in-memory file system caches that rapidly estimates the compression ratio of incoming cache entries with their Shannon entropies and compresses cache entries with low compression ratio. In addition, a description is provided of the design and implementation of an in-kernel in-memory file system cache with the proposed selective compression scheme. The evaluation showed that the proposed scheme reduced the execution time of benchmarks by approximately 18% in comparison to the conventional non-compressing in-memory cache scheme. It also provided a cache hit ratio similar to the all-compressing counterpart and reduced 7.5% of the execution time by reducing the compression overhead. In addition, it was shown that the selective compression scheme can reduce the CPU time used for compression by 28% compared to the case of the all-compressing scheme.

Cache Table Management for Effective Label Switching (효율적인 레이블 스위칭을 위한 캐쉬 테이블 관리)

  • Kim, Nam-Gi;Yoon, Hyun-Soo
    • Journal of KIISE:Information Networking
    • /
    • v.28 no.2
    • /
    • pp.251-261
    • /
    • 2001
  • The traffic on the Internet has been growing exponentially for some time. This growth is beginning to stress the current-day routers. However, switching technology offers much higher performance. So the label switching network which combines IP routing with switching technology, is emerged. EspeciaJJy in the data driven label switching, flow classification and cache table management are needed. Flow classification is to classify packets into switching and non-switching packets, and cache table management is to maintain the cache table which contains information for flow classification and label switching. However, the cache table management affects the performance of label switching network considerably as well as flowclassification because the bigger cache table makes more packet switched and maintains setup cost lower, but cache is restricted by local router resources. For that reason, there is need to study the cache replacement scheme for the efficient cache table management with the Internet traffic characterized by user. So in this paper, we propose several cache replacement schemes for label switching network. First, without the limitation at switching capacity in the router. we introduce FIFO(First In First Out). LFC(Least Flow Count), LRU(Least Recently Used! scheme and propose priority LRU, weighted priority LRU scheme. Second, with the limitation at switching capacity in the router, we introduce LFC-LFC, LFC-LRU, LRU-LFC, LRU-LRU scheme and propose LRU-weighted LRU scheme. Without limitation, weighted priority LRU scheme and with limitation, LRU-weighted LRU scheme showed best performance in this paper.

  • PDF

Analysis on the Performance and Temperature of the 3D Quad-core Processor according to Cache Organization (캐쉬 구성에 따른 3차원 쿼드코어 프로세서의 성능 및 온도 분석)

  • Son, Dong-Oh;Ahn, Jin-Woo;Choi, Hong-Jun;Kim, Jong-Myon;Kim, Cheol-Hong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.17 no.6
    • /
    • pp.1-11
    • /
    • 2012
  • As the process technology scales down, multi-core processors cause serious problems such as increased interconnection delay, high power consumption and thermal problems. To solve the problems in 2D multi-core processors, researchers have focused on the 3D multi-core processor architecture. Compared to the 2D multi-core processor, the 3D multi-core processor decreases interconnection delay by reducing wire length significantly, since each core on different layers is connected using vertical through-silicon via(TSV). However, the power density in the 3D multi-core processor is increased dramatically compared to that in the 2D multi-core processor, because multiple cores are stacked vertically. Unfortunately, increased power density causes thermal problems, resulting in high cooling cost, negative impact on the reliability. Therefore, temperature should be considered together with performance in designing 3D multi-core processors. In this work, we analyze the temperature of the cache in quad-core processors varying cache organization. Then, we propose the low-temperature cache organization to overcome the thermal problems. Our evaluation shows that peak temperature of the instruction cache is lower than threshold. The peak temperature of the data cache is higher than threshold when the cache is composed of many ways. According to the results, our proposed cache organization not only efficiently reduces the peak temperature but also reduces the performance degradation for 3D quad-core processors.