• Title/Summary/Keyword: Set-associative Cache

Search Result 26, Processing Time 0.022 seconds

Way-set Associative Management for Low Power Hybrid L2 Cache Memory (고성능 저전력 하이브리드 L2 캐시 메모리를 위한 연관사상 집합 관리)

  • Jung, Bo-Sung;Lee, Jung-Hoon
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.13 no.3
    • /
    • pp.125-131
    • /
    • 2018
  • STT-RAM is attracting as a next generation Non-volatile memory for replacing cache memory with low leakage energy, high integration and memory access performance similar to SRAM. However, there is problem of write operations as the other Non_volatile memory. Hybrid cache memory using SRAM and STT-RAM is attracting attention as a cache memory structure with lowe power consumption. Despite this, reducing the leakage energy consumption by the STT-RAM is still lacking access to the Dynamic energy. In this paper, we proposed as energy management method such as a way-selection approach for hybrid L2 cache fo SRAM and STT-RAM and memory selection method of write/read operation. According to the simulation results, the proposed hybrid cache memory reduced the average energy consumption by 40% on SPEC CPU 2006, compared with SRAM cache memory.

A Proposal for Hit Ratio Improvement of a Microprocessor's Cache Memory (마이크로프로세서 캐쉬메모리의 적중률 개선을 위한 제안)

  • 조용훈;김정선
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.25 no.4B
    • /
    • pp.783-787
    • /
    • 2000
  • A microprocessor, which is used as a CPU for state-of-the-art personal computers, adopts 256KB or 512KB L2(Level 2) cache memory. This cache hires Direct Mapping Procedure, 32B Line Size, and no Write Allocation. In this cache architecture, we can expert about 2.5% hit ratio improvement by using 8-way Set Associative Mapping instead of Direct Mapping, 128B Line Size instead of 32B, and Write Allocation.

  • PDF

An Area Efficient Low Power Data Cache for Multimedia Embedded Systems (멀티미디어 내장형 시스템을 위한 저전력 데이터 캐쉬 설계)

  • Kim Cheong-Ghil;Kim Shin-Dug
    • The KIPS Transactions:PartA
    • /
    • v.13A no.2 s.99
    • /
    • pp.101-110
    • /
    • 2006
  • One of the most effective ways to improve cache performance is to exploit both temporal and spatial locality given by any program executional characteristics. This paper proposes a data cache with small space for low power but high performance on multimedia applications. The basic architecture is a split-cache consisting of a direct-mapped cache with small block sire and a fully-associative buffer with large block size. To overcome the disadvantage of small cache space, two mechanisms are enhanced by considering operational behaviors of multimedia applications: an adaptive multi-block prefetching to initiate various fetch sizes and an efficient block filtering to remove rarely reused data. The simulations on MediaBench show that the proposed 5KB-cache can provide equivalent performance and reduce energy consumption up to 40% as compared with 16KB 4-way set associative cache.

Performance and Power Consumption Improvement of Embedded RISC Core (임베디드 RISC 코어의 성능 및 전력 개선)

  • Jung, Hong-Kyun;Ryoo, Kwang-Ki
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.14 no.2
    • /
    • pp.453-461
    • /
    • 2010
  • This paper presents a branch prediction algorithm and a 4-way set-associative cache for performance improvement of embedded RISC core and a clock-gating algorithm using ODC (Observability Don't Care) operation to improve the power consumption of the core. The branch prediction algorithm has a structure using BTB(Branch Target Buffer) and 4-way set associative cache has lower miss rate than direct-mapped cache. Pseudo-LRU Policy, which is one of the Line Replacement Policies, is used for decreasing the number of bits that store LRU value. The clock gating algorithm reduces dynamic power consumption. As a result of estimation of performance and dynamic power, the performance of the OpenRISC core applied the proposed architecture is improved about 29% and dynamic power of the core using Chartered $0.18{\mu}m$ technology library is reduced by 16%.

Design of an Asynchronous Data Cache with FIFO Buffer for Write Back Mode (Write Back 모드용 FIFO 버퍼 기능을 갖는 비동기식 데이터 캐시)

  • Park, Jong-Min;Kim, Seok-Man;Oh, Myeong-Hoon;Cho, Kyoung-Rok
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.6
    • /
    • pp.72-79
    • /
    • 2010
  • In this paper, we propose the data cache architecture with a write buffer for a 32bit asynchronous embedded processor. The data cache consists of CAM and data memory. It accelerates data up lood cycle between the processor and the main memory that improves processor performance. The proposed data cache has 8 KB cache memory. The cache uses the 4-way set associative mapping with line size of 4 words (16 bytes) and pseudo LRU replacement algorithm for data replacement in the memory. Dirty register and write buffer is used for write policy of the cache. The designed data cache is synthesized to a gate level design using $0.13-{\mu}m$ process. Its average hit rate is 94%. And the system performance has been improved by 46.53%. The proposed data cache with write buffer is very suitable for a 32-bit asynchronous processor.

New Drowsy Cashing Method by Using Way-Line Prediction Unit for Low Power Cache (저전력 캐쉬를 위한 웨이-라인 예측 유닛을 이용한 새로운 드로시 캐싱 기법)

  • Lee, Jung-Hoon
    • Journal of The Institute of Information and Telecommunication Facilities Engineering
    • /
    • v.10 no.2
    • /
    • pp.74-79
    • /
    • 2011
  • The goal of this research is to reduce dynamic and static power consumption for a low power cache system. The proposed cache can achieve a low power consumption by using a drowsy and a way prediction mechanism. For reducing the static power, the drowsy technique is used at 4-way set associative cache. And for reducing the dynamic energy, one among four ways is selectively accessed on the basis of information in the Way-Line Prediction Unit (WLPU). This prediction mechanism does not introduce any additional delay though prediction misses are occurred. The WLPU can effectively reduce the performance overhead of the conventional drowsy caching by waking only a drowsy cache line and one way in advance. Our results show that the proposed cache can reduce the power consumption by about 40% compared with the 4-way drowsy cache.

  • PDF

An Associative Class Set Generation Method for supporting Location-based Services (위치 기반 서비스 지원을 위한 연관 클래스 집합 생성 기법)

  • 김호숙;용환승
    • Journal of KIISE:Databases
    • /
    • v.31 no.3
    • /
    • pp.287-296
    • /
    • 2004
  • Recently, various location-based services are becoming very popular in mobile environments. In this paper, we propose a new concept of a frequent item set, called “associative class set”, for supporting the location-based service which uses a large quantity of a spatial database in mobile computing environments, and then present a new method for efficiently generating the associative class set. The associative class set is generated with considering the temporal relation of queries, the spatial distance of required objects, and access patterns of users. The result of our research can play a fundamental role in efficiently supporting location-based services and in overcoming the limitation of mobile environments. The associative class set can be applied by a recommendation system of a geographic information system in mobile computing environments, mobile advertisement, city development planning, and client cache police of mobile users.

Cache and Pipeline Architecture Improvement and Low Power Design of Embedded Processor (임베디드 프로세서의 캐시와 파이프라인 구조개선 및 저전력 설계)

  • Jung, Hong-Kyun;Ryoo, Kwang-Ki
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2008.10a
    • /
    • pp.289-292
    • /
    • 2008
  • This paper presents a branch prediction algorithm and a 4-way set-associative cache for performance improvement of OpenRISC processor and a clock gating algorithm using ODC (Observability Don't Care) operation for a low-power processor. The branch prediction algorithm has a structure using BTB(Branch Target Buffer) and 4-way set associative cache has lower miss rate than direct-mapped cache. The clock gating algorithm reduces dynamic power consumption. As a result of estimation of performance and dynamic power, the performance of the OpenRISC processor using the proposed algorithm is improved about 8.9% and dynamic power of the processor using samsung $0.18{\mu}m$ technology library is reduced by 13.9%.

  • PDF

Designing a low-power L1 cache system using aggressive data of frequent reference patterns

  • Jung, Bo-Sung;Lee, Jung-Hoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.7
    • /
    • pp.9-16
    • /
    • 2022
  • Today, with the advent of the 4th industrial revolution, IoT (Internet of Things) systems are advancing rapidly. For this reason, a various application with high-performance and large-capacity are emerging. Therefore, there is a need for low-power and high-performance memory for computing systems with these applications. In this paper, we propose an effective structure for the L1 cache memory, which consumes the most energy in the computing system. The proposed cache system is largely composed of two parts, the L1 main cache and the buffer cache. The main cache is 2 banks, and each bank consists of a 2-way set association. When the L1 cache hits, the data is copied into buffer cache according to the proposed algorithm. According to simulation, the proposed L1 cache system improved the performance of energy delay products by about 65% compared to the existing 4-way set associative cache memory.

The Effects of Cache Memory on the System Bus Traffic (캐쉬 메모리가 버스 트래픽에 끼치는 영향)

  • 조용훈;김정선
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.21 no.1
    • /
    • pp.224-240
    • /
    • 1996
  • It is common sense for at least one or more levels of cache memory to be used in these day's computer systems. In this paper, the impact of the internal cache memory organization on the performance of the computer is investigated by using a simulator program, which is wirtten by authors and run on SUN SPARC workstation, with several real execution, with several real execution trace files. 280 cache organizations have been simulated using n-way set associative mapping and LRU(Least Recently Used) replacement algorithm with write allocation policy. As a result, 16-way setassociative cache is the best configuration, and when we select 256KB cache memory and 64 byte line size, the bus traffic ratio was decreased compared to that of the noncache system so that a single bus could support almost 7 processors without any delay and degradationof high ratio(hit ratio was 99.21%). The smaller the line size we choose, the little lower hit ratio we can get, but the more processors can be supported by a single bus(maximum 18 processors). Therefore, using a proper cache memory organization can make a single bus structure be able to support multiple processors without any performance degradation.

  • PDF