• 제목/요약/키워드: Trace cache

검색결과 45건 처리시간 0.028초

다양한 블럭 크기를 갖는 섹터 캐시 메모리의 Trace-driven 시뮬레이션 알고리즘 (A New trace-driven Simulation Algorithm for Sector Cache Memories with Various Block Sizes)

  • Dong Gue Park
    • 전자공학회논문지B
    • /
    • 제32B권6호
    • /
    • pp.849-861
    • /
    • 1995
  • In this paper, a new trace driven simulation algorithm is proposed to evaluate the bus traffic and the miss ration of the various sector cache memories, which have various sub-block sizes and block sizes and associativities and number of sets, with a single pass through an address trace. Trace-driven simulaton is usually used as a method for performance evaluation of sector cache memories, but it spends a lot of simulation time for simulating the diverse cache configurations with a long address trace. The proposed algorithm shortens the simulation time by evaluating the performance of the various sector cache configurations. which have various sub-block sizes and block sizes and associativities and number of sets , with a single pass through an address trace. Our simulation results show that the run times of the proposed simulation algorithm can be considerably reduced than those of existing simulation algorithms, when the proposed algorithm is miplemented in C language and the address traces obtained from the various sample programs are used as a input of trace-driven simulation.

  • PDF

Low Power Trace Cache for Embedded Processor

  • Moon Je-Gil;Jeong Ha-Young;Lee Yong-Surk
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2004년도 ICEIC The International Conference on Electronics Informations and Communications
    • /
    • pp.204-208
    • /
    • 2004
  • Embedded business will be expanded market more and more since customers seek more wearable and ubiquitous systems. Cellular telephones, PDAs, notebooks and portable multimedia devices could bring higher microprocessor revenues and more rewarding improvements in performance and functions. Increasing battery capacity is still creeping along the roadmap. Until a small practical fuel cell becomes available, microprocessor developers must come up with power-reduction methods. According to MPR 2003, the instruction and data caches of ARM920T processor consume $44\%$ of total processor power. The rest of it is split into the power consumptions of the integer core, memory management units, bus interface unit and other essential CPU circuitry. And the relationships among CPU, peripherals and caches may change in the future. The processor working on higher operating frequency will exact larger cache RAM and consume more energy. In this paper, we propose advanced low power trace cache which caches traces of the dynamic instruction stream, and reduces cache access times. And we evaluate the performance of the trace cache and estimate the power of the trace cache, which is compared with conventional cache.

  • PDF

효율적인 버퍼 캐시 관리를 위한 동적 캐시 분할 블록교체 기법 (Dynamic Cache Partitioning Strategy for Efficient Buffer Cache Management)

  • 진재선;허의남;추현승
    • 한국시뮬레이션학회논문지
    • /
    • 제12권2호
    • /
    • pp.35-44
    • /
    • 2003
  • The effectiveness of buffer cache replacement algorithms is critical to the performance of I/O systems. In this paper, we propose the degree of inter-reference gap (DIG) based block replacement scheme that retains merits of the least recently used (LRU) such as simple implementation and good cache hit ratio (CHR) for general patterns of references, and improves CHR further. In the proposed scheme, cache blocks with low DIGs are distinguished from blocks with high DIGs and the replacement block is selected among high DIGs blocks as done in the low inter-reference recency set (LIRS) scheme. Thus, by having the effect of the partitioning the cache memory dynamically based on DIGs, CHR is improved. Trace-driven simulation is employed to verified the superiority of the DIG based scheme and shows that the performance improves up to about 175% compared to the LRU scheme and 3% compared to the LIRS scheme for the same traces.

  • PDF

CPC: A File I/O Cache Management Policy for Compute-Bound Workloads

  • Bahn, Hyokyung
    • International journal of advanced smart convergence
    • /
    • 제11권2호
    • /
    • pp.1-6
    • /
    • 2022
  • With the emergence of the new era of the 4th industrial revolution, compute-bound workloads with large memory footprint like big data processing increase dramatically. Even in such compute-bound workloads, however, we observe bulky I/Os while loading big data from storage to memory. Although file I/O cache plays a role of accelerating the performance of storage I/O, we found out that the cache hit rate in such environments is not improved even though we increase the file I/O cache capacity because of some special I/O references generated by compute-bound workloads. To cope with this situation, we propose a new file I/O cache management policy that improves the cache hit rate for compute-bound workloads significantly. Trace-driven simulations by replaying file I/O reference logs of compute-bound workloads show that the proposed cache management policy improves the cache hit rate compared to the well-acknowledged CLOCK algorithm by a large margin.

n-way Set Associative Cache와 Fully Associative Cache성능 분석 (Performance Analysis of n-way Associative Cache and Fully Associative Cache)

  • 조용훈;김정선
    • 한국정보처리학회논문지
    • /
    • 제4권3호
    • /
    • pp.802-810
    • /
    • 1997
  • 본 논문에서는 n-way Set Associative Cache와 Fully Associative Cache의 유용성 검증을 위하여 direct mapping,2_,4_,16_way set associative mapping 뿐만 아니라 32_, 64_,128_,256_,512_,1024_,2048_,그리고4096_way set assiciative mapping을 사용하는 캐취 의 성능을 제안된 시뮬레이터 프로그램을 실행시켜 분석한다. 일반적으로 캐쉬 메모리 내에있는 하나의 라인보호 내에 수용 가능한 주기억장치의 라인 수 n이 커짐에 따라 그 성능 선형적으로 개선될 것으로 기대되지만, 본 논문의 분석에 따르면 512K 이상의 대용량 캐쉬에서는 n의 변화에 따른 성능 개선이 거의 없는 상태였고 소용량 캐쉬의 경우에도 사용된 라이사이즈가 작은 경우 그 성능개선이 미미하였으며 라인사이즈가 비교적 큰 캐쉬에서는 괄목할 만한 성능개선이 있음을 확인하였다.

  • PDF

낮은 쓰기 성능을 갖는 비휘발성 메인 메모리 시스템을 위한 성능 및 에너지 최적화 기법 (Performance and Energy Optimization for Low-Write Performance Non-volatile Main Memory Systems)

  • 정우순;이형규
    • 대한임베디드공학회논문지
    • /
    • 제13권5호
    • /
    • pp.245-252
    • /
    • 2018
  • Non-volatile RAM devices have been increasingly viewed as an alternative of DRAM main memory system. However some technologies including phase-change memory (PCM) are still suffering from relatively poor write performance as well as limited endurance. In this paper, we introduce a proactive last-level cache management to efficiently hide a low write performance of non-volatile main memory systems. The proposed method significantly reduces the cache miss penalty by proactively evicting the part of cachelines when the non-volatile main memory system is in idle state. Our trace-driven simulation demonstrates 24% performance enhancement, compared with a conventional LRU cache management, on the average.

Bitmap-based Prefix Caching for Fast IP Lookup

  • Kim, Jinsoo;Ko, Myeong-Cheol;Nam, Junghyun;Kim, Junghwan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제8권3호
    • /
    • pp.873-889
    • /
    • 2014
  • IP address lookup is very crucial in performance of routers. Several works have been done on prefix caching to enhance the performance of IP address lookup. Since a prefix represents a range of IP addresses, a prefix cache shows better performance than an IP address cache. However, not every prefix is cacheable in itself. In a prefix cache it causes false hit to cache a non-leaf prefix because there is possibly the longer matching prefix in the routing table. Prefix expansion techniques such as complete prefix tree expansion (CPTE) make it possible to cache the non-leaf prefixes as the expanded forms, but it is hard to manage the expanded prefixes. The expanded prefixes sometimes incur a great deal of update overhead in a routing table. We propose a bitmap-based prefix cache (BMCache) to provide low update overhead as well as low cache miss ratio. The proposed scheme does not have any expanded prefixes in the routing table, but it can expand a non-leaf prefix using a bitmap on caching time. The trace-driven simulation shows that BMCache has very low miss ratio in spite of its low update overhead compared to other schemes.

캐쉬 메모리가 버스 트래픽에 끼치는 영향 (The Effects of Cache Memory on the System Bus Traffic)

  • 조용훈;김정선
    • 한국통신학회논문지
    • /
    • 제21권1호
    • /
    • pp.224-240
    • /
    • 1996
  • It is common sense for at least one or more levels of cache memory to be used in these day's computer systems. In this paper, the impact of the internal cache memory organization on the performance of the computer is investigated by using a simulator program, which is wirtten by authors and run on SUN SPARC workstation, with several real execution, with several real execution trace files. 280 cache organizations have been simulated using n-way set associative mapping and LRU(Least Recently Used) replacement algorithm with write allocation policy. As a result, 16-way setassociative cache is the best configuration, and when we select 256KB cache memory and 64 byte line size, the bus traffic ratio was decreased compared to that of the noncache system so that a single bus could support almost 7 processors without any delay and degradationof high ratio(hit ratio was 99.21%). The smaller the line size we choose, the little lower hit ratio we can get, but the more processors can be supported by a single bus(maximum 18 processors). Therefore, using a proper cache memory organization can make a single bus structure be able to support multiple processors without any performance degradation.

  • PDF

2-레벨 디스크 캐쉬 시스템에서 디스크 블록 중복 저장을 최소화하는 효율적인 캐싱 알고리즘 (An Efficient Caching Algorithm to Minimize Duplicated Disk Blocks in 2-level Disk Cache System)

  • 류갑상;정수목
    • 한국컴퓨터산업학회논문지
    • /
    • 제5권1호
    • /
    • pp.57-64
    • /
    • 2004
  • 처리기와 디스크의 속도 차가 커지고 있어 I/O subsystem이 컴퓨터 시스템의 성능향상에 병목 현상을 일으키게 된다. 이러한 처리기와 디스크와의 속도 차를 극복하기 위한 한 방법으로 캐쉬가 사용되고 있다. 캐쉬를 사용하면 디스크 블록에 대한 접근 횟수를 줄일 수 있어 전체 시스템의 성능을 향상시킬 수 있다. 본 논문에서는 버퍼 캐쉬와 디스크 캐쉬를 가지는 시스템에서 서로 독립적으로 캐쉬가 관리되어 다수의 디스크 블록이 중복되게 유지되는 문제를 해결하기 위하여 디스크 블록의 중복을 최소화함으로 시스템의 성능을 개선하는 캐쉬 관리 기법을 제안하였다 시뮬레이션을 통하여 제안된 기법을 적용하였을 경우 디스크 블록에 대한 평균 접근 지연시간이 감소됨을 확인하였다.

  • PDF

오퍼랜드 참조 예측 캐쉬(ORPC)를 활용한 오퍼랜드 페치의 성능 개선 (Performance Improvement of Operand Fetching with the Operand Reference Prediction Cache(ORPC))

  • 김흥준;조경산
    • 한국정보처리학회논문지
    • /
    • 제5권6호
    • /
    • pp.1652-1659
    • /
    • 1998
  • 본 논문에서는 오퍼랜드 참조 지연과 자료 캐쉬에 대한 대역폭 요구를 줄이기 위하여, 명령어 페치 단계에서 오퍼랜드의 값과 주소 변환 정보를 예측하고 초기에 예측의 정확성을 검증하여 예측 실패에 의한 성능 손실을 최소화할 수 있는 오퍼랜드 차조 예측 캐쉬(ORPC) 구조를 제안하였다. 제안된 ORPC의 세 가지 운영 구조 (ORPC1, ORPC2, ORPC3)에 의한 예측의 정확도와 성능 개선은 6개의 벤치마크 프로그램의 trace-driven 시뮬레이션을 통해 분석되었다. 512항목의 ORPC2, ORPC3은 평균적으로 오퍼랜드 적재 참조의 45.3%에 대해 정확한 오퍼랜드를 예측하여 오퍼랜드 적재 시간 및 자료캐쉬의 대역폭 요구를 감소시키며, 또한 ORPC3은 전체 오퍼랜드 참조에 대해 98.1%의 주소 변환 정보를 제공하여 자료 TLB의 기능을 대신한다.

  • PDF