• Title/Summary/Keyword: 캐쉬 선인출

Search Result 11, Processing Time 0.034 seconds

A Study on the Prediction Accuracy Bounds of Instruction Prefetching (명령어 선인출 예측 정확도의 한계에 관한 연구)

  • Kim, Seong-Baeg;Min, Sang-Lyul;Kim, Chong-Sang
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.27 no.8
    • /
    • pp.719-729
    • /
    • 2000
  • Prefetching aims at reducing memory latency by fetching, in advance, data that are likely to be requested by the processor in a near future. The effectiveness of prefetching is determined by how accurate the prediction on the needed instructions and data is. Most previous studies on prefetching were limited to proposing a particular prefetch scheme and its performance evaluation, paying little attention to theoretical aspects of prefetching. This paper focuses on the theoretical aspects of instruction prefetching. For this purpose, we propose a clairvoyant prefetch model that makes use of perfect history information. Based on this theoretical model, we analyzed upper limits on the prefetch prediction accuracies of the SPEC benchmarks. The results show that the prefetch prediction accuracy is very high when there is no cache. However, as the size of the instruction cache increases, the prefetch prediction accuracy drops drastically. For example, in the case of the spice benchmark, the prefetch prediction accuracy drops from 53% to 39% when the cache size increases from 2Kbyte to 16Kbyte (assuming 16byte block size). These results indicate that as the cache size increases, most localities are captured by the cache and that instruction prefetching based on the information extracted from the references that missed in the cache suffers from prediction inaccuracies

  • PDF

An Efficient Instruction Prefetching Scheme Based on the Page Access Information (페이지 접근 정보에 기반한 효율적인 명령어 캐쉬 선인출 기법)

  • Shin Soong-Hyun;Kim Cheol-Hong;Jhon Chu-Shik
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.33 no.5
    • /
    • pp.306-315
    • /
    • 2006
  • In general, the hit ratio of the first level cache is one of the most important factors in determining the performance of computer systems. Prefetching from lower level memory structure is one of the most useful techniques for improving the hit ratio of the first level cache. In this paper, we propose a prefetch on continuous same page access (CSPA) scheme which improves the prefetch efficiency of the instruction cache and reduces prefetch cost at the same time. The proposed CSPA scheme traces the page addresses of executed instructions to count how many times the same memory page is accessed continuously. To increase the prefetch efficiency, the CSPA scheme initiates prefetch only if the number of accesses to the same page exceeds the threshold value. Generally, the size of a L1 cache block is smaller than that of a L2 cache block. Therefore, one L2 cache block contains a number of L1 cache blocks. To reduce the number of unnecessary accesses to the L2 cache due to prefetch, the CSPA scheme enables prefetch only when the missed L1 block and the prefetch L1 block are in the same L2 cache block, leading to reduced prefetch cost. According to our simulations, the proposed prefetching scheme improves the performance by up to 6.7%.

Prefetching Mechanism for Efficient Disconnected Operations in Mobile Computing Environment (이동 컴퓨팅 환경에서 효율적인 단절 연산을 위한 선인출 메카니즘)

  • 최창호;김명일;박상서;김성조
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 1998.10a
    • /
    • pp.538-585
    • /
    • 1998
  • 이동 컴퓨팅 환경에서 이동 호스트는 무선망을 통해 서버와 연결된 후에서야 데이터를 획득 및 처리할 수 있다. 그러나, 이동 호스트는 낮은 대역폭과 이로 인한 지연, 네트워크 단절(disconnection)과 같은 무선망의 특성으로 인하여 사용상에 많은 불편함과 비효율성이 발생하고 있는 실정이다. 특히, 이동 호스트가 네트워크 단절로 인해 서버에 접근이 불가능하고 작업에 필요한 데이터가 캐쉬에 없을 경우에는 작업 처리가 불가능하다. 본 논문에서는 이와 같은 문제점을 해결하기 위하여 이동 호스트가 미래에 작업할 데이터를 미리 인출해서 캐쉬에 저장하는 선인출 메카니즘을 제안하였다. 본 논문의 선인출 메카니즘을 기록기, 분석기, 선인출 목록 생성기, 그리고 비교기로 구성된다. 기록기와 분석기는 이동 호스트의 파일 참조 패턴들을 기반으로 프로파일을 생성하며, 선인출로 인한 성능 감소를 최소화하기 위해 선인출과 캐쉬 교체 전략을 통합한 선인출 메카니즘을 제시하였다.

  • PDF

Disk Cache Operating Strategy Using Hints in Disk Drive (++디스크 드라이브 레벨에서 힌트정보를 이용한 디스크 캐쉬 운영 방안)

  • 조재동;장태무
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.10c
    • /
    • pp.27-29
    • /
    • 2000
  • 마이크로 프로세서의 동작 속도와 디스크 액세스 속도의 성능 차이는 컴퓨터 시스템의 성능을 제한하는 중요한 요인 주의 하나로 지적되고 있다. 이러한 격차를 줄이는 기술로 디스크 캐쉬의 운영이 연구되어 왔고 디스크 캐쉬 성능 개선 방법으로 선인출이 널리 연구되어 왔다. 본 논문에서는 디스크 드라이브 상에 구현된 캐쉬에서 디스크 요청에 대한 성격적 유형을 힌트로 이용한 선인출 적용방법을 제안하고, 제안된 방법의 유효성은 시뮬레이션 방식으로 입증하였으며 적응적으로 변경된 선인출 적용 방법이 성능의 개선을 이룰 수 있음을 보였다.

  • PDF

Data Prefetching Effect of the Stride Merging-Arrays Method (스트라이드 배열 병합 방법의 데이터 선인출 효과)

  • Jeong, In-Beom;Lee, Jun-Won
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.26 no.11
    • /
    • pp.1429-1436
    • /
    • 1999
  • 데이타들에 대한 선인출 효과를 얻기 위하여 캐쉬 메모리의 캐쉬 블록은 다중 워드로 구성된다. 그러나 선인출된 데이타들이 사용되지 않을 경우 캐쉬 메모리가 낭비되고 따라서 캐쉬 실패율이 증가한다. 데이타 배열 병합 방법은 캐쉬 실패 원인의 하나인 캐쉬 충돌 실패를 감소시키기 위하여 사용되고 있다. 그러나 기존의 배열 병합 방법은 유용하지 못한 데이타들을 캐쉬 블록에 선인출하는 현상을 보인다. 본 논문에서는 이러한 현상을 개선한 스트라이드 배열 병합을 제안한다. 모의시험에서 캐쉬 블록이 다중 워드로 구성된 경우 스트라이드 배열 병합은 캐쉬 충돌 실패를 감소시킬 뿐 만 아니라 유용한 데이타 선인출을 증가 시키므로 캐쉬 성능을 향상시킴을 보여준다. 또한 이렇게 향상된 캐쉬 성능은 프로세서 증가에 따른 확장성 있는 프로그램 성능을 나타낸다.Abstract The cache memory is composed of cache lines with multiple words to achieve the effect of data prefetching. However, if the prefetched data are not used, the spaces of the cache memory are wasted and thus the cache miss rate increases. The data merging-arrays method is used for the sake of the reduction of the cache conflict misses. However, the existing merging-arrays method results in the useless data prefetching. In this paper, a stride merging-arrays method is suggested for improving this phenomenon. Simulation results show that when a cache line is composed of multiple words, the stride merging-arrays method increases the cache performance due to not only the reduction of cache conflict misses but also the useful data prefetching. This enhanced cache performance also represents the more scalable performance of parallel applications according to increasing the number of processors.

An Adaptive Sequential Prefetching using Traffic Information in Shared-Memory Multiprocessors (공유메모리 다중처리기에서 상호연결망의 통신량을 고려하는 선인출 기법)

  • 박정우;손영철;정한조;맹승렬
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.04a
    • /
    • pp.633-635
    • /
    • 2000
  • 상호연결망을 기반으로 하는 공유메모리 다중처리기의 성능은 공유메모리 접근 속도에 많은 영향을 받는다. 선인출 기법은 프로세서의 계산과 데이터의 접근을 중첩시켜 메모리의 접근 속도를 줄인다. 기존의 선인출 기법들은 캐쉬미스 양을 줄이는 것만을 생각하여 상호연결망의 상황을 고려하지 않은 문제점이 있다. 본 논문에서는 응답이 늦은 선인출 이용하여 선인출 양을 조절함으로써 상호연결망의 경쟁을 줄이는 새로운 선인출 기법을 제안하고 프로그램 구동 모의실험을 통해 기존의 선인출 기법[1]에 비해 더 좋은 성능을 나타냄을 보인다.

  • PDF

Prefetching Mechanism using the User's File Access Pattern Profile in Mobile Computing Environment (이동 컴퓨팅 환경에서 사용자의 FAP 프로파일을 이용한 선인출 메커니즘)

  • Choi, Chang-Ho;Kim, Myung-Il;Kim, Sung-Jo
    • Journal of KIISE:Information Networking
    • /
    • v.27 no.2
    • /
    • pp.138-148
    • /
    • 2000
  • In the mobile computing environment, in order to make copies of important files available when being disconnected the mobile host(client) must store them in its local cache while the connection is maintained. In this paper, we propose the prefetching mechanism for the client to save files which may be accessed in the near future. Our mechanism utilizes analyzer, prefetch-list producer, and prefetch manager. The analyzer records file access patterns of the user in a FAP(File Access Patterns) profile. Using the profile, the prefetch-list producer creates the prefetch-list. The prefetch manager requests a file server to return this list. We set the parameter TRP(Threshold of Reference Probability) to ensure that only reasonably related files can be prefetched. The prefetch-list producer adds the files to a prefetch-list if their reference probability is greater than the TRP. We also use the parameter TACP(Threshold of Access Counter Probability) to reduce the hoarding size required to store a prefetch-list. Finally, we measure the metrics such as the cache hit ratio, the number of files referenced by the client after disconnection and the hoarding size. The simulation results show that the performance of our mechanism is superior to that of the LRU caching mechanism. Our results also show that prefetching with the TACP can reduce the hoard size while maintaining similar performance of prefetching without TACP.

  • PDF

Analysis of Web Server Referencing Characteristics and performance Improvement of Web Server (웹 서버의 참조 특성 분석과 성능 개선)

  • Ahn, Hyo-Beom;Cho, Kyung-San
    • The KIPS Transactions:PartA
    • /
    • v.8A no.3
    • /
    • pp.201-208
    • /
    • 2001
  • Explosive growth of the Web and the non-uniform characteristics of client requests result in the performance degradation of Web servers, and server cache has been recognized as the solution. We analyzed Web server accessing characteristics-repetition, size, and locality of access. Based on the result, we analyzed the cache removal policies and proposed a prefetch strategy to improve the hit ratio of server caches. In addition, through the trace-driven simulation based on the traces from real Web sites, we showed the performance improvement by our proposal.

  • PDF

A Data Prefetching Scheme Exploiting the Grain Size in Parallel Programs using Data Arrays (데이타 배열을 사용하는 병렬 프로그램에서 그레인 크기를 이용한 데이타 선인출 기법)

  • Jung, In-Bum;Lee, Joon-Won
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.27 no.1
    • /
    • pp.101-108
    • /
    • 2000
  • The data prefetching scheme is an effective technique to reduce the main memory access latency by exploiting the overlap of processor computations with data accesses. However, if the prefetched data replicate the useful existing data in the cache memory and they are not being used in computations. performances of programs are aggravated. This phenomenon results from the lack of correct predictions for data being used in the future. When parallel programs exploit the data arrays for computations, the grain size is useful information for data prefetching scheme because it implies the range of data using in computations. Based on this information, we suggest a new data prefetching scheme exploited by the grain size of the parallel program. Simulation results show that the suggested prefetching scheme improves the performance of the simulated parallel programs due to the reduction of bus transactions as well as useful prefetching operations.

  • PDF

A Dynamic Prefetch Filtering Schemes to Enhance Usefulness Of Cache Memory (캐시 메모리의 유용성을 높이는 동적 선인출 필터링 기법)

  • Chon Young-Suk;Lee Byung-Kwon;Lee Chun-Hee;Kim Suk-Il;Jeon Joong-Nam
    • The KIPS Transactions:PartA
    • /
    • v.13A no.2 s.99
    • /
    • pp.123-136
    • /
    • 2006
  • The prefetching technique is an effective way to reduce the latency caused memory access. However, excessively aggressive prefetch not only leads to cache pollution so as to cancel out the benefits of prefetch but also increase bus traffic leading to overall performance degradation. In this thesis, a prefetch filtering scheme is proposed which dynamically decides whether to commence prefetching by referring a filtering table to reduce the cache pollution due to unnecessary prefetches In this thesis, First, prefetch hashing table 1bitSC filtering scheme(PHT1bSC) has been shown to analyze the lock problem of the conventional scheme, this scheme such as conventional scheme used to be N:1 mapping, but it has the two state to 1bit value of each entries. A complete block address table filtering scheme(CBAT) has been introduced to be used as a reference for the comparative study. A prefetch block address lookup table scheme(PBALT) has been proposed as the main idea of this paper which exhibits the most exact filtering performance. This scheme has a length of the table the same as the PHT1bSC scheme, the contents of each entry have the fields the same as CBAT scheme recently, never referenced data block address has been 1:1 mapping a entry of the filter table. On commonly used prefetch schemes and general benchmarks and multimedia programs simulates change cache parameters. The PBALT scheme compared with no filtering has shown enhanced the greatest 22%, the cache miss ratio has been decreased by 7.9% by virtue of enhanced filtering accuracy compared with conventional PHT2bSC. The MADT of the proposed PBALT scheme has been decreased by 6.1% compared with conventional schemes to reduce the total execution time.