Search | Korea Science

Analytical Models of Instruction Fetch on Superscalar Processors

Kim, Sun-Mo;Jung, Jin-Ha;Park, Sang-Bang
- Proceedings of the IEEK Conference
- /
- 2000.07b
- /
- pp.619-622
- /
- 2000
This research presents an analytical model to predict the instruction fetch rate on superscalar Processors. The proposed model is also able to analyze the performance relationship between cache miss and branch prediction miss. The proposed model takes into account various kind of architectural parameters such as branch instruction probability, cache miss rate, branch prediction miss rate, and etc.. To prove the correctness of the proposed model, we performed extensive simulations and compared the results with those of the analytical models. Simulation results showed that the pro-posed model can estimate the instruction fetch rate accurately within 10% error in most cases. The model is also able to show the effects of the cache miss and branch prediction miss on the performance of instruction fetch rate, which can provide an valuable information in designing a balanced system.
PDF

Design of a DI model-based Content Addressable Memory for Asynchronous Cache

Battogtokh, Jigjidsuren;Cho, Kyoung-Rok
- International Journal of Contents
- /
- v.5 no.2
- /
- pp.53-58
- /
- 2009
This paper presents a novel approach in the design of a CAM for an asynchronous cache. The architecture of cache mainly consists of four units: control logics, content addressable memory, completion signal logic units and instruction memory. The pseudo-DCVSL is useful to make a completion signal which is a reference for handshake control. The proposed CAM is a very simple extension of the basic circuitry that makes a completion signal based on DI model. The cache has 2.75KB CAM for 8KB instruction memory. We designed and simulated the proposed asynchronous cache including CAM. The results show that the cache hit ratio is up to 95% based on pseudo-LRU replacement policy.
https://doi.org/10.5392/IJoC.2009.5.2.053 인용 PDF

Radiation-Induced Soft Error Detection Method for High Speed SRAM Instruction Cache (고속 정적 RAM 명령어 캐시를 위한 방사선 소프트오류 검출 기법)

Kwon, Soon-Gyu;Choi, Hyun-Suk;Park, Jong-Kang;Kim, Jong-Tae
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.35 no.6B
- /
- pp.948-953
- /
- 2010
In this paper, we propose multi-bit soft error detection method which can use an instruction cache of superscalar CPU architecture. Proposed method is applied to high-speed static RAM for instruction cache. Using 1D parity and interleaving, it has less memory overhead and detects more multi-bit errors comparing with other methods. It only detects occurrence of soft errors in static RAM. Error correction is treated like a cache miss situation. When soft errors are occurred, it is detected by 1D parity. Instruction cache just fetch the words from lower-level memory to correct errors. This method can detect multi-bit errors in maximum 4$\times$4 window.
PDF KSCI

Design and Performance Evaluation of Expansion Buffer Cache (확장 버퍼 캐쉬의 설계 및 성능 평가)

Hong Won-Kee
- The KIPS Transactions:PartA
- /
- v.11A no.7 s.91
- /
- pp.489-498
- /
- 2004
VLIW processor is considered to be an appropriate processor for the embedded system, provided with high performance and low power con-sumption due to its simple hardware structure. Unfortunately, the VLIW processor often suffers from high memory access latency due to the variable length of I-packets, which consist of independent instructions to be issued in parallel. It is because of the variable I-packet length that some I-packets must be placed over two cache blocks, which are called straddle I-packets, so that two cache accesses are required to fetch such I-packets. In this paper, an expansion buffer cache is proposed to improve not only the instruction fetch bandwidth, but also the power consumption of the I-cache with moderate hardware cost. The expansion buffer cache has a small expansion buffer containing a fraction of a straddle packet along with the main cache to reduce the additional cache accesses due to the straddle I-packets. With a great reduction in the cache accesses due to the straddle packets, the expansion buffer cache can achieve $5{\~}9{\%}$improvement over the conventional I-caches in the $Delay{\cdot}Power{\cdot}Area$ metric.
https://doi.org/10.3745/KIPSTA.2004.11A.7.489 인용 PDF KSCI

The Efficient Buffer Size in A Dual Flash Memory Structure with Buffer System (이중 NAND 플래시 구조의 버퍼시스템에서 효율적 버퍼 크기)

Jung, Bo-Sung;Lee, Jung-Hoon
- IEMEK Journal of Embedded Systems and Applications
- /
- v.6 no.6
- /
- pp.383-391
- /
- 2011
As we know the effects of cache memory research, instruction and data caches can be separated for higher performance with Harvard CPUs. In this paper, we shows the efficiency of buffer system in the instruction and data flash storage medium. And we analyzed characteristics of the data and instruction flash and evaluated the performance. Finally, we propose the best buffer structure with an optimal block size and buffer size for the instruction and data flash.
https://doi.org/10.14372/IEMEK.2011.6.6.6 인용 PDF KSCI

Performance Analysis of Multicore Processor Architectures Based On Cache Size Effects (캐쉬 용량 효과에 대한 멀티코어 프로세서의 성능 연구)

Lee, Jongbok
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.12 no.6
- /
- pp.175-180
- /
- 2012
In order to overcome the complexity and performance limit problems of superscalar processors, the multicore architecture has been prevalent recently. The configuration and the size of instruction and data caches greatly gives effect on the performance of multicore processors. Using SPEC 2000 benchmarks as input, the trace-driven simulation has been performed for the 2-core to 16-core architectures with different sizes of caches extensively. As a result, the 2-way set associative instruction and data cache with the size of 64KB brought the best cost-effective performance.
https://doi.org/10.7236/JIWIT.2012.12.6.175 인용 PDF KSCI

An Efficient Instruction Prefetching Scheme Based on the Page Access Information (페이지 접근 정보에 기반한 효율적인 명령어 캐쉬 선인출 기법)

Shin Soong-Hyun;Kim Cheol-Hong;Jhon Chu-Shik
- Journal of KIISE:Computer Systems and Theory
- /
- v.33 no.5
- /
- pp.306-315
- /
- 2006
In general, the hit ratio of the first level cache is one of the most important factors in determining the performance of computer systems. Prefetching from lower level memory structure is one of the most useful techniques for improving the hit ratio of the first level cache. In this paper, we propose a prefetch on continuous same page access (CSPA) scheme which improves the prefetch efficiency of the instruction cache and reduces prefetch cost at the same time. The proposed CSPA scheme traces the page addresses of executed instructions to count how many times the same memory page is accessed continuously. To increase the prefetch efficiency, the CSPA scheme initiates prefetch only if the number of accesses to the same page exceeds the threshold value. Generally, the size of a L1 cache block is smaller than that of a L2 cache block. Therefore, one L2 cache block contains a number of L1 cache blocks. To reduce the number of unnecessary accesses to the L2 cache due to prefetch, the CSPA scheme enables prefetch only when the missed L1 block and the prefetch L1 block are in the same L2 cache block, leading to reduced prefetch cost. According to our simulations, the proposed prefetching scheme improves the performance by up to 6.7%.
PDF KSCI

Energy-efficient Set-associative Cache Using Bi-mode Way-selector (에너지 효율이 높은 이중웨이선택형 연관사상캐시)

Lee, Sungjae;Kang, Jinku;Lee, Juho;Youn, Jiyong;Lee, Inhwan
- KIPS Transactions on Computer and Communication Systems
- /
- v.1 no.1
- /
- pp.1-10
- /
- 2012
The way-lookup cache and the way-tracking cache are considered to be the most energy-efficient when used for level 1 and level 2 caches, respectively. This paper proposes an energy-efficient set-associative cache using the bi-mode way-selector that combines the way selecting techniques of the way-tracking cache and the way-lookup cache. The simulation results using an Alpha 21264-based system show that the bi-mode way-selecting L1 instruction cache consumes 27.57% of the energy consumed by the conventional set-associative cache and that it is as energy-efficient as the way-lookup cache when used for L1 instruction cache. The bi-mode way-selecting L1 data cache consumes 28.42% of the energy consumed by the conventional set-associative cache, which means that it is more energy-efficient than the way-lookup cache by 15.54% when used for L1 data cache. The bi-mode way-selecting L2 cache consumes 15.41% of the energy consumed by the conventional set-associative cache, which means that it is more energy-efficient than the way-tracking cache by 16.16% when used for unified L2 cache. These results show that the proposed cache can provide the best level of energy-efficiency regardless of the cache level.
https://doi.org/10.3745/KTCCS.2012.1.1.001 인용 PDF

Hybrid Value Predictor in Wide-Issue Superscalar Processor (슈퍼스칼라 프로세서에서 명령 윈도우 크기에 따른 혼합형 값 예측기)

Jeon, Byoung-Chan;Choi, Gyoo-Seok
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.9 no.2
- /
- pp.97-103
- /
- 2009
In this paper, the performance of a hybrid value predictor according to the instruction fetch rate on window size superscalar processors is evaluated. In general, the data dependency relations of instructions are increased with the number of the fetched instructions. Therefore, it is expected that the performance of a value predictor will be higher when the instruction fetch rate is increased. The performance is studied for the machine with collapsing buffer and he one with trace cache as an instruction fetch mechanism. As a result of experiment, it is showed that the performance effect of a value predictor is higher as the instruction fetch rate of instruction window size, IPC, predict rate on apply with non-tc and tc is increased.
PDF

A Study on the Prediction Accuracy Bounds of Instruction Prefetching (명령어 선인출 예측 정확도의 한계에 관한 연구)

Kim, Seong-Baeg;Min, Sang-Lyul;Kim, Chong-Sang
- Journal of KIISE:Computer Systems and Theory
- /
- v.27 no.8
- /
- pp.719-729
- /
- 2000
Prefetching aims at reducing memory latency by fetching, in advance, data that are likely to be requested by the processor in a near future. The effectiveness of prefetching is determined by how accurate the prediction on the needed instructions and data is. Most previous studies on prefetching were limited to proposing a particular prefetch scheme and its performance evaluation, paying little attention to theoretical aspects of prefetching. This paper focuses on the theoretical aspects of instruction prefetching. For this purpose, we propose a clairvoyant prefetch model that makes use of perfect history information. Based on this theoretical model, we analyzed upper limits on the prefetch prediction accuracies of the SPEC benchmarks. The results show that the prefetch prediction accuracy is very high when there is no cache. However, as the size of the instruction cache increases, the prefetch prediction accuracy drops drastically. For example, in the case of the spice benchmark, the prefetch prediction accuracy drops from 53% to 39% when the cache size increases from 2Kbyte to 16Kbyte (assuming 16byte block size). These results indicate that as the cache size increases, most localities are captured by the cache and that instruction prefetching based on the information extracted from the references that missed in the cache suffers from prediction inaccuracies
PDF

Search Result 67, Processing Time 0.021 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)