• Title/Summary/Keyword: 분기 명령어

Search Result 70, Processing Time 0.02 seconds

Performance Analysis of Eager Dual Path Strategy (적극적 이중 경로 전략의 성능 분석)

  • Joo, Young-Sang;Cho, Kyung-San
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.1
    • /
    • pp.245-251
    • /
    • 2000
  • 파이프라인 프로세서를 위한 이중 경로 전략의 성능을 개선하기 위해, 본 논문에서는 통합 신뢰 매커지즘과 적극적 이중경로 전략(EDPS)을 제안한다. 통합 신뢰 매커니즘은 동적 신뢰 매커니즘과 정적 신뢰 매커니즘을 결합한 것으로 기존의 신뢰 매커니즘보다 신뢰 예측 정확도를 높일 수 있고 제안하는 EDPS와 결합하여 사용한다. EDPS는 높은 신뢰 집합에 g속하는 분기 명령어도 가능한 경우에는 두 경로를 모두 사용하여 조건 분기 명령어로 인해 발생하는 분기 지연의 총합을 줄일 수 있다. 6개 벤치마크에 대한 추적 기반의 시뮬레이션을 통해, 제안된 통합 신뢰 매커니즘을 사용하는 EDPS가 기존의 선택적 이중 경로 실행에 비해 분기 지연의 총합을 22%을 줄일 수 있다.

  • PDF

Direction-Embedded Branch Prediction based on the Analysis of Neural Network (신경망의 분석을 통한 방향 정보를 내포하는 분기 예측 기법)

  • Kwak Jong Wook;Kim Ju-Hwan;Jhon Chu Shik
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.42 no.1
    • /
    • pp.9-26
    • /
    • 2005
  • In the pursuit of ever higher levels of performance, recent computer systems have made use of deep pipeline, dynamic scheduling and multi-issue superscalar processor technologies. In this situations, branch prediction schemes are an essential part of modem microarchitectures because the penalty for a branch misprediction increases as pipelines deepen and the number of instructions issued per cycle increases. In this paper, we propose a novel branch prediction scheme, direction-gshare(d-gshare), to improve the prediction accuracy. At first, we model a neural network with the components that possibly affect the branch prediction accuracy, and analyze the variation of their weights based on the neural network information. Then, we newly add the component that has a high weight value to an original gshare scheme. We simulate our branch prediction scheme using Simple Scalar, a powerful event-driven simulator, and analyze the simulation results. Our results show that, compared to bimodal, two-level adaptive and gshare predictor, direction-gshare predictor(d-gshare. 3) outperforms, without additional hardware costs, by up to 4.1% and 1.5% in average for the default mont of embedded direction, and 11.8% in maximum and 3.7% in average for the optimal one.

2-Level Adaptive Branch Prediction Based on Set-Associative Cache (세트 연관 캐쉬를 사용한 2단계 적응적 분기 예측)

  • Shim, Won
    • The KIPS Transactions:PartA
    • /
    • v.9A no.4
    • /
    • pp.497-502
    • /
    • 2002
  • Conditional branches can severely limit the performance of instruction level parallelism by causing branch penalties. 2-level adaptive branch predictors were developed to get accurate branch prediction in high performance superscalar processors. Although 2 level adaptive branch predictors achieve very high prediction accuracy, they tend to be very costly. In this paper, set-associative cached correlated 2-level branch predictors are proposed to overcome the cost problem in conventional 2-level adaptive branch predictors. According to simulation results, cached correlated predictors deliver higher prediction accuracy than conventional predictors at a significantly lower cost. The best misprediction rates of global and local cached correlated predictors using set-associative caches are 5.99% and 6.28% respectively. They achieve 54% and 17% improvements over those of the conventional 2-level adaptive branch predictors.

The Instruction Flash memory system with the high performance dual buffer system (명령어 플래시 메모리를 위한 고성능 이중 버퍼 시스템 설계)

  • Jung, Bo-Sung;Lee, Jung-Hoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.2
    • /
    • pp.1-8
    • /
    • 2011
  • NAND type Flash memory has performing much researches for a hard disk substitution due to its low power consumption, cheap prices and a large storage. Especially, the NAND type flash memory is using general buffer systems of a cache memory for improving overall system performance, but this has shown a tendency to emphasize in terms of data. So, our research is to design a high performance instruction NAND type flash memory structure by using a buffer system. The proposed buffer system in a NAND flash memory consists of two parts, i.e., a fully associative temporal buffer for branch instruction and a fully associative spatial buffer for spatial locality. The spatial buffer with a large fetching size turns out to be effective serial instructions, and the temporal buffer with a small fetching size can achieve effective branch instructions. According to the simulation results, we can reduce average miss ratios by around 77% and the average memory access time can achieve a similar performance compared with the 2-way, victim and fully associative buffer with two or four sizes.

Energy-aware Instruction Cache Design using Backward Branch Information for Embedded Processors (임베디드 시스템에서 후방 분기 명령어 정보를 이용한 저전력 명령어 캐쉬 설계 기법)

  • Yang, Na-Ra;Kim, Jong-Myon;Kim, Cheol-Hong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.13 no.6
    • /
    • pp.33-39
    • /
    • 2008
  • Energy efficiency should be considered together with performance when designing embedded processors. This paper proposes a new energy-aware instruction cache design using backward branch information to reduce the energy consumption in an embedded processor, since instruction caches consume a significant fraction of the on-chip energy. Proposed instruction cache is composed of two caches: a large main instruction cache and a small loop instruction cache. Proposed technique enables the selective access between the main instruction cache and the loop instruction cache to reduce the number of accesses to the main instruction cache, leading to good energy efficiency. Analysis results show that the proposed instruction cache reduces the energy consumption by 20% on the average, compared to the traditional instruction cache.

  • PDF

Accurate Prediction of Polymorphic Indirect Branch Target (간접 분기의 타형태 타겟 주소의 정확한 예측)

  • 백경호;김은성
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.41 no.6
    • /
    • pp.1-11
    • /
    • 2004
  • Modern processors achieve high performance exploiting avaliable Instruction Level Parallelism(ILP) by using speculative technique such as branch prediction. Traditionally, branch direction can be predicted at very high accuracy by 2-level predictor, and branch target address is predicted by Branch Target Buffer(BTB). Except for indirect branch, each of the branch has the unique target, so its prediction is very accurate via BTB. But because indirect branch has dynamically polymorphic target, indirect branch target prediction is very difficult. In general, the technique of branch direction prediction is applied to indirect branch target prediction, and much better accuracy than traditional BTB is obtained for indirect branch. We present a new indirect branch target prediction scheme which combines a indirect branch instruction with its data dependent register of the instruction executed earlier than the branch. The result of SPEC benchmark simulation which are obtained on SimpleScalar simulator shows that the proposed predictor obtains the most perfect prediction accuracy than any other existing scheme.

Instruction-corruption-less Binary Modification Mechanism for Static Stack Protections (이진 조작을 통한 정적 스택 보호 시 발생하는 명령어 밀림현상 방지 기법)

  • Lee, Young-Rim;Kim, Young-Pil;Yoo, Hyuck
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.1
    • /
    • pp.71-75
    • /
    • 2008
  • Many sensor operating systems have memory limitation constraint; therefore, stack memory areas of threads resides in a single memory space. Because most target platforms do not have hardware MMY (Memory Management Unit), it is difficult to protect each stack area. The method to solve this problem is to exchange original stack handling instructions in binary code for wrapper routines to protect stack area. In this exchanging phase, instruction corruption problem occurs due to difference of each instruction length between stack handling instructions and branch instructions. In this paper, we propose the algorithm to call a target routine without instruction corruption problem. This algorithm can reach a target routine by repeating branch instructions to have a short range. Our solution makes it easy to apply security patch and maintain upgrade of software of sensor node.

A Novel Approach to Improve Branch Prediction Accuracy by Neural Network Information (신경망을 이용한 분기 예측의 개선)

  • Kwak, Jong Wook;Kim, Ju-Hwan;Jhon, Chu Shik
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2004.05a
    • /
    • pp.1651-1654
    • /
    • 2004
  • 파이프라인과 슈퍼스칼라 방식이 일반화된 시스템 구조 하에서, 분기 명령어는 시스템 전체적인 성능에 중요한 영향을 미친다. 특히 분기 예측이 실패했을 경우, 잘못된 분기 예측으로 인한 페널티가 발생한다는 점에서 분기 예측의 정확도에 대한 중요성은 크다고 할 수 있다. 본 논문에서는 분기 예측의 정확도를 높이기 위해서, 분기 예측과 관련된 신경망을 구축하여 이를 통해 분기 예측에 필요한 각 요소별 가중치의 변화를 분석하고, 이를 분기 예측에 새롭게 반영하고자 한다. 본 논문에서는 이를 위해 실행 구동 방식의 시뮬레이터인 SimpleScalar를 통하여 모의 실험을 수행하였으며, 실험 결과 본 논문에서 제시한 새로운 기법이 기존의 일반적인 이단계 적응형 분기 예측 기법이나 gshare 기법에 비하여 더 우수한 결과를 보였다.

  • PDF

Recovery Modules for Speculative Update Branch History (분기 정보의 투기적 사용에 대한 효율적인 복구 기법)

  • Kwak Jong Wook;Kim Ju-Hwan;Jhang Seong Tae;Jhon Chu Shik
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2005.11a
    • /
    • pp.766-768
    • /
    • 2005
  • 분기 영령어의 예측 정확도는 시스템 전체 성능에 중대한 영향을 미친다. 여러 분기 예측 방식 가운데 하나인 "분기 정보의 투기적 사용" 은 분기 명령어의 가장 최근 기록을 일관되게 사용할 수 있도록 도와줌으로 해서 분기 예측의 정확도 향상에 크게 기여한다. 하지만 이와 같은 기법은 미완료 분기에 대한 히스토리를 투기적으로 사용하는 방식이다. 따라서 사용되는 정보가 올바르지 못할 수 있으며, 이런 경우 적절한 복구 기법을 필요로 한다. 본 논문에서는 분기 정보의 투기적 사용에 대한 필요성과 효율적인 복구 기법을 제안한다. 제안된 기법은 이전 연구와 비교하여 상당한 하드웨어 요구량의 감소를 가져왔으며, 또한 프로그램 수행의 정확성을 해치지 않으면서 최대 $3.3\%$의 성능향상을 보였다.

  • PDF

The Compressed Instruction Set Architecture for the OpenRISC Processor (OpenRISC 프로세서를 위한 압축 명령어 집합 구조)

  • Kim, Dae-Hwan
    • Journal of the Korea Society of Computer and Information
    • /
    • v.17 no.10
    • /
    • pp.11-23
    • /
    • 2012
  • To achieve efficient code size reduction, this paper proposes a new compressed instruction set architecture for the OpenRISC architecture. The new instructions and their corresponding formats are designed by the profiling information of the existing instruction usage. New 16-bit instructions and 32-bit instructions are proposed to compressed the existing 32-bit instructions and instruction sequences, respectively. The proposed instructions can be classified into three types. The first is the new 16-bit instructions for the frequent normal 32-bit instructions such as add, load, store, branch, and jump instructions. The second type is the new 32-bit instructions for the consecutive two load instructions, two store instructions, and 32-bit data mov instructions. Finally, two new 32-bit instructions are proposed to compress function prolog and epilog code, respectively. OpenRISC hardware decoder is extended to support the new instructions. Experiments show that the efficiency of code size reduction improves by an average of 30.4% when compared to the OR1200 instruction set architecture without loss of execution performance.