Search | Korea Science

Performance Improvement of SVLIW Architectures by Removing LNOPs from An Object Code (목적 코드에서 LNOP 코드가 제거됨에 따른 SVLIW 구조의 성능 향상)

Jeong, Bo-Yun;Jeon, Joong-Nam;Kim, Suk-Il
- The Transactions of the Korea Information Processing Society
- /
- v.4 no.9
- /
- pp.2269-2279
- /
- 1997
SVLIW (Superscalar VLIW) processor, a family of VLIW processors schedules very long instruction words at runtime. If a very long instruction word that is to be issued occurs data dependence relations and/or resource conflicts with those words that were under execution, a long NOP word is issued instead of the word until all the data dependence relations and/or resource conflicts have been resolved. Thus, LNOPs can be removed in object codes for SVLIW processors. In this paper, we measure an improvement of the cache hit ratio caused by removing LNOPs in the object code. We also analyze an improvement of the processor performance due to higher cache hit ratio of the processor. Benchmark tests promise that the performance of SVLIW processors is improved more than 5% compared with that of traditional VLIW processors.
PDF

A Study on Processor Monitoring for Integration Test of Flight Control Computer equipped with A Modern Processor (최신 프로세서 탑재 비행제어 컴퓨터의 통합시험을 위한 프로세서 모니터링 연구)

Lee, Cheol;Kim, Jae-Cheol;Cho, In-Jae
- Journal of Institute of Control, Robotics and Systems
- /
- v.14 no.10
- /
- pp.1081-1087
- /
- 2008
This paper describes limitations and solutions of the existing processor-monitoring concept for a military supersonics aircraft Flight Control Computer (FLCC) equipped with modern architecture processor to perform the system integration test. Safecritical FLCC integration test, which requires automatic test for thousands of test cases and real-time input/output test condition generation, depends on the processor-monitoring device called Processor Interface (PI). The PI, which relies upon on the FLCC processor's external address and data-bus data, has some limitations due to multi-fetching capability of the modern sophisticated military processors, like C6000's VLIW (Very-Long Instruction Word) architecture and PowerPC's Superscalar architecture. Several techniques for limitations were developed and proper monitoring approach was presented for modem processor-adopted FLCC system integration test.
https://doi.org/10.5302/J.ICROS.2008.14.10.1081 인용 PDF KSCI

Superscalar RISC Microprocessor Architecture with enhanced Multimedia Instructions (멀티미디어 명령어를 강화한 수퍼스칼라 RISC 마이크로프로세서 구조)

이용환;문병인;이용석
- Proceedings of the IEEK Conference
- /
- 1999.11a
- /
- pp.931-934
- /
- 1999
For applications in multimedia to which genuine RISC microprocessors are not suitably applicable, a new generation of fast and flexible microprocessors is required. In this paper, as a technique of integrating DSP functionality in a general RISC processor, a RISC that can execute DSP extension instructions is developed to improve the performance of multimedia application execution. This processor can execute DSP instructions in parallel with the execution of ALU instructions for efficient and fast execution. In addition, the execution ability of integer instructions is improved by enhancing the RISC core itself.
PDF

A Performance Study on Many-core Processor Architectures with SPEC Benchmark Programs (SPEC 벤치마크 프로그램에 대한 매니코어 프로세서의 성능 연구)

Lee, Jongbok
- The Transactions of The Korean Institute of Electrical Engineers
- /
- v.62 no.2
- /
- pp.252-256
- /
- 2013
In order to overcome the complexity and performance limit problems of superscalar processors, the multi-core architecture has been prevalent recently. Usually, the number of cores mostly used for the multi-core processor architecture ranges from 2 to 16. However in the near future, more than 32-cores are likely to be utilized, which is called as many-core processor architecture. Using SPEC 2000 benchmarks as input, the trace-driven simulation has been performed for the 32 to 1024 many-core architectures extensively. For 1024-cores, the average performance scores 15.7 IPC, but the performance increase rate is saturated.
https://doi.org/10.5370/KIEE.2013.62.2.252 인용 PDF KSCI

Branch Misprediction Recovery Mechanism That Exploits Control Independence on Program (프로그램 상의 제어 독립성을 이용한 분기 예상 실패 복구 메커니즘)

Yoon, Sung-Lyong;Lee, Won-Mo;Cho, Yeong-Il
- Journal of KIISE:Computer Systems and Theory
- /
- v.29 no.7
- /
- pp.401-410
- /
- 2002
Control independence has been put forward as a new significant source of instruction-level parallelism for superscalar processors. In branch prediction mechanisms, all instructions after a mispredicted branch have to be squashed and then instructions of a correct path have to be re-fetched and re-executed. This paper presents a new branch misprediction recovery mechanism to reduce the number of instructions squashed on a misprediction. Detection of control independent instructions is accomplished with the help of the static method using a profiling and the dynamic method using a control flow of program sequences. We show that the suggested branch misprediction recovery mechanism improves the performance by 2~7% on a 4-issue processor, by 4~15% on an 8-issue processor and by 8~28% on a 16-issue processor.
PDF KSCI

A Branch Predictor with New Recovery Mechanism in ILP Processors for Agriculture Information Technology (농업정보기술을 위한 ILP 프로세서에서 새로운 복구 메커니즘 적용 분기예측기)

Ko, Kwang Hyun;Cho, Young Il
- Agribusiness and Information Management
- /
- v.1 no.2
- /
- pp.43-60
- /
- 2009
To improve the performance of wide-issue superscalar processors, it is essential to increase the width of instruction fetch and the issue rate. Removal of control hazard has been put forward as a significant new source of instruction-level parallelism for superscalar processors and the conditional branch prediction is an important technique for improving processor performance. Branch mispredictions, however, waste a large number of cycles, inhibit out-of-order execution, and waste electric power on mis-speculated instructions. Hence, the branch predictor with higher accuracy is necessary for good processor performance. In global-history-based predictors like gshare and GAg, many mispredictions come from commit update of the branch history. Some works on this subject have discussed the need for speculative update of the history and recovery mechanisms for branch mispredictions. In this paper, we present a new mechanism for recovering the branch history after a misprediction. The proposed mechanism adds an age_counter to the original predictor and doubles the size of the branch history register. The age_counter counts the number of outstanding branches and uses it to recover the branch history register. Simulation results on the SimpleScalar 3.0/PISA tool set and the SPECINT95 benchmarks show that gshare and GAg with the proposed recovery mechanism improved the average prediction accuracy by 2.14% and 9.21%, respectively and the average IPC by 8.75% and 18.08%, respectively over the original predictor.
PDF

The Processor Performance Model Using Statistical Simulation (통계적 모의실험을 이용하는 프로세서의 성능 모델)

Lee Jong-Bok
- Journal of KIISE:Computer Systems and Theory
- /
- v.33 no.5
- /
- pp.297-305
- /
- 2006
Trace-driven simulation is widely used for measuring the performance of a microprocessor in its initial design phase. However, since it requires much time and disk space, the statistical simulation has been studied as an alternative method. In this paper, statistical simulations are performed for a high performance superscalar microprocessor with a perceptron-based multiple branch predictor. For the verification, various hardware configurations are simulated using SPEC2000 benchmarks programs as input. As a result, we show that the statistical simulation is quite accurate and time saving for the evaluation of microprocessor architectures with multiple branch prediction.
PDF KSCI

3-way SuperScalar Decoder Design for ARMv7 Core (ARMv7 Core를 위한 3-way SuperScalar Decoder 설계)

Kim, Hyo-Won;Kim, In-Soo;Baek, Chul-Ki;Min, Hyoung-Bok
- Proceedings of the KIEE Conference
- /
- 2008.10c
- /
- pp.246-247
- /
- 2008
Further evolutions of technologies and needs of users will make mobile equipments improved. To make this happen, processor's good performance is essential. Hence, This paper propose a reform of Instruction Execute and Instruction Decode of contemporary ARMv7 which needs low-power and has the high performance for a faster processor. The first chapter explains why the performance of a processor has to be upgraded, the second chapter shows current technologies. The third chapter explains about the proposal and illustrates the structure. Finally, in the forth chapter, the conclusion will be made. 3-way Superscalar, that is proposed in this paper, will make designing a faster processor possible. And it will contribute for the advanced performance of mobile equipments.
PDF

Hybrid Value Predictor in Wide-Issue Superscalar Processor (슈퍼스칼라 프로세서에서 명령 윈도우 크기에 따른 혼합형 값 예측기)

Jeon, Byoung-Chan;Choi, Gyoo-Seok
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.9 no.2
- /
- pp.97-103
- /
- 2009
In this paper, the performance of a hybrid value predictor according to the instruction fetch rate on window size superscalar processors is evaluated. In general, the data dependency relations of instructions are increased with the number of the fetched instructions. Therefore, it is expected that the performance of a value predictor will be higher when the instruction fetch rate is increased. The performance is studied for the machine with collapsing buffer and he one with trace cache as an instruction fetch mechanism. As a result of experiment, it is showed that the performance effect of a value predictor is higher as the instruction fetch rate of instruction window size, IPC, predict rate on apply with non-tc and tc is increased.
PDF

Performance Balance between the Window size and Issue Width (다양한 윈도우 크기와 이슈폭에 따른 시스템의 성능 변화)

Kim, Tae-Mok;Yi, Jong-Su;Kim, Jun-Seong
- Proceedings of the IEEK Conference
- /
- 2008.06a
- /
- pp.655-656
- /
- 2008
There are trade-offs between a window size and an issue width for superscalar processors. A good balance between them prevents waste of system resources. In this paper, we investigate the performance of a superscalar processor with various sizes of window and issue width. From the experiments, we find that there is a linear relationship between window size and issue width.
PDF

Search Result 58, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)