Search | Korea Science

Simulation and Synthesis of RISC-V Processor (RISC-V 프로세서의 모의실행 및 합성)

Lee, Jongbok
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.19 no.1
- /
- pp.239-245
- /
- 2019
RISC-V is a free and open ISA enabling a new era of processor innovation through open standard collaboration. Born in academia and research, RISC-V ISA delivers a new level of free, extensible software and hardware freedom on architecture, paving the way for the next 50 years of computing design and innovation. In this paper, according to the emergence of RISC-V architecture, we describe the RISC-V processor instruction set constituted by arithmetic logic, memory, branch, control, status register, environment call and break point instructions. Using ModelSim and Quartus-II, 38 instructions of RISC-V has been successfully simulated and synthesized.
https://doi.org/10.7236/JIIBC.2019.19.1.239 인용 PDF KSCI HTML

An Improved Register Allocation Technique for ILP Processors (ILP 프로세서를 위한 개선된 레지스터 할당 기법)

Sin, Hwa-Jeong;Lee, Gi-Ho
- Journal of KIISE:Software and Applications
- /
- v.28 no.2
- /
- pp.201-209
- /
- 2001
고성능 마이크로 프로세서들은 성능 향상을 위해 ILP를 지원한다. 병렬성을 극대화시키기 위해서는 많은 성능 저해 요인들을 제거해야 한다. 최근에는 컴파일러의 역할을 증대시켜 이러한 요인들을 줄이기 위한 노력들이 활발히 진행되고 있다. 본 논문에서는 성능 저해 요인인 조건 분기 처리를 위하여 조건 실행과 레지스터 할당을 결합함으로써 메모리로의 대피를 최소화하고 병렬성을 향상시킬 수 있는 개선된 레지스터 할당 알고리즘을 제안한다. 제안한 방법을 적용하여 실험한 결과 간섭 그래프의 에지수가 4.47% 감소되었고 그 결과 요구되는 대피 변수의 수도 21.35% 감소되었다. 그리고 기존의 방법에 비해 19.38%의 성능 향상 결과를 얻었다. 결국 본 레지스터 할당 기법은 조건 실행을 통해 조건 분기 명령을 제거하여 기본 블록 내의 명령어 수를 증가시켜 병렬처리의 기회를 증진시키고 조건 분석을 통해 간섭 그래프의 불필요한 에너지를 제거시켜 보다 효율적인 레지스터 할당을 실현함으로써 제안한 방법의 타당성을 검증하였다.
PDF

JMP+RAND: Mitigating Memory Sharing-based Side-channel Attacks by Embedding Random Values in Binaries (JMP+RAND: 바이너리 난수 삽입을 통한 메모리 공유 기반 부채널 공격 방어 기법)

Kim, Taehun;Shin, Youngjoo
- Proceedings of the Korea Information Processing Society Conference
- /
- 2019.10a
- /
- pp.456-458
- /
- 2019
정보보안을 달성하기 위해서 컴퓨터가 보급된 이래로 많은 노력이 이루어졌다. 그중 메모리 보호 기법에 대한 연구가 가장 많이 이루어졌지만, 컴퓨터의 성능 향상으로 이전의 메모리 보호 기법들의 문제들이 발견되고, 부채널 공격의 등장으로 새로운 방어책이 필요 되었다. 본 논문에서는 프로그램에 정적 바이너리 재작성(Static Binary Rewriting) 기법을 통해 페이지(Page)마다 4~8byte 의 난수를 삽입하여 메모리 공유 기반 부채널 공격을 방어할 수 있는 2 가지 방법을 제시한다. 최근 아키텍처는 분기 예측(Branch Prediction)을 통해 jmp 명령어에 대한 분기처리가 매우 빠르고 정확하게 처리되기 때문에 난수를 삽입할 때 사용하는 jmp+rand 방식은 오버헤드가 매우 낮다. 또한 특정 프로그램에만 난수 삽입이 가능하므로 특히 클라우드 환경에서 중복제거 기능과 함께 사용하면 높은 효율성을 보일 수 있다고 예상한다.
https://doi.org/10.3745/PKIPS.y2019m10a.456 인용 PDF

The Design and Simulation of Out-of-Order Execution Processor using Tomasulo Algorithm (토마술로 알고리즘을 이용하는 비순차실행 프로세서의 설계 및 모의실행)

Lee, Jongbok
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.20 no.4
- /
- pp.135-141
- /
- 2020
Today, CPUs in general-purpose computers such as servers, desktops and laptops, as well as home appliances and embedded systems, consist mostly of multicore processors. In order to improve performance, it is required to use an out-of-order execution processor by Tomasulo algorithm as each core processor. An out-of-order execution processor with Tomasulo algorithm can execute the available instructions in any order and perform speculation in order to reduce control dependencies. Therefore, the performance of an out-of-order execution processor can be significantly improved compared to an in-order execution processor. In this paper, an out-of-order execution processor using Tomasulo algorithm and ARM instruction set is designed using VHDL record data types and simulated by GHDL. As a result, it is possible to successfully perform operations on programs written in ARM instructions.
https://doi.org/10.7236/JIIBC.2020.20.4.135 인용 PDF KSCI HTML

Analytical Models and their Performance Analysis of Superscalar Processors (수퍼스칼라 프로세서의 해석적 모델 및 성능 분석)

Kim, Hak-Jun;Kim, Seon-Mo;Choe, Sang-Bang
- Journal of KIISE:Computer Systems and Theory
- /
- v.26 no.7
- /
- pp.847-862
- /
- 1999
본 논문에서는 유한버퍼의(finite-buffered) 동기화된(synchronous) 큐잉모델(queueing model)을 이용하여 명령어들간의 병렬성, 분기명령의 빈도수, 분기예측(branch prediction)의 정확도, 캐쉬미스 등의 파라미터들을 고려하여 프로세서의 명령어 실행율을 예측하며 캐쉬의 성능과 파이프라인 성능간의 관계를 분석할 수 있는 새로운 해석적 모델을 제안하였다. 해석적 모델은 모델의 타당성을 검증하기 위해서 시뮬레이션을 수행하여 얻은 결과와 비교하였다. 해석적 모델과 시뮬레이션을 비교한 결과 대부분 10% 오차 내에서 일치하였다. 본 연구를 통하여 얻은 해석적 모델을 사용하면 시뮬레이션에서는 드러나지 않는 성능제약의 원인에 대한 명확한 규명이 가능하기 때문에 성능향상을 위한 설계자료를 얻을 수 있으며, 시스템 성능 밸런스를 위한 캐쉬와 비순차이슈 파이프라인 성능간의 관계에 대한 정확한 분석이 가능하다.Abstract This research presents a novel analytic model to predict the instruction execution rate of superscalar processors using the queuing model with finite-buffer size and synchronous operation mode. The proposed model is also able to analyze the performance relationship between cache and pipeline. The proposed model takes into account various kinds of architectural parameters such as instruction-level parallelism, branch probability, the accuracy of branch prediction, cache miss, and etc.. To prove the correctness of the model, we performed extensive simulations and compared the results with the analytic model. Simulation results showed that the proposed model can estimate the average execution rate accurately within 10% error compared to simulation results. The proposed model can explain the causes of performance bottleneck which cannot be uncovered by the simulation method only. The model is also able to show the effect of the cache miss on the performance of out-of-order issue superscalar processors, which can provide an valuable information in designing a balanced system.

A Design of Multimedia Application SoC based with Processor using BTB (BTB를 이용한 프로세서 기반 멀티미디어 응용 SoC 설계)

Jung, Younjin;Lee, Byungyup;Ryoo, Kwangki
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2009.10a
- /
- pp.397-400
- /
- 2009
This paper describes ASIC design of Multimedia application SoC platform based RISC processor with BTB(Branch Target Buffer). For performance enhancement of platform, we use a simple branch prediction scheme, BTB structure, that stores a target address for branch instruction to remove pipeline harzard. Also, the platform includes a number of peripheral such as VGA controller, AC97 controller, UART controller, SRAM interface and Debug interface. The platform is designed and verified on a Xilinx VERTEX-4 FPGA using a number of test programs for functional tests and timing constraints. Finally, the platform is implemented into a single ASIC chip which can be operated at 100MHz clock frequency using the Chartered 0.18um process. As a result of performance estimation, the proposed platform shows about 5~9% performance improvement in comparison with the previous SoC Platform.
PDF

Design of a Delayed Dual-Core Lock-Step Processor with Automatic Recovery in Soft Errors (소프트 에러 발생 시 자동 복구하는 이중 코어 지연 락스텝 프로세서의 설계)

Juho Kim;Seonghyun Yang;Seongsoo Lee
- Journal of IKEEE
- /
- v.27 no.4
- /
- pp.683-686
- /
- 2023
In this paper, we designed a Delayed Dual Core Lock-Step (D-DCLS) processor where two cores operate same instructions with delay and the result is compared to mitigate soft errors and common mode failures in automotive electronic systems. Because D-DCLS does not know which core an error occurred in, each core must be recovered to the point before the error occurred, but complex hardware modifications are required to return all intermediate values on the pipeline stage. In this paper, in order for easy hardware implementation, all register values are saved to a buffer whenever a branch instruction is executed. When an error is detected, the saved register values are automatically restored, and then 'BX LR' instruction is executed to return to the last branch point. The proposed D-DCLS processor was designed using Verilog HDL and was confirmed to continue normal operation after automatically recovering error.
https://doi.org/10.7471/ikeee.2023.27.4.683 인용 PDF

Analysis on the Thermal Efficiency of Branch Prediction Techniques in 3D Multicore Processors (3차원 구조 멀티코어 프로세서의 분기 예측 기법에 관한 온도 효율성 분석)

Ahn, Jin-Woo;Choi, Hong-Jun;Kim, Jong-Myon;Kim, Cheol-Hong
- The KIPS Transactions:PartA
- /
- v.19A no.2
- /
- pp.77-84
- /
- 2012
Speculative execution for improving instruction-level parallelism is widely used in high-performance processors. In the speculative execution technique, the most important factor is the accuracy of branch predictor. Unfortunately, complex branch predictors for improving the accuracy can cause serious thermal problems in 3D multicore processors. Thermal problems have negative impact on the processor performance. This paper analyzes two methods to solve the thermal problems in the branch predictor of 3D multi-core processors. First method is dynamic thermal management which turns off the execution of the branch predictor when the temperature of the branch predictor exceeds the threshold. Second method is thermal-aware branch predictor placement policy by considering each layer's temperature in 3D multi-core processors. According to our evaluation, the branch predictor placement policy shows that average temperature is $87.69^{\circ}C$, and average maximum temperature gradient is $11.17^{\circ}C$. And, dynamic thermal management shows that average temperature is $89.64^{\circ}C$ and average maximum temperature gradient is $17.62^{\circ}C$. Proposed branch predictor placement policy has superior thermal efficiency than the dynamic thermal management. In the perspective of performance, the proposed branch predictor placement policy degrades the performance by 3.61%, while the dynamic thermal management degrades the performance by 27.66%.
https://doi.org/10.3745/KIPSTA.2012.19A.2.077 인용 PDF KSCI

Efficient Maximum Intensity Projection using SIMD Instruction and Streaming Memory Transfer (단일 명령 복수 데이터 연산과 순차적 메모리 참조를 이용한 효율적인 최대 휘소 투영 볼륨 가시화)

Kye, Hee-Won
- Journal of Korea Multimedia Society
- /
- v.12 no.4
- /
- pp.512-520
- /
- 2009
Maximum intensity projection (MIP) is a volume rendering method which extracts maximum values along the viewing direction through volume data. It visualizes high-density structures, such as angio-graphic datasets so that it is frequently used in medical imaging systems. We have proposed an efficient two-step MIP acceleration method that uses the recent CPUs. First, we exploited SIMD instructions to reduce conditional branch instructions which take up a considerable part of whole rendering process, so that we improved rendering speed. Second, we proposed a new method, which accesses volume and image data successively by modifying the shear-warp rendering. This method improves memory access patterns so that cache misses are reduced. Using the current CPUs, our method improved the rendering speed by a factor of 7 than that of the shear-warp rendering.
PDF

Graph based Binary Code Execution Path Exploration Platform for Dynamic Symbolic Execution (동적 기호 실행을 이용한 그래프 기반 바이너리 코드 실행 경로 탐색 플랫폼)

Kang, Byeongho;Im, Eul Gyu
- Journal of the Korea Institute of Information Security & Cryptology
- /
- v.24 no.3
- /
- pp.437-444
- /
- 2014
In this paper, we introduce a Graph based Binary Code Execution Path Exploration Platform. In the graph, a node is defined as a conditional branch instruction, and an edge is defined as the other instructions. We implemented prototype of the proposed method and works well on real binary code. Experimental results show proposed method correctly explores execution path of target binary code. We expect our method can help Software Assurance, Secure Programming, and Malware Analysis more correct and efficient.
https://doi.org/10.13089/JKIISC.2014.24.3.437 인용 PDF KSCI HTML

Search Result 70, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)