Search | Korea Science

A Predicate-Sensitive Scheduling Algorithm in Instruction-Level Parallelism Processors (ILP 프로세서를 위한 조건실행 지원 스케쥴링 알고리즘)

Yoo, Byung-Kang;Lee, Sang-Jeong
- The Transactions of the Korea Information Processing Society
- /
- v.5 no.1
- /
- pp.202-214
- /
- 1998
Exploitation of instruction-level parallelism(ILP) is an effective mechanism for improving the performance of modern super-scalar and VLIW processors. Various software techniques can be applied to increase ILP. Among these techniques, predicated execution is the one that increases the degree of ILP by allowing instructions from different basic blocks to be converted to a single basic block by removing branch instructions. In this paper, a global predicate-sensitive scheduling algorithm is proposed to improve the performance for ILP processors that support predicated execution. In order to examine the performance of proposed algorithm, a C compiler and a simulator are developed. By simulating various benchmark programs with the compiler and the simulator, the performance results of this algorithm are measured and the effectiveness of the algorithm is verified. As a result of measure performance with I, 2, 4 issue execution, this study was confirmed average performance by 20% or more.
PDF

Early Null Pointer Check using Predication in Java Just-In-Time Compilation (자바 적시 컴파일에서의 조건 수행을 이용한 비어 있는 포인터의 조기검사)

Lee Sanggyu;Choi Hyug-Kyu;Moon Soo-Mook
- Journal of KIISE:Software and Applications
- /
- v.32 no.7
- /
- pp.683-692
- /
- 2005
Java specification states that all accesses to an object must be checked at runtime if object refers to null. Since Java is an object-oriented language, object accesses are frequent enough to make null pointer checks affect the performance significantly. In order to reduce the performance degradation, there have been attempts to remove redundant null pointer checks. For example, in a Java environment where a just-in-time (JIT) compiler is used, the JIT compiler removes redundant null pointer check code via code analysis. This paper proposes a technique to remove additional null pointer check code that could not be removed by previous JIT compilation techniques, via early null pointer check using an architectural feature called predication. Generally, null point check code consists of two instructions: a compare and a branch. Our idea is moving the compare instruction that is usually located just before an use of an object, to the point right after the object is defined so that the total number of compare instructions is reduced. This results in reduction of dynamic and static compare instructions by 3.21$\%$ and 1.98$\%$. respectively, in SPECjvm98 bechmarks, compared to the code that has already been optimized by previous null pointer check elimination techniques. Its performance impact on an Itanium machine is an improvement of 0.32$\%$.
PDF KSCI

The Enhancement of Indirect Branch Prediction Accuracy via Double Return Address Stack (이중 함수 복귀 스택의 활용을 통한 간접 분기 명령어의 예측 정확도 향상 기법)

Kwak, Jong-Wook;Kim, Ju-Hwan
- Proceedings of the Korean Information Science Society Conference
- /
- 2011.06a
- /
- pp.494-497
- /
- 2011
함수 복귀 예측은 이론적으로 오버플로가 발생하지 않는 한도 내에서 100%의 정확도를 보여야 한다. 하지만, 투기적 실행을 지원하는 현대 마이크로프로세서 환경 하에서는 잘못된 실행 경로로의 수행 결과를 무효화 할 때 RAS의 오염이 발생하며, 이는 함수 복귀 주소의 예측 실패로 이어진다. 본 논문에서는 이러한 RAS의 오염을 방지하기 위하여 RAS 재명명 기법을 제안한다. RAS 재명명 기법은 RAS의 스택을 소프트 스택과 하드 스택으로 나누어 관리한다. 소프트 스택은 투기적 실행에 의한 데이터의 변경을 복구할 수 있는 항목을 관리하고, 하드 스택은 소프트 스택의 크기 제한으로 겹쳐쓰기가 일어나는 데이터 가운데 이후에 재사용될 데이터를 관리하는 구조로 구성된다. 제안된 기법을 모의실험 한 결과, RAS 오염방지 기법이 적용되지 않은 시스템과 비교하여 함수 복귀 예측 실패를 약 1/90로 감소시켰으며, 최대 6.95%의 IPC 향상을 가져왔다.

JMP+RAND: Mitigating Memory Sharing-Based Side-Channel Attack by Embedding Random Values in Binaries (JMP+RAND: 바이너리 난수 삽입을 통한 메모리 공유 기반 부채널 공격 방어 기법)

Kim, Taehun;Shin, Youngjoo
- KIPS Transactions on Computer and Communication Systems
- /
- v.9 no.5
- /
- pp.101-106
- /
- 2020
Since computer became available, much effort has been made to achieve information security. Even though memory protection defense mechanisms were studied the most among of them, the problems of existing memory protection defense mechanisms were found due to improved performance of computer and new defense mechanisms were needed due to the advent of the side-channel attacks. In this paper, we propose JMP+RAND that embedding random values of 5 to 8 bytes per page to defend against memory sharing based side-channel attacks and bridging the gap of existing memory protection defense mechanism. Unlike the defense mechanism of the existing side-channel attacks, JMP+RAND uses static binary rewriting and continuous jmp instruction and random values to defend against the side-channel attacks in advance. We numerically calculated the time it takes for a memory sharing-based side-channel attack to binary adopted JMP+RAND technique and verified that the attacks are impossible in a realistic time. Modern architectures have very low overhead for JMP+RAND because of the very fast and accurate branching of jmp instruction using branch prediction. Since random value can be embedded only in specific programs using JMP+RAND, it is expected to be highly efficient when used with memory deduplication technique, especially in a cloud computing environment.
https://doi.org/10.3745/KTCCS.2020.9.5.101 인용 PDF KSCI

The Effect of Robot-Based STEAM Class on the Korean Learning of Multiculturul School Children -Focusing on After School Learning of Elementary School- (로봇 활용 STEAM 수업이 다문화 아동의 한국어 학습에 미치는 영향 -초등학교 방과 후 수업을 중심으로-)

Kim, Se-Min;You, Kang-Soo
- Journal of Digital Convergence
- /
- v.13 no.8
- /
- pp.1-8
- /
- 2015
This paper focuses on analyzing Korean language learning effect through the STEAM class using a robot which is targeted on multicultural elementary school students. For the purpose of it, the degree of difficulty and interest of how students feel has been measured. By using the programing tool of Korean language entering base, they learn the programming commands like as variable, data type, branching statement, loop statement, etc in Korean, the effect of Korean learning has been measured. It has been examined two interviews at the beginning and the end of the second semester to measure the effect of Korean language learning. As a result of this research, It can be realized that multicultural children who have similar linguistic characteristics and cultural sphere understood Korean language easily when they take the Korean language class by utilizing a robot, and the class had an effect on the acquisition of Korean language for multicultural children.
https://doi.org/10.14400/JDC.2015.13.8.1 인용 PDF KSCI

An Implementation of Efficient Quicksort Utilizing SIMD-Based VBP Technique (SIMD 기반의 VBP 기법을 적용한 효율적인 퀵정렬의 구현)

Hong, Gilseok;Kim, Hongyeon;Kang, Seonghyeon;Min, Jun-Ki
- KIISE Transactions on Computing Practices
- /
- v.23 no.8
- /
- pp.498-503
- /
- 2017
SIMD (Single Instruction Multiple Data) is a representative parallelization architecture that processes multiple data loaded in a SIMD register with a single instruction. Quicksort is a sorting algorithm that picks an element as a pivot from the array and reorders the array such that all elements having the values less than the pivot value are located in the left side on the pivot as well as all elements having the value greater than the pivot value are located in the right side on the pivot and then the algorithm performs the same task on both sublist recursively. In this paper, we propose an efficient Quicksort algorithm applying the SIMD instructions which minimally invokes conditional branches to avoid the performance degradation incurred by branch misprediction in a pipeline architecture. In addition, we improve the performance of the Quicksort algorithm by fetching data into a SIMD register as a byte unit to apply VBP (Vertical Bit Parallel) and the early pruning technique.
https://doi.org/10.5626/KTCP.2017.23.8.498 인용 KSCI

Filter Cache Predictor Using Mode Selection Bit (모드 선택 비트를 사용한 필터 캐시 예측기)

Kwak, Jong-Wook
- Journal of the Institute of Electronics Engineers of Korea CI
- /
- v.46 no.5
- /
- pp.1-13
- /
- 2009
Filter cache has been introduced as one solution of reducing cache power consumption. More than 50% of the power reduction results from the filter cache, whereas more than 20% of the performance is compromised. To minimize the performance degradation of the filter cache, the predictive filter cache has been proposed. In this paper, we review the previous filter cache predictors and analyze the problems of the solutions. As a result, we found main problems that cause prediction misses in previous filter cache schemes and, to resolve the problems, this paper proposes a new prediction policy. In our scheme, some reference bit entries, called MSBs, are inserted into filter cache and BTB, to adaptively control the filter cache access. In simulation parts, we use a modified SimpleScalar simulator with MiBench benchmark programs to verify the proposed filter cache. The simulation result shows in average 5% performance improvement, compared to previous ones.
PDF KSCI

FPGA-Based Implementation of a Practical 8-Bit Microprocessor (FPGA 기반 실용적 마이크로프로세서의 구현)

Ahn Jung-Il;Park Sung-Hwan;Kwon Sung-Jae
- Proceedings of the Korea Society for Industrial Systems Conference
- /
- 2006.05a
- /
- pp.119-123
- /
- 2006
본 논문에서는 마이크로프로세서의 기능을 수행하는 데 필수적이며 사용빈도가 높은 총 64개의 명령어를 정의한 후 이를 처리할 데이터패스를 구성해 스테이트 머쉰으로 제어하는 방식으로 실용적 8비트 마이크로프로세서를 VHDL로 설계를 하고 FPGA로 구현했다. 통상 마이크로프로세서 관련 논문에서는 기능적 시뮬레이션까지만 했거나, 인터럽트 기능이 없든지, 하드웨어로 구현을 하지 않았거나, 또는 개발 관련 내용이 자세히 제시되지 않았었다. 본 논문에서는 데이터 이동, 논리, 가산 연산뿐만 아니라 분기, 점프 연산도 실행할 수 있도록 해 연산 및 제어용도에 적합하도록 하였고, 스택, 외부 인터럽트 기능까지도 지원하도록 해 그 자체로서 완전한 실용적 마이크로프로세서가 되도록 하였다. 또한 프로그램 ROM까지도 칩 안에 넣어 전체 마이크로프로세서를 단일 칩으로 구현하였다. 타이밍 시뮬레이션으로 검증 후 제작 과정을 통해, 설계된 마이크로프로세서가 정상적으로 동작함을 확인하였다. Altera MAX+.PLUS II 통합개발환경 하에서 EP1K50TC144-3 FPGA 칩으로 구현을 하였고 최대 동작주파수는 9.39MHz까지 가능했고 사용한 로직 엘리먼트의 개수는 2813개로서 논리 사용률은 97%이었다.
PDF

Unplugged Robot Coding System Based on Remote Interface (리모컨 인터페이스 기반의 언플러그드 로봇 코딩 시스템)

Lee, Jun;Seo, Yong-Ho
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.19 no.5
- /
- pp.157-162
- /
- 2019
Recently, the awareness of S/W education, which was confined to the profession, is changing due to the changing industrial environment based on ICT technology World main countries invest competitively in S/W education and the target age group is getting lower Among them, the unplugged coding method using the robot platform is known as one of the most effective S/W training methods targeting the elementary age by the intuitive coding method and the robot platform feedback. However, the unplugged coding method using the robot platform has a disadvantage that it can not configure various interfaces for complicated coding due to limitations of H/W. In this paper, we have proposed an unplugged coding system which can input various commands for robot control by IR remote control as an interface and minute signals using robot sensor.
https://doi.org/10.7236/JIIBC.2019.19.5.157 인용 PDF KSCI HTML

Design and Implementation of a Single-Chip 8-Bit Microcontroller (단일 칩 8비트 마이크로컨트롤러의 설계 및 구현)

Ahn, Jung-Il;Park, Sung-Hwan;Kwon, Sung-Jae
- Journal of Korea Society of Industrial Information Systems
- /
- v.11 no.4
- /
- pp.72-81
- /
- 2006
In this paper, we first define a total of 64 instructions that are considered to be essential and frequently used, construct a datapath diagram, determine the control sequence using a finite state machine, and implement an 8-bit microcontroller using FPGA in VHDL. In the past, only functional simulation results of a rudimentary microcontroller were reported, the microcontroller lacked interrupt handling capability, or it was not implemented in hardware. We have designed a self-contained 8-bit microcontroller such that it can perform data transfer, addition, and logical operations, as well as stack and external interrupt operations. Following timing simulation of the designed microcontroller, we implemented it in an FPGA and verified its operation successfully. The design and implementation has been done under the Altera MAX+PLUS II integrated development environment using the EP1K50TC144-3 chip. The maximum operating frequency, the total number of logic elements used, and the logic utilization were found to be 9.39 MHz, 2813, and 97%, respectively. The result can be used as a microcontroller IP, and as needs arise, the VHDL code can be modified accordingly.
PDF

Search Result 70, Processing Time 0.022 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)