통합 검색 | Korea Science

코드감소와 성능향상을 위한 이질 레지스터 분할 및 명령어 구조 설계 (Code Size Reduction and Execution performance Improvement with Instruction Set Architecture Design based on Non-homogeneous Register Partition)

권영준;이혁재
- 대한전기학회논문지:전력기술부문A
- /
- 제48권12호
- /
- pp.1575-1579
- /
- 1999
Embedded processors often accommodate two instruction sets, a standard instruction set and a compressed instruction set. With the compressed instruction set, code size can be reduced while instruction count (and consequently execution time) can be increased. To achieve code size reduction without significant increase of execution time, this paper proposes a new compressed instruction set architecture, called TOE (Two Operations Execution). The proposed instruction set format includes the parallel bit that indicates an instruction can be executed simultaneously with the next instruction. To add the parallel bit, TOE instruction format reduces the destination register field. The reduction of the register field limits the number of registers that are accessible by an instruction. To overcome the limited accessibility of registers, TOE adapts non-homogeneous register partition in which registers are divided into multiple subsets, each of which are accessed by different groups of instructions. With non-homogeneous registers, each instruction can access only a limited number of registers, but an entire program can access all available registers. With efficient non-homogeneous register allocator, all registers can be used in a balanced manner. As a result, the increase of code size due to register spills is negligible. Experimental results show that more than 30% of TOE instructions can be executed in parallel without significant increase of code size when compared to existing Thumb instruction set.
PDF

Thumb-2 명령어 집합 구조의 병렬 분기 명령어 확장 (Parallel Branch Instruction Extension for Thumb-2 Instruction Set Architecture)

김대환
- 한국컴퓨터정보학회논문지
- /
- 제18권7호
- /
- pp.1-10
- /
- 2013
본 논문에서는 Thumb-2 명령어 집합 구조의 성능을 개선하기 위하여 분기 명령어와 사용 빈도가 높은 명령어를 동시에 실행하는 병렬 분기 명령어 집합을 제시한다. 제시된 기법에서는 16비트 분기 명령어와 사용 빈도가 높은 16비트 LOAD, ADD, MOV, STORE, SUB 명령어를 각각 결합하는 새로운 32비트 명령어를 도입한다. 새로운 명령어의 인코딩 공간을 제공하기 위해 사용 빈도가 낮은 기존 명령어의 레지스터 필드에 사용되는 비트 수를 줄이고 이를 통해 절약된 비트들을 이용하여 병렬 분기 명령어를 인코딩한다. 실험 결과, 제시된 방법은 코드 크기를 증가시키지 않고 전통적인 방식과 비교하여 평균 8.0%의 성능을 향상시킨다.
https://doi.org/10.9708/jksci.2013.18.7.001 인용 PDF KSCI

RISC 프로세서 On-Chip Cache의 설계 (Design of A On-Chip Caches for RISC Processors)

홍인식;임인칠
- 대한전자공학회논문지
- /
- 제27권8호
- /
- pp.1201-1210
- /
- 1990
This paper proposes on-chip instruction and data cache memories on RISC reduced instruction set computer) architecture which supports fast instruction fetch and data read/write, and enables RISC processor under research to obtain high performance. In the execution of HLL(high level language) programs, heavily used local scalar variables are stored in large register file, but arrays, structures, and global scalar variables are difficult for compiler to allocate registers. These problems can be solved by on-chip Instruction/Data cache. And each cycle of instruction fetch, pad delay causes the lowering of the processors's performance. Cache memories are designed in CMOS technology and SRAM(static-RAM), that saves layout area and power dissipation, is used for instruction and data storage. To speed up and support RISC processor's piplined architecture efficiently, hardwired logic technology is used overall circuits i cache blocks. The schematic capture and timing simulation of proposed cache memorises are performed on Apollo DN4000 workstation using Mentor Graphics CAD tools.
PDF

디지털 신호처리 기능을 강화한 32비트 마이크로프로세서 (A 32-bit Microprocessor with enhanced digital signal process functionality)

문상국
- 한국정보통신학회:학술대회논문집
- /
- 한국해양정보통신학회 2005년도 추계종합학술대회
- /
- pp.820-822
- /
- 2005
본 논문에서는 16비트 혹은 32비트 고정 소수점 연산을 지원하는 디지털 신호처리 기능을 강화한 명령어 축소형 마이크로프로세서를 설계하였다. 설계한 마이크로프로세서는 명령어 축소형 마이크로 아키텍쳐의 표준에 따라서 범용 마이크로프로세서의 기능과 디지털 신호처리 프로세서의 기능을 함께 갖추고 있다. 산술연산기능 유닛, 디지털 신호처리 유닛, 메모리 제어 유닛으로 구성되어 있으며, 이 연산 유닛들이 병렬적으로 수행되어 디지털 신호처리 명령이나 로드/스토어 명령어의 지연된 시간을 보상할 수 있게 설계되었다. 이 연산유닛들을 병렬적으로 동작하게 함으로써 5단계 파이프라인의 구조로 고성능 마이크로프로세서를 구현하였다.
PDF

32-bit RISC-V 프로세서에서 국산 블록 암호 성능 밴치마킹 (Benchmarking Korean Block Ciphers on 32-Bit RISC-V Processor)

곽유진;김영범;서석충
- 정보보호학회논문지
- /
- 제31권3호
- /
- pp.331-340
- /
- 2021
5G를 포함한 통신 산업이 발전함에 따라, 모바일 임베디드 시스템을 위한 특수목적의 초소형 컴퓨터인 SoC (System on Chip)의 개발이 증대되고 있다. 이에 따라, 산업체와 기업들의 기술 설계의 패러다임이 변화하고 있다. 기존의 공정은 기업들이 마이크로 아키텍처를 구매하였다면, 지금은 ISA (Instruction Set Architecture)를 사들여, 기업이 직접 아키텍처를 설계한다. RISC-V는 축소 명령어 집합 컴퓨터 기반의 개방형 명령어 집합이다. RISC-V는 모듈화를 통하여 확장이 가능한 ISA를 탑재했으며, 현재 전 세계적 기업들의 지원을 통하여 ISA의 확장 버전 등이 개발되고 있다. 본 논문에서는 RISC-V에서 국산 블록 암호 ARIA, LEA, PIPO에 대하여 성능 벤치마킹과 분석 결과를 제공한다. 또한, RISC-V의 기본 명령어 집합과 특징을 활용한 구현 방법을 제안하고 성능을 논의한다.
https://doi.org/10.13089/JKIISC.2021.31.3.331 인용 PDF KSCI HTML

RISC 프로세서의 프로그램 카운터 부(PCU)의 설계 (The Design of A Program Counter Unit for RISC Processors)

홍인식;임인칠
- 대한전자공학회논문지
- /
- 제27권7호
- /
- pp.1015-1024
- /
- 1990
This paper proposes a program counter unit(PCU) on the pipelined architecture of RISC (Reduced Instruction Set Computer) type high performance processors, PCU is used for supplying instruction addresses to memory units(Instruction Cache) efficiently. A RISC processor's PCU has to compute the instruction address within required intervals continnously. So, using the method of self-generated incrementor, is more efficient than the conventional one's using ALU or private adder. The proposed PCU is designed to have the fast +4(Byte Address) operation incrementor that has no carry propagation delay. Design specifications are taken by analyzing the whole data path operation of target processor's default and exceptional mode instructions. CMOS and wired logic circuit technologic are used in PCU for the fast operation which has small layout area and power dissipation. The schematic capture and logic, timing simulation of proposed PCU are performed on Apollo W/S using Mentor Graphics CAD tooks.
PDF

Selecting a Synthesizable RISC-V Processor Core for Low-cost Hardware Devices

Gookyi, Dennis Agyemanh Nana;Ryoo, Kwangki
- Journal of Information Processing Systems
- /
- 제15권6호
- /
- pp.1406-1421
- /
- 2019
The Internet-of-Things (IoT) has been deployed in almost every facet of our day to day activities. This is made possible because sensing and data collection devices have been given computing and communication capabilities. The devices implement System-on-Chips (SoCs) that incorporate a lot of functionalities, yet they are severely constrained in terms of memory capacitance, hardware area, and power consumption. With the increase in the functionalities of sensing devices, there is a need for low-cost synthesizable processors to handle control, interfacing, and error processing. The first step in selecting a synthesizable processor core for low-cost devices is to examine the hardware resource utilization to make sure that it fulfills the requirements of the device. This paper gives an analysis of the hardware resource usage of ten synthesizable processors that implement the Reduced Instruction Set Computer Five (RISC-V) Instruction Set Architecture (ISA). All the ten processors are synthesized using Vivado v2018.02. The maximum frequency, area, and power reports are extracted and a comparison is made to determine which processor is ideal for low-cost hardware devices.
https://doi.org/10.3745/JIPS.03.0129 인용 PDF KSCI

VHDL을 이용한 프로그램 가능한 스택 기반 영상 프로세서 구조 설계 (Design of Architecture of Programmable Stack-based Video Processor with VHDL)

박주현;김영민
- 전자공학회논문지C
- /
- 제36C권4호
- /
- pp.31-43
- /
- 1999
본 논문의 주요 목표는 고성능 SVP(Stack-based Video Processor)를 설계하는 것이다. SVP는 과거에 제안된 스택 머신과 영상 프로세서의 최적의 측면만을 선택함으로써 더 좋은 구조를 갖도록 하는 포괄적인 구조이다. 본 구조는 객체 지향형 프로그램의 소규모의 많은 서브루틴을 가지고 있기 때문에 스택 버퍼를 갖는 준범용 S-RISC(Stack-based Reduced Instruction Set Comuter)를 이용하여 객체 지향형 영상 데이터를 처리한다. 그리고 MPEG-4의 반화소 단위 처리와 고급 모드 움직임 보상, 움직임 예측, SA-DCT(Shape Adaptive-Discrete Cosine Transform)가 가능하며, 절대값기, 반감기를 가지고 있어서 부호화하기로 확장할 수 있도록 하였다. SVP는 0.6㎛ 3-메탈 계층 CMOS 표준 셀 기준을 이용하여 설계되었으며, 110K 로직 게이트와 12Kbit SRAM 내부 버퍼로 이루어지고 50 MHz의 동작 속도를 가진다 . MPEG-4의 VLBL(Very Low Bitrate Video) 최대 전송율인 QCIF 15fps(frame per second)로 영상 재생 알고리즘을 수행한다.
PDF

Multicore Flow Processor with Wire-Speed Flow Admission Control

Doo, Kyeong-Hwan;Yoon, Bin-Yeong;Lee, Bhum-Cheol;Lee, Soon-Seok;Han, Man Soo;Kim, Whan-Woo
- ETRI Journal
- /
- 제34권6호
- /
- pp.827-837
- /
- 2012
We propose a flow admission control (FAC) for setting up a wire-speed connection for new flows based on their negotiated bandwidth. It also terminates a flow that does not have a packet transmitted within a certain period determined by the users. The FAC can be used to provide a reliable transmission of user datagram and transmission control protocol applications. If the period of flows can be set to a short time period, we can monitor active flows that carry a packet over networks during the flow period. Such powerful flow management can also be applied to security systems to detect a denial-of-service attack. We implement a network processor called a flow management network processor (FMNP), which is the second generation of the device that supports FAC. It has forty reduced instruction set computer core processors optimized for packet processing. It is fabricated in 65-nm CMOS technology and has a 40-Gbps process performance. We prove that a flow router equipped with an FMNP is better than legacy systems in terms of throughput and packet loss.
https://doi.org/10.4218/etrij.12.1812.0046 인용 PDF KSCI

低電力 MCU core의 設計에 對해

안형근;정봉영;노형래
- 전자공학회지
- /
- 제25권5호
- /
- pp.31-41
- /
- 1998
With the advent of portable electronic systems, power consumption has recently become a major issue in circuit and system design. Furthermore, the sophisticated fabrication technology makes it possible to embed more functions and features in a VLSI chip, consequently calling for both higher performance and lower power to deal with the ever growing complexity of system algorithms than in the past. VLSI designers should cope with two conflicting constraints, high performance and low power, offering an optimum trade off of these constraints to meet requirements of system. Historically, VLSI designers have focused on performance improvement, and power dissipation was not a design criteria but an afterthought. This design paradigm should be changed, as power is emerging as the most critical design constraint. In VLSI design, low power design can be accomplished through many ways, for instance, process, circuit/logic design, architectural design, and etc.. In this paper, a few low power design examples, which have been used in 8 bit micro-controller core, and can be used also in 4/16/32 bit micro-controller cores, are presented in the areas of circuit, logic and architectural design. We first propose a low power guidelines for micro-controller design in SAMSUNG, and more detailed design examples are followed applying 4 specific design guidelines. The 1st example shows the power reduction through reduction of number of state clocks per instruction. The 2nd example realized the power reduction by applying RISC(Reduced Instruction Set Computer) concept. The 3rd example is to optimize the algorithm for ALU(Arithmetic Logic Unit) to lower the power consumption, Lastly, circuit cells designed for low power are described.
PDF

검색결과 22건 처리시간 0.028초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)