Search | Korea Science

Design and Implementation of a 128-bit Block Cypher Algorithm SEED Using Low-Cost FPGA for Embedded Systems (내장형 시스템을 위한 128-비트 블록 암호화 알고리즘 SEED의 저비용 FPGA를 이용한 설계 및 구현)

Yi, Kang;Park, Ye-Chul
- Journal of KIISE:Computer Systems and Theory
- /
- v.31 no.7
- /
- pp.402-413
- /
- 2004
This paper presents an Implementation of Korean standard 128-bit block cipher SEED for the small (8 or 16-bits) embedded system using a low-cost FPGA(Field Programmable Gate Array) chip. Due to their limited computing and storage capacities most of the 8-bits/16-bits small embedded systems require a separate and dedicated cryptography processor for data encryption and decryption process which require relatively heavy computation job. So, in order to integrate the SEED with other logic circuit block in a single chip we need to invent a design which minimizes the area demand while maintaining the proper performance. But, the straight-forward mapping of the SEED specification into hardware design results in exceedingly large circuit area for a low-cost FPGA capacity. Therefore, in this paper we present a design which maximize the resource sharing and utilizing the modern FPGA features to reduce the area demand resulting in the successful implementation of the SEED plus interface logic with single low-cost FPGA. We achieved 66% area accupation by our SEED design for the XC2S100 (a Spartan-II series FPGA from Xilinx) and data throughput more than 66Mbps. This Performance is sufficient for the small scale embedded system while achieving tight area requirement.
PDF KSCI

A 16 bit FPGA Microprocessor for Embedded Applications (실장제어 16 비트 FPGA 마이크로프로세서)

차영호;조경연;최혁환
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.5 no.7
- /
- pp.1332-1339
- /
- 2001
SoC(System on Chip) technology is widely used in the field of embedded systems by providing high flexibility for a specific application domain. An important aspect of development any new embedded system is verification which usually requires lengthy software and hardware co-design. To reduce development cost of design effort, the instruction set of microprocessor must be suitable for a high level language compiler. And FPGA prototype system could be derived and tested for design verification. In this paper, we propose a 16 bit FPGA microprocessor, which is tentatively-named EISC16, based on an EISC(Extendable Instruction Set Computer) architecture for embedded applications. The proposed EISC16 has a 16 bit fixed length instruction set which has the short length offset and small immediate operand. A 16 bit offset and immediate operand could be extended using by an extension register and an extension flag. We developed a cross C/C++ compiler and development software of the EISC16 by porting GNU on an IBM-PC and SUN workstation and compared the object code size created after compiling a C/C. standard library, concluding that EISC16 exhibits a higher code density than existing 16 microprocessors. The proposed EISC16 requires approximately 6,000 gates when designed and synthesized with RTL level VHDL at Xilinix's Virtex XCV300 FPGA. And we design a test board which consists of EISC16 ROM, RAM, LED/LCD panel, periodic timer, input key pad and RS-232C controller. 11 works normally at 7MHz Clock.
PDF

A practial design of direct digital frequency synthesizer with multi-ROM configuration (병렬 구조의 직접 디지털 주파수 합성기의 설계)

이종선;김대용;유영갑
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.21 no.12
- /
- pp.3235-3245
- /
- 1996
A DDFS(Direct Digital Frequency Synthesizer) used in spread spectrum communication systems must need fast switching speed, high resolution(the step size of the synthesizer), small size and low power. The chip has been designed with four parallel sine look-up table to achieve four times throughput of a single DDFS. To achieve a high processing speed DDFS chip, a 24-bit pipelined CMOS technique has been applied to the phase accumulator design. To reduce the size of the ROM, each sine ROM of the DDFS is stored 0-.pi./2 sine wave data by taking advantage of the fact that only one quadrant of the sine needs to be stored, since the sine the sine has symmetric property. And the 8 bit of phase accumulator's output are used as ROM addresses, and the 2 MSBs control the quadrants to synthesis the sine wave. To compensate the spectrum purity ty phase truncation, the DDFS use a noise shaper that structure like a phase accumlator. The system input clock is divided clock, 1/2*clock, and 1/4*clock. and the system use a low frequency(1/4*clock) except MUX block, so reduce the power consumption. A 107MHz DDFS(Direct Digital Frequency Synthesizer) implemented using 0.8.mu.m CMOS gate array technologies is presented. The synthesizer covers a bandwidth from DC to 26.5MHz in steps of 1.48Hz with a switching speed of 0.5.mu.s and a turing latency of 55 clock cycles. The DDFS synthesizes 10 bit sine waveforms with a spectral purity of -65dBc. Power consumption is 276.5mW at 40MHz and 5V.
PDF

An FPGA Implementation of the Synthesis Filter for MPEG-1 Audio Layer III by a Distributed Arithmetic Lookup Table (분산산술연산방식을 이용한 MPEG-1 오디오 계층 3 합성필터의 FPGA 군현)

Koh Sung-Shik;Choi Hyun-Yong;Kim Jong-Bin;Ku Dae-Sung
- The Journal of the Acoustical Society of Korea
- /
- v.23 no.8
- /
- pp.554-561
- /
- 2004
As the technologies of semiconductor and multimedia communication have been improved. the high-quality video and the multi-channel audio have been highlighted. MPEG Audio Layer 3 decoder has been implemented as a Processor using a standard. Since the synthesis filter of MPEG-1 Audio Layer 3 decoder requires the most outstanding operation in the entire decoder. the synthesis filter that can reduce the amount of operation is needed for the design of the high-speed processor. Therefore, in this paper, the synthesis filter. the most important part of MPEG Audio, is materialized in FPGA using the method of DAULT (distributed arithemetic look-up table). For the design of high-speed synthesis filter, the DAULT method is used instead of a multiplier and a Pipeline structure is used. The Performance improvement by 30% is obtained by additionally making the result of multiplication of data with cosine function into the table. All hardware design of this Paper are described using VHDL (VHIC Hardware Description Language) Active-HDL 6.1 of ALDEC is used for VHDL simulation and Synplify Pro 7.2V is used for Model-sim and synthesis. The corresponding library is materialized by XC4013E and XC4020EX. XC4052XL of XILINX and XACT M1.4 is used for P&R tool. The materialized processor operates from 20MHz to 70MHz.
PDF KSCI

FPGA Mapping Incorporated with Multiplexer Tree Synthesis (멀티플렉서 트리 합성이 통합된 FPGA 매핑)

Kim, Kyosun
- Journal of the Institute of Electronics and Information Engineers
- /
- v.53 no.4
- /
- pp.37-47
- /
- 2016
The practical constraints on the commercial FPGAs which contain dedicated wide function multiplexers in their slice structure are incorporated with one of the most advanced FPGA mapping algorithms based on the AIG (And-Inverter Graph), one of the best logic representations in academia. As the first step of the mapping process, cuts are enumerated as intermediate structures. And then, the cuts which can be mapped to the multiplexers are recognized. Without any increased complexity, the delay and area of multiplexers as well as LUTs are calculated after checking the requirements for the tree construction such as symmetry and depth limit against dynamically changing mapping of neighboring nodes. Besides, the root positions of multiplexer trees are identified from the RTL code, and annotated to the AIG as AOs (Auxiliary Outputs). A new AIG embedding the multiplexer tree structures which are intentionally synthesized by Shannon expansion at the AOs, is overlapped with the optimized AIG. The lossless synthesis technique which employs FRAIG (Functionally Reduced AIG) is applied to this approach. The proposed approach and techniques are validated by implementing and applying them to two RISC processor examples, which yielded 13~30% area reduction, and up to 32% delay reduction. The research will be extended to take into account the constraints on the dedicated hardware for carry chains.
https://doi.org/10.5573/ieie.2016.53.4.037 인용 PDF KSCI

The Design of 32 Bit Microprocessor for Sequence Control Using FPGA (FPGA를 이용한 시퀀스 제어용 32비트 마이크로프로세서 설계)

Yang, Oh
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.40 no.6
- /
- pp.431-441
- /
- 2003
This paper presents the design of 32 bit microprocessor for a sequence control using a field programmable gate array(FPGA). The microprocessor was designed by a VHDL with top down method, the program memory was separated from the data memory for high speed execution of sequence instructions. Therefore it was possible that sequence instructions could be operated at the same time during the instruction fetch cycle. In order to reduce the instruction decoding time and the interface time of the data memory interface, an instruction code size was implemented by 32 bits. And the real time debug operation was implemented for easeful debugging the designed processor with a single step run, PC break point run, data memory break point run. Also in this designed microprocessor, pulse instructions, step controllers, master controllers, BM and BCD type arithmetic instructions, barrel shift instructions were implemented for sequence logic control. The FPGA was synthesized under a Xilinx's Foundation 4.2i Project Manager using a V600EHQ240 which contains 600,000 gates. Finally simulation and experiment were successfully performed respectively. For showing good performance, the designed microprocessor for the sequence logic control was compared with the H8S/2148 microprocessor which contained many bit instructions for sequence logic control. The designed processor for the sequence logic showed good performance.
PDF KSCI

Cascade CNN with CPU-FPGA Architecture for Real-time Face Detection (실시간 얼굴 검출을 위한 Cascade CNN의 CPU-FPGA 구조 연구)

Nam, Kwang-Min;Jeong, Yong-Jin
- Journal of IKEEE
- /
- v.21 no.4
- /
- pp.388-396
- /
- 2017
Since there are many variables such as various poses, illuminations and occlusions in a face detection problem, a high performance detection system is required. Although CNN is excellent in image classification, CNN operatioin requires high-performance hardware resources. But low cost low power environments are essential for small and mobile systems. So in this paper, the CPU-FPGA integrated system is designed based on 3-stage cascade CNN architecture using small size FPGA. Adaptive Region of Interest (ROI) is applied to reduce the number of CNN operations using face information of the previous frame. We use a Field Programmable Gate Array(FPGA) to accelerate the CNN computations. The accelerator reads multiple featuremap at once on the FPGA and performs a Multiply-Accumulate (MAC) operation in parallel for convolution operation. The system is implemented on Altera Cyclone V FPGA in which ARM Cortex A-9 and on-chip SRAM are embedded. The system runs at 30FPS with HD resolution input images. The CPU-FPGA integrated system showed 8.5 times of the power efficiency compared to systems using CPU only.
https://doi.org/10.7471/ikeee.2017.21.4.388 인용 PDF KSCI

A Study on Signal Analysis of the Data Aquisition System for Photosensor (데이터 획득장치에 이용되는 포토센서에 대한 DAS의 신호분석연구)

Hwang, InHo;Yoo, Sun Kook
- Journal of rehabilitation welfare engineering & assistive technology
- /
- v.10 no.3
- /
- pp.237-242
- /
- 2016
The major advantage of slip-ring technology in Spiral CT is that it facilitates continuous rotation of the x-ray tube, so that volume data can be acquired from a patient quickly. Not only for such a fast scan, but also for the dose reduction purpose, high signal-to-noise ratio and fast data acquisition system is required. In this study, we have built a multi-channel photodetector and multi-channel data acquisition system for CT application. The detector module consisted of CdWO4 crystal and Si photodiode in 16 channels. For the performance test of the preamplifier stage, both the transimpedance and switched integrator types are optimized for the photodetector modules. Switched integrator showed better noise performance in the limited bandwidth which is suitable for the current CT application. The control sequence for data acquisition and 20 bit ADC is designed with VHDL(Very High Speed Integrated Circuit Hardware Description Language) and implemented on FPGA(Field Programmable Gate Array) chip. Our Si photodiode detector module coupled to CdWO4 crystal showed comparable signal with other commercially available photodiode for CT. Switched integrator type showed higher SNR but narrower bandwidth compared to transimpedance preamplifier. Digital hardware is designed by FPGA, so that the control signal could be redesigned without hardware alteration.
https://doi.org/10.21288/resko.2016.10.3.237 인용 PDF KSCI

Design of PMOS-Diode Type eFuse OTP Memory IP (PMOS-다이오드 형태의 eFuse OTP IP 설계)

Kim, Young-Hee;Jin, Hongzhou;Ha, Yoon-Gyu;Ha, Pan-Bong
- The Journal of Korea Institute of Information, Electronics, and Communication Technology
- /
- v.13 no.1
- /
- pp.64-71
- /
- 2020
eFuse OTP memory IP is required to trim the analog circuit of the gate driving chip of the power semiconductor device. Conventional NMOS diode-type eFuse OTP memory cells have a small cell size, but require one more deep N-well (DNW) mask. In this paper, we propose a small PMOS-diode type eFuse OTP memory cell without the need for additional processing in the CMOS process. The proposed PMOS-diode type eFuse OTP memory cell is composed of a PMOS transistor formed in the N-WELL and an eFuse link, which is a memory element and uses a pn junction diode parasitic in the PMOS transistor. A core driving circuit for driving the array of PMOS diode-type eFuse memory cells is proposed, and the SPICE simulation results show that the proposed core circuit can be used to sense post-program resistance of 61㏀. The layout sizes of PMOS-diode type eFuse OTP memory cell and 512b eFuse OTP memory IP designed using 0.13㎛ BCD process are 3.475㎛ × 4.21㎛ (= 14.62975㎛2) and 119.315㎛ × 341.95㎛ (= 0.0408mm2), respectively. After testing at the wafer level, it was confirmed that it was normally programmed.
https://doi.org/10.17661/jkiiect.2020.13.1.64 인용 PDF KSCI

Simulation Study of a Large Area CMOS Image Sensor for X-ray DR Detector with Separate ROICs (센서-회로 분리형 엑스선 DR 검출기를 위한 대면적 CMOS 영상센서 모사 연구)

Kim, Myung Soo;Kim, Hyoungtak;Kang, Dong-uk;Yoo, Hyun Jun;Cho, Minsik;Lee, Dae Hee;Bae, Jun Hyung;Kim, Jongyul;Kim, Hyunduk;Cho, Gyuseong
- Journal of Radiation Industry
- /
- v.6 no.1
- /
- pp.31-40
- /
- 2012
There are two methods to fabricate the readout electronic to a large-area CMOS image sensor (LACIS). One is to design and manufacture the sensor part and signal processing electronics in a single chip and the other is to integrate both parts with bump bonding or wire bonding after manufacturing both parts separately. The latter method has an advantage of the high yield because the optimized and specialized fabrication process can be chosen in designing and manufacturing each part. In this paper, LACIS chip, that is optimized design for the latter method of fabrication, is presented. The LACIS chip consists of a 3-TR pixel photodiode array, row driver (or called as a gate driver) circuit, and bonding pads to the external readout ICs. Among 4 types of the photodiode structure available in a standard CMOS process, $N_{photo}/P_{epi}$ type photodiode showed the highest quantum efficiency in the simulation study, though it requires one additional mask to control the doping concentration of $N_{photo}$ layer. The optimized channel widths and lengths of 3 pixel transistors are also determined by simulation. The select transistor is not significantly affected by channel length and width. But source follower transistor is strongly influenced by length and width. In row driver, to reduce signal time delay by high capacitance at output node, three stage inverter drivers are used. And channel width of the inverter driver increases gradually in each step. The sensor has very long metal wire that is about 170 mm. The repeater consisted of inverters is applied proper amount of pixel rows. It can help to reduce the long metal-line delay.

Search Result 584, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)