• Title/Summary/Keyword: FPGA architecture

Search Result 406, Processing Time 0.032 seconds

A detailed FPGA routing by 2-D track assignment (이차원 트랙 할당에 의한 FPGA 상세 배선)

  • 이정주;임종석
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.34C no.10
    • /
    • pp.8-18
    • /
    • 1997
  • In FPGAs, we may use the property of the routing architecture for their routing compared to the routing in the conventional layout style. Especially, the Xilinx XC4000 series FPGAs have very special routing architecture in which the routing problem is equivalent to the two dimensional track assignment problem. In this paper, we propose a new FPgA detailed routing method by developing a two dimensional trackassigment heuristic algorithm. The proposed routing mehtod accept a global routing result as an input and obtain a detailed routing such that the number of necessary wire segments in each connection block is minimized. For all benchmark circuits tested, our routing methd complete routing results. The number of used tracks are also similar to the results by thedirect routing methods.

  • PDF

Memory saving architecture of number theoretic transform for lattice cryptography (동형 암호 시스템을 위한 정수 푸리에 변환의 메모리 절약 구조)

  • Moon, Sangook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2016.05a
    • /
    • pp.762-763
    • /
    • 2016
  • In realizing a homomorphic encryption system, the operations of encrypt, decypt, and recrypt constitute major portions. The most important common operation for each back-bone operations include a polynomial modulo multiplication for over million-bit integers, which can be obtained by performing integer Fourier transform, also known as number theoretic transform. In this paper, we adopt and modify an algorithm for calculating big integer multiplications introduced by Schonhage-Strassen to propose an efficient algorithm which can save memory. The proposed architecture of number theoretic transform has been implemented on an FPGA and evaluated.

  • PDF

Concurrent Support Vector Machine Processor (Concurrent Support Vector Machine 프로세서)

  • 위재우;이종호
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.53 no.8
    • /
    • pp.578-584
    • /
    • 2004
  • The CSVM(Current Support Vector Machine) that is a digital architecture performing all phases of recognition process including kernel computing, learning, and recall of SVM(Support Vector Machine) on a chip is proposed. Concurrent operation by parallel architecture of elements generates high speed and throughput. The classification problems of bio data having high dimension are solved fast and easily using the CSVM. Quadratic programming in original SVM learning algorithm is not suitable for hardware implementation, due to its complexity and large memory consumption. Hardware-friendly SVM learning algorithms, kernel adatron and kernel perceptron, are embedded on a chip. Experiments on fixed-point algorithm having quantization error are performed and their results are compared with floating-point algorithm. CSVM implemented on FPGA chip generates fast and accurate results on high dimensional cancer data.

A Hardware/Software Codesign for Image Processing in a Processor Based Embedded System for Vehicle Detection

  • Moon, Ho-Sun;Moon, Sung-Hwan;Seo, Young-Bin;Kim, Yong-Deak
    • Journal of Information Processing Systems
    • /
    • v.1 no.1 s.1
    • /
    • pp.27-31
    • /
    • 2005
  • Vehicle detector system based on image processing technology is a significant domain of ITS (Intelligent Transportation System) applications due to its advantages such as low installation cost and it does not obstruct traffic during the installation of vehicle detection systems on the road[1]. In this paper, we propose architecture for vehicle detection by using image processing. The architecture consists of two main parts such as an image processing part, using high speed FPGA, decision and calculation part using CPU. The CPU part takes care of total system control and synthetic decision of vehicle detection. The FPGA part assumes charge of input and output image using video encoder and decoder, image classification and image memory control.

Design of a Serial Port Interface Suitable for Bluetooth Embedded Systems (블루투스 임베디드 시스템에 적용 가능한 직렬 포트 인터페이스 설계)

  • Moon, Sangook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2009.05a
    • /
    • pp.903-906
    • /
    • 2009
  • In this contribution, we designed a serial port interface (SPI) suitable for embedded systems, especially for Bluetooth baseband. Proposed architecture is compatible for the APB bus in AMBA bus architecture. The 8-bit design of the SPI module is in charge of transferring the data and the instructions between the external devices and the coprocessors. We adopted the cyclic redundancy check method for the error correction. Also, we provided the interface for multimedia cards. The designed SPI module was automatically synthesized, placed, and routed. Implementation was performed through the Altera FPGA and well operated at 25MHz clock frequency.

  • PDF

Design and Implementation of Accelerator Architecture for Binary Weight Network on FPGA with Limited Resources (한정된 자원을 갖는 FPGA에서의 이진가중치 신경망 가속처리 구조 설계 및 구현)

  • Kim, Jong-Hyun;Yun, SangKyun
    • Journal of IKEEE
    • /
    • v.24 no.1
    • /
    • pp.225-231
    • /
    • 2020
  • In this paper, we propose a method to accelerate BWN based on FPGA with limited resources for embedded system. Because of the limited number of logic elements available, a single computing unit capable of handling Conv-layer, FC-layer of various sizes must be designed and reused. Also, if the input feature map can not be parallel processed at one time, the output must be calculated by reading the inputs several times. Since the number of available BRAM modules is limited, the number of data bits in the BWN accelerator must be minimized. The image classification processing time of the BWN accelerator is superior when compared with a embedded CPU and is faster than a desktop PC and 50% slower than a GPU system. Since the BWN accelerator uses a slow clock of 50MHz, it can be seen that the BWN accelerator is advantageous in performance versus power.

Development of High-Speed Real-Time Signal Processing Unit for Small Radio Frequency Tracking Radar Using TMS320C6678 (TMS320C6678을 적용한 소형 Radio Frequency 추적레이다용 고속 실시간 신호처리기 설계)

  • Kim, Hong-Rak;Hyun, Hyo-Young;Kim, Younjin;Woo, Seonkeol;Kim, Gwanghee
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.21 no.5
    • /
    • pp.11-18
    • /
    • 2021
  • The small radio frequency tracking radar is a tracking system with a radio frequency sensor that identifies a target through all-weather radio frequency signal processing for a target and searches, detects and tracks the target for the major target. In this paper, we describe the development of a board equipped with TMS320C6678 and XILINX FPGA (Field Programmable Gate Array), a high-speed multi-core DSP that acquires target information through all-weather radio frequency and identifies a target through real-time signal processing. We propose DSP-FPGA combination architecture for DSP and FPGA selection and signal processing, and also explain the design of SRIO for high-speed data transmission.

Design and Evaluation of 32-Bit RISC-V Processor Using FPGA (FPGA를 이용한 32-Bit RISC-V 프로세서 설계 및 평가)

  • Jang, Sungyeong;Park, Sangwoo;Kwon, Guyun;Suh, Taeweon
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.11 no.1
    • /
    • pp.1-8
    • /
    • 2022
  • RISC-V is an open-source instruction set architecture which has a simple base structure and can be extensible depending on the purpose. In this paper, we designed a small and low-power 32-bit RISC-V processor to establish the base for research on RISC-V embedded systems. We designed a 2-stage pipelined processor which supports RISC-V base integer instruction set except for FENCE and EBREAK instructions. The processor also supports privileged ISA for trap handling. It used 1895 LUTs and 1195 flip-flops, and consumed 0.001W on Xilinx Zynq-7000 FPGA when synthesized using Vivado Design Suite. GPIO, UART, and timer peripherals are additionally used to compose the system. We verified the operation of the processor on FPGA with FreeRTOS at 16MHz. We used Dhrystone and Coremark benchmarks to measure the performance of the processor. This study aims to provide a low-power, high-efficiency microprocessor for future extension.

Design of an Integrated Circuit for Controlling the Printer Head Ink Nozzle (프린터 헤드 노즐분사 제어용 집적회로설계)

  • 정승민;김정태;이문기
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.7 no.4
    • /
    • pp.798-804
    • /
    • 2003
  • In this paper, We have designed an advanced circuits for controlling the Ink Nozzle of Printer Head We can fully increase the number of nozzle by reducing the number of Input/Output PADs using the proposed new circuit. The proposed circuit is tested with only 20 nozzles to evaluate functional test using FPGA sample chip. The new circuit architecture can be estimated. Full circuit for controlling 320 nozzles was designed and simulated from ASIC full custom methodology, then the circuit was fabricated by applying 3${\mu}{\textrm}{m}$ CMOS process design rule.

A Design of Parallel Processing for Wavelet Transformation on FPGA (ICCAS 2005)

  • Ngowsuwan, Krairuek;Chisobhuk, Orachat;Vongchumyen, Charoen
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2005.06a
    • /
    • pp.864-867
    • /
    • 2005
  • In this paper we introduce a design of parallel architecture for wavelet transformation on FPGA. We implement wavelet transforms though lifting scheme and apply Daubechies4 transform equations. This technique has an advantage that we can obtain perfect reconstruction of the data. We divide our process to high pass filter and low pass filter. With this division, we can find coefficients from low and high pass filters simultaneously using parallel processing properties of FPGA to reduce processing time. From the equations, we have to design real number computation module, referred to IEEE754 standard. We choose 32 bit computation that is fine enough to reconstruct data. After that we arrange the real number module according to Daubechies4 transform though lifting scheme.

  • PDF