• Title/Summary/Keyword: Parallel Implementation

Search Result 883, Processing Time 0.029 seconds

Design of the Digital Neuron Processor (디지털 뉴런프로세서의 설계에 관한 연구)

  • Hong, Bong-Wha;Lee, Ho-Sun;Park, Wha-Se
    • 전자공학회논문지 IE
    • /
    • v.44 no.3
    • /
    • pp.12-22
    • /
    • 2007
  • In this paper, we designed of the high speed digital neuron processor in order to digital neural networks. we designed of the MAC(Multiplier and Accumulator) operation unit used residue number system without carry propagation for the high speed operation. and we implemented sigmoid active function which make it difficult to design neuron processor. The Designed circuits are descripted by VHDL and synthesized by Compass tools. we designed of MAC operation unit and sigmoid processing unit are proved that it could run time 19.6 nsec on the simulation and decreased to hardware size about 50%, each order. Designed digital neuron processor can be implementation in parallel distributed processing system with desired real time processing, In this paper.

Real-Time Implementation of the Navigation Parameter Extraction from the Aerial Image Sequence (항공영상을 이용한 항법변수 추출 알고리듬의 실시간 구현)

  • 박인준;신상윤;전동욱;김관석;오영석;이민규;김인철;박래홍;이상욱
    • Proceedings of the IEEK Conference
    • /
    • 2000.09a
    • /
    • pp.489-492
    • /
    • 2000
  • 본 논문에서는 영상 항법 변수 추출 알고리듬의 실시간 구현에 관해 연구하였다. 영상 항법 변수 추출 알고리듬은 이전 위치를 기준으로 현재 위치를 추정해내는 상대위치 추정 알고리듬과 상대위치 추정에 의해 누적되는 오차를 보정하기 위한 절대위치 보정 알고리듬으로 구성된다. 절대위치 보정 알고리듬은 고해상도 영상과 IRS (Indian Remote Sensing) 위성영상을 기준영상으로 이용하는 방법 및 DEM (Digital Elevation Model) 을 이용하는 방법으로 구성된다. 하이브리드 영상 항법 변수 추출 알고리듬을 실시간으로 구현하기 위해 MVP (Multimedia Video Processor)로 명명된 TMS320C80 DSP (Digital Signal Processor) 칩을 사용하였다. 구현된 시스템은 MVP의 부동 소수점 프로세서인 MP (Master Processor) 를 고정 소수점 프로세서인 PP (Parallel Processor) 를 제어하거나 삼각함수 계산과 같은 부동 소수점 함수를 계산하는데 사용하였고, 대부분의 연산은 PP를 사용하여 수행하였다. 처리시간이 많이 필요한 모듈에 대해서는 고속 알고리듬을 개발하였고, 4개의 PP를 효율적으로 사용하기 위한 영상분할 방법에 대해 제안하였다. 비행체에서 캡코더를 이용해 촬영한 연속 항공 영상과 비행체의 자세정보를 입력으로 실시간 시뮬레이션 하였다. 실험결과는 하이브리드 항법 변수 추출 알고리듬의 실시간 구현이 효과적으로 구현되었음을 나타내고 있다.

  • PDF

A Helicopter-borne Pulse Doppler Radar Signal Processor Development using High Speed Multi-DSP (고속 Multi-DSP를 이용한 헬기탑재 펄스 도플러 레이다 신호처리기 개발)

  • Kwag, Young-Kil;Choi, Min-Su;Jeun, In-Pyung;Hwang, Gwang-Yeon;Lee, Kang-Hoon;Lee, Jae-Ho
    • Proceedings of the Korea Electromagnetic Engineering Society Conference
    • /
    • 2005.11a
    • /
    • pp.23-28
    • /
    • 2005
  • An airborne radar is an essential aviation electronic system of the helicopter to perform various missions in all-weather environments. This paper presents the results of the design and implementation of the airborne pulse doppler radar signal processor using high multi-DSP for the multi-function radar capability such as short-range, midium-range, and long-range depending on the mission of the vehicle. Particularly, the radar signal processor is developed using two DSP boards in parallel for the various radar signal processing algorithm. The key algorithms include LFM chirp waveform-based pulse compression, MTI clutter filter, MTD processor, adaptive CFAR, and clutter map. Especially airborne moving clutter Doppler spectrum compensation algorithm such as TACCAR is implemented for the multi-mode airborne radar system. The test results shows the good Doppler spectral separation for the clutter and the moving target in the flight test environment using helicopter.

  • PDF

A TX Clock Timing Technique for the CIJ Compensation of Coupled Microstrip Lines

  • Jung, Hae-Kang;Lee, Soo-Min;Sim, Jae-Yoon;Park, Hong-June
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.10 no.3
    • /
    • pp.232-239
    • /
    • 2010
  • By using the clock timing control at transmitter (TX), the crosstalk-induced jitter (CIJ) is compensated for in the 2-bit parallel data transmission through the coupled microstrip lines on printed circuit board (PCB). Compared to the authors' prior work, the delay block circuit is simplified by combining a delay block with a minimal number of stages and a 3-to-1 multiplexer. The delay block generates three clock signals with different delays corresponding to the channel delay of three different signal modes. The 3-to-1 multiplexer selects one of the three clock signals for TX timing depending on the signal mode. The TX is implemented by using a $0.18\;{\mu}m$ CMOS process. The measurement shows that the TX reduces the RX jitters by about 38 ps at the data rates from 2.6 Gbps to 3.8 Gbps. Compared to the authors' prior work, the amount of RX Jitter reduction increases from 28 ps to 38 ps by using the improved implementation.

Implementation of Neural Networks using GPU (GPU를 이용한 신경망 구현)

  • Oh Kyoung-su;Jung Keechul
    • The KIPS Transactions:PartB
    • /
    • v.11B no.6
    • /
    • pp.735-742
    • /
    • 2004
  • We present a new use of common graphics hardware to perform a faster artificial neural network. And we examine the use of GPU enhances the time performance of the image processing system using neural network, In the case of parallel computation of multiple input sets, the vector-matrix products become matrix-matrix multiplications. As a result, we can fully utilize the parallelism of GPU. Sigmoid operation and bias term addition are also implemented using pixel shader on GPU. Our preliminary result shows a performance enhancement of about thirty times faster using ATI RADEON 9800 XT board.

Spark Framework Based on a Heterogenous Pipeline Computing with OpenCL (OpenCL을 활용한 이기종 파이프라인 컴퓨팅 기반 Spark 프레임워크)

  • Kim, Daehee;Park, Neungsoo
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.67 no.2
    • /
    • pp.270-276
    • /
    • 2018
  • Apache Spark is one of the high performance in-memory computing frameworks for big-data processing. Recently, to improve the performance, general-purpose computing on graphics processing unit(GPGPU) is adapted to Apache Spark framework. Previous Spark-GPGPU frameworks focus on overcoming the difficulty of an implementation resulting from the difference between the computation environment of GPGPU and Spark framework. In this paper, we propose a Spark framework based on a heterogenous pipeline computing with OpenCL to further improve the performance. The proposed framework overlaps the Java-to-Native memory copies of CPU with CPU-GPU communications(DMA) and GPU kernel computations to hide the CPU idle time. Also, CPU-GPU communication buffers are implemented with switching dual buffers, which reduce the mapped memory region resulting in decreasing memory mapping overhead. Experimental results showed that the proposed Spark framework based on a heterogenous pipeline computing with OpenCL had up to 2.13 times faster than the previous Spark framework using OpenCL.

Generalized Binary Second-order Recurrent Neural Networks Equivalent to Regular Grammars (정규문법과 동등한 일반화된 이진 이차 재귀 신경망)

  • Jung Soon-Ho
    • Journal of Intelligence and Information Systems
    • /
    • v.12 no.1
    • /
    • pp.107-123
    • /
    • 2006
  • We propose the Generalized Binary Second-order Recurrent Neural Networks(GBSRNNf) being equivalent to regular grammars and ?how the implementation of lexical analyzer recognizing the regular languages by using it. All the equivalent representations of regular grammars can be implemented in circuits by using GSBRNN, since it has binary-valued components and shows the structural relationship of a regular grammar. For a regular grammar with the number of symbols m, the number of terminals p, the number of nonterminals q, and the length of input string k, the size of the corresponding GBSRNN is $O(m(p+q)^2)$ and its parallel processing time is O(k) and its sequential processing time, $O(k(p+q)^2)$.

  • PDF

The Design and Implementation of Arc Power supply for Neutral Beam Injection (중성입자빔 가열을 위한 아크 전원 공급장치 설계 및 구현)

  • Lee, Hee-Jun;Shin, Soo-Cheol;Lee, Seung-Gyo;Jung, Yong-Chae;Won, Chung-Yuen
    • Journal of the Korean Institute of Illuminating and Electrical Installation Engineers
    • /
    • v.27 no.6
    • /
    • pp.50-58
    • /
    • 2013
  • The Neutral Beam Injection(NBI) generates ultra-high temperature energy in the tokamak of nuclear fusion. The NBI consists of filament power supply acceleration and deceleration power supply and arc power supply(APS). The APS has characteristics of low voltage and high current. APS generate arc through constant output of voltage and current. So this paper proposed suitable buck converter for low voltage and high current. The case of proposed buck converter used parallel switch because it can increase capacity and decrease conduction loss. When an arc is generated, the NBI chamber occur high voltage. And it will break output capacitor of buck converter. Therefore the output capacitor was removed in the proposed converter. Thus buck converter with constant output is the most important design of the output inductor. In this paper, designed APS verified operation of system and stability through simulation and prototype.

A Grid Service based on OGSA for Process Fault Detection (프로세스 결함 검출을 위한 OGSA 기반 그리드 서비스의 설계 및 구현)

  • Kang, Yun-Hee
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2004.11a
    • /
    • pp.314-317
    • /
    • 2004
  • With the advance of network and software infrastructure, Grid-computing technology on a cluster of heterogeneous computing resources becomes pervasive. Grid computing is required a coordinated use of an assembly of distributed computers, which are linked by WAN. As the number of grid system components increases, the probability of failure in the grid computing is higher than that in a traditional parallel computing. To provide the robustness of grid applications, fault detection is critical and is essential elements in design and implementation. In this paper, a OGSA based process fault-detection services presented to provide high reliability under low network traffic environment.

  • PDF

Design and Implementation of Distributed In-Memory DBMS-based Parallel K-Means as In-database Analytics Function (분산 인 메모리 DBMS 기반 병렬 K-Means의 In-database 분석 함수로의 설계와 구현)

  • Kou, Heymo;Nam, Changmin;Lee, Woohyun;Lee, Yongjae;Kim, HyoungJoo
    • KIISE Transactions on Computing Practices
    • /
    • v.24 no.3
    • /
    • pp.105-112
    • /
    • 2018
  • As data size increase, a single database is not enough to serve current volume of tasks. Since data is partitioned and stored into multiple databases, analysis should also support parallelism in order to increase efficiency. However, traditional analysis requires data to be transferred out of database into nodes where analytic service is performed and user is required to know both database and analytic framework. In this paper, we propose an efficient way to perform K-means clustering algorithm inside the distributed column-based database and relational database. We also suggest an efficient way to optimize K-means algorithm within relational database.