• Title/Summary/Keyword: Parallel Implementation

Search Result 883, Processing Time 0.039 seconds

A Spatiotemporal Parallel Processing Model for the MLP Neural Network (MLP 신경망을 위한 시공간 병렬처리모델)

  • Kim Sung-Oan
    • Journal of the Korea Society of Computer and Information
    • /
    • v.10 no.5 s.37
    • /
    • pp.95-102
    • /
    • 2005
  • A Parallel Processing model by considering a spatiotemporal parallelism is presented for the training procedure of the MLP neural network. We tried to design the flexible Parallel Processing model by simultaneously applying both of the training-set decomposition for a temporal parallelism and the network decomposition for a spatial parallelism. The analytical Performance evaluation model shows that when the problem size is extremely large, the speedup of each implementation depends, in the extreme, on whether the problem size is pattern-size intensive or pattern-quantify intensive.

  • PDF

Interpolator Design for Cubic Parallel Manipulator (육면형 병렬공작기계의 보간기 설계)

  • Kim, H.;Hong, D.;Choi, W. C.;Song, J.-B.
    • Proceedings of the Korean Society of Precision Engineering Conference
    • /
    • 2001.04a
    • /
    • pp.492-495
    • /
    • 2001
  • In order to utilize a parallel machine tool for CAM system, the development of adequate interpolator is necessary. This paper presents a quintic B-spline interpolator with algorithm of limiting maximum interpolation error. The favored property of near arc-length parametrization in the curve representation is used in the implementation of the reference command generation. Then, this interpolator is applied to cubic parallel manipulator to show its validity.

  • PDF

Leaky wave antenna analysis design, and implementation (누설파 안테나 해석 설계 및 제작)

  • 홍재표;조웅희;이종익;윤리호;이정형;조영기;엄효준
    • Journal of the Korean Institute of Telematics and Electronics A
    • /
    • v.33A no.11
    • /
    • pp.88-96
    • /
    • 1996
  • Periodically slotted dielectricalloy filled parallel-plate waveguide as a leaky wave antenna is designed and fabricated at the center frequency of 10.0GHz. The antenna was fed by use of a hog-horn structure. The hog-horn and the two side walls and the lower plate of parallel-plate waveguide were fabricated form duralumin. The upper plate of parallel-plate waveguide with 48 periodic slots was made of copper plate of 1mm thickness. The dielectric material inside the parallel-plate waveguide was chosen to be paraffin. The experimental radiation pattern for the fbricated antenna was compared with the theoretical results for the finite periodic structure.

  • PDF

(A Design and Implementation of Parallelizing Compiler in Loop Structure) (루프구조의 병렬화 컴파일러 설계 및 구현)

  • 송월봉
    • Journal of the Korea Computer Industry Society
    • /
    • v.3 no.8
    • /
    • pp.981-988
    • /
    • 2002
  • In this paper, a simple parallel compiler of a sequential loop is presented. This is a procedure for the automatic conversion of a sequential loop into a nested parallel DOALL loops at compile time. For this. the source program of Parafrase II parallel compiler is analyzed and a new general method the extracting parallelism in order to parallel processing effectively in nested loop is implemented.

  • PDF

Benchmarks for Performance Testing of MPI-IO on the General Parallel File System (범용 병렬화일 시스템 상에서 MPI-IO 방안의 성능 평가 벤티마크)

  • Park, Seong-Sun
    • The KIPS Transactions:PartA
    • /
    • v.8A no.2
    • /
    • pp.125-132
    • /
    • 2001
  • IBM developed the MPI-IO, we call it MPI-2, on the General Parallel File System. We designed and implemented various Matrix Multiplication Benchmarks to evaluate its performances. The MPI-IO on the General Parallel File System shows four kinds of data access methods : the non-collective and blocking, the collective and blocking, the non-collective and non-blocking, and the split collective operation. In this paper, we propose benchmarks to measure the IO time and the computation time for the data access methods. We describe not only its implementation but also the performance evaluation results.

  • PDF

Implementation of a Parallel Inverted Pendulum System with Decoupling Control (병렬형 역진자 시스템 제작 및 분리제어)

  • 김주호;박운식;최재원
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.17 no.7
    • /
    • pp.162-169
    • /
    • 2000
  • In this paper, we develop a parallel inverted pendulum system that has the characteristics of the strongly coupled dynamics of motion by an elastic spring, the time-variant system parameters, and inherent instability, and so on. Hence, it is possible to approximate some kinds of a physical system into this representative system and to apply the various control theories to this system in order to verie their fidelity and efficiency. For this purpose, an experimental system of the parallel inverted pendulum has been implemented, and a control scheme using the eigenstructure assignment for decoupling control is presented in comparison with the conventional LQR optimal control method. Furthermore, this system can be utilized as a testbed to develop and evaluate new control algorithms through various setups. Finally, in this paper, the results of the experiment are compared with those of numerical simulations for validation.

  • PDF

Implementation of High-Speed Reed-Solomon Decoder Using the Modified Euclid's Algorithm (개선된 수정 유클리드 알고리듬을 이용한 고속의 Reed-Solomon 복호기의 설계)

  • 김동선;최종찬;정덕진
    • The Transactions of the Korean Institute of Electrical Engineers A
    • /
    • v.48 no.7
    • /
    • pp.909-915
    • /
    • 1999
  • In this paper, we propose an efficient VLSI architecture of Reed-Solomon(RS) decoder. To improve the speed. we develope an architecture featuring parallel and pipelined processing. To implement the parallel and pipelined processing architecture, we analyze the RS decoding algorithm and the honor's algorithm for parallel processing and we also modified the Euclid's algorithm to apply the efficient parallel structure in RS decoder. To show the proposed architecture, the performance of the proposed RS decoder is compared to Shao's and we obtain the 10 % efficiency in area and three times faster in speed when it's compared to Shao's time domain decoder. In addition, we implemented the proposed RS decoder with Altera FPGA Flex10K-50.

  • PDF

Parallel implementations and their performance evaluations of a SOFM neural network on the multicomputer (다중컴퓨터망에서 SOFM 신경회로망의 병렬구현 및 성능평가)

  • 김선종;최흥문
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.33B no.10
    • /
    • pp.90-97
    • /
    • 1996
  • This paper presents an efficient parallel implementation and its performance evaluations of a SOFM neural netowrk on the multicomputer. We investigate the parallel performance as the size of a neural network N, the number of the patterns L, and the number of the processors p increase. We propose an analytica performance evaluation model for eac of the parallel implementations and verified the validity of the model through experiments. Analytical result show that the number of processors for a maximum speedup of the network decomposition nd the training-set decomposition increases in proportion to .root.N and .root.L, respectively. The performances of the both decompositions depend on the number of training patterns L and the size of the neural network N and, if L.geq.0.423N, the performance of trhe training-set decomposition is proved to be better than that of the network decomposition.

  • PDF

Initial Timing Acquisition for Binary Phase-Shift Keying Direct Sequence Ultra-wideband Transmission

  • Kang, Kyu-Min;Choi, Sang-Sung
    • ETRI Journal
    • /
    • v.30 no.4
    • /
    • pp.495-505
    • /
    • 2008
  • This paper presents a parallel processing searcher structure for the initial synchronization of a direct sequence ultra-wideband (DS-UWB) system, which is suitable for the digital implementation of baseband functionalities with a 1.32 Gsample/s chip rate analog-to-digital converter. An initial timing acquisition algorithm and a data demodulation method are also studied. The proposed searcher effectively acquires initial symbol and frame timing during the preamble transmission period. A hardware efficient receiver structure using 24 parallel digital correlators for binary phase-shift keying DS-UWB transmission is presented. The proposed correlator structure operating at 55 MHz is shared for correlation operations in a searcher, a channel estimator, and the demodulator of a RAKE receiver. We also present a pseudo-random noise sequence generated with a primitive polynomial, $1+x^2+x^5$, for packet detection, automatic gain control, and initial timing acquisition. Simulation results show that the performance of the proposed parallel processing searcher employing the presented pseudo-random noise sequence outperforms that employing a preamble sequence in the IEEE 802.15.3a DS-UWB proposal.

  • PDF

Design of High-speed Digit Serial-Parallel Multiplier in Finite Field GF($2^m$) (Finite Field GF($2^m$)상의 Digit Serial-Parallel Multiplier 구현)

  • Choi, Won-Ho;Hong, Sung-Pyo
    • Proceedings of the KIEE Conference
    • /
    • 2003.11c
    • /
    • pp.928-931
    • /
    • 2003
  • This paper presents a digit-serial/parallel multiplier for finite fields GF(2m). The hardware requirements of the implemented multiplier are less than those of the existing multiplier of the same class, while processing time and area complexity. The implemented multiplier possesses the features of regularity and modularity. Thus, it is well suited to VLSI implementation. If the implemented digit-serial multiplier chooses the digit size D appropriately, it can meet the throughput requirement of a certain application with minimum hardware. The multipliers and squarers analyzed in this paper can be used efficiently for crypto processor in Elliptic Curve Cryptosystem.

  • PDF