• Title/Summary/Keyword: cell processor

Search Result 226, Processing Time 0.022 seconds

A linear array SliM-II image processor chip (선형 어레이 SliM-II 이미지 프로세서 칩)

  • 장현만;선우명훈
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.35C no.2
    • /
    • pp.29-35
    • /
    • 1998
  • This paper describes architectures and design of a SIMD type parallel image processing chip called SliM-II. The chiphas a linear array of 64 processing elements (PEs), operates at 30 MHz in the worst case simulation and gives at least 1.92 GIPS. In contrast to existing array processors, such as IMAP, MGAP-2, VIP, etc., each PE has a multiplier that is quite effective for convolution, template matching, etc. The instruction set can execute an ALU operation, data I/O, and inter-PE communication simulataneously in a single instruction cycle. In addition, during the ALU/multiplier operation, SliM-II provides parallel move between the register file and on-chip memory as in DSP chips, SliM-II can greatly reduce the inter-PE communication overhead, due to the idea a sliding, which is a technique of overlapping inter-PE communication with computation. Moreover, the bandwidth of data I/O and inter-PE communication increases due to bit-parallel data paths. We used the COMPASS$^{TM}$ 3.3 V 0.6.$\mu$m standrd cell library (v8r4.10). The total number of transistors is about 1.5 muillions, the core size is 13.2 * 13.0 mm$^{2}$ and the package type is 208 pin PQ2 (Power Quad 2). The performance evaluation shows that, compared to a existing array processors, a proposed architeture gives a significant improvement for algorithms requiring multiplications.s.

  • PDF

A New Fast Algorithm for Short Range Force Calculation (근거리 힘 계산의 새로운 고속화 방법)

  • Lee, Sang-Hwan;Ahn, Cheol-O
    • 유체기계공업학회:학술대회논문집
    • /
    • 2006.08a
    • /
    • pp.383-386
    • /
    • 2006
  • In this study, we propose a new fast algorithm for calculating short range forces in molecular dynamics, This algorithm uses a new hierarchical tree data structure which has a high adaptiveness to the particle distribution. It can divide a parent cell into k daughter cells and the tree structure is independent of the coordinate system and particle distribution. We investigated the characteristics and the performance of the tree structure according to k. For parallel computation, we used orthogonal recursive bisection method for domain decomposition to distribute particles to each processor, and the numerical experiments were performed on a 32-node Linux cluster. We compared the performance of the oct-tree and developed new algorithm according to the particle distributions, problem sizes and the number of processors. The comparison was performed sing tree-independent method and the results are independent of computing platform, parallelization, or programming language. It was found that the new algorithm can reduce computing cost for a large problem which has a short search range compared to the computational domain. But there are only small differences in wall-clock time because the proposed algorithm requires much time to construct tree structure than the oct-tree and he performance gain is small compared to the time for single time step calculation.

  • PDF

A study on the characteristics of gas flow in inlet port of 2 cycle engine (2사이클 기관 흡기 포오트의 가스 유동 특성에 관한 연구)

  • 이창식
    • Transactions of the Korean Society of Mechanical Engineers
    • /
    • v.11 no.5
    • /
    • pp.725-730
    • /
    • 1987
  • An experimental study of the air flow through inlet pipe of reciprocating two-cycle engine was investigated under motored condition. Measurements of the two components of velocity, velocity fluctuation, and the other behavior of inlet flow have been obtained by laser Doppler anemometer system. The research engine comprised the cylinder head of a two-cycle engine which mounted on optical spacer with measuring window and glass inlet entry for laser anemometer measurement. A dual beam laser Doppler anemometer was used with conventional forward scattered method and comprised argon-ion laser, frequency shifter with Bragg cell module, and the signal processor. Measurements of mean velocity fluctuation of inlet flow for different engine speeds, measuring positions, and the changes in cylinder volume are investigated. The results presented show that the changes in engine speed is shown to be strongly influenced on the mean velocity of inlet air. The effect of measuring position and cylinder volume on the inlet velocity was also investigated for the inlet port entry and is shown to be small compared to the engine speed.

A New Blind Beamforming Procedure Based on the Conjugate Gradient Method for CDMA Mobile Communications

  • Shin, Eung-Soon;Choi, Seung-Won;Shim, Dong-Hee;Kyeong, Mun-Geon;Chang, Kyung-Hi;Park, Youn-Ok;Han, Ki-Chul;Lee, Chung-Kun
    • ETRI Journal
    • /
    • v.20 no.2
    • /
    • pp.133-148
    • /
    • 1998
  • The objective of this paper is to present an adaptive algorithm for computing the weight vector which provides a beam pattern having its maximum gain along the direction of the mobile target signal source in the presence of interfering signals within a cell. The conjugate gradient method (CGM) is modified in such a way that the suboptimal weight vector is produced with the computational load of O(16N), which has been found to be small enough for the real-time processing of signals in most land mobile communications with the digital signal processor (DSP) off the shelf, where N denotes the number of antenna elements of the array. The adaptive procedure proposed in this paper is applied to code division multiple access (CDMA) mobile communication system to show its excellent performance in terms of signal to interference plus noise ratio (SINR), bit error rate (BER), and capacity, which are enhanced by about 7 dB, ${\frac{1}{100}}$ times, and 7 times, respectively, when the number of antenna elements is 6 and the processing gain is 20 dB.

  • PDF

Control Model of 1 kW Class Tactical Hybrid Power Generation System with Liquid Fuel Processor (야전용 액체 연료개질 1 kW급 하이브리드 전원시스템 제어 연구)

  • Ji, Hyun-Jin;Ha, Sang-Hyun;Kim, Young-Chul;Cho, Sung-Baek
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.14 no.4
    • /
    • pp.732-739
    • /
    • 2011
  • A fuel cell/secondary battery hybrid power generation system could extend well beyond the efficiency and interoperability of the conventional diesel generator. The suggested power source system consists of 2.3 kW class PEMFC, 100 Ah lithium polymer battery, and two DC/DC converters by serial connection type. It was known that interoperability of sub-systems is the key factor for stable and optimal control of the hybrid power generation system. The modeling and simulation methods have been proposed to reduce the number of configurations and performance tests for components selection and select the optimized control condition of the power generation system. The control model for power source system is implemented based on the empirical formulation and carried out in the Matlab/Simulink environment. The results show that the simulation can be used to establish the algorism of prototype and increase the durability of the power source system.

Code Acquisition with Receive Diversity and Constant False Alarm Rate Schemes: 1. Homogeneous Fading Circumstance (수신기 다양성과 일정 오경보 확률 방법을 쓴 부호획득: 1. 균질 감쇄 환경)

  • Kwon Hyoung-Moon;Oh Jong-Ho;Song Iick-Ho;Lee Ju-Mi
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.31 no.4C
    • /
    • pp.371-380
    • /
    • 2006
  • The performance characteristics of the cell averaging(CA), greatest of(GO), and smallest of(SO) constant false alarm rate(CFAR) processors in homogeneous environment are obtained and compared when receiving antenna diversity is employed in the pseudonoise code acquisition of direct-sequence code division multiple access (DS/CDMA) systems. From the simulation results, it is observed that the CA CFAR scheme has the best performance and the GO CFAR scheme has almost the same performance as the CA CFAR scheme in homogeneous environment. In Part 2 of this paper, the CA, GO, and SO CFAR processors for code acquisition in nonhomogeneous environment are addressed.

A Study on the method for the measurement of vibrating amplitude and frequency with Laser Doppler Vibrometer (레이저 도플러 진동계를 이용한 진동변위와 주파수 측정방법 연구)

  • Kim, Seong-Hoon;Kim, Ho-Seong
    • Proceedings of the KIEE Conference
    • /
    • 1998.07e
    • /
    • pp.1824-1827
    • /
    • 1998
  • A Laser Doppler Vibrometer(LDV) was developed using He-Ne laser as a light source. The heterodyne method was employed and its output signal was digitally processed with a $\mu$-processor and the result was displayed with LCD. The frequency shifted object beam(40 MHz) by a Bragg cell was focused on the surface of the moving target and the Doppler shifted reflected beam was recombined with reference beam at the fast photodetector to produce frequency modulated signal centered at 40 MHz. The signal from the detector was amplified and downconverted to intermediate frequency centered at 1 MHz after mixing process. The voltage output that was proportional to the velocity of the moving surface was obtained using PLL. With the same method, the fringe pattern signal of the moving surface is obtained. This fringe pattern signal is converted to TTL signal with ZCD(zero-crossing detector) and then counted to calculate the displacement due to the vibration, which is displayed with LCD. This LDV can be used to measure the resonant frequency of the electric equipments such as circuit breakers and transformers, of which resonant frequencies are changed when they are damaged.

  • PDF

SoC including 2M-byte on-chip SRAM and analog circuits for Miniaturization and low power consumption (소형화와 저전력화를 위해 2M-byte on-chip SRAM과 아날로그 회로를 포함하는 SoC)

  • Park, Sung Hoon;Kim, Ju Eon;Baek, Joon Hyun
    • Journal of IKEEE
    • /
    • v.21 no.3
    • /
    • pp.260-263
    • /
    • 2017
  • Based on several CPU cores, an SoC including ADCs, DC-DC converter and 2M-byte SRAM is proposed in this paper. The CPU core consists of a 12-bit MENSA, a 32-bit Symmetric multi-core processor, as well as 16-bit CDSP. To eliminate the external SDRAM memory, internal 2M-byte SRAM is implemented. Because the SRAM normally occupies huge area, the parasitic components reduce the speed of SoC. In this work, the SRAM blocks are divided into small pieces to reduce the parasitic components. The proposed SoC is developed in a standard 55nm CMOS process and the speed of SoC is 200MHz.

A GF($2^{163}$) Scalar Multiplier for Elliptic Curve Cryptography for Smartcard Security (스마트카드 보안용 타원곡선 암호를 위한 GF($2^{163}$) 스칼라 곱셈기)

  • Jeong, Sang-Hyeok;Shin, Kyung-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.13 no.10
    • /
    • pp.2154-2162
    • /
    • 2009
  • This paper describes a scalar multiplier for Elliptic curve cryptography for smart card security. The scaler multiplier has 163-bits key size which supports the specifications of smart card standard. To reduce the computational complexity of scala multiplication on finite field, the non-adjacent format (NAF) conversion algorithm which is based on complementary recoding is adopted. The scalar multiplier core synthesized with a 0.35-${\mu}m$ CMOS cell library has 32,768 gates and can operate up to 150-MHz@3.3-V. It can be used in hardware design of Elliptic curve cryptography processor for smartcard security.

Design of Floating-Point Multiplier for Mobile Graphics Application (모바일 그래픽스 응용을 위한 부동소수점 승산기의 설계)

  • Choi, Byeong-Yoon;Salcic, Zoran
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.3
    • /
    • pp.547-554
    • /
    • 2008
  • In this paper, two-stage pipelined floating-point multiplier (FP-MUL) is designed. The FP-MUL processor supports single precision multiplication for 3D graphic APIs, such as OpenGL and Direct3D and has area-efficient and low-latency architecture via saturated arithmetic, area-efficient sticky-bit generator, and flagged prefix adder. The FP-MUL has about 4-ns delay time under $0.13{\mu}m$ CMOS standard cell library and consists of about 7,500 gates. Because its maximum performance is about 250 MFLOPS, it can be applicable to mobile 3D graphics application.