• Title/Summary/Keyword: parallel layout

Search Result 70, Processing Time 0.026 seconds

Improvement of Address Pointer Assignment in DSP Code Generation (DSP용 코드 생성에서 주소 포인터 할당 성능 향상 기법)

  • Lee, Hee-Jin;Lee, Jong-Yeol
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.45 no.1
    • /
    • pp.37-47
    • /
    • 2008
  • Exploitation of address generation units which are typically provided in DSPs plays an important role in DSP code generation since that perform fast address computation in parallel to the central data path. Offset assignment is optimization of memory layout for program variables by taking advantage of the capabilities of address generation units, consists of memory layout generation and address pointer assignment steps. In this paper, we propose an effective address pointer assignment method to minimize the number of address calculation instructions in DSP code generation. The proposed approach reduces the time complexity of a conventional address pointer assignment algorithm with fixed memory layouts by using minimum cost-nodes breaking. In order to contract memory size and processing time, we employ a powerful pruning technique. Moreover our proposed approach improves the initial solution iteratively by changing the memory layout for each iteration because the memory layout affects the result of the address pointer assignment algorithm. We applied the proposed approach to about 3,000 sequences of the OffsetStone benchmarks to demonstrate the effectiveness of the our approach. Experimental results with benchmarks show an average improvement of 25.9% in the address codes over previous works.

Techniques of Internally Generating Waves on A Curve and Specifying Partial Reflection Conditions (파랑 수치모형에서 곡선형 내부조파기법과 부분반사조건 적용기법 개발)

  • Lee, Chang-Hoon;Kim, Min-Kyun;Kim, Duk-Gu;Choi, Hyuk-Jin;Cho, Yong-Jun
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2005.05b
    • /
    • pp.532-537
    • /
    • 2005
  • The techniques of internally generating waves on a curve in a rectangular grid system are developed using the line source method. Numerical experiments are conducted using the extended mild-slope equations of Suh et al. (1997). For five different types of wave generation layout, numerical experiments are conducted in the cases of the propagation of waves on a flat bottom, and the refraction and shoaling of waves on a plane slope. The fifth type of wave generation, which consists of two parallel lines connected to a semicircle, shows the best solutions especially when the grid size is small enough.

  • PDF

Preliminary design of cable-stayed bridges for vertical static loads

  • Michaltsos, G.T.;Ermopoulos, J.C.;Konstantakopoulos, T.G.
    • Structural Engineering and Mechanics
    • /
    • v.16 no.1
    • /
    • pp.1-15
    • /
    • 2003
  • This paper proposes a new method for the preliminary design of cable-stayed bridges that belong to the radial system subjected to static loads (self weight, traffic loads, concentrated loads, etc). The method is based on the determination of the each time existing relation between the tension forces of the cables and the corresponding bridge-deck deformations, and can be extended on any type of cable layout (fan, parallel, or mixed system). Galerkin's method is used for the final determination of the cable stresses and the bridge deformation. The determination of the equation, which gives the forces of the cables in relation to the deck's configurations, permits us to convert the problem to the solving of a continuous beam without cables.

A real-time high speed full search block matching motion estimation processor (고속 실시간 처리 full search block matching 움직임 추정 프로세서)

  • 유재희;김준호
    • Journal of the Korean Institute of Telematics and Electronics A
    • /
    • v.33A no.12
    • /
    • pp.110-119
    • /
    • 1996
  • A novel high speed VLSI architecture and its VLSI realization methodologies for a motion estimation processor based on full search block matching algorithm are presentd. The presented architecture is designed in order to be suitable for highly parallel and pipelined processing with identical PE's and adjustable in performance and hardware amount according to various application areas. Also, the throughput is maximized by enhancing PE utilization up to 100% and the chip pin count is reduced by reusing image data with embedded image memories. Also, the uniform and identical data processing structure of PE's eases VLSI implementation and the clock rate of external I/O data can be made slower compared to internal clock rate to resolve I/O bottleneck problem. The logic and spice simulation results of the proposed architecture are presented. The performances of the proposed architecture are evaluated and compared with other architectures. Finally, the chip layout is shown.

  • PDF

Nonparametric Method using Placement in an Analysis of a Covariance Model

  • Hwang, Dong-Min;Kim, Dong-Jae
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.5
    • /
    • pp.721-729
    • /
    • 2012
  • Various methods control the influence of a covariate on a response variable. These methods are analysis of covariance(ANCOVA), RANK ANCOVA, ANOVA of (covariate-adjusted) residuals, and Kruskal-Wallis tests on residuals. Covariate-adjusted residuals are obtained from the overall regression line fit to the entire data set that ignore the treatment levels or factors. It is demonstrated that the methods on covariate-adjusted residuals are only appropriate when the regression lines are parallel and covariate means are equal for all treatments. In this paper, we proposed the new nonparametric method on the ANCOVA model, as applying joint placement in a one-way layout on residuals as described in Chung and Kim (2007). A Monte Carlo simulation study is adapted to compare the power of the proposed procedure with those of the previous procedure.

Design and Fabrication of Parallel Coupled Line Band Pass Filter for 5.8GHz ISM Band (5.8GHz ISM밴드용 평행 결합선로 대역통과 여파기의 설계)

  • Jang, In-Seok;Son, Tae-Ho
    • Proceedings of the KAIS Fall Conference
    • /
    • 2006.05a
    • /
    • pp.381-383
    • /
    • 2006
  • 본 논문에서는 5.8Ghz ISM대역 평행 결합선로 대역통과 여파기를 설계 제작하였다. 긴본적인 설계는 저역통과 여파기에서 대역통과 여파기로 변환한 후, 직 병렬 공진기를 이용한 설계와 J-인버터를 이용해 평행 결합선로 대역통과 여파기를 구현하였다. 2개의 공진 주파수를 실제로 구현하기 어렵기 때문에 하나의 공진기만을 사용하기 위해 인버터를 사용하였다. 또한 실제적인 마이크로스트립 라인의 layout크기를 결정하기 위해 우수 기수 모드 임피던스를 해석하고 근사식을 통해 스트립라인의 치수를 결정하였다. 이런 과정을 토대로 5.8GHz ISM밴드용 평행 결합선로 대역통과 여파기를 설계, 제작하였다.

  • PDF

Effect of Capacitance Error on the A/D conversion Accuracy (커패시턴스 오차가 아날로그 디지털 변환의 정확도에 미치는 영향)

  • Lee, Yun-Tae;Kim, Chung-Gi;Gyeong, Jong-Min
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.22 no.5
    • /
    • pp.57-61
    • /
    • 1985
  • The e(lect of capacitance error on the A/D conversion accuracy in the A/D converter using binary-weighted capacitor array was scruntized. Besides the Monte-Carlo method considering the inter-capacitance ratios as random variables, " correlation approach" con-sidering the correlation coefficient between capacitances is proposed in this paper. Bt was observed by the measurement of capacitances of monolithic MO5 capacitors that the correla-tion coefficient between capacitors decreases as the capacitor size incrrases. It was also verified that the parallel connection of unit capacitors and the common centroid layout scheme signi(icantly increase the inter-capacitance correlation coefficients.

  • PDF

Real-Time Storage and Retrieval Techniques for Continuous Media Storage Server (연속미디어 저장 서버에서의 실시간 저장 및 검색 기법)

  • CheolSu Lim
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.32B no.11
    • /
    • pp.1365-1373
    • /
    • 1995
  • In this paper, we address the issues related to storage and retrieval of continuous media (CM)data we face in designing multimedia on-demand (MOD) storage servers. To support the two orthogonal factors of MOD server design, i.e., storage and retrieval of CM data, this paper discusses the techniques of disk layout, disk striping and real-time disk scheduling, which are integrated as a combined solution to the high- performance MOD storage subsystem. The proposed clustered striping technique enables either a multiple-disk or a parallel system to guarantee a continuous retrieval of CM data at the bandwidth required to support user playback rate by avoiding the formation of I/O bottlenecks.

  • PDF

Large-scale 3D fast Fourier transform computation on a GPU

  • Jaehong Lee;Duksu Kim
    • ETRI Journal
    • /
    • v.45 no.6
    • /
    • pp.1035-1045
    • /
    • 2023
  • We propose a novel graphics processing unit (GPU) algorithm that can handle a large-scale 3D fast Fourier transform (i.e., 3D-FFT) problem whose data size is larger than the GPU's memory. A 1D FFT-based 3D-FFT computational approach is used to solve the limited device memory issue. Moreover, to reduce the communication overhead between the CPU and GPU, we propose a 3D data-transposition method that converts the target 1D vector into a contiguous memory layout and improves data transfer efficiency. The transposed data are communicated between the host and device memories efficiently through the pinned buffer and multiple streams. We apply our method to various large-scale benchmarks and compare its performance with the state-of-the-art multicore CPU FFT library (i.e., fastest Fourier transform in the West [FFTW]) and a prior GPU-based 3D-FFT algorithm. Our method achieves a higher performance (up to 2.89 times) than FFTW; it yields more performance gaps as the data size increases. The performance of the prior GPU algorithm decreases considerably in massive-scale problems, whereas our method's performance is stable.

Design of a 2.4GHz CMOS Low Noise Amplifier (2.4GHz CMOS 저잡음 증폭기)

  • 최혁환;오현숙;김성우;임채성;권태하
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.7 no.1
    • /
    • pp.106-113
    • /
    • 2003
  • In this paper, we proposed low noise amplifier for 2.4GHz ISM frequency with CMOS technology. The property of noise and gain is improved by cascode architecture. The architecture, which common source output of cascode is connected to input of parallel MOS, reduce IM. The LNA results based on Hynix 0.35${\mu}{\textrm}{m}$ 2poly 4metal CMOS processor with a 3.3V supply. It achieves a gain of 13dB, noise figure of 1.7dB, IP3 of 8dBm, Input/output matching of -31dB/-28dB, reverse isolation of -25dB. and power dissipation of 4.7mW with HSPICE simulation. The size of layout is smaller than 2 ${\times}$ 2mm with Mentor.