• Title/Summary/Keyword: Memory Requirement

Search Result 267, Processing Time 0.021 seconds

Discontinuous Grids and Time-Step Finite-Difference Method for Simulation of Seismic Wave Propagation (지진파 전파 모의를 위한 불균등 격자 및 시간간격 유한차분법)

  • 강태섭;박창업
    • Proceedings of the Earthquake Engineering Society of Korea Conference
    • /
    • 2003.03a
    • /
    • pp.50-58
    • /
    • 2003
  • We have developed a locally variable time-step scheme matching with discontinuous grids in the flute-difference method for the efficient simulation of seismic wave propagation. The first-order velocity-stress formulations are used to obtain the spatial derivatives using finite-difference operators on a staggered grid. A three-times coarser grid in the high-velocity region compared with the grid in the low-velocity region is used to avoid spatial oversampling. Temporal steps corresponding to the spatial sampling ratio between both regions are determined based on proper stability criteria. The wavefield in the margin of the region with smaller time-step are linearly interpolated in time using the values calculated in the region with larger one. The accuracy of the proposed scheme is tested through comparisons with analytic solutions and conventional finite-difference scheme with constant grid spacing and time step. The use of the locally variable time-step scheme with discontinuous grids results in remarkable saving of the computation time and memory requirement with dependency of the efficiency on the simulation model. This implies that ground motion for a realistic velocity structures including near-surface sediments can be modeled to high frequency (several Hz) without requiring severe computer memory

  • PDF

An IE-FFT Algorithm to Analyze PEC Objects for MFIE Formulation

  • Seo, Seung Mo
    • Journal of electromagnetic engineering and science
    • /
    • v.19 no.1
    • /
    • pp.6-12
    • /
    • 2019
  • An IE-FFT algorithm is implemented and applied to the electromagnetic (EM) solution of perfect electric conducting (PEC) scattering problems. The solution of the method of moments (MoM), based on the magnetic field integral equation (MFIE), is obtained for PEC objects with closed surfaces. The IE-FFT algorithm uses a uniform Cartesian grid to apply a global fast Fourier transform (FFT), which leads to significantly reduce memory requirement and speed up CPU with an iterative solver. The IE-FFT algorithm utilizes two discretizations, one for the unknown induced surface current on the planar triangular patches of 3D arbitrary geometries and the other on a uniform Cartesian grid for interpolating the free-space Green's function. The uniform interpolation of the Green's functions allows for a global FFT for far-field interaction terms, and the near-field interaction terms should be adequately corrected. A 3D block-Toeplitz structure for the Lagrangian interpolation of the Green's function is proposed. The MFIE formulation with the IE-FFT algorithm, without the help of a preconditioner, is converged in certain iterations with a generalized minimal residual (GMRES) method. The complexity of the IE-FFT is found to be approximately $O(N^{1.5})$and $O(N^{1.5}logN)$ for memory requirements and CPU time, respectively.

Mesh Stability Study for the Performance Assessment of a Deep Geological Repository Using APro

  • Hyun Ho Cho;Hong Jang;Dong Hyuk Lee;Jung-Woo Kim
    • Journal of Nuclear Fuel Cycle and Waste Technology(JNFCWT)
    • /
    • v.21 no.2
    • /
    • pp.283-294
    • /
    • 2023
  • APro, developed in KAERI for the process-based total system performance assessment (TSPA) of deep geological disposal systems, performs finite element method (FEM)-based multiphysics analysis. In the FEM-based analysis, the mesh element quality influences the numerical solution accuracy, memory requirement, and computation time. Therefore, an appropriate mesh structure should be constructed before the mesh stability analysis to achieve an accurate and efficient process-based TSPA. A generic reference case of DECOVALEX-2023 Task F, which has been proposed for simulating stationary groundwater flow and time-dependent conservative transport of two tracers, was used in this study for mesh stability analysis. The relative differences in tracer concentration varying mesh structures were determined by comparing with the results for the finest mesh structure. For calculation efficiency, the memory requirements and computation time were compared. Based on the mesh stability analysis, an approach based on adaptive mesh refinement was developed to resolve the error in the early stage of the simulation time-period. It was observed that the relative difference in the tracer concentration significantly decreased with high calculation efficiency.

Fast Image Compression and Pixel-wise Switching Technique for Hardware Efficient Implementation of Dynamic Capacitance Compensation (하드웨어 효율적인 동적 커패시턴스 보상 구현을 위한 고속 영상 압축 및 화소별 스위칭 기법)

  • Choi, Joon-Hwan;Song, Won-Suk;Choi, Hyuk
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.8
    • /
    • pp.616-622
    • /
    • 2009
  • Thanks to Dynamic Capacitance Control (DCC) technique, response time of an LCD display has greatly improved. However, DCC requires hi-speed memory for the real-time writing/reading of an image of a previous frame, which results in increases in hardware overhead and cost. In this paper, we propose Modified Exponential Golomb (MEG) coding, a low-complex high-speed image compression method, which can remarkably reduce memory requirement for DCC. We also propose a pixel-wise DCC switching technique to prevent a compression error from affecting the quality of a final image on LCD. In our experiment, the degradation in visual quality was not noticeable when we cut the DCC memory size of 1080i HD data by 1/3.

Embedded Compression Codec Algorithm for Motion Compensated Wavelet Video Coding System (움직임 보상된 웨이블릿 기반의 비디오 코딩 시스템에 적용 가능한 임베디드 압축 코덱 알고리즘)

  • Kim, Song-Ju
    • The Journal of the Korea Contents Association
    • /
    • v.12 no.3
    • /
    • pp.77-83
    • /
    • 2012
  • In this paper, a low-complexity embedded compression (EC) Codec algorithm for the wavelet video coder is applied to reduce excessive external memory requirements. The EC algorithm is used to achieve a fixed compression ratio of 50 % under the near-lossless-compression constraint. The EC technique can reduce the 50 % memory requirement for intermediate low-frequency coefficients during multiple discrete wavelet transform stages compared with direct implementation of the wavelet video encoder of this paper. Furthermore, the EC scheme based on a forward adaptive quantization and fixed length coding can save bandwidth and size of buffer between DWT and SPIHT to 50 %. Simulation results show that our EC algorithm present only PSNR degradation of 0.179 and 0.162 dB in average when the target bit-rate of the video coder are 1 and 0.5 bpp, respectively.

Performance Analysis and Enhancing Techniques of Kd-Tree Traversal Methods on GPU (GPU용 Kd-트리 탐색 방법의 성능 분석 및 향상 기법)

  • Chang, Byung-Joon;Ihm, In-Sung
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.2
    • /
    • pp.177-185
    • /
    • 2010
  • Ray-object intersection is an important element in ray tracing that takes up a substantial amount of computing time. In general, such spatial data structure as kd-tree has been frequently used for static scenes to accelerate the intersection computation. Recently, a few variants of kd-tree traversal have been proposed suitable for the GPU that has a relatively restricted computing architecture compared to the CPU. In this article, we propose yet another two implementation techniques that can improve those previous ones. First, we present a cached stack method that is aimed to reduce the costly global memory access time needed when the stack is allocated to global memory. Secondly, we present a rope-with-short-stack method that eases the substantial memory requirement, often necessary for the previous rope method. In order to show the effectiveness of our techniques, we compare their performances with those of the previous GPU traversal methods. The experimental results will provide prospective GPU ray tracer developers with valuable information, helping them choose a proper kd-tree traversal method.

FPGA Implementation of SURF-based Feature extraction and Descriptor generation (SURF 기반 특징점 추출 및 서술자 생성의 FPGA 구현)

  • Na, Eun-Soo;Jeong, Yong-Jin
    • Journal of Korea Multimedia Society
    • /
    • v.16 no.4
    • /
    • pp.483-492
    • /
    • 2013
  • SURF is an algorithm which extracts feature points and generates their descriptors from input images, and it is being used for many applications such as object recognition, tracking, and constructing panorama pictures. Although SURF is known to be robust to changes of scale, rotation, and view points, it is hard to implement it in real time due to its complex and repetitive computations. Using 3.3 GHz Pentium, in our experiment, it takes 240ms to extract feature points and create descriptors in a VGA image containing about 1,000 feature points, which means that software implementation cannot meet the real time requirement, especially in embedded systems. In this paper, we present a hardware architecture that can compute the SURF algorithm very fast while consuming minimum hardware resources. Two key concepts of our architecture are parallelism (for repetitive computations) and efficient line memory usage (obtained by analyzing memory access patterns). As a result of FPGA synthesis using Xilinx Virtex5LX330, it occupies 101,348 LUTs and 1,367 KB on-chip memory, giving performance of 30 frames per second at 100 MHz clock.

Optimized Binary-Search-on- Range Architecture for IP Address Lookup (IP 주소 검색을 위한 최적화된 영역분할 이진검색 구조)

  • Park, Kyong-Hye;Lim, Hye-Sook
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.33 no.12B
    • /
    • pp.1103-1111
    • /
    • 2008
  • Internet routers forward an incoming packet to an output port toward its final destination through IP address lookup. Since each incoming packet should be forwarded in wire-speed, it is essential to provide the high-speed search performance. In this paper, IP address lookup algorithms using binary search are studied. Most of the binary search algorithms do not provide a balanced search, and hence the required number of memory access is excessive so that the search performance is poor. On the other hand, binary-search-on-range algorithm provides high-speed search performance, but it requires a large amount of memory. This paper shows an optimized binary-search-on-range structure which reduces the memory requirement by deleting unnecessary entries and an entry field. By this optimization, it is shown that the binary-search-on-range can be performed in a routing table with a similar or lesser number of entries than the number of prefixes. Using real backbone routing data, the optimized structure is compared with the original binary-search-on-range algorithm in terms of search performance. The performance comparison with various binary search algorithms is also provided.

A Memory-based Reasoning Algorithm using Adaptive Recursive Partition Averaging Method (적응형 재귀 분할 평균법을 이용한 메모리기반 추론 알고리즘)

  • 이형일;최학윤
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.4
    • /
    • pp.478-487
    • /
    • 2004
  • We had proposed the RPA(Recursive Partition Averaging) method in order to improve the storage requirement and classification rate of the Memory Based Reasoning. That algorithm worked not bad in many area, however, the major drawbacks of RPA are it's partitioning condition and the way of extracting major patterns. We propose an adaptive RPA algorithm which uses the FPD(feature-based population densimeter) to stop the ARPA partitioning process and produce, instead of RPA's averaged major pattern, optimizing resulting hyperrectangles. The proposed algorithm required only approximately 40% of memory space that is needed in k-NN classifier, and showed a superior classification performance to the RPA. Also, by reducing the number of stored patterns, it showed an excellent results in terms of classification when we compare it to the k-NN.

Resolving Cycle Extension Overhead Multimedia Data Retrieval

  • Won, Youjip;Cho, Kyungsun
    • Transactions on Control, Automation and Systems Engineering
    • /
    • v.4 no.2
    • /
    • pp.164-168
    • /
    • 2002
  • In this article, we present the novel approach of avoiding temporal insufficiency of data blocks, jitter, which occurs due to the commencement of new session. We propose to make the sufficient amount of data blocks available on memory such that the ongoing session can survive the cycle extension. This technique is called ″pre-buffering″. We examine two different approaches in pre-buffering: (i) loads all required data blocks prior to starting playback and (ii) incrementally accumulates the data blocks in each cycle. We develop an elaborate model to determine the appropriate amount of data blocks necessary to survive the cycle extension and to compute startup latency involved in loading these data blocks. The simulation result shows that limiting the disk bandwidth utilization to 60% can greatly improve the startup latency as well as the buffer requirement for individual streams.