• Title/Summary/Keyword: Memory Problem

Search Result 1,088, Processing Time 0.031 seconds

Memory-Efficient Belief Propagation for Stereo Matching on GPU (GPU 에서의 고속 스테레오 정합을 위한 메모리 효율적인 Belief Propagation)

  • Choi, Young-Kyu;Williem, Williem;Park, In Kyu
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2012.11a
    • /
    • pp.52-53
    • /
    • 2012
  • Belief propagation (BP) is a commonly used global energy minimization algorithm for solving stereo matching problem in 3D reconstruction. However, it requires large memory bandwidth and data size. In this paper, we propose a novel memory-efficient algorithm of BP in stereo matching on the Graphics Processing Units (GPU). The data size and transfer bandwidth are significantly reduced by storing only a part of the whole message. In order to maintain the accuracy of the matching result, the local messages are reconstructed using shared memory available in GPU. Experimental result shows that there is almost an order of reduction in the global memory consumption, and 21 to 46% saving in memory bandwidth when compared to the conventional algorithm. The implementation result on a recent GPU shows that we can obtain 22.8 times speedup in execution time compared to the execution on CPU.

  • PDF

A Technique for Improving the Performance of Cache Memories

  • Cho, Doosan
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.13 no.3
    • /
    • pp.104-108
    • /
    • 2021
  • In order to improve performance in IoT, edge computing system, a memory is usually configured in a hierarchical structure. Based on the distance from CPU, the access speed slows down in the order of registers, cache memory, main memory, and storage. Similar to the change in performance, energy consumption also increases as the distance from the CPU increases. Therefore, it is important to develop a technique that places frequently used data to the upper memory as much as possible to improve performance and energy consumption. However, the technique should solve the problem of cache performance degradation caused by lack of spatial locality that occurs when the data access stride is large. This study proposes a technique to selectively place data with large data access stride to a software-controlled cache. By using the proposed technique, data spatial locality can be improved by reducing the data access interval, and consequently, the cache performance can be improved.

An Efficient Parallel Algorithm for the Single Function Coarsest Partition Problem on the EREW PRAM

  • Ha, Kyeoung-Ju;Ku, Kyo-Min;Park, Hae-Kyeong;Kim, Young-Kook;Ryu, Kwan-Woo
    • ETRI Journal
    • /
    • v.21 no.2
    • /
    • pp.22-30
    • /
    • 1999
  • In this paper, we derive an efficient parallel algorithm to solve the single function coarsest partition problem. This algorithm runs in O(\log2n) time using O(nlogn) operations on the EREW PRAM with O(n) memory cells used. Compared with the previous PRAM algorithms that consume O(n1+${\varepsilon}$) memory cells for some positive constant ${\varepsilon}\>0$, our algorithm consumes less memory cells without increasing the total number of operations.

  • PDF

Efficient Data Management for Finite Element Analysis with Pre-Post Processing of Large Structures (전-후 처리 과정을 포함한 거대 구조물의 유한요소 해석을 위한 효율적 데이터 구조)

  • 박시형;박진우;윤태호;김승조
    • Proceedings of the Computational Structural Engineering Institute Conference
    • /
    • 2004.04a
    • /
    • pp.389-395
    • /
    • 2004
  • We consider the interface between the parallel distributed memory multifrontal solver and the finite element method. We give in detail the requirement and the data structure of parallel FEM interface which includes the element data and the node array. The full procedures of solving a large scale structural problem are assumed to have pre-post processors, of which algorithm is not considered in this paper. The main advantage of implementing the parallel FEM interface is shown up in the case that we use a distributed memory system with a large number of processors to solve a very large scale problem. The memory efficiency and the performance effect are examined by analyzing some examples on the Pegasus cluster system.

  • PDF

An Effective Run-before Decoding Method Based on FSM (FSM 기법을 이용한 효과적인 run_before 복원 방식)

  • Moon Yong-Ho
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.31 no.3C
    • /
    • pp.245-249
    • /
    • 2006
  • In general, a large number of the memory accesses are required to decode the CAVLC in H.264/AVC. This is a serious problem for applications such as a DMB and videophone services because of the considerable amount of power that is consumed in accessing the memory. In order to overcome this problem, we propose an efficient run_before decoding method, In the proposed method, the memory access is removed by using a FSM with arithmetic operations. The simulation results show that the proposed algorithm does not degrade video quality is not degraded as well as the power is saved.

SEC-DED-DAEC codes without mis-correction for protecting on-chip memories (오정정 없이 온칩 메모리 보호를 위한 SEC-DED-DAEC 부호)

  • Jun, Hoyoon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.10
    • /
    • pp.1559-1562
    • /
    • 2022
  • As electronic devices technology scales down into the deep-submicron to achieve high-density, low power and high performance integrated circuits, multiple bit upsets by soft errors have become a major threat to on-chip memory systems. To address the soft error problem, single error correction, double error detection and double adjacent error correction (SEC-DED-DAEC) codes have been recently proposed. But these codes do not troubleshoot mis-correction problem. We propose the SEC-DED_DAEC code with without mis-correction. The decoder for proposed code is implemented as hardware and verified. The results show that there is no mis-correction in the proposed codes and the decoder can be employed on-chip memory system.

Fiber-reinforced micropolar thermoelastic rotating Solid with voids and two-temperature in the context of memory-dependent derivative

  • Alharbi, Amnah M.;Said, Samia M.;Abd-Elaziz, Elsayed M.;Othman, Mohamed I.A.
    • Geomechanics and Engineering
    • /
    • v.28 no.4
    • /
    • pp.347-358
    • /
    • 2022
  • The main concern of this article is to discuss the problem of a two-temperature fiber-reinforced micropolar thermoelastic medium with voids under the effect rotation, mechanical force in the context four different theories with memory-dependent derivative (MDD) and variable thermal conductivity. The three-phase-lag model (3PHL), dual-phase-lag model (DPL), Green-Naghdi theory (G-N II, G-N III), coupled theory, and the Lord-Shulman theory (L-S) are employed to solve the present problem. Analytical expressions of the physical quantities are obtained by using Laplace-Fourier transforms technique. Numerical results are shown graphically and the results obtained are analyzed. The most significant points are highlighted.

Multi -Core Transactional Memory for High Contention Parallel Processing (집중 충돌 병렬 처리를 위한 효율적인 다중 코어 트랜잭셔널 메모리)

  • Kim, Seung-Hun;Kim, Sun-Woo;Ro, Won-Woo
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.48 no.1
    • /
    • pp.72-79
    • /
    • 2011
  • The importance of parallel programming seriously emerges ever since the modern microprocessor architecture has been shifted to the multi-core system. Transactional Memory has been proposed to address synchronization which is usually implemented by using locks. However, the lock based synchronization method reduces the parallelism and has the possibility of causing deadlock. In this paper, we propose an efficient method to utilize transactional memory for the situation which has high contention. The proposed idea is based on the theoretical analysis and it is verified with simulation results. The simulation environment has been implemented using HTM(Hardware Transactional Memory) systems. We also propose a model of the dining philosopher problem to discuss the efficient resource management using the transactional memory technique.

Design of an Massive Storage System based on the NAND Flash Memory (NAND 플래시 메모리 기반의 대용량 저장장치 설계)

  • Ryu, Dong-Woo;Kim, Sang-Wook;Maeng, Doo-Lyel
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.10 no.8
    • /
    • pp.1962-1969
    • /
    • 2009
  • During past 20 years we have witnessed brilliant advances in major components of computer system, including CPU, memory, network device and HDD. Among these components, in spite of its tremendous advance in capacity, the HDD is the most performance dragging device until now and there is little affirmative forecasting that this problem will be resolved in the near future. We present a new approach to solve this problem using the NAND Flash memory. Researches utilizing Flash memory as storage medium are abundant these days, but almost all of them are targeted to mobile or embedded devices. Our research aims to develop the NAND Flash memory based storage system enough even for enterprise level server systems. This paper present structural and operational mechanism to overcome the weaknesses of existing NAND Flash memory based storage system, and its evaluation.

A Self-Description File System for NAND Flash Memory (낸드 플래시 메모리를 위한 자기-서술 파일 시스템)

  • Han, Jun-Yeong;Park, Sang-Oh;Kim, Sung-Jo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.2
    • /
    • pp.98-113
    • /
    • 2009
  • Conventional file systems for harddisk drive cannot be applied to NAND flash memory, because the physical characteristics of NAND flash memory differs from those of harddisk drive. To address this problem, various file systems with better reliability and efficiency have also been developed recently. However, those file systems have inherent overheads for updating the file's metadata pages, because those file systems save file's meta-data and data separately. Furthermore, those file systems have a critical reliability problem: file systems fail when either a page in meta-data of a file system or a file itself fails. In this paper, we propose a self-description page technique and In Memory Core File System technique to address these efficiency and reliability problems, and develop SDFS(Self-Description File System) newly. SDFS can be safely recovered, although some pages fail, and improves write and read performance by 36% and 15%, respectively, and reduces mounting time by 1/20 compared with YAFFS2.