• Title/Summary/Keyword: Processing-in-Memory

Search Result 1,836, Processing Time 0.028 seconds

Time-domain 3D Wave Propagation Modeling and Memory Management Using Graphics Processing Units (그래픽 프로세서를 이용한 시간 영역 3차원 파동 전파 모델링과 메모리 관리)

  • Kim, Ahreum;Ryu, Donghyun;Ha, Wansoo
    • Geophysics and Geophysical Exploration
    • /
    • v.19 no.3
    • /
    • pp.145-152
    • /
    • 2016
  • We used graphics processing units for an efficient time-domain 3D wave propagation modeling. Since graphics processing units are designed for massively parallel processes, we need to optimize the calculation and memory management to fully exploit graphics processing units. We focused on the memory management and examined the performance of programs with respect to the memory management methods. We also tested the effects of memory transfer on the performance of the program by varying the order of finite difference equation and the size of velocity models. The results show that the memory transfer takes a larger portion of the running time than that of the finite difference calculation in programs transferring whole 3D wavefield.

Cost Analysis of Window Memory Relocation for Data Stream Processing (데이터 스트림 처리를 위한 윈도우 메모리 재배치의 비용 분석)

  • Lee, Sang-Don
    • The Journal of the Korea Contents Association
    • /
    • v.8 no.4
    • /
    • pp.48-54
    • /
    • 2008
  • This paper analyzes cost tradeoffs between memory usage and computation for window-based operators in data stream environments. It identifies generic operator network constructs, and sets up a cost model for the estimation of the expected memory reduction and the computation overheads when window memory relocations are applied to each operator network construct. This cost model helps to identify the utility of window memory relocations. It also helps to apply window memory relocation to improve a query execution plan to save memory usage. The proposed approach contributes to expand the scope of query processing and optimization in data stream environments. It also provides a basis to develop a cost estimation model for the query optimization using window memory relocations.

Designing a Bitonic Sorting Algorithm for Shared-Memory Parallel Computers and an Efficient Implementation of its Communication (공유 메모리 병렬 컴퓨터 환경에서 Bitonic Sorting 알고리즘 설계와 효율적인 통신의 구현)

  • Lee, Jae-Dong;Kwon, Kyung-Hee;Park, Yong-Beom
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.11
    • /
    • pp.2690-2700
    • /
    • 1997
  • This paper presents parallel sorting algorithm, SHARED-MEMORY-BS and REDUCED-BS, which are implemented on shared-memory parallel computers. These algorithm sort N keys in $O(log^2N)$ time. REDUCED-BS users a parity strategy which gives an idea for the efficient usage of the local memory associated with each processor. By taking advantage of the local memory associated with each processor, the communication of REDUCED-BS is decreased by approximately half that of SHARED-MEMORY-BS. On the basis of alleviating the communication, the algorithm REDUCED-BS results in a significant improvement of performance.

  • PDF

GPU Memory Management Technique to Improve the Performance of GPGPU Task of Virtual Machines in RPC-Based GPU Virtualization Environments (RPC 기반 GPU 가상화 환경에서 가상머신의 GPGPU 작업 성능 향상을 위한 GPU 메모리 관리 기법)

  • Kang, Jihun
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.10 no.5
    • /
    • pp.123-136
    • /
    • 2021
  • RPC (Remote Procedure Call)-based Graphics Processing Unit (GPU) virtualization technology is one of the technologies for sharing GPUs with multiple user virtual machines. However, in a cloud environment, unlike CPU or memory, general GPUs do not provide a resource isolation technology that can limit the resource usage of virtual machines. In particular, in an RPC-based virtualization environment, since GPU tasks executed in each virtual machine are performed in the form of multi-process, the lack of resource isolation technology causes performance degradation due to resource competition. In addition, the GPU memory competition accelerates the performance degradation as the resource demand of the virtual machines increases, and the fairness decreases because it cannot guarantee equal performance between virtual machines. This paper, in the RPC-based GPU virtualization environment, analyzes the performance degradation problem caused by resource contention when the GPU memory requirement of virtual machines exceeds the available GPU memory capacity and proposes a GPU memory management technique to solve this problem. Also, experiments show that the GPU memory management technique proposed in this paper can improve the performance of GPGPU tasks.

Features of an Error Correction Memory to Enhance Technical Texts Authoring in LELIE

  • SAINT-DIZIER, Patrick
    • International Journal of Knowledge Content Development & Technology
    • /
    • v.5 no.2
    • /
    • pp.75-101
    • /
    • 2015
  • In this paper, we investigate the notion of error correction memory applied to technical texts. The main purpose is to introduce flexibility and context sensitivity in the detection and the correction of errors related to Constrained Natural Language (CNL) principles. This is realized by enhancing error detection paired with relatively generic correction patterns and contextual correction recommendations. Patterns are induced from previous corrections made by technical writers for a given type of text. The impact of such an error correction memory is also investigated from the point of view of the technical writer's cognitive activity. The notion of error correction memory is developed within the framework of the LELIE project an experiment is carried out on the case of fuzzy lexical items and negation, which are both major problems in technical writing. Language processing and knowledge representation aspects are developed together with evaluation directions.

A Method of Multi-processing of ACS and Survivor Path Metric Memory Management for TCM Decoder (TCM 복호기의 ACS 다중화 및 생존경로척도 기억장치 관리 방법)

  • 최시연;강병희;김진우;오길남;김덕현
    • Proceedings of the IEEK Conference
    • /
    • 2001.09a
    • /
    • pp.865-868
    • /
    • 2001
  • TCM offers considerable coding gains without compromising bandwidth or signal power. But TCM decoder is more complex than convolutional Viterbi decoder. Because, the number of branches exponentially increased by the constraint length and input symbol bits. The parallelism of ACS and memory management technique of SPMM is one of the important factor for speed-up and hardware complexity. This paper proposes a multi-processing technique of ACS and also gives a memory management technique of SPMM in TCM decoders.

  • PDF

Multiaccess Memory System supporting Local Buffer Memory System to Processing Elements (처리기에 지역 버퍼 메모리 시스템을 지원하는 다중접근기억장치)

  • Lee, Hyung
    • The Journal of the Korea Contents Association
    • /
    • v.12 no.1
    • /
    • pp.30-37
    • /
    • 2012
  • A memory system with the linear skewing scheme has been regarded as one of suitable memory systems for a single instruction, multiple data (SIMD) architecture. The memory system supports simultaneous access n data to m memory modules within various access types with a constant interval in an arbitrary position in two dimensional data array of $M{\times}N$. Although $m{\times}cells$ memory cells are physically required to support logical two dimensional $M{\times}N$ array of data by means of the memory system, at least (m-n)${\times}cells$ memory cells remain in disuse, where cells is (M-1)/q+(N-1)/$p{\times}{\lceil}M/q{\rceil}+1$. On keeping functionalities the memory system supports, $(n{\times}t){\times}N/p$ out of a number of unused memory cells, where t>0, being used as local buffer memories for n processing elements is proposed in this paper.

Performance Analysis of Interconnection Network for Multiprocessor Systems (다중프로세서 시스템을 \ulcorner나 상호결합 네트워크의 성능 분석)

  • 김원섭;오재철
    • The Transactions of the Korean Institute of Electrical Engineers
    • /
    • v.37 no.9
    • /
    • pp.663-670
    • /
    • 1988
  • Advances in VLSI technology have made it possible to have a larger number of processing elements to be included in highly parallel processor system. A system with a large number of processing elements and memory requires a complex data path. Multistage Interconnection networks(MINS) are useful in providing programmable data path between processing elements and memory modules in multiprocessor system. In this thesis, the performance of MINS for the star network has been analyzed and compared with other networks, such as generalized shuffle network, delta network, and referenced crossbar network.

  • PDF

A Quantitative Analysis for An Efficient Memory Allocation (효과적인 메모리 할당을 위한 정량적 분석)

  • Hong, Yun-Shik
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.9
    • /
    • pp.2395-2403
    • /
    • 1998
  • Memory allocation problem has two independent goals: minimization of number of memories and minimization of number of registers in one memory Our concern is the ordering of the bindings during memory allocation. We formulate and analyze three different memory allocation algorithms b) changing their binding order. It is shown that when we combine these subtasks and solve them simultaneously by heuristic cost function significant savings (up to 20%) can be obtained in the total area of memories.

  • PDF

An evaluation of the effects of VDT tasks on multiple resources processing in working menory using MD, PD method (MD, PD법을 이용한 VDT 직무의 단기기억 다중자원처리에의 영향평가)

  • 윤철호;노병옥
    • Journal of the Ergonomics Society of Korea
    • /
    • v.16 no.1
    • /
    • pp.85-96
    • /
    • 1997
  • This article reviews the effects of VDT tasks on multiple resources for processing and storage in short-term working memory. MD and PD method were introduced toevaluate the modalities (auditory-visual) in the multiple resources model. The subjects conducted 2 sessions of 50 minites VDT tasks. Before, between and after VDT tasks, MD, PD task performance scores and CFF(critical flicker frequency0 values were measured. The review suggested that the modalities of human information processing in working memory were affected by VDT tasks with different task contents.

  • PDF