• Title/Summary/Keyword: Memory processing

Search Result 2,096, Processing Time 0.031 seconds

Time-domain 3D Wave Propagation Modeling and Memory Management Using Graphics Processing Units (그래픽 프로세서를 이용한 시간 영역 3차원 파동 전파 모델링과 메모리 관리)

  • Kim, Ahreum;Ryu, Donghyun;Ha, Wansoo
    • Geophysics and Geophysical Exploration
    • /
    • v.19 no.3
    • /
    • pp.145-152
    • /
    • 2016
  • We used graphics processing units for an efficient time-domain 3D wave propagation modeling. Since graphics processing units are designed for massively parallel processes, we need to optimize the calculation and memory management to fully exploit graphics processing units. We focused on the memory management and examined the performance of programs with respect to the memory management methods. We also tested the effects of memory transfer on the performance of the program by varying the order of finite difference equation and the size of velocity models. The results show that the memory transfer takes a larger portion of the running time than that of the finite difference calculation in programs transferring whole 3D wavefield.

Development of Flash Memory Page Management Techniques

  • Kim, Jeong-Joon
    • Journal of Information Processing Systems
    • /
    • v.14 no.3
    • /
    • pp.631-644
    • /
    • 2018
  • Many studies on flash memory-based buffer replacement algorithms that consider the characteristics of flash memory have recently been developed. Conventional flash memory-based buffer replacement algorithms have the disadvantage that the operation speed slows down, because only the reference is checked when selecting a replacement target page and either the reference count is not considered, or when the reference time is considered, the elapsed time is considered. Therefore, this paper seeks to solve the problem of conventional flash memory-based buffer replacement algorithm by dividing pages into groups and considering the reference frequency and reference time when selecting the replacement target page. In addition, because flash memory has a limited lifespan, candidates for replacement pages are selected based on the number of deletions.

Understanding The Role of Smart Pointers in the Rust Memory (Rust 언어 메모리 안전 모델에서 스마트 포인터의 역할에 대한 연구)

  • Martin Kayondo;Inyoung Bang;Yunheung Paek
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.05a
    • /
    • pp.345-347
    • /
    • 2023
  • Rust has gained popularity as a memory safe systems programming language. At the center of its memory safety is a strict memory ownership model with stringent rules enforced by the compiler. This paper aims to shed light on this memory safety model and the role smart pointers play towards its success. We study specific smart pointers, their purposes and contribution to Rust's memory safety. We further explore weaknesses of these smart pointers and their APIs, and provide scenarios under which they may lead to memory vulnerabilities in Rust programs.

Algorithmic GPGPU Memory Optimization

  • Jang, Byunghyun;Choi, Minsu;Kim, Kyung Ki
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.14 no.4
    • /
    • pp.391-406
    • /
    • 2014
  • The performance of General-Purpose computation on Graphics Processing Units (GPGPU) is heavily dependent on the memory access behavior. This sensitivity is due to a combination of the underlying Massively Parallel Processing (MPP) execution model present on GPUs and the lack of architectural support to handle irregular memory access patterns. Application performance can be significantly improved by applying memory-access-pattern-aware optimizations that can exploit knowledge of the characteristics of each access pattern. In this paper, we present an algorithmic methodology to semi-automatically find the best mapping of memory accesses present in serial loop nest to underlying data-parallel architectures based on a comprehensive static memory access pattern analysis. To that end we present a simple, yet powerful, mathematical model that captures all memory access pattern information present in serial data-parallel loop nests. We then show how this model is used in practice to select the most appropriate memory space for data and to search for an appropriate thread mapping and work group size from a large design space. To evaluate the effectiveness of our methodology, we report on execution speedup using selected benchmark kernels that cover a wide range of memory access patterns commonly found in GPGPU workloads. Our experimental results are reported using the industry standard heterogeneous programming language, OpenCL, targeting the NVIDIA GT200 architecture.

Technology Trends in CXL Memory and Utilization Software (CXL 메모리 및 활용 소프트웨어 기술 동향 )

  • H.Y. Ahn;S.Y. Kim;Y.M. Park;W.J. Han
    • Electronics and Telecommunications Trends
    • /
    • v.39 no.1
    • /
    • pp.62-73
    • /
    • 2024
  • Artificial intelligence relies on data-driven analysis, and the data processing performance strongly depends on factors such as memory capacity, bandwidth, and latency. Fast and large-capacity memory can be achieved by composing numerous high-performance memory units connected via high-performance interconnects, such as Compute Express Link (CXL). CXL is designed to enable efficient communication between central processing units, memory, accelerators, storage, and other computing resources. By adopting CXL, a composable computing architecture can be implemented, enabling flexible server resource configuration using a pool of computing resources. Thus, manufacturers are actively developing hardware and software solutions to support CXL. We present a survey of the latest software for CXL memory utilization and the most recent CXL memory emulation software. The former supports efficient use of CXL memory, and the latter offers a development environment that allows developers to optimize their software for the hardware architecture before commercial release of CXL memory devices. Furthermore, we review key technologies for improving the performance of both the CXL memory pool and CXL-based composable computing architecture along with various use cases.

Enhancing the performance of taxi application based on in-memory data grid technology (In-memory data grid 기술을 활용한 택시 애플리케이션 성능 향상 기법 연구)

  • Choi, Chi-Hwan;Kim, Jin-Hyuk;Park, Min-Kyu;Kwon, Kaaen;Jung, Seung-Hyun;Nazareno, Franco;Cho, Wan-Sup
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.5
    • /
    • pp.1035-1045
    • /
    • 2015
  • Recent studies in Big Data Analysis are showing promising results, utilizing the main memory for rapid data processing. In-memory computing technology can be highly advantageous when used with high-performing servers having tens of gigabytes of RAM with multi-core processors. The constraint in network in these infrastructure can be lessen by combining in-memory technology with distributed parallel processing. This paper discusses the research in the aforementioned concept applying to a test taxi hailing application without disregard to its underlying RDBMS structure. The application of IMDG technology in the application's backend API without restructuring the database schema yields 6 to 9 times increase in performance in data processing and throughput. Specifically, the change in throughput is very small even with increase in data load processing.

Design of the new parallel processing architecture for commercial applications (상용 응용을 위한 병렬처리 구조 설계)

  • 한우종;윤석한;임기욱
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.33B no.5
    • /
    • pp.41-51
    • /
    • 1996
  • In this paper, anew parallel processing system based on a cluster architecture which provides scalability of a parallel processing system while maintains shared memory multiprocessor characteristics is proposed. In recent days low cost, high performnce microprocessors have led to construction of large scale parallel processing systems. Such parallel processing systems provides large scalability but are mainly used for scientific applications which have large data parallelism. A shared memory multiprocessor system like TICOM is currently used as aserver for the commercial application, however, the shared memory multiprocessor system is known to have very limited scalability. The proposed architecture can support scalability and performance of the parallel processing system while it provides adaptability for the commerical application, hence it can overcome the limitation of the shared memory multiprocessor. The architecture and characteristics of the proposed system shall be described. A proprietary hierarchical crsossbar network is designed for this system, of which the protocol, routing and switching technique and the signal transfer technique are optimized for the proposed architecture. The design trade-offs for the network are described in this paper and with simulation usihng the SES/workbench, it is explored that the network fits to the proposed architecture.

  • PDF

Proposal of 3D Graphic Processor Using Multi-Access Memory System (Multi-Access Memory System을 이용한 3D 그래픽 프로세서 제안)

  • Lee, S-Ra-El;Kim, Jae-Hee;Ko, Kyung-Sik;Park, Jong-Won
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.4
    • /
    • pp.119-128
    • /
    • 2019
  • Due to the nature of the 3D graphics processor system, many mathematical calculations are required and parallel processing research using GPU (Graphics Processing Unit) is being performed for high-speed processing. In this paper, we propose a 3D graphics processor using MAMS, a parallel processor that does not use cache memory, to solve the GPU problem of increasing bandwidth caused by cache memory miss and the problem that 3D shader processing speed is not constant. The 3D graphics processor using MAMS proposed in this paper designed Vertex shader, Pixel shader, Tiling and Rasterizing structure using DirectX command analysis, the FPGA(Xilinx Virtex6@100MHz) board for MAMS was constructed and designed using Verilog. We compared the processing time of the developed FPGA (100Mhz) and nVidia GeForce GTX 660 (980Mhz), the processing time using GTX 660 was not constant and suing MAMS was constant.

High Quality Vertical Silicon Channel by Laser-Induced Epitaxial Growth for Nanoscale Memory Integration

  • Son, Yong-Hoon;Baik, Seung Jae;Kang, Myounggon;Hwang, Kihyun;Yoon, Euijoon
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.14 no.2
    • /
    • pp.169-174
    • /
    • 2014
  • As a versatile processing method for nanoscale memory integration, laser-induced epitaxial growth is proposed for the fabrication of vertical Si channel (VSC) transistor. The fabricated VSC transistor with 80 nm gate length and 130 nm pillar diameter exhibited field effect mobility of $300cm^2/Vs$, which guarantees "device quality". In addition, we have shown that this VSC transistor provides memory operations with a memory window of 700 mV, and moreover, the memory window further increases by employing charge trap dielectrics in our VSC transistor. Our proposed processing method and device structure would provide a promising route for the further scaling of state-of-the-art memory technology.

A Study on Software-based Memory Testing of Embedded System (임베디드 시스템의 소프트웨어 기반 메모리 테스팅에 관한 연구)

  • Roh, Myong-Ki;Kim, Sang-Il;Rhew, Sung-Yul
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2004.05a
    • /
    • pp.309-312
    • /
    • 2004
  • 임베디드 시스템은 특별한 목적을 수행하기 위해 컴퓨터 하드웨어와 소프트웨어를 결합시킨 것이다. 임베디드 시스템은 일반 데스크탑보다 작은 규모의 하드웨어에서 운영된다. 임베디드 시스템은 파워, 공간, 메모리 등의 여러 가지 환경적 요소에 제약을 받는다. 그리고 임베디드 시스템은 실시간으로 동작하기 때문에 임베디드 시스템에서 소프트웨어의 실패는 일반 데스크탑에서보다 훨씬 심각한 문제를 발생시킨다. 따라서 임베디드 시스템은 주어진 자원을 효율적으로 사용하여야 하고 임베디드 시스템의 실패율을 낮춰야만 한다. 치명적인 문제를 발생시킬 수 있는 임베디드 시스템의 실패의 원인 중 하나가 메모리에 관련한 문제이다. 임베디드 시스템 특정상 메모리 문제는 크게 하드웨어 기반의 메모리 문제와 소프트웨어 기반의 메모리 문제로 분류된다. 소프트웨어 기반의 메모리에 관련한 문제는 Memory Leak, Freeing Free Memory, Freeing Unallocated Memory, Memory Allocation Failed, Late Detect Array Bounds Write, Late Detect Freed Memory Write 등과 같은 것들이 있다. 본 논문에서는 임베디드 시스템의 메모리 관련에 대한 문제점을 파악하고 관련 툴을 연구하여 그 문제점들을 효율적으로 해결할 수 있는 기법을 점증적으로 연구하고자 한다.

  • PDF