• Title/Summary/Keyword: in-memory computing

Search Result 766, Processing Time 0.03 seconds

A Connection of Information in the Ubiquitous Space (유비쿼터스 공간에서의 정보 연결)

  • Ko Sung-Bum
    • Journal of Internet Computing and Services
    • /
    • v.5 no.2
    • /
    • pp.1-15
    • /
    • 2004
  • The current Internet space is evolving to the so called Ubiquitous space. Unlike the Internet space, the information in the Ubiquitous space is distributed evenly in the places like computer's memory, human's brain and physical machine. The 'hypertext', the connection model of the information, which is originally designed for the Internet space doesn't suit well to the Ubiquitous space. From this point of view, we proposed the CPM model in this paper. The CPM model is designed for comprising the such three computing mechanism as analog computing, digital computing and human computing. In this paper, we showed that the characteristics of the CPM model might answer the such purpose as the connection of information in the Ubiquitous space.

  • PDF

GPU Memory Management Technique to Improve the Performance of GPGPU Task of Virtual Machines in RPC-Based GPU Virtualization Environments (RPC 기반 GPU 가상화 환경에서 가상머신의 GPGPU 작업 성능 향상을 위한 GPU 메모리 관리 기법)

  • Kang, Jihun
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.10 no.5
    • /
    • pp.123-136
    • /
    • 2021
  • RPC (Remote Procedure Call)-based Graphics Processing Unit (GPU) virtualization technology is one of the technologies for sharing GPUs with multiple user virtual machines. However, in a cloud environment, unlike CPU or memory, general GPUs do not provide a resource isolation technology that can limit the resource usage of virtual machines. In particular, in an RPC-based virtualization environment, since GPU tasks executed in each virtual machine are performed in the form of multi-process, the lack of resource isolation technology causes performance degradation due to resource competition. In addition, the GPU memory competition accelerates the performance degradation as the resource demand of the virtual machines increases, and the fairness decreases because it cannot guarantee equal performance between virtual machines. This paper, in the RPC-based GPU virtualization environment, analyzes the performance degradation problem caused by resource contention when the GPU memory requirement of virtual machines exceeds the available GPU memory capacity and proposes a GPU memory management technique to solve this problem. Also, experiments show that the GPU memory management technique proposed in this paper can improve the performance of GPGPU tasks.

Quantitative Analyses of System Level Performance of Dynamic Memory Allocation In Embedded Systems (내장형 시스템 동적 메모리 할당 기법의 시스템 수준 성능에 관한 정량적 분석)

  • Park, Sang-Soo;Shin, Heon-Shik
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.11 no.6
    • /
    • pp.477-487
    • /
    • 2005
  • As embedded system grows in size and complexity, the importance of the technique for dynamic memory allocation has increased. The objective of this paper is to measure the performance of dynamic memory allocation by varying both hardware and software design parameters for embedded systems. Unlike torrent performance evaluation studies that have presumed the single threaded system with single address spate without OS support, our study adopts realistic environment where the embedded system runs on Linux OS. This paper contains the experimental performance analyses of dynamic memory allocation method by investigating the effects of each software layer and some hardware design parameters. Our quantitative results tan be used to help system designers design high performance, low power embedded systems.

Tabu Search Heuristics for Solving a Class of Clustering Problems (타부 탐색에 근거한 집락문제의 발견적 해법)

  • Jung, Joo-Sung;Yum, Bong-Jin
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.23 no.3
    • /
    • pp.451-467
    • /
    • 1997
  • Tabu search (TS) is a useful strategy that has been successfully applied to a number of complex combinatorial optimization problems. By guiding the search using flexible memory processes and accepting disimproved solutions at some iterations, TS helps alleviate the risk of being trapped at a local optimum. In this article, we propose TS-based heuristics for solving a class of clustering problems, and compare the relative performances of the TS-based heuristic and the simulated annealing (SA) algorithm. Computational experiments show that the TS-based heuristic with a long-term memory offers a higher possibility of finding a better solution, while the TS-based heuristic without a long-term memory performs better than the others in terms of the combined measure of solution quality and computing effort required.

  • PDF

Large-scale 3D fast Fourier transform computation on a GPU

  • Jaehong Lee;Duksu Kim
    • ETRI Journal
    • /
    • v.45 no.6
    • /
    • pp.1035-1045
    • /
    • 2023
  • We propose a novel graphics processing unit (GPU) algorithm that can handle a large-scale 3D fast Fourier transform (i.e., 3D-FFT) problem whose data size is larger than the GPU's memory. A 1D FFT-based 3D-FFT computational approach is used to solve the limited device memory issue. Moreover, to reduce the communication overhead between the CPU and GPU, we propose a 3D data-transposition method that converts the target 1D vector into a contiguous memory layout and improves data transfer efficiency. The transposed data are communicated between the host and device memories efficiently through the pinned buffer and multiple streams. We apply our method to various large-scale benchmarks and compare its performance with the state-of-the-art multicore CPU FFT library (i.e., fastest Fourier transform in the West [FFTW]) and a prior GPU-based 3D-FFT algorithm. Our method achieves a higher performance (up to 2.89 times) than FFTW; it yields more performance gaps as the data size increases. The performance of the prior GPU algorithm decreases considerably in massive-scale problems, whereas our method's performance is stable.

A Study on Vulnerability Analysis and Memory Forensics of ESP32

  • Jiyeon Baek;Jiwon Jang;Seongmin Kim
    • Journal of Internet Computing and Services
    • /
    • v.25 no.3
    • /
    • pp.1-8
    • /
    • 2024
  • As the Internet of Things (IoT) has gained significant prominence in our daily lives, most IoT devices rely on over-the-air technology to automatically update firmware or software remotely via the network connection to relieve the burden of manual updates by users. And preserving security for OTA interface is one of the main requirements to defend against potential threats. This paper presents a simulation of an attack scenario on the commoditized System-on-a-chip, ESP32 chip, utilized for drones during their OTA update process. We demonstrate three types of attacks, WiFi cracking, ARP spoofing, and TCP SYN flooding techniques and postpone the OTA update procedure on an ESP32 Drone. As in this scenario, unpatched IoT devices can be vulnerable to a variety of potential threats. Additionally, we review the chip to obtain traces of attacks from a forensics perspective and acquire memory forensic artifacts to indicate the SYN flooding attack.

Building a Dynamic Analyzer for CUDA based System.

  • SALAH T. ALSHAMMARI
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.8
    • /
    • pp.77-84
    • /
    • 2023
  • The utilization of GPUs on general-purpose computers is currently on the rise due to the increase in its programmability and performance requirements. The utility of tools like NVIDIA's CUDA have been designed to allow programmers to code algorithms by using C-like language for the execution process on the graphics processing units GPU. Unfortunately, many of the performance and correctness bugs will happen on parallel programs. The CUDA tool support for the parallel programs has not yet been actualized. The use of a dynamic analyzer to find performance and correctness bugs in CUDA programs facilitates the execution of sophisticated processes, especially in modern computing requirements. Any race conditions bug it will impact of program correctness and the share memory bank conflicts to improve the overall performance. The technique instruments the programs in a way that promotes accessibility of the memory locations accessed by different threads well as to check for any bugs in the code of a program. The instrumented source code will be used initiated directly in the device emulation code of CUDA to send report for the user about all errors. The current degree of automation helps programmers solve subtle bugs in highly complex programs or programs that cannot be analyzed manually.

UI for Supporting Old Age's Prospective Memory (노인의 미래기억을 보조하는 UI)

  • Yoon, Yong-Sik;Sohn, Young-Woo
    • Journal of the HCI Society of Korea
    • /
    • v.1 no.1
    • /
    • pp.89-95
    • /
    • 2006
  • Prospective memory is memory for activities to be performed in the future, such as remembering to purchase a piece of fruit on the way home or remembering to give someone a telephone message. Due to the decrease in memory ability, the aged have difficulty in remembering the tasks they intended to perform in the future. Employing survey and experimental methods, we identified the UI requirements for enhancing prospective memory (PM) performance for the aged. The survey included subjective assessments of PM performance for the aged and their preferred usability components and PM-supporting systems in an ubiquitous computing environment. The experiment examined the effect of contextual cues on PM performance for the young and the aged. Practical implications of our results were discussed in the respects of PM-supporting UI design requirements for the aged.

  • PDF

Efficient Hardware Transactional Memory Scheme for Processing Transactions in Multi-core In-Memory Environment (멀티코어 인메모리 환경에서 트랜잭션을 처리하기 위한 효율적인 HTM 기법)

  • Jang, Yeonwoo;Kang, Moonhwan;Yoon, Min;Chang, Jaewoo
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.8
    • /
    • pp.466-472
    • /
    • 2017
  • Hardware Transactional Memory (HTM) has greatly changed the parallel programming paradigm for transaction processing. Since Intel has recently proposed Transactional Synchronization Extension (TSX), a number of studies based on HTM have been conducted. However, the existing studies support conflict prediction for a single cause of the transaction processing and provide a standardized TSX environment for all workloads. To solve the problems, we propose an efficient hardware transactional memory scheme for processing transactions in multi-core in-memory environment. First, the proposed scheme determines whether to use Software Transactional Memory (STM) or the serial execution as a fallback path of HTM by using a prediction matrix to collect the information of previously executed transactions. Second, the proposed scheme performs efficient transaction processing according to the characteristic of a given workload by providing a retry policy based on machine learning algorithms. Finally, through the experimental performance evaluation using Stanford transactional applications for multi-processing (STAMP), the proposed scheme shows 10~20% better performance than the existing schemes.

An Efficient Index Buffer Management Scheme for a B+ tree on Flash Memory (플래시 메모리상에 B+트리를 위한 효율적인 색인 버퍼 관리 정책)

  • Lee, Hyun-Seob;Joo, Young-Do;Lee, Dong-Ho
    • The KIPS Transactions:PartD
    • /
    • v.14D no.7
    • /
    • pp.719-726
    • /
    • 2007
  • Recently, NAND flash memory has been used for a storage device in various mobile computing devices such as MP3 players, mobile phones and laptops because of its shock-resistant, low-power consumption, and none-volatile properties. However, due to the very distinct characteristics of flash memory, disk based systems and applications may result in severe performance degradation when directly adopting them on flash memory storage systems. Especially, when a B-tree is constructed, intensive overwrite operations may be caused by record inserting, deleting, and its reorganizing, This could result in severe performance degradation on NAND flash memory. In this paper, we propose an efficient buffer management scheme, called IBSF, which eliminates redundant index units in the index buffer and then delays the time that the index buffer is filled up. Consequently, IBSF significantly reduces the number of write operations to a flash memory when constructing a B-tree. We also show that IBSF yields a better performance on a flash memory by comparing it to the related technique called BFTL through various experiments.