• Title/Summary/Keyword: in-memory computing

Search Result 766, Processing Time 0.034 seconds

An Augmented Memory System using Associated Words and Social Network Service (소셜네트워크 서비스와 연상단어를 활용한 증강기억 시스템)

  • Kim, Tai-Wan;Park, Bum-Jun;Park, Tae-Keun
    • Journal of Internet Computing and Services
    • /
    • v.11 no.6
    • /
    • pp.41-50
    • /
    • 2010
  • As time goes by, most of information escapes human being's memory even though he/she tries hard to remember the information. On the other hand, when a human being takes a look at an image, he/she recollects once forgotten past memories and relates a specific object in the photo with associated words, which trigger new memories. Beside, he/she feels the affection of that time by the recalled memory. Therefore, this paper proposes an augmented memory system that assists recollection of user's past memories by using the images in social network services and user's dictionary for associated words. In the proposed system, if a user selects an object in an image, words associated with the object is provided to the user. If the user selects one of the associated words, the proposed system offers the list of other images containing the object of the selected word. The repetition of the aforementioned process can make the user recollect his/her memory and stimulate his/her affection. It is expected that the proposed system will be useful for revitalizing social network services.

Delay Operation Techniques for Efficient MR-Tree on Nand Flash Memory (낸드 플래시 메모리 상에서 효율적인 MR-트리 동작을 위한 지연 연산 기법)

  • Lee, Hyun-Seung;Song, Ha-Yoon;Kim, Kyung-Chang
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.8
    • /
    • pp.758-762
    • /
    • 2008
  • Embedded systems usually utilize Flash Memories with very nice characteristics of non-volatility, low access time, low power and so on. For the multimedia database systems, R-tree is an indexing tree with nice characteristics for multimedia access. MR-tree, which is an upgraded version of R-tree, has shown better performance in searching, inserting and deleting operations than R-tree. Flash memory has sectors and blocks as a unit of read, write and delete operations. Especially, the delete is done on a unit of 512 byte blocks with very large operation time and it is also known that read and write operations on a unit of block matches caching nature of MT-tree. Our research optimizes MR-tree operations in a unit of Flash memory blocks. Such an adjusting leads in better indexing performance in database accesses. With MR-tree on a 512B block units we achieved fast search time of database indexing with low height of MR-tree as well as faster update time of database indexing with the best fit of flash memory blocks. Thus MR-tree with optimized operations shows good characteristics to be a database index schemes on any systems with flash memory.

DVFS based Memory-Contention Aware Scheduling Method for Multi-threaded Workloads (멀티쓰레드 워크로드를 위한 DVFS 기반 메모리 경합 인지 스케줄링 기법)

  • Nam, Yoonsung;Kang, Minkyu;Yeom, HeonYoung;Eom, Hyeonsang
    • KIISE Transactions on Computing Practices
    • /
    • v.24 no.1
    • /
    • pp.10-16
    • /
    • 2018
  • The task of consolidating server workloads is critical for the efficiency of a datacenter in terms of reducing costs. However, as a greater number of workloads are consolidated in a single server, the performance of workloads might be degraded due to their contention to the limited shared resources. To reduce the performance degradation, scheduling for mitigating the contention of shared resources is necessary. In this paper, we present the Dynamic Voltage Frequency Scaling (DVFS) based memory-contention aware scheduling method for multi-threaded workloads. The proposed method uses two approaches: running memory-intensive threads on the limited cores to avoid concurrent memory accesses, and reducing the frequencies of the cores that run memory-intensive threads. With the proposed algorithm, we increased performance by 43% and reduced power consumption by 38% compared to the Completely Fair Scheduler(CFS), the default scheduler of Linux.

Design and Evaluation of a Fast Boot-up Technique for Flash Memory based Computer Systems (플래시메모리 기반 컴퓨터시스템을 위한 고속 부팅 기법의 설계 및 성능평가)

  • Yim, Keun-Soo;Kim, Ji-Hong;Koh, Kern
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.32 no.11_12
    • /
    • pp.587-597
    • /
    • 2005
  • Flash memory based embedded computing systems are becoming increasingly prevalent.These systems typically have to provide an instant start-up time. However, we observe that mounting a file system toy flash memory takes 1 to 25 seconds mainly depending on the flash capacity. Since the flash chip capacity is doubled in every year, this mounting time will soon become the most dominant reason of the delay of system start-up time Therefore, in this paper, we present instant mounting techniques for flash file systems by storing the In-memory file system metadata to flash memory when unmounting the file system and reloading the stored metadata quickly when mounting the file system. These metadata snapshot techniques are specifically developed for NOR- and NAND-type flash memories, while at the same time, overcoming their physical constraints. The proposed techniques check the validity of the stored snapshot and use the proposed fast trash recovery techniques when the snapshot is Invalid. Based on the experimental results, the proposed techniques can reduce the flash mounting time by about two orders of magnitude over the existing de facto standard flash file system, JFFS2.

Memory Efficient Parallel Ray Casting Algorithm for Unstructured Grid Volume Rendering on Multi-core CPUs (비정렬 격자 볼륨 렌더링을 위한 다중코어 CPU기반 메모리 효율적 광선 투사 병렬 알고리즘)

  • Kim, Duksu
    • Journal of KIISE
    • /
    • v.43 no.3
    • /
    • pp.304-313
    • /
    • 2016
  • We present a novel memory-efficient parallel ray casting algorithm for unstructured grid volume rendering on multi-core CPUs. Our method is based on the Bunyk ray casting algorithm. To solve the high memory overhead problem of the Bunyk algorithm, we allocate a fixed size local buffer for each thread and the local buffers contain information of recently visited faces. The stored information is used by other rays or replaced by other face's information. To improve the utilization of local buffers, we propose an image-plane based ray grouping algorithm that makes ray groups have high coherency. The ray groups are then distributed to computing threads and each thread processes the given groups independently. We also propose a novel hash function that uses the index of faces as keys for calculating the buffer index each face will use to store the information. To see the benefits of our method, we applied it to three unstructured grid datasets with different sizes and measured the performance. We found that our method requires just 6% of the memory space compared with the Bunyk algorithm for storing face information. Also it shows compatible performance with the Bunyk algorithm even though it uses less memory. In addition, our method achieves up to 22% higher performance for a large-scale unstructured grid dataset with less memory than Bunyk algorithm. These results show the robustness and efficiency of our method and it demonstrates that our method is suitable to volume rendering for a large-scale unstructured grid dataset.

Analysis of Component Performance using Open Source for Guarantee SLA of Cloud Education System (클라우드 교육 시스템의 SLA 보장을 위한 오픈소스기반 요소 성능 분석)

  • Yoon, JunWeon;Song, Ui-Sung
    • Journal of Digital Contents Society
    • /
    • v.18 no.1
    • /
    • pp.167-173
    • /
    • 2017
  • As the increasing use of the cloud computing, virtualization technology have been combined and applied a variety of requirements. Cloud computing has the advantage that the support computing resource by a flexible and scalable to users as they want and it utilized in a variety of distributed computing. To do this, it is especially important to ensure the stability of the cloud computing. In this paper, we analyzed a variety of component measurement using open-source tools for ensuring the performance of the system on the education system to build cloud testbed environment. And we extract the performance that may affect the virtualization environment from processor, memory, cache, network, etc on each of the host machine(Host Machine) and a virtual machine (Virtual Machine). Using this result, we can clearly grasp the state of the system and also it is possible to quickly diagnose the problem. Furthermore, the cloud computing can be guaranteed the SLA(Service Level Agreement).

AN EFFICIENT INCOMPRESSIBLE FREE SURFACE FLOW SIMULATION USING GPU (GPU를 이용한 효율적인 비압축성 자유표면유동 해석)

  • Hong, H.E.;Ahn, H.T.;Myung, H.J.
    • Journal of computational fluids engineering
    • /
    • v.17 no.2
    • /
    • pp.35-41
    • /
    • 2012
  • This paper presents incompressible Navier-Stokes solution algorithm for 2D Free-surface flow problems on the Cartesian mesh, which was implemented to run on Graphics Processing Units(GPU). The INS solver utilizes the variable arrangement on the Cartesian mesh, Finite Volume discretization along Constrained Interpolation Profile-Conservative Semi-Lagrangian(CIP-CSL). Solution procedure of incompressible Navier-Stokes equations for free-surface flow takes considerable amount of computation time and memory space even in modern multi-core computing architecture based on Central Processing Units(CPUs). By the recent development of computer architecture technology, Graphics Processing Unit(GPU)'s scientific computing performance outperforms that of CPU's. This paper focus on the utilization of GPU's high performance computing capability, and presents an efficient solution algorithm for free surface flow simulation. The performance of the GPU implementations with double precision accuracy is compared to that of the CPU code using an representative free-surface flow problem, namely. dam-break problem.

Implementing Efficient Camera ISP Filters on GPGPUs Using OpenCL (GPGPU 기반의 효율적인 카메라 ISP 구현)

  • Park, Jongtae;Facchini, Beron;Hong, Jingun;Burgstaller, Bernd
    • Annual Conference of KIPS
    • /
    • 2010.11a
    • /
    • pp.1784-1787
    • /
    • 2010
  • General Purpose Graphic Processing Unit (GPGPU) computing is a technique that utilizes the high-performance many-core processors of high-end graphic cards for general-purpose computations such as 3D graphics, video/image processing, computer vision, scientific computing, HPC and many more. GPGPUs offer a vast amount of raw computing power, but programming is extremely challenging because of hardware idiosyncrasies. The open computing language (OpenCL) has been proposed as a vendor-independent GPGPU programming interface. OpenCL is very close to the hardware and thus does little to increase GPGPU programmability. In this paper we present how a set of digital camera image signal processing (ISP) filters can be realized efficiently on GPGPUs using OpenCL. Although we found ISP filters to be memory-bound computations, our GPGPU implementations achieve speedups of up to a factor of 64.8 over their sequential counterparts. On GPGPUs, our proposed optimizations achieved speedups between 145% and 275% over their baseline GPGPU implementations. Our experiments have been conducted on a Geforce GTX 275; because of OpenCL we expect our optimizations to be applicable to other architectures as well.

A Comparative Study of Deep Learning Techniques for Alzheimer's disease Detection in Medical Radiography

  • Amal Alshahrani;Jenan Mustafa;Manar Almatrafi;Layan Albaqami;Raneem Aljabri;Shahad Almuntashri
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.5
    • /
    • pp.53-63
    • /
    • 2024
  • Alzheimer's disease is a brain disorder that worsens over time and affects millions of people around the world. It leads to a gradual deterioration in memory, thinking ability, and behavioral and social skills until the person loses his ability to adapt to society. Technological progress in medical imaging and the use of artificial intelligence, has provided the possibility of detecting Alzheimer's disease through medical images such as magnetic resonance imaging (MRI). However, Deep learning algorithms, especially convolutional neural networks (CNNs), have shown great success in analyzing medical images for disease diagnosis and classification. Where CNNs can recognize patterns and objects from images, which makes them ideally suited for this study. In this paper, we proposed to compare the performances of Alzheimer's disease detection by using two deep learning methods: You Only Look Once (YOLO), a CNN-enabled object recognition algorithm, and Visual Geometry Group (VGG16) which is a type of deep convolutional neural network primarily used for image classification. We will compare our results using these modern models Instead of using CNN only like the previous research. In addition, the results showed different levels of accuracy for the various versions of YOLO and the VGG16 model. YOLO v5 reached 56.4% accuracy at 50 epochs and 61.5% accuracy at 100 epochs. YOLO v8, which is for classification, reached 84% accuracy overall at 100 epochs. YOLO v9, which is for object detection overall accuracy of 84.6%. The VGG16 model reached 99% accuracy for training after 25 epochs but only 78% accuracy for testing. Hence, the best model overall is YOLO v9, with the highest overall accuracy of 86.1%.

A Study of Multiple Dynamic Programming (Multiple dynamic programming에 관한 연구)

  • Young Moon park
    • 전기의세계
    • /
    • v.21 no.1
    • /
    • pp.13-16
    • /
    • 1972
  • Dynamic Programming is regarded as a very powerful tool for solving nonlinear optimization problem subject to a number of constraints of state and control variables, but has definite disadvantages that it requires much more computing time and consumes much more memory spaces than other technigues. In order to eliminate the above-mentioned demerits, this paper suggests a news technique called Multiple Dynamic Programming. The underlying principles are based on the concept of multiple passes that, instead of forming fin lattices in time-state plane as adopted in the conventional Dynamic Programming, the Multiple Dynamic Programming constitutes, at the first pass, coarse lattices in the feasible domain of time-state plane and determines the optimal state trajectory by the usual method of Dynamic Programming, and at the second pass again constitutes finer lattices in the narrower domain surrounded by both the upperand lower edges next to the lattice edges through which the first pass optimal trajectory passes and determines the more accurate optimal trajectory of state, and then at the third pass repeats the same processes, and so on. The suggested technique insures remarkable curtailment in amounts of computer memory spaces and conputing time, and its applicability has been demonstrated by a case study on the hydro-thermal power coordination in Korean power system.

  • PDF