• Title/Summary/Keyword: Benchmarks

Search Result 380, Processing Time 0.024 seconds

Energy-Efficient Last-Level Cache Management for PCM Memory Systems

  • Bahn, Hyokyung
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.14 no.1
    • /
    • pp.188-193
    • /
    • 2022
  • The energy efficiency of memory systems is an important task in designing future computer systems as memory capacity continues to increase to accommodate the growing big data. In this article, we present an energy-efficient last-level cache management policy for future mobile systems. The proposed policy makes use of low-power PCM (phase-change memory) as the main memory medium, and reduces the amount of data written to PCM, thereby saving memory energy consumptions. To do so, the policy keeps track of the modified cache lines within each cache block, and replaces the last-level cache block that incurs the smallest PCM writing upon cache replacement requests. Also, the policy considers the access bit of cache blocks along with the cache line modifications in order not to degrade the cache hit ratio. Simulation experiments using SPEC benchmarks show that the proposed policy reduces the power consumption of PCM memory by 22.7% on average without degrading performances.

The method for protecting contents on a multimedia system (멀티미디어 시스템에서 콘텐츠를 보호하기 위한 방법)

  • Kim, Seong-Ki
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.7
    • /
    • pp.113-121
    • /
    • 2009
  • As a DRM is recently being removed from many sites, the content protection on a video server becomes important. However, many protection methods have their own limitations, or aren't used due to the deterioration of the streaming performance. This paper proposes a content protection method that uses both the eCryptFS and the SELinux at the same time, and measures the performance of the proposed method by using various benchmarks. Then, this paper verifies that the method doesn't significantly decrease the streaming performance although the proposed method decreases the other performances, so it can be used for the content protection in a multimedia system.

Modern Face Recognition using New Masked Face Dataset Generated by Deep Learning (딥러닝 기반의 새로운 마스크 얼굴 데이터 세트를 사용한 최신 얼굴 인식)

  • Pann, Vandet;Lee, Hyo Jong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.11a
    • /
    • pp.647-650
    • /
    • 2021
  • The most powerful and modern face recognition techniques are using deep learning methods that have provided impressive performance. The outbreak of COVID-19 pneumonia has spread worldwide, and people have begun to wear a face mask to prevent the spread of the virus, which has led existing face recognition methods to fail to identify people. Mainly, it pushes masked face recognition has become one of the most challenging problems in the face recognition domain. However, deep learning methods require numerous data samples, and it is challenging to find benchmarks of masked face datasets available to the public. In this work, we develop a new simulated masked face dataset that we can use for masked face recognition tasks. To evaluate the usability of the proposed dataset, we also retrained the dataset with ArcFace based system, which is one the most popular state-of-the-art face recognition methods.

Optimizing SR-GAN for Resource-Efficient Single-Image Super-Resolution via Knowledge Distillation

  • Sajid Hussain;Jung-Hun Shin;Kum-Won Cho
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.05a
    • /
    • pp.479-481
    • /
    • 2023
  • Generative Adversarial Networks (GANs) have facilitated substantial improvement in single-image super-resolution (SR) by enabling the generation of photo-realistic images. However, the high memory requirements of GAN-based SRs (mainly generators) lead to reduced performance and increased energy consumption, making it difficult to implement them onto resource-constricted devices. In this study, we propose an efficient and compressed architecture for the SR-GAN (generator) model using the model compression technique Knowledge Distillation. Our approach involves the transmission of knowledge from a heavy network to a lightweight one, which reduces the storage requirement of the model by 58% with also an increase in their performance. Experimental results on various benchmarks indicate that our proposed compressed model enhances performance with an increase in PSNR, SSIM, and image quality respectively for x4 super-resolution tasks.

Design and Implementation of a Massively Parallel Multithreaded Architecture: DAVRID

  • Sangho Ha;Kim, Junghwan;Park, Eunha;Yoonhee Hah;Sangyong Han;Daejoon Hwang;Kim, Heunghwan;Seungho Cho
    • Journal of Electrical Engineering and information Science
    • /
    • v.1 no.2
    • /
    • pp.15-26
    • /
    • 1996
  • MPAs(Massively Parallel Architectures) should address two fundamental issues for scalability: synchronization and communication latency. Dataflow architecture faces problems of excessive synchronization overhead and inefficient execution of sequential programs while they offer the ability to exploit massive parallelism inherent in programs. In contrast, MPAs based on von Neumann computational model may suffer from inefficient synchronization mechanism and communication latency. DAVRID (DAtaflow/Von Neumann RISC hybrID) is a massively parallel multithreaded architecture which takes advantages of von Neumann and dataflow models. It has good single thread performance as well as tolerates synchronization and communication latency. In this paper, we describe the DAVRID architecture in detail and evaluate its performance through simulation runs over several benchmarks.

  • PDF

Large-scale 3D fast Fourier transform computation on a GPU

  • Jaehong Lee;Duksu Kim
    • ETRI Journal
    • /
    • v.45 no.6
    • /
    • pp.1035-1045
    • /
    • 2023
  • We propose a novel graphics processing unit (GPU) algorithm that can handle a large-scale 3D fast Fourier transform (i.e., 3D-FFT) problem whose data size is larger than the GPU's memory. A 1D FFT-based 3D-FFT computational approach is used to solve the limited device memory issue. Moreover, to reduce the communication overhead between the CPU and GPU, we propose a 3D data-transposition method that converts the target 1D vector into a contiguous memory layout and improves data transfer efficiency. The transposed data are communicated between the host and device memories efficiently through the pinned buffer and multiple streams. We apply our method to various large-scale benchmarks and compare its performance with the state-of-the-art multicore CPU FFT library (i.e., fastest Fourier transform in the West [FFTW]) and a prior GPU-based 3D-FFT algorithm. Our method achieves a higher performance (up to 2.89 times) than FFTW; it yields more performance gaps as the data size increases. The performance of the prior GPU algorithm decreases considerably in massive-scale problems, whereas our method's performance is stable.

Deep Reference-based Dynamic Scene Deblurring

  • Cunzhe Liu;Zhen Hua;Jinjiang Li
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.3
    • /
    • pp.653-669
    • /
    • 2024
  • Dynamic scene deblurring is a complex computer vision problem owing to its difficulty to model mathematically. In this paper, we present a novel approach for image deblurring with the help of the sharp reference image, which utilizes the reference image for high-quality and high-frequency detail results. To better utilize the clear reference image, we develop an encoder-decoder network and two novel modules are designed to guide the network for better image restoration. The proposed Reference Extraction and Aggregation Module can effectively establish the correspondence between blurry image and reference image and explore the most relevant features for better blur removal and the proposed Spatial Feature Fusion Module enables the encoder to perceive blur information at different spatial scales. In the final, the multi-scale feature maps from the encoder and cascaded Reference Extraction and Aggregation Modules are integrated into the decoder for a global fusion and representation. Extensive quantitative and qualitative experimental results from the different benchmarks show the effectiveness of our proposed method.

Development of a new CVAP structural analysis methodology of APR1400 reactor internals using scaled model tests

  • Jongsung Moon;Inseong Jin;Doyoung Ko;Kyuhyung Kim
    • Nuclear Engineering and Technology
    • /
    • v.56 no.1
    • /
    • pp.309-316
    • /
    • 2024
  • The U.S. Nuclear Regulatory Commission (NRC) Regulatory Guide (RG) 1.20 provides guidance on the comprehensive vibration assessment program (CVAP) to be performed on reactor internals during preoperational and startup tests. The purpose of the program is to identify loads that could cause vibration in the reactor internals and to ensure that these vibrations do not affect their structural integrity. The structural vibrational analysis program involves creating finite element analysis models of the reactor internals and calculating their structural responses when subjected to vibration loads. The appropriateness of the structural analysis methodology must be demonstrated through benchmarks or any other reasonable means. Although existing structural analysis methodologies have been proven to be appropriate and are widely used, this paper presents the development of an improved new structural analysis methodology for APR1400 reactor internals using scaled model tests.

Enhancing Automated Report Generation: Integrating Rivet and RAG with Advanced Retrieval Techniques

  • Doo-Il Kwak;Kwang-Young Park
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2024.05a
    • /
    • pp.753-756
    • /
    • 2024
  • This study integrates Rivet and Retrieved Augmented Generation (RAG) technologies to enhance automated report generation, addressing the challenges of large-scale data management. We introduce novel algorithms, such as Dynamic Data Synchronization and Contextual Compression, expected to improve report generation speed by 40% and accuracy by 25%. The application, demonstrated through a model corporate entity, "Company L," shows how such integrations can enhance business intelligence. Empirical validations planned will utilize metrics like precision, recall, and BLEU to substantiate the improvements, setting new benchmarks for the industry. This research highlights the potential of advanced technologies in transforming corporate data processes.

An Efficient mmWave MIMO Transmission with Hybrid Precoding

  • Ying Liu;Jinhong Bian;Yuanyuan Wang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.7
    • /
    • pp.2010-2026
    • /
    • 2024
  • This work investigates the hybrid precoder scheme in a millimeter wave (mmWave) multi-user MIMO system. We study a sum rate maximization scheme by jointly designing the digital precoder and the analog precoder. To handle the non-convex problem, a block coordinate descent (BCD) method is formulated, where the digital precoder is solved by a bisection search and the analog precoder is addressed by the penalty dual decomposition (PDD) alternately. Then, we extend the proposed algorithm to the sub-connected schemes. Besides, the proposed algorithm enjoys lower computational complexity when compared with other benchmarks. Simulation results verify the performance of the proposed scheme and provide some meaningful insight.