• Title/Summary/Keyword: Performance benchmark

Search Result 853, Processing Time 0.028 seconds

Design of A Media Processor Equipped with Dual Cache (복수 캐시로 구성한 미디어 프로세서의 설계)

  • Moon, Hyun-Ju;Jeon, Joong-Nam;Kim, Suk-Il
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.29 no.10
    • /
    • pp.573-581
    • /
    • 2002
  • In this paper, we propose a mediaprocessor of dual-cache architecture which is composed of the multimedia data cache and the general-purpose data cache to prevent performance degradation caused by memory delay. In the proposed processor architecture, multimedia data that are written in subword instructions are loaded in the multimedia data cache and the remaining data are loaded in the general-purpose data cache. Also, Ive use multi-block prefetching scheme that fetches two consecutive data blocks into a cache at a time to exploit the locality of multimedia data. Experimental results on MPEG and JPEG benchmark programs show that the proposed processor architecture results in better performance than the processor equipped with single data cache.

A Study about Performance Evaluation of Various NoSQL Databases (다양한 NoSQL 데이터베이스의 성능 평가 연구)

  • Park, Hong-Jin
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.9 no.3
    • /
    • pp.298-305
    • /
    • 2016
  • Various NoSQL databases are more excellent to process a large amount of big data than existing relational databases such as MySQL, PostgreSQL and Oracle. Among widely used NoSQL databases, performance of HBase, Cassandra, MongoDB and Redis was comparatively assessed. For distributed processing of a large amount of data, 12 servers were connected through switching hub and Ubuntu was installed as operating system. As for benchmark tool, YCSB was applied. Read and update ratios changed from 50% and 50%, 95% and 5% and finally, 100% and 0% and each of them was assessed as 200,000 commands developed into 1,200,000 commands for each case. Cassandra was most excellent with transaction processing per second while MongoDB was most excellent with the number of processes carried out per unit time.

Design and Implementation of an SCI-Based Network Cache Coherent NUMA System for High-Performance PC Clustering (고성능 PC 클러스터 링을 위한 SCI 기반 Network Cache Coherent NUMA 시스템의 설계 및 구현)

  • Oh Soo-Cheol;Chung Sang-Hwa
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.31 no.12
    • /
    • pp.716-725
    • /
    • 2004
  • It is extremely important to minimize network access time in constructing a high-performance PC cluster system. For PC cluster systems, it is possible to reduce network access time by maintaining network cache in each cluster node. This paper presents a Network Cache Coherent NUMA (NCC-NUMA) system to utilize network cache by locating shared memory on the PCI bus, and the NCC-NUMA card which is core module of the NCC-NUMA system is developed. The NCC-NUMA card is directly plugged into the PCI slot of each node, and contains shared memory, network cache, shared memory control module and network control module. The network cache is maintained for the shared memory on the PCI bus of cluster nodes. The coherency mechanism between the network cache and the shared memory is based on the IEEE SCI standard. According to the SPLASH-2 benchmark experiments, the NCC-NUMA system showed improvements of 56% compared with an SCI-based cluster without network cache.

Synthesis of Multi-level Reed Muller Circuits using BDDs (BDD를 이용한 다단계 리드뮬러회로의 합성)

  • Jang, Jun-Yeong;Lee, Gwi-Sang
    • The Transactions of the Korea Information Processing Society
    • /
    • v.3 no.3
    • /
    • pp.640-654
    • /
    • 1996
  • This paper presents a synthesis method for multi-level Reed-Muller circuits using BDDs(Binary Decision Diagrams). The existing synthesis tool for Reed circuits, FACTOR, is not appropriate to the synthesis of large circuits because it uses matrix (map-type) to represent given logic functions, resulting in the exponential time and space in number of imput to the circuits. For solving this problems, a syntheisis method based on BDD is presented. Using BDDs, logic functions are represented compactly. Therefor storage spaces and computing time for synthesizing logic functions were greatly decreased, and this technique can be easily applied to large circuits. Using BDD representations, the proposed method extract best patterns to minimize multi-level Reed Muller circuits with good performance in area optimization and testability. Experimental results using the proposed method show better performance than those using previous methods〔2〕. For large circuits of considering the best input partition, synthesis results have been improved.

  • PDF

Container-based Cluster Management System for User-driven Distributed Computing (사용자 맞춤형 분산 컴퓨팅을 위한 컨테이너 기반 클러스터 관리 시스템)

  • Park, Ju-Won;Hahm, Jaegyoon
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.9
    • /
    • pp.587-595
    • /
    • 2015
  • Several fields of science have traditionally demanded large-scale workflow support, which requires thousands of central processing unit (CPU) cores. In order to support such large-scale scientific workflows, large-capacity cluster systems such as supercomputers are widely used. However, as users require a diversity of software packages and configurations, a system administrator has some trouble in making a service environment in real time. In this paper, we present a container-based cluster management platform and introduce an implementation case to minimize performance reduction and dynamically provide a distributed computing environment desired by users. This paper offers the following contributions. First, a container-based virtualization technology is assimilated with a resource and job management system to expand applicability to support large-scale scientific workflows. Second, an implementation case in which docker and HTCondor are interlocked is introduced. Lastly, docker and native performance comparison results using two widely known benchmark tools and Monte-Carlo simulation implemented using various programming languages are presented.

A Software VIA based PC Cluster System on SCI Network (SCI 네트워크 상의 소프트웨어 VIA기반 PC글러스터 시스템)

  • Shin, Jeong-Hee;Chung, Sang-Hwa;Park, Se-Jin
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.29 no.4
    • /
    • pp.192-200
    • /
    • 2002
  • The performance of a PC cluster system is limited by the use of traditional communication protocols, such as TCP/IP because these protocols are accompanied with significant software overheads. To overcome the problem, systems based on user-level interface for message passing without intervention of kernel have been developed. The VIA(Virtual Interface Architecture) is one of the representative user-level interfaces which provide low latency and high bandwidth. In this paper, a VIA system is implemented on an SCI(Scalable Coherent Interface) network based PC cluster. The system provides both message-passing and shared-memory programming environments and shows the maximum bandwidth of 84MB/s and the latency of $8{\mu}s$. The system also shows better performance in comparison with other comparable computer systems in carrying out parallel benchmark programs.

Analysis of Tensor Processing Unit and Simulation Using Python (텐서 처리부의 분석 및 파이썬을 이용한 모의실행)

  • Lee, Jongbok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.3
    • /
    • pp.165-171
    • /
    • 2019
  • The study of the computer architecture has shown that major improvements in price-to-energy performance stems from domain-specific hardware development. This paper analyzes the tensor processing unit (TPU) ASIC which can accelerate the reasoning of the artificial neural network (NN). The core device of the TPU is a MAC matrix multiplier capable of high-speed operation and software-managed on-chip memory. The execution model of the TPU can meet the reaction time requirements of the artificial neural network better than the existing CPU and the GPU execution models, with the small area and the low power consumption even though it has many MAC and large memory. Utilizing the TPU for the tensor flow benchmark framework, it can achieve higher performance and better power efficiency than the CPU or CPU. In this paper, we analyze TPU, simulate the Python modeled OpenTPU, and synthesize the matrix multiplication unit, which is the key hardware.

Implementation of High Performance TCP Proxy Logic against TCP Flooding Attack on Network Interface Card (TCP 플러딩 공격 방어를 위한 네트워크 인터페이스용 고성능 TCP 프락시 제어 로직 구현)

  • Kim, Byoung-Koo;Kim, Ik-Kyun;Kim, Dae-Won;Oh, Jin-Tae;Jang, Jong-Soo;Chung, Tai-Myoung
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.21 no.2
    • /
    • pp.119-129
    • /
    • 2011
  • TCP-related Flooding attacks still dominate Distributed Denial of Service Attack. It is a great challenge to accurately detect the TCP flood attack in hish speed network. In this paper, we propose the NIC_Cookie logic implementation, which is a kind of security offload engine against TCP-related DDoS attacks, on network interface card. NIC_Cookie has robustness against DDoS attack itself and it is independent on server OS and external network configuration. It supports not IP-based response method but packet-level response, therefore it can handle attacks of NAT-based user group. We evaluate that the latency time of NIC_Cookie logics is $7{\times}10^{-6}$ seconds and we show 2Gbps wire-speed performance through a benchmark test.

A Study on the Types of Public Hospitals in the Region by Cluster Analysis (군집분석을 통한 지역거점공공병원의 유형화)

  • Seo, Ji-Woo;Sohn, Minsung;Choi, Mankyu
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.8
    • /
    • pp.329-336
    • /
    • 2021
  • This study selected indicators that can represent the characteristics of general hospitals, including local medical centers and Red Cross hospitals, which are representative public health institutions, and analyzed clusters. And we present to benchmark in each cluster. According to the analysis, 276 general hospitals were classified into 13 clusters, and local medical centers and Red Cross hospitals were classified into clusters between 1 and 7 of the total 13 clusters because of their small size. Local medical centers and Red Cross hospitals, selected as excellent hospitals in each cluster, showed significant differences in management performance despite similar regional environment and medical performance, and among them, surgical consultation and internal medical care rates, inpatient and outpatient rates. In order for local medical centers and Red Cross hospitals to play their role as secondary acute hospitals in the region, inpatient care services and surgical functions must be activated.

Distortion-guided Module for Image Deblurring (왜곡 정보 모듈을 이용한 이미지 디블러 방법)

  • Kim, Jeonghwan;Kim, Wonjun
    • Journal of Broadcast Engineering
    • /
    • v.27 no.3
    • /
    • pp.351-360
    • /
    • 2022
  • Image blurring is a phenomenon that occurs due to factors such as movement of a subject and shaking of a camera. Recently, the research for image deblurring has been actively conducted based on convolution neural networks. In particular, the method of guiding the restoration process via the difference between blur and sharp images has shown the promising performance. This paper proposes a novel method for improving the deblurring performance based on the distortion information. To this end, the transformer-based neural network module is designed to guide the restoration process. The proposed method efficiently reflects the distorted region, which is predicted through the global inference during the deblurring process. We demonstrate the efficiency and robustness of the proposed module based on experimental results with various deblurring architectures and benchmark datasets.