• Title/Summary/Keyword: Shared Memory

Search Result 386, Processing Time 0.026 seconds

270 MHz Full HD H.264/AVC High Profile Encoder with Shared Multibank Memory-Based Fast Motion Estimation

  • Lee, Suk-Ho;Park, Seong-Mo;Park, Jong-Won
    • ETRI Journal
    • /
    • v.31 no.6
    • /
    • pp.784-794
    • /
    • 2009
  • We present a full HD (1080p) H.264/AVC High Profile hardware encoder based on fast motion estimation (ME). Most processing cycles are occupied with ME and use external memory access to fetch samples, which degrades the performance of the encoder. A novel approach to fast ME which uses shared multibank memory can solve these problems. The proposed pixel subsampling ME algorithm is suitable for fast motion vector searches for high-quality resolution images. The proposed algorithm achieves an 87.5% reduction of computational complexity compared with the full search algorithm in the JM reference software, while sustaining the video quality without any conspicuous PSNR loss. The usage amount of shared multibank memory between the coarse ME and fine ME blocks is 93.6%, which saves external memory access cycles and speeds up ME. It is feasible to perform the algorithm at a 270 MHz clock speed for 30 frame/s real-time full HD encoding. Its total gate count is 872k, and internal SRAM size is 41.8 kB.

A Remote Cache Coherence Protocol for Single Shared Memory in Multiprocessor System (단일 공유 메모리를 가지는 다중 프로세서 시스템의 원격 캐시 일관성 유지 프로토콜)

  • Kim, Seong-Woon;Kim, Bo-Gwan
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.42 no.6
    • /
    • pp.19-28
    • /
    • 2005
  • The multiprocessor architecture is a good method to improve the computer system performance. The CC-NUMA provides a single shared space with the physically distributed memories is used widely in the multiprocessor computer system. A CC-NUMA has the full-mapped directory for the shared memory md uses a remote cache memory for tile fast memory access. In this paper, we propose a processing node architecture for a CC-NUMA system and a cache coherency protocol on the physically distributed but logically shared system. We show an implementation result of the system which is adopted the cache coherency protocol.

Design and Implementation of an SCI-Based Network Cache Coherent NUMA System for High-Performance PC Clustering (고성능 PC 클러스터 링을 위한 SCI 기반 Network Cache Coherent NUMA 시스템의 설계 및 구현)

  • Oh Soo-Cheol;Chung Sang-Hwa
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.31 no.12
    • /
    • pp.716-725
    • /
    • 2004
  • It is extremely important to minimize network access time in constructing a high-performance PC cluster system. For PC cluster systems, it is possible to reduce network access time by maintaining network cache in each cluster node. This paper presents a Network Cache Coherent NUMA (NCC-NUMA) system to utilize network cache by locating shared memory on the PCI bus, and the NCC-NUMA card which is core module of the NCC-NUMA system is developed. The NCC-NUMA card is directly plugged into the PCI slot of each node, and contains shared memory, network cache, shared memory control module and network control module. The network cache is maintained for the shared memory on the PCI bus of cluster nodes. The coherency mechanism between the network cache and the shared memory is based on the IEEE SCI standard. According to the SPLASH-2 benchmark experiments, the NCC-NUMA system showed improvements of 56% compared with an SCI-based cluster without network cache.

Performance Analysis of Shared Stack Management for Sensor Operating Systems (센서 운영 체제를 위한 공유 스택 기법의 성능 분석)

  • Gu, Bon-Cheol;Heo, Jun-Young;Hong, Ji-Man;Cho, Yoo-Kun
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.35 no.1
    • /
    • pp.53-59
    • /
    • 2008
  • In spite of increasing complexity of wireless sensor network applications, most of the sensor node platforms still have severe resource constraints. Especially a small amount of memory and absence of a memory management unit (MMU) cause many problems in managing application thread stacks. Hence, a shared-stack was proposed, which allows several threads to share one single stack for minimizing the amount of memory wasted by fixed-size stacks. In this paper, we present the memory usage models for thread stacks by deriving the overflow probability of the fixed-size stack and the shared-stack and also show that the shared-stack is more reliable than the fixed-size stack.

Advanced Message Storing Method on mobilePOST SMSC (mobilePOST SMSC(Short Message Service Center)에서의 향상된 메시지 저장 기법)

  • Song, Byung-Kwen
    • Journal of the Korean Society for Railway
    • /
    • v.11 no.2
    • /
    • pp.126-138
    • /
    • 2008
  • This paper proposes the preservation method that can effectively process short messages at mobilePOST SMSC(Short Message Service Center) platform on CDMA(Code Division Multiple Access). There are three techniques for the preservation method. First one is shared memory technique where several processes within the system share same memory to process transmission of short messages with maximum performance. Second one is to back-up messages in shared memory to the file system to prevent lost during system initialization or other unstable period. Third technique is that when transmission of short message was completed, finished message is moved from the shared memory to relational database for accounting purposes.

Implementation and Performance Evaluation of Software Distributed Shared Memory for SMP Clusters (SMP 클러스터를 위한 소프트웨어 분산 공유메모리의 구현 및 성능 측정)

  • 이동현;이상권;박소연;맹승렬
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.30 no.7_8
    • /
    • pp.331-340
    • /
    • 2003
  • Low-cost commodity SMP(Symmetric Multiprocessor) is widely used as a node of cluster system. In this paper, we implement and evaluate the performance of SDSM system for SMP clusters. Our SDSM system provides HLRC(Home-based Lazy Release Consistency) memory consistency model. Our protocol utilize shared memory within same SMP node, so that page fetch and message passing through network can be reduced. It is implemented on 8 node of 2-way Pentium-III SMP interconnected with 100Mbps Fast Ethernet, and uses TCP/IP for transport/network layer protocol. The experiment with eight applications shows that our SMP protocol achieves maximum 33% speedup improvement and 13%-52% reduction of page fetch compared with uniprocessor protocol.

An Adaptive Prefetching Technique for Software Distributed Shared Memory Systems (소프트웨어 분산공유메모리시스템을 위한 적응적 선인출 기법)

  • Lee, Sang-Kwon;Yun, Hee-Chul;Lee, Joon-Won;Maeng, Seung-Ryoul
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.28 no.9
    • /
    • pp.461-468
    • /
    • 2001
  • Though shared virtual memory (SVM) system promise low cost solutions for high performance computing they suffer from long memory latencies. These latencies are usually caused by repetitive invalidations on shared data. Since shared data are accessed through synchronization and the patterns by which threads synchronizes are repetitive, a prefetching scheme bases on such repetitiveness would reduce memory latencies. Based on this observation, we propose a prefetching technique which predicts future access behavior by analyzing access history per synchronization variable. Our technique was evaluated on an 8-node SVM system using the SPLASH-2 benchmark. The results show the our technique could achieve 34%~45% reduction in memory access latencies.

  • PDF

A Study on Implementation of a VXIbus System Using Shared Memory Protocol (공유메모리 프로토콜을 이용한 VXIbus 시스템 구현에 관한 연구)

  • 노승환;강민호;김덕진
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.18 no.9
    • /
    • pp.1332-1347
    • /
    • 1993
  • Existing instruments are composed independently according to their function and user constructed instrumentation system with those instruments. But in the late 1980s VXI bus enables to construct instrumentation system with various modular type instruments. For an VXI bus system with the word serial protocol, an increase of data size can degrade the system performance. In this paper shared memory protocol is proposed to overcome performance degradation. The shared memory protocol is analyzed using the GSPN and compared with that of the word serial protocol. It is shown that the shared memory protocol has a better performance than the word serial protocol. The VXI bus message based-system with the proposed shared memory protocol is constructed and experimented with signal generating device and FFT analyzing device. Up to 80 KHz input signal the result of FFT analysis is accurate and that result is agree with that of conventional FFT analyzer. In signal generating experiment from 100 KHz to 1.1 GHz sine wave is generated.

  • PDF

Mutual exclusion of shared memory access in the simulation software of the midclass commuter (중형항공기 시뮬레이션 소프트웨어의 작업간 공유메모리 사용의 상호배제)

  • 이인석;이해창;이상혁
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1996.10b
    • /
    • pp.207-209
    • /
    • 1996
  • The software of the midclass commuter flight simulation is running on multiprocessor/multitasking environments The software is consist of tasks which are periodically alive at a given interval. Each task communicates via shared memory. The data shared by tasks is divided by several block. Only one task, called producer, can produce data for a data block but several tasks, called consumers, can read data from the data block. Double buffer and conditional flag are used to implement a mutual exclusion which prevents the producer and consumers from accessing the same data block simultaneously.

  • PDF

A Multicast ATM Switch Architecture using Shared Bus and Shared Memory Switch (공유 버스와 공유 메모리 스위치를 이용한 멀티캐스트 ATM 스위치 구조)

  • 강행익;박영근
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.24 no.8B
    • /
    • pp.1401-1411
    • /
    • 1999
  • Due to the increase of multimedia services, multicasting is considered as important design factor for ATM switch. To resolve the traffic expansion problem that is occurred by multicast in multistage interconnection networks, this paper proposes the multicast switch using a high-speed bus and a shared memory switch. Since the proposed switch uses a high-speed time division bus as a connection medium and chooses a shared memory switch as a basic switch module, it provides good port scalability. The traffic arbitration scheme enables internal non-blocking. By simulation we proves a good performance in the data throughput and the cell delay.

  • PDF