• Title/Summary/Keyword: Memory contention

Search Result 32, Processing Time 0.025 seconds

A Lock Mechanism for HiPi-bus Based Multiprocessor Systems (HiPi-bus 구조의 다중 프로세서 시스템에서의 잠금장치)

  • 윤용호;임인칠
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.30B no.2
    • /
    • pp.33-43
    • /
    • 1993
  • Lock mechanism is essential for synchronization on the multiprocessor systems. Lock mechanism needs to reduce the time for lock operation in low lock contention. Lock mechanism must consider the case of the high lock contention. The conventional lock control scheme in memory results in the increase of bus traffic and memory utilization in lock operation. This paper suggests a lock scheme which stores the lock data in cache and manages it efficiently to reduce the time spent in lock operation when the lock contention is low on a multiprocessor system built on HiPi-bus(Highly Pipelined bus). This paper also presents the design of the HIPi-CLOCK (Highly Pipelined bus Cache LOCK mechanism) which transfere the data from on cache to another when the lock contention is high. The designed simulator compares the conventional lock scheme which controls the lock in memory with the suggested HiPi-CLOCK scheme in terms of the RMW(Read-Modify-Write) operation time using simulated trace. It is shown that the suggested lock control scheme performance is over twice than that of the conventional method in low lock contention. When the lock contention is high, the performance of the suggested scheme increases as the number of the shared lock data increases.

  • PDF

DVFS based Memory-Contention Aware Scheduling Method for Multi-threaded Workloads (멀티쓰레드 워크로드를 위한 DVFS 기반 메모리 경합 인지 스케줄링 기법)

  • Nam, Yoonsung;Kang, Minkyu;Yeom, HeonYoung;Eom, Hyeonsang
    • KIISE Transactions on Computing Practices
    • /
    • v.24 no.1
    • /
    • pp.10-16
    • /
    • 2018
  • The task of consolidating server workloads is critical for the efficiency of a datacenter in terms of reducing costs. However, as a greater number of workloads are consolidated in a single server, the performance of workloads might be degraded due to their contention to the limited shared resources. To reduce the performance degradation, scheduling for mitigating the contention of shared resources is necessary. In this paper, we present the Dynamic Voltage Frequency Scaling (DVFS) based memory-contention aware scheduling method for multi-threaded workloads. The proposed method uses two approaches: running memory-intensive threads on the limited cores to avoid concurrent memory accesses, and reducing the frequencies of the cores that run memory-intensive threads. With the proposed algorithm, we increased performance by 43% and reduced power consumption by 38% compared to the Completely Fair Scheduler(CFS), the default scheduler of Linux.

Multi -Core Transactional Memory for High Contention Parallel Processing (집중 충돌 병렬 처리를 위한 효율적인 다중 코어 트랜잭셔널 메모리)

  • Kim, Seung-Hun;Kim, Sun-Woo;Ro, Won-Woo
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.48 no.1
    • /
    • pp.72-79
    • /
    • 2011
  • The importance of parallel programming seriously emerges ever since the modern microprocessor architecture has been shifted to the multi-core system. Transactional Memory has been proposed to address synchronization which is usually implemented by using locks. However, the lock based synchronization method reduces the parallelism and has the possibility of causing deadlock. In this paper, we propose an efficient method to utilize transactional memory for the situation which has high contention. The proposed idea is based on the theoretical analysis and it is verified with simulation results. The simulation environment has been implemented using HTM(Hardware Transactional Memory) systems. We also propose a model of the dining philosopher problem to discuss the efficient resource management using the transactional memory technique.

Design of Contention Free Parallel MAP Decode Module (메모리 경합이 없는 병렬 MAP 복호 모듈 설계)

  • Chung, Jae-Hun;Rim, Chong-Suck
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.48 no.1
    • /
    • pp.39-49
    • /
    • 2011
  • Turbo code needs long decoding time because of iterative decoding. To communicate with high speed, we have to shorten decoding time and it is possible with parallel process. But memory contention can cause from parallel process, and it reduces performance of decoder. To avoid memory contention, QPP interleaver is proposed in 2006. In this paper, we propose MDF method which is fit to QPP interleaver, and has relatively short decoding time and reduced logic. And introduce the design of MAP decode module using MDF method. Designed decoder is targetted to FPGA of Xilinx, and its throughput is 80Mbps maximum.

A Design of Parallel Turbo Decoder based on Double Flow Method Using Even-Odd Cross Mapping (짝·홀 교차 사상을 이용한 Double Flow 기법 기반 병렬 터보 복호기 설계)

  • Jwa, Yu-Cheol;Rim, Chong-Suck
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.54 no.7
    • /
    • pp.36-46
    • /
    • 2017
  • The turbo code, an error correction code, needs a long decoding time since the same decoding process must be repeated several times in order to obtain a good BER performance. Thus, parallel processing may be used to reduce the decoding time, in which case there may be a memory contention that requires additional buffers. The QPP interleaving has been proposed to avoid such case, but there is still a possibility of memory contention when a decoder is constructed using the so-called double flow technique. In this paper, we propose an even-odd cross mapping technique to avoid memory conflicts even in decoding using the double-flow technique. This method uses the address generation characteristic of the QPP interleaving and can be used to implement the interleaving circuit between the decoding blocks and the LLR memory blocks. When the decoder implemented by applying the double flow and the proposed methods is compared with the decoder by the conventional MDF techniques, the decoding time is reduced by up to 32% with the total area increase by 8%.

A photonic packet switching system with contention resolution capability (충돌제어 기능을 갖는 광 패킷 스위칭 시스템 연구)

  • 이기철;이성철;이성근;정지채;강철희;박진우
    • Journal of the Korean Institute of Telematics and Electronics D
    • /
    • v.34D no.8
    • /
    • pp.52-61
    • /
    • 1997
  • In this paper it is proposed a new architecture for N*N optical packet switching system. It consists of active-splitter type pf packet router, travelling type of optical buffer memory for packet contention resoltuion and an electronic controller. the BER performance of the proposed switching system is analyzed with respect to channel crosstalks and amplified spontaneous emissio noise form switching elements and optical amplifiers respectively. Operational validity of the proposed switching system is also experimentally proved by realizing 2*2 optical packet switching system.

  • PDF

Distributed memory access architecture and control for fully disaggregated datacenter network

  • Kyeong-Eun Han;Ji Wook Youn;Jongtae Song;Dae-Ub Kim;Joon Ki Lee
    • ETRI Journal
    • /
    • v.44 no.6
    • /
    • pp.1020-1033
    • /
    • 2022
  • In this paper, we propose novel disaggregated memory module (dMM) architecture and memory access control schemes to solve the collision and contention problems of memory disaggregation, reducing the average memory access time to less than 1 ㎲. In the schemes, the distributed scheduler in each dMM determines the order of memory read/write access based on delay-sensitive priority requests in the disaggregated memory access frame (dMAF). We used the memory-intensive first (MIF) algorithm and priority-based MIF (p-MIF) algorithm that prioritize delay-sensitive and/or memory-intensive (MI) traffic over CPU-intensive (CI) traffic. We evaluated the performance of the proposed schemes through simulation using OPNET and hardware implementation. Our results showed that when the offered load was below 0.7 and the payload of dMAF was 256 bytes, the average round trip time (RTT) was the lowest, ~0.676 ㎲. The dMM scheduling algorithms, MIF and p-MIF, achieved delay less than 1 ㎲ for all MI traffic with less than 10% of transmission overhead.

Extended PCF(EPCF) Mechanism for Wireless LAN MAC (Wireless LAN MAC을 위한 Extended PCF(EPCF) 방법)

  • Lee, Ho-Seok;Suh, Byung-Suhl
    • Proceedings of the KIEE Conference
    • /
    • 2002.11c
    • /
    • pp.31-34
    • /
    • 2002
  • There are two kinds of network architectures in the IEEE 802.11:[1] distributed (ad-hoc) or centralized (infrastructure) wireless network. Centralized networks have an access point (base station) that can control the wireless medium access of stations in these networks. The 802.11 MAC protocol of an access point is the same as those of other stations in the contention period. This paper propose a novel MAC protocol of an access point to solve these problems. This MAC protocol adds a new contention-free period called EPCF (Extended PCF) to resolve accumulated data in the queue of an access point. Simulation results show that the new protocol performs better throughput than the 802.11 standard MAC with the less queue memory site requirement.

  • PDF

Size Reduction and Performance Analysis of the Bit-map Table Used in the Bus-based Shared Memory System (버스기반의 공유메모리 시스템에서 사용된 비트맵 테이블의 크기 축소와 성능 분석)

  • Woo, Jong-Jung;Lee, Ka-Young
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.1
    • /
    • pp.24-32
    • /
    • 1998
  • The bus contention among bus-based shared-memory multiprocessors limits their performance. In addition, under split bus transaction environment, multiprocessors may make some memory requests unnecessary stand by in the memory access buffer, which makes system performance worse. This unnecessary stand-by can be eliminated by maintaining the bitmap table which contains the status bit for each memory block. However, this mechanism requires a great size of SRAM for the status information, which is fully mapped from the whole memory blocks. To solve this problem, we propose a bitmap cache which exploits partial mapping and locality of references. The simulation results show that the proposed system can greatly reduce the capacity of SRAM for the status information with little deteriorating its performance.

  • PDF

Performance Analysis of Bus Arbitration Schemes for Multiple-bus Multiprocessor System (다중버스 다중프로세서 시스템을 위한 버스 중재 방식의 성능 분석)

  • 김종현
    • Journal of the Korea Society for Simulation
    • /
    • v.2 no.1
    • /
    • pp.13-22
    • /
    • 1993
  • In a multiple-bus multiprocessor system in which processors and memory modulus are interconnected through system buses, time delay due to bus contention degrades system performance. In order to reduce such a problem , and optimal bus arbitration scheme and its hardware are neccessary. In this study, performaces of four arbitration schemes are analyzed and compared : fixed-priority, equal-priority, rotating-priority and round-robin priority schemes. For the study, the software simulator of a multiple-bus multiprocessor system is developed by using SLAM II. Simulation results show that, when memory sccesses are evenly distributed to all memory modulus, round-robin priority scheme provides the best performance. But when a hot spot exists, the use of the fixed priority scheme results in the shortest access time.

  • PDF