통합 검색 | Korea Science

An Interference Matrix Based Approach to Bounding Worst-Case Inter-Thread Cache Interferences and WCET for Multi-Core Processors

Yan, Jun;Zhang, Wei
- Journal of Computing Science and Engineering
- /
- 제5권2호
- /
- pp.131-140
- /
- 2011
Different cores typically share the last-level cache in a multi-core processor. Threads running on different cores may interfere with each other. Therefore, the multi-core worst-case execution time (WCET) analyzer must be able to safely and accurately estimate the worst-case inter-thread cache interference. This is not supported by current WCET analysis techniques that manly focus on single thread analysis. This paper presents a novel approach to analyze the worst-case cache interference and bounding the WCET for threads running on multi-core processors with shared L2 instruction caches. We propose to use an interference matrix to model inter-thread interference, on which basis we can calculate the worst-case inter-thread cache interference. Our experiments indicate that the proposed approach can give a worst-case bound less than 1%, as in benchmark fib-call, and an average 16.4% overestimate for threads running on a dual-core processor with shared-L2 cache. Our approach dramatically improves the accuracy of WCET overestimatation by on average 20.0% compared to work.
https://doi.org/10.5626/JCSE.2011.5.2.131 인용 PDF KPUBS

공유 데이터베이스 환경에서 고성능 트랜잭션 처리를 위한 버퍼 무효화 기법 (Buffer Invalidation Schemes for High Performance Transaction Processing in Shared Database Environment)

김신희;배정미;강병욱
- 한국정보시스템학회지:정보시스템연구
- /
- 제6권1호
- /
- pp.159-180
- /
- 1997
Database sharing system(DBSS) refers to a system for high performance transaction processing. In DBSS, the processing nodes are locally coupled via a high speed network and share a common database at the disk level. Each node has a local memory, a separate copy of operating system, and a DBMS. To reduce the number of disk accesses, the node caches database pages in its local memory buffer. However, since multiple nodes may be simultaneously cached a page, cache consistency must be ensured so that every node can always access the latest version of pages. In this paper, we propose efficient buffer invalidation schemes in DBSS, where the database is logically partitioned using primary copy authority to reduce locking overhead. The proposed schemes can improve performance by reducing the disk access overhead and the message overhead due to maintaining cache consistency. Furthermore, they can show good performance when database workloads are varied dynamically.
PDF

Study of Cache Performance on GPGPU

Choi, Kyu Hyun;Kim, Seon Wook
- IEIE Transactions on Smart Processing and Computing
- /
- 제4권2호
- /
- pp.78-82
- /
- 2015
General-purpose graphics processing units (GPGPUs) provide tremendous computational and processing power. Despite the latency hiding mechanism, a GPU architecture requires high memory bandwidth and lower latency between computational units and the memory system. For this reason, the current GPU architecture has private L1 caches in each core and a shared L2 cache to increase performance by reducing memory latency. But in some cases, this CPU-like cache design is not suitable for GPGPUs. In this paper, we analyze detailed cache performance related to GPGPU application characteristics, and suggest technical alternatives for the GPGPU architecture as future work.
https://doi.org/10.5573/IEIESPC.2015.4.2.078 인용 PDF KSCI

다중 프로세서 칩을 위한 시스템 제어 장치의 구조설계 및 FPGA 구현 (Architecture design and FPGA implementation of a system control unit for a multiprocessor chip)

박성모;정갑천
- 전자공학회논문지C
- /
- 제34C권12호
- /
- pp.9-19
- /
- 1997
This paper describes the design and FPGA implementation of a system control unit within a multiprocessor chip which can be used as a node processor ina massively parallel processing (MPP) caches, memory management units, a bus unit and a system control unit. Major functions of the system control unit are locking/unlocking of the shared variables of protected access, synchronization of instruction execution among four integer untis, control of interrupts, generation control of processor's status, etc. The system control unit was modeled in very high level using verilog HDL. Then, it was simulated and verified in an environment where trap handler and external interrupt controller were added. Functional blocks of the system control unit were changed into RTL(register transfer level) model and synthesized using xilinx FPGA cell library in synopsys tool. The synthesized system control unit was implemented by Xilinx FPGA chip (XC4025EPG299) after timing verification.
PDF

쓰기 횟수 감소를 위한 하이브리드 캐시 구조에서의 캐시간 직접 전송 기법에 대한 연구 (A Study on Direct Cache-to-Cache Transfer for Hybrid Cache Architecture to Reduce Write Operations)

최주희
- 반도체디스플레이기술학회지
- /
- 제23권1호
- /
- pp.65-70
- /
- 2024
Direct cache-to-cache transfer has been studied to reduce the latency and bandwidth consumption related to the shared data in multiprocessor system. Even though these studies lead to meaningful results, they assume that caches consist of SRAM. For example, if the system employs the non-volatile memory, the one of the most important parts to consider is to decrease the number of write operations. This paper proposes a hybrid write avoidance cache coherence protocol that considers the hybrid cache architecture. A new state is added to finely control what is stored in the non-volatile memory area, and experimental results showed that the number of writes was reduced by about 36% compared to the existing schemes.
PDF

슬롯링으로 연결된 다중처리기 시스템에서 최적화된 캐쉬일관성 프로토콜 (An Optimized Cache Coherence Protocol in Multiprocessor System Connected by Slotted Ring)

민준식;장태무
- 한국정보처리학회논문지
- /
- 제7권12호
- /
- pp.3964-3975
- /
- 2000
다중처리기 시스템에서 여러 처리기 캐쉬들 간에 일고나성을 유지하기 위한 정책에는 기록무효화 정책과 기록갱신 정책이 있다. 기록 무효와 정책은 처리기사 캐쉬 블록에 기록을 시도할 때마다 다른 캐쉬에 저장된 동일한 모든 복사본을 무효화한다. 이러한 빈번한 무효화로 인하여, 기록 무효화 정책은 캐쉬 적중률이 낮다. 반면에 기록 갱신정책은 동일한 블록을 무효화 시키는 것이 아니라 동시에 갱신하는 정책이다. 이러한 정책의 경우에 블록의 공유 여부에 상관없이 갱신된 내용을 상호 연결망ㅇ르 통하여 전송해야만 하며 이로 인하여 상호 연결망상에 교통량이 폭주하게 된다. 본 논문에서는 슬롯링으로 연결된 공유메모리 다중처리기 시스템에서 효율적인 캐쉬 일관성 정책을 제안한다. 제안된 프로오콜은 기록 갱신정책을 기반으로 하며 공유된 블록을 갱신할 경우에만 갱신된 내용을 전송한다. 반면 갱신된 블록이 공유되지 않은 블록이면 갱신된 내용을 전송하지 않는다. 본 논문에서는 제안된 프로토콜은 분석하고 시뮬레이션을 통하여 기존의 프로토콜과 성능을 비교한다.
PDF

검색결과 16건 처리시간 0.019초

An Interference Matrix Based Approach to Bounding Worst-Case Inter-Thread Cache Interferences and WCET for Multi-Core Processors

공유 데이터베이스 환경에서 고성능 트랜잭션 처리를 위한 버퍼 무효화 기법 (Buffer Invalidation Schemes for High Performance Transaction Processing in Shared Database Environment)

Study of Cache Performance on GPGPU

다중 프로세서 칩을 위한 시스템 제어 장치의 구조설계 및 FPGA 구현 (Architecture design and FPGA implementation of a system control unit for a multiprocessor chip)

쓰기 횟수 감소를 위한 하이브리드 캐시 구조에서의 캐시간 직접 전송 기법에 대한 연구 (A Study on Direct Cache-to-Cache Transfer for Hybrid Cache Architecture to Reduce Write Operations)

슬롯링으로 연결된 다중처리기 시스템에서 최적화된 캐쉬일관성 프로토콜 (An Optimized Cache Coherence Protocol in Multiprocessor System Connected by Slotted Ring)

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)