Search | Korea Science

Remote Cache Replacement Policy using Processor Locality in Multi-Processor System (다중 프로세서 시스템에서 프로세서 지역성을 이용한 원격 캐쉬 교체 정책)

Han Sang Yoon;Kwak Jong Wook;Jhang Seong Tae;Jhon Chu Shik
- Journal of KIISE:Computer Systems and Theory
- /
- v.32 no.11_12
- /
- pp.541-556
- /
- 2005
The memory access latency of the system has been a primary factor of performance degradation in single-processor system and multi-processor system. The remote memory access latency takes a lot of overhead over the local memory access latency especially in the distributed shared-memory system. To resolve this problem, the multi-level cache architecture that contains a remote cache in the multi-processor system has been proposed. In this paper, we propose a new cache replacement policy that improves the performance of the multi-processor system with the remote cache. If the multi-level cache keeps the multi-level inclusion(MLI) property and uses the LRU(Least Recently Used) cache replacement policy, the LRU information of the higher-level cache(a processor cache) would be different with that of the lower-level cache(a remote cache). In this situation, the replacement of a remote cache line can induce the exchange of a processor cache line that is used by the processor. It is a main factor of performance degradation in a whole system. To alleviate this disadvantage of the LRU replacement polity, the new policy analyses tht processor's remote memory access pattern of each node and uses this information to reduce the number of invalidations of the useful cache line in the higher-level cache. The new replacement policy of the remote cache can improve the performance by $3.5\%$ in maximum and $2.5\%$ in average on SPLASH-2 benchmarks, compared to the general LRU cache replacement policy.
PDF KSCI

Keeping-ownership Cache Replacement Policies for Remote Access Caches of NUMA System (NUMA 시스템에서 소유권에 근거한 원격 캐시 교체 정책)

신숭현;곽종욱;장성태;전주식
- Journal of KIISE:Computer Systems and Theory
- /
- v.31 no.8
- /
- pp.473-486
- /
- 2004
NUMA systems have remote access caches(RAC) in each local node to reduce the overhead for repeated remote memory accesses. By this RAC, memory latency and network traffic can be reduced and the performance of the multiprocessor system can be improved. Until now, several cache replacement policies have been proposed in recent years, and there also is cache replacement policy for multiprocessor systems. In this paper, we propose a cache replacement policy which is based on cache line coherence information. In this policy, the cache line that does not have an ownership is replaced first with respect to cache line that has an ownership. Like this way, the overhead to transfer ownership is avoided and the memory latency can be decreased. We also propose “Keeping-Ownership replacement policy with MRU (KOM)” and “Keeping-Ownership replacement policy with Reference Bit(KORB)” to reduce the frequent replacement penalty of the ownership-lacking cache line. We compare and analyze these with LRU and Pseudo LRU(PLRU). The simulation shows that KOM outperforms the PLRU by 25%, and KORB outperforms the PLRU by 13%. Although the hardware cost of KOM is very small, the performance of KOM is nearly equal to that of the LRU.
PDF KSCI

A Remote Cache Coherence Protocol for Single Shared Memory in Multiprocessor System (단일 공유 메모리를 가지는 다중 프로세서 시스템의 원격 캐시 일관성 유지 프로토콜)

Kim, Seong-Woon;Kim, Bo-Gwan
- Journal of the Institute of Electronics Engineers of Korea CI
- /
- v.42 no.6
- /
- pp.19-28
- /
- 2005
The multiprocessor architecture is a good method to improve the computer system performance. The CC-NUMA provides a single shared space with the physically distributed memories is used widely in the multiprocessor computer system. A CC-NUMA has the full-mapped directory for the shared memory md uses a remote cache memory for tile fast memory access. In this paper, we propose a processing node architecture for a CC-NUMA system and a cache coherency protocol on the physically distributed but logically shared system. We show an implementation result of the system which is adopted the cache coherency protocol.
PDF KSCI

Formal Verification of RACE Protocol Using VIS (VIS를 이용한 RACE 포로토콜의 정형검증)

Um, Hyun-Sun;Choi, JIn-Young;Han, Woo-Jong;Ki, An-Do;Shim, Kyu-Hyun
- The Transactions of the Korea Information Processing Society
- /
- v.7 no.7
- /
- pp.2219-2228
- /
- 2000
Caches in a multiprocessing environment introduce the cache coherence problem. When multiple processors maintain locally cached copies of a unique shared-memory location, any local modification of the location can result in a globally inconsistent view of memory. Cache coherence protocols are important to operate a shared-memory multiprocessor system with efficiency and correctness. Since random testing and simulations are not enough to validate correctness of protocols, it is necessary to develop efficient and reliable verification methods. In this appear we present our experience in using VIS (Verification Interacting with Synthesis), a tool of formal method, to analyze a number of property of a cache coherence protocol, RACE (Remote Access Cache coherent Enforcement).
PDF

Meta Data Caching Mechanism in Distributed Directory Database Systems (분산 디렉토리 데이터베이스 시스템에서의 메타 데이터 캐싱 기법)

Lee, Kang-Woo;Koh, Jin-Gwang
- The Transactions of the Korea Information Processing Society
- /
- v.7 no.6
- /
- pp.1746-1752
- /
- 2000
In this paper, a cache mechanism is proposed to improve the speed of query processing in distributed director database systems. To decrease search time of requested objects and query processing time. query requests and results about objects in a remote site are stored in the cache of a local site. Cache system architecture is designed according to the classified information. Cache schema are designed for each cache information. Operational algorithms are developed for meta data cache which has meta data tree. This tree improves the speed of query processing by reducing the scope of search space. Finally, performance evaluation is performed by comparing the proposed cache mechanism with X500.
PDF

A Study on the Block Lookup and Replacement in Global Memory (전역적 메모리에서의 블록 룩업과 재배치에 관한 연구)

이영섭;김은경;정병수
- Proceedings of the IEEK Conference
- /
- 2000.11c
- /
- pp.51-54
- /
- 2000
Due to the emerging of high-speed network, lots of interests of access to remote data have increased. Those interests motivate using of Cooperative Caching that uses remote cache like local cache by sharing other clients' cache. The conventional algorithm like GMS(Global Memory Service) has some disadvantages that occurred bottleneck and decreasing performance because of exchanges of many messages to server or manager. On the other hand, Hint-based algorithm resolves a GMS's server bottleneck as each client has hint information of all blocks. But Hint-based algorithm also causes some problems such as inaccurate information in it, if it has too old hint information. In this paper, we offer the policy that supplement bottleneck and inaccuracy； by using file identifier that can search for the lookup table and by exchanging oldest block information between each client periodically.
PDF

Improving Performance of Internet by Using Hierarchical Proxy Cache (계층적 프록시 캐쉬를 이용한 인터넷 성능 향상 기법)

이효일;김종현
- Journal of the Korea Society for Simulation
- /
- v.9 no.2
- /
- pp.1-14
- /
- 2000
Recently, as construction of information infra including high-speed communication networks remarkably expands, more various information services have been provided. Thus the number of internet users rapidly increases, and it results in heavy load on Web server and higher traffics on networks. The phenomena cause longer response time that means worse quality of service. To solve such problems, much effort has been attempted to loosen bottleneck on Web server, reduce traffic on networks and shorten response times by caching informations being accessed more frequently at the proxy server that is located near to clients. And it is also possible to improve internet performance further by allowing clients to share informations stored in proxy caches. In this paper, we perform simulations of hierarchical proxy caches with the 3-level 4-ary tree structure by using real web traces, and analyze cache hit ratio for various cache replacement policies and cache sizes when the delayed-store scheme is applied. According to simulation results, the delayed-store scheme increases the remote cache hit ratio, that improves quality of service by shortening the service response time.
PDF

Design of cache mechanism in distributed directory environment (분산 디렉토리 환경 하에서 효율적인 캐시 메카니즘 설계)

이강우;이재호;임해철
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.22 no.2
- /
- pp.205-214
- /
- 1997
In this paper, we suggest a cache mechanism to improve the speed fo query processing in distributed directory environment. For this, request and result and result about objects in remote site are store in the cache of local site. A cache mechanism developed through six phases; 1) Cached information which stored in distributed directory system is classified as application data, system data and meta data. 2) Cache system architecture is designed according to classified information. 3) Cache schema are designed for each cache information. 4) Least-TTL algorithms which use the weighted value of geograpical information and access frquency for replacements are developed for datacaches(application cache, system cache). 5) Operational algorithms are developed for meta data cache which has meta data tree. This tree is based on the information of past queries and improves the speed ofquery processing by reducing the scope of search space. 6) Finally, performance evaluations are performed by comparing with proposed cache mechanism and other mechanisms.
PDF

A New Parameter Estimation Method for a Zipf-like Distribution for Geospatial Data Access

Li, Rui;Feng, Wei;Wang, Hao;Wu, Huayi
- ETRI Journal
- /
- v.36 no.1
- /
- pp.134-140
- /
- 2014
Many reports have shown that the access pattern for geospatial tiles follows Zipf's law and that its parameter ${\alpha}$ represents the access characteristics. However, visits to geospatial tiles have temporal and spatial popularities, and the ${\alpha}$-value changes as they change. We construct a mathematical model to simulate the user's access behavior by studying the attributes of frequently visited tile objects to determine parameter estimation algorithms. Because the least squares (LS) method in common use cannot obtain an exact ${\alpha}$-value and does not provide a suitable fit to data for frequently visited tiles, we present a new approach, which uses a moment method of estimation to obtain the value of ${\alpha}$ when ${\alpha}$ is close to 1. When ${\alpha}$ is further away from 1, the method uses the associated cache hit ratio for tile access and uses an LS method based on a critical cache size to estimate the value of ${\alpha}$. The decrease in the estimation error is presented and discussed in the section on experiment results. This new method, which provides a more accurate estimate of ${\alpha}$ than earlier methods, promises more effective prediction of requests for frequently accessed tiles for better caching and load balancing.
https://doi.org/10.4218/etrij.14.0113.0293 인용 PDF KSCI

Application Behavior-oriented Adaptive Remote Access Cache in Ring based NUMA System (링 구조 NUMA 시스템에서 적응형 다중 그레인 원격 캐쉬 설계)

곽종욱;장성태;전주식
- Journal of KIISE:Computer Systems and Theory
- /
- v.30 no.9
- /
- pp.461-476
- /
- 2003
Due to the implementation ease and alleviation of memory bottleneck effect, NUMA architecture has dominated in the multiprocessor systems for the past several years. However, because the NUMA system distributes memory in each node, frequent remote memory access is a key factor of performance degradation. Therefore, efficient design of RAC(Remote Access Cache) in NUMA system is critical for performance improvement. In this paper, we suggest Multi-Grain RAC which can adaptively control the RAC line size, with respect to each application behavior Then we simulate NUMA system with multi-grain RAC using MINT, event-driven memory hierarchy simulator. and analyze the performance results. At first, with profile-based determination method, we verify the optimal RAC line size for each application and, then, we compare and analyze the performance differences among NUMA systems with normal RAC, with optimal line size RAC, and with multi-grain RAC. The simulation shows that the worst case can be always avoided and results are very close to optimal case with any combination of application and RAC format.
PDF KSCI

Search Result 33, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)