• Title/Summary/Keyword: Cache System

Search Result 457, Processing Time 0.033 seconds

An On-chip Multiprocessor Miroprocessor with Shared MMU and Cache

  • Lee, Yong-Hwan;Jeong, Woo-Kyeong;An, Sang-Jun;Lee, Yong-Surk
    • Journal of Electrical Engineering and information Science
    • /
    • v.2 no.4
    • /
    • pp.1-7
    • /
    • 1997
  • A multiprocessor microprocessor named SMPC(scaleable multiprocessor chip) that contains tow IU (integer unit) is presented in this paper. It can execute multiple instructions from several tasks exploiting task-level parallelism that is free from instruction dependencies, and provide high performance and throughput on both single program and multiprogramming environments. the IU is a 32-bit scalar processor expecially designed to boost up the performance of string manipulations which are frequently used in RDBMS(relational data base management system) applications. A memory management unit and a data cache shared by two IUs improve the performance and reduce the chip area required. ETH SMPC is implemented in VLSI circuit by custom design and automated design tools.

  • PDF

Caching and Prefetching Policies Using Program Page Reference Patterns on a File System Layer for NAND Flash Memory (NAND 플래시 메모리용 파일 시스템 계층에서 프로그램의 페이지 참조 패턴을 고려한 캐싱 및 선반입 정책)

  • Kim, Gyeong-San;Kim, Seong-Jo
    • Proceedings of the IEEK Conference
    • /
    • 2006.06a
    • /
    • pp.777-778
    • /
    • 2006
  • In this thesis, we design and implement a Flash Cache Core Module (FCCM) which operates on the YAFFS NAND flash memory. The FCCM applies memory replacement policy and prefetching policy based on the page reference pattern of applications. Also, implement the Clean-First memory replacement technique considering the characteristics of flash memory. In this method the decision is made according to page hit to apply prefetched waiting area. The FCCM decrease I/O hit frequency up to 37%, Compared with the linux cache and prefetching policy. Also, it operated using less memory for prefetching(maximum 24% and average 16%) compared with the linux kernel.

  • PDF

5G Network Communication, Caching, and Computing Algorithms Based on the Two-Tier Game Model

  • Kim, Sungwook
    • ETRI Journal
    • /
    • v.40 no.1
    • /
    • pp.61-71
    • /
    • 2018
  • In this study, we developed hybrid control algorithms in smart base stations (SBSs) along with devised communication, caching, and computing techniques. In the proposed scheme, SBSs are equipped with computing power and data storage to collectively offload the computation from mobile user equipment and to cache the data from clouds. To combine in a refined manner the communication, caching, and computing algorithms, game theory is adopted to characterize competitive and cooperative interactions. The main contribution of our proposed scheme is to illuminate the ultimate synergy behind a fully integrated approach, while providing excellent adaptability and flexibility to satisfy the different performance requirements. Simulation results demonstrate that the proposed approach can outperform existing schemes by approximately 5% to 15% in terms of bandwidth utilization, access delay, and system throughput.

Distributed Cache for High-Performance in real time cloud (실시간 클라우드 환경에서 HDFS의 고 성능을 위한 분산캐시)

  • Choi, Ji Hyeon;Youn, Hee Yong
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2014.07a
    • /
    • pp.351-352
    • /
    • 2014
  • 분산 파일시스템은 서로 분산된 여러 서버들을 가지고 파일 시스템을 구성함으로써 높은 확장성과 고가용성을 지원한다. HDFS는 대용량 데이터 저장장치로 처리되고 있지만 실시간 파일 접근에 관한 고려는 부족하다. 파일을 읽을 때 네임노드와 데이터 노드는 상호 작용을 하지만 엄청난 대용량의 데이터 그리고 동시작업량이 많을 때 접근수행속가 급격하게 감소하게 된다. 따라서 실시간 클라우드 서비스 환경에서 HDFS 파일 접근 수행속도를 향상시키기 위한 연구가 이슈이다. 본 논문에서는 HDFS의 위에 분산 캐시를 둔 새로운 캐시시스템을 제안한다.

  • PDF

Design of Low-Power Object-based Mobile Storage System by WLAN Power Control (WLAN 전력제어를 통한 저전력 객체기반 모바일 스토리지 시스템의 설계)

  • Jeon, Young-Joon;Choi, Min-Seok;Nam, Young-Jin
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2007.06b
    • /
    • pp.441-444
    • /
    • 2007
  • 본 논문에서는 객체기반 IP 스토리지를 이용하여 모바일 기기에서 멀티미디어 콘텐츠 재생에 적합한 저전력 객체기반 모바일 스토리지 시스템 구조를 제안한다. 멀티미디어 콘텐츠의 재생 성능을 높이기 위해 모바일 단말 측 OSD 계층에 버퍼 캐시(buffer cache)와 선반입(prefetch) 기능을 추가한다. 그리고 모바일 단말의 WLAN 전력제어를 통하여 WLAN이 가능한 한 오랜 시간 동안 Sleep 상태 또는 Power Off 상태에 있을 수 있도록 하여 전력의 소비를 줄인다. 본 연구에서는 캐시 및 선반입 기능을 위해 버퍼 캐시관리자(buffer cache manager)와 선반입 관리자(prefetch manager)를 설계하였고, WLAN 전력관리 기능을 위해 WLAN 관리자(WLAN manager)를 설계하였다.

  • PDF

Code Transformation Techniques for Scratch-Pad Memory (Scratch-Pad Memory를 위한 코드 변환 기법)

  • 문대경;이재진
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.10a
    • /
    • pp.577-579
    • /
    • 2004
  • 전원을 전적으로 배터리에 의존하는 모바일 임베디드 시스템은 배터리 용량의 한계 때문에 효율적인 에너지의 사용이 매우 중요하다. 특히 memory subsystem은 전체 system에서 소모되는 에너지에서 큰 비중을 차지한다. 이 논문은 성능 면에서 cache의 대안이 되고, cache보다 간단한 구조 때문에 전력소모가 훨씬 적은 on-chip scratch-pad memory(SPM)를 효율적으로 이용할 수 있는 소스 코드 변환 방법 및 SPM 관리방법을 제안한다. 각 함수 단위로 코드 변환을 하며, 어떤 변수를 SPM에 할당하기 위한 소스코드 변환을 했을 때, 소스코드 분석만으로 알 수 있는 변수의 정적인 참조 횟수를 가중치로 고려하여, 코드 변환 후 메모리 참조에 의한 실행 시간과 에너지 소모를 계산하고 이를 바탕으로 SPM에 할당한 변수를 결정한 다음 실제 그 코드 변환을 적용한다. 제안된 코드 변환은 컴파일러에 의해 자동화 될 수 있다. 10개의 임베디드 벤치마크 프로그램을 이용하여 본 논문에서 제안하는 방법의 성능 평가를 한 결과, 실행 시간은 평균 23% 향상되고 에너지 소모는 평균 49% 감소함을 알 수 있다.

  • PDF

David II: A new architecture for parallel rendering processors with effective memory system (David II: 효과적인 메모리 시스템을 가지는 병렬 렌더링 프로세서)

  • Lee, Kil-Whan;Park, Woo-Chan;Kim, Il-San;Han, Tack-Don
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2004.05a
    • /
    • pp.1655-1658
    • /
    • 2004
  • Current rendering processors are organized mainly to process a triangle as fast as possible and recently parallel 3D rendering processors, which can process multiple triangles in parallel with multiple rasterizers, begin to appear. For high performance in processing triangles, it is desirable for each rasterizer have its own local pixel cache. However, the consistency problem may occur in accessing the data at the same address simultaneously by more than one rasterizer. In this paper, we propose a parallel rendering processor architecture, called DAVID II, resolving such consistency problem effectively. Moreover, the proposed architecture reduces the latency due to a pixel cache miss significantly. The experimental results show that DAVID II achieves almost linear speedup at best case even in sixteen rasterizers.

  • PDF

A Performance Improvement of Linux TCP/IP Stack based on Flow-Level Parallelism in a Multi-Core System (멀티코어 시스템에서 흐름 수준 병렬처리에 기반한 리눅스 TCP/IP 스택의 성능 개선)

  • Kwon, Hui-Ung;Jung, Hyung-Jin;Kwak, Hu-Keun;Kim, Young-Jong;Chung, Kyu-Sik
    • The KIPS Transactions:PartA
    • /
    • v.16A no.2
    • /
    • pp.113-124
    • /
    • 2009
  • With increasing multicore system, much effort has been put on the performance improvement of its application. Because multicore system has multiple processing devices in one system, its processing power increases compared to the single core system. However in many cases the advantages of multicore can not be exploited fully because the existing software and hardware were designed to be suitable for single core. When the existing software runs on multicore, its performance improvement is limited by the bottleneck of sharing resources and the inefficient use of cache memory on multicore. Therefore, according as the number of core increases, it doesn't show performance improvement and shows performance drop in the worst case. In this paper we propose a method of performance improvement of multicore system by applying Flow-Level Parallelism to the existing TCP/IP network application and operating system. The proposed method sets up the execution environment so that each core unit operates independently as much as possible in network application, TCP/IP stack on operating system, device driver, and network interface. Moreover it distributes network traffics to each core unit through L2 switch. The proposed method allows to minimize the sharing of application data, data structure, socket, device driver, and network interface between each core. Also it allows to minimize the competition among cores to take resources and increase the hit ratio of cache. We implemented the proposed methods with 8 core system and performed experiment. Experimental results show that network access speed and bandwidth increase linearly according to the number of core.

The Power and Pitfalls of Data Prefetching (데이터 미리읽기의 동작과 문제점)

  • Ki, An-do
    • Electronics and Telecommunications Trends
    • /
    • v.13 no.4 s.52
    • /
    • pp.59-69
    • /
    • 1998
  • The terminology of data prefetching is introduced, which includes stride, repeat distance, stall, pending stall, prefetch degree, prefetch distance, and prefetch offset. The effectiveness of hardware data prefetching in reducing cache misses is shown by presenting a square matrix multiplication example. Thereafter the pitfalls of prefetching and possible solutions are discussed.

A Cache buffer and Read Request-aware Request Scheduling Method for NAND flash-based Solid-state Disks (캐시 버퍼와 읽기 요청을 고려한 낸드 플래시 기반 솔리드 스테이트 디스크의 요청 스케줄링 기법)

  • Bang, Kwanhu;Park, Sang-Hoon;Lee, Hyuk-Jun;Chung, Eui-Young
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.8
    • /
    • pp.143-150
    • /
    • 2013
  • Solid-state disks (SSDs) have been widely used by high-performance personal computers or servers due to its good characteristics and performance. The NAND flash-based SSDs, which take large portion of the whole NAND flash market, are the major type of SSDs. They usually integrate a cache buffer which is built from DRAM and uses the write-back policy for better performance. Unfortunately, the policy makes existing scheduling methods less effective at the I/F level of SSDs Therefore, in this paper, we propose a scheduling method for the I/F with consideration of the cache buffer. The proposed method considers the hit/miss status of cache buffer and gives higher priority to the read requests. As a result, the requests whose data is hit on the cache buffer can be handled in advance and the read requests which have larger effects on the whole system performance than write requests experience shorter latency. The experimental results show that the proposed scheduling method improves read latency by 26%.