• Title/Summary/Keyword: 캐시 데이터

Search Result 256, Processing Time 0.027 seconds

Memory-Efficient High Performance Parallelization of Aho-Corasick Algorithm on Intel Xeon Phi (Intel Xeon Phi 에서의 Aho-Corasick 알고리즘을 위한 메모리 친화적인 고성능 병렬화)

  • Tran, Nhat-Phuong;Jeong, Yosang;Lee, Myungho
    • Annual Conference of KIPS
    • /
    • 2014.04a
    • /
    • pp.87-89
    • /
    • 2014
  • Aho-Corasick (AC) algorithm is a multiple patterns string matching algorithm commonly used in many applications with real-time performance requirements. In this paper, we parallelize the AC algorithm on the Intel's Many Integrated Core (MIC) Architecture, Xeon Phi Coprocessor. We propose a new technique to compress the Deterministic Finite Automaton structure which represents the set of pattern strings again which the input data is inspected for possible matches. The new technique reduces the cache misses and leads to significantly improved performance on Xeon Phi.

Distributed In-Memory Caching Method for ML Workload in Kubernetes (쿠버네티스에서 ML 워크로드를 위한 분산 인-메모리 캐싱 방법)

  • Dong-Hyeon Youn;Seokil Song
    • Journal of Platform Technology
    • /
    • v.11 no.4
    • /
    • pp.71-79
    • /
    • 2023
  • In this paper, we analyze the characteristics of machine learning workloads and, based on them, propose a distributed in-memory caching technique to improve the performance of machine learning workloads. The core of machine learning workload is model training, and model training is a computationally intensive task. Performing machine learning workloads in a Kubernetes-based cloud environment in which the computing framework and storage are separated can effectively allocate resources, but delays can occur because IO must be performed through network communication. In this paper, we propose a distributed in-memory caching technique to improve the performance of machine learning workloads performed in such an environment. In particular, we propose a new method of precaching data required for machine learning workloads into the distributed in-memory cache by considering Kubflow pipelines, a Kubernetes-based machine learning pipeline management tool.

  • PDF

A Study on Non-query Based Model Extraction Attacks (쿼리를 사용하지 않는 딥러닝 모델 탈취 공격 연구)

  • Cho, Yungi;Lee, Younghan;Jun, Sohee;Paek, Yunheung
    • Annual Conference of KIPS
    • /
    • 2021.05a
    • /
    • pp.219-222
    • /
    • 2021
  • 인공지능 기술은 모든 분야에서 혁신을 이뤄내고 있다. 이와 동시에 인공지능 모델에 대한 여러 보안적인 문제점이 야기되고 있다. 그 중 대표적인 문제는 많은 인적/물적 자원을 통해 개발한 모델을 악의적인 사용자가 탈취하는 것이다. 모델 탈취가 발생할 경우, 경제적인 문제뿐만 아니라 모델 자체의 취약성을 드러낼 수 있다. 현재 많은 연구가 쿼리를 통해 얻는 모델의 입력과 출력을 분석하여 모델의 의사경계면 또는 모델의 기능성을 탈취하고 있다. 하지만 쿼리 기반의 탈취 공격은 획득할 수 있는 정보가 제한적이기 때문에 완벽한 탈취가 어렵다. 이에 따라 딥러닝 모델 연산 과정에서 데이터 스니핑 또는 캐시 부채널 공격을 통해 추가적인 정보 또는 완전한 모델을 탈취하려는 연구가 진행되고 있다. 본 논문에서는 최근 연구 동향과 쿼리 기반 공격과의 차이점을 분석하고 연구한다.

NVM-based Write Amplification Reduction to Avoid Performance Fluctuation of Flash Storage (플래시 스토리지의 성능 지연 방지를 위한 비휘발성램 기반 쓰기 증폭 감소 기법)

  • Lee, Eunji;Jeong, Minseong;Bahn, Hyokyung
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.16 no.4
    • /
    • pp.15-20
    • /
    • 2016
  • Write amplification is a critical factor that limits the stable performance of flash-based storage systems. To reduce write amplification, this paper presents a new technique that cooperatively manages data in flash storage and nonvolatile memory (NVM). Our scheme basically considers NVM as the cache of flash storage, but allows the original data in flash storage to be invalidated if there is a cached copy in NVM, which can temporarily serve as the original data. This scheme eliminates the copy-out operation for a substantial number of cached data, thereby enhancing garbage collection efficiency. Experimental results show that the proposed scheme reduces the copy-out overhead of garbage collection by 51.4% and decreases the standard deviation of response time by 35.4% on average.

Main Memory Spatial Database Clusters for Large Scale Web Geographic Information Systems (대규모 웹 지리정보시스템을 위한 메모리 상주 공간 데이터베이스 클러스터)

  • Lee, Jae-Dong
    • Journal of Korea Spatial Information System Society
    • /
    • v.6 no.1 s.11
    • /
    • pp.3-17
    • /
    • 2004
  • With the rapid growth of the Internet geographic information services through the WWW such as a location-based service and so on. Web GISs (Geographic Information Systems) have also come to be a cluster-based architecture like most other information systems. That is, in order to guarntee high quality of geographic information service without regard to the rapid growth of the number of users, web GISs need cluster-based architecture that will be cost-effective and have high availability and scalability. This paper proposes the design of the cluster-based web GIS with high availability and scalability. For this, each node within a cluster-based web GIS consists of main memory spatial databases which accomplish role of caching by using data declustering and the locality of spatial query. Not only simple region queries but also the proposed system processed spatial join queries effectively. Compare to the existing method. Parallel R-tree spatial join for a shared-Nothing architecture, the result of simulation experiments represents that the proposed spatial join method achieves improvement of performance respectively 23% and 30% as data quantity and nodes of cluster become large.

  • PDF

Page Replacement Algorithm for Improving Performance of Hybrid Main Memory (하이브리드 메인 메모리의 성능 향상을 위한 페이지 교체 기법)

  • Lee, Minhoe;Kang, Dong Hyun;Kim, Junghoon;Eom, Young Ik
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.1
    • /
    • pp.88-93
    • /
    • 2015
  • In modern computer systems, DRAM is commonly used as main memory due to its low read/write latency and high endurance. However, DRAM is volatile memory that requires periodic power supply (i.e., memory refresh) to sustain the data stored in it. On the other hand, PCM is a promising candidate for replacement of DRAM because it is non-volatile memory, which could sustain the stored data without memory refresh. PCM is also available for byte-addressable access and in-place update. However, PCM is unsuitable for using main memory of a computer system because it has two limitations: high read/write latency and low endurance. To take the advantage of both DRAM and PCM, a hybrid main memory, which consists of DRAM and PCM, has been suggested and actively studied. In this paper, we propose a novel page replacement algorithm for hybrid main memory. To cope with the weaknesses of PCM, our scheme focuses on reducing the number of PCM writes in the hybrid main memory. Experimental results shows that our proposed page replacement algorithm reduces the number of PCM writes by up to 80.5% compared with the other page replacement algorithms.

An Energy Efficient Query Processing Mechanism using Cache Filtering in Cluster-based Wireless Sensor Networks (클러스터 기반 WSN에서 캐시 필터링을 이용한 에너지 효율적인 질의처리 기법)

  • Lee, Kwang-Won;Hwang, Yoon-Cheol;Oh, Ryum-Duck
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.8
    • /
    • pp.149-156
    • /
    • 2010
  • As following the development of the USN technology, sensor node used in sensor network has capability of quick data process and storage to support efficient network configuration is enabled. In addition, tree-based structure was transformed to cluster in the construction of sensor network. However, query processing based on existing tree structure could be inefficient under the cluster-based network. In this paper, we suggest energy efficient query processing mechanism using filtering through data attribute classification in cluster-based sensor network. The suggestion mechanism use advantage of cluster-based network so reduce energy of query processing and designed more intelligent query dissemination. And, we prove excellence of energy efficient side with MATLab.

SPARQL Query Processing in Distributed In-Memory System (분산 메모리 시스템에서의 SPARQL 질의 처리)

  • Jagvaral, Batselem;Lee, Wangon;Kim, Kang-Pil;Park, Young-Tack
    • Journal of KIISE
    • /
    • v.42 no.9
    • /
    • pp.1109-1116
    • /
    • 2015
  • In this paper, we propose a query processing approach that uses the Spark functional programming and distributed memory system to solve the computational overhead of SPARQL. In the semantic web, RDF ontology data is produced at large scale, and the main challenge for the semantic web is to query and manipulate such a large ontology with a high throughput. The most existing studies on SPARQL have focused on deploying the Hadoop MapReduce framework, and although approaches based on Hadoop MapReduce have shown promising results, they achieve a low level of throughput due to the underlying distributed file processes. Therefore, in order to speed up the query processes, we suggest query- processing methods that are based on memory caching in distributed memory system. Our approach is also integrated with a clause unification method for propagating between the clauses that exploits Spark join, map and filter methods along with caching. In our experiments, we have achieved a high level of performance relative to other approaches. In particular, our performance was nearly similar to that of Sempala, which has been considered to be the fastest query processing system.

Parallel Cell-Connectivity Information Extraction Algorithm for Ray-casting on Unstructured Grid Data (비정렬 격자에 대한 광선 투사를 위한 셀 사이 연결정보 추출 병렬처리 알고리즘)

  • Lee, Jihun;Kim, Duksu
    • Journal of the Korea Computer Graphics Society
    • /
    • v.26 no.1
    • /
    • pp.17-25
    • /
    • 2020
  • We present a novel multi-core CPU based parallel algorithm for the cell-connectivity information extraction algorithm, which is one of the preprocessing steps for volume rendering of unstructured grid data. We first check the synchronization issues when parallelizing the prior serial algorithm naively. Then, we propose a 3-step parallel algorithm that achieves high parallelization efficiency by removing synchronization in each step. Also, our 3-step algorithm improves the cache utilization efficiency by increasing the spatial locality for the duplicated triangle test process, which is the core operation of building cell-connectivity information. We further improve the efficiency of our parallel algorithm by employing a memory pool for each thread. To check the benefit of our approach, we implemented our method on a system consisting of two octa-core CPUs and measured the performance. As a result, our method shows continuous performance improvement as we add threads. Also, it achieves up to 82.9 times higher performance compared with the prior serial algorithm when we use thirty-two threads (sixteen physical cores). These results demonstrate the high parallelization efficiency and high cache utilization efficiency of our method. Also, it validates the suitability of our algorithm for large-scale unstructured data.

A Prefetching and Memory Management Policy for Personal Solid State Drives (개인용 SSD를 위한 선반입 및 메모리 관리 정책)

  • Baek, Sung-Hoon
    • The KIPS Transactions:PartA
    • /
    • v.19A no.1
    • /
    • pp.35-44
    • /
    • 2012
  • Traditional technologies that are used to improve the performance of hard disk drives show many negative cases if they are applied to solid state drives (SSD). Access time and block sequence in hard disk drives that consist of mechanical components are very important performance factors. Meanwhile, SSD provides superior random read performance that is not affected by block address sequence due to the characteristics of flash memory. Practically, it is recommended to disable prefetching if a SSD is installed in a personal computer. However, this paper presents a combinational method of a prefetching scheme and a memory management that consider the internal structure of SSD and the characteristics of NAND flash memory. It is important that SSD must concurrently operate multiple flash memory chips. The I/O unit size of NAND flash memory tends to increase and it exceeded the block size of operating systems. Hence, the proposed prefetching scheme performs in an operating unit of SSD. To complement a weak point of the prefetching scheme, the proposed memory management scheme adaptively evicts uselessly prefetched data to maximize the sum of cache hit rate and prefetch hit rate. We implemented the proposed schemes as a Linux kernel module and evaluated them using a commercial SSD. The schemes improved the I/O performance up to 26% in a given experiment.