• Title/Summary/Keyword: in-memory computing

Search Result 766, Processing Time 0.028 seconds

A Study on the Remote Method Connection using RMI in the Distributed Computing System (분산 환경 시스템에서 RMI를 이용한 원격 메소드 연결에 관한 연구)

  • 소경영;최유순;박종구
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.5 no.3
    • /
    • pp.483-491
    • /
    • 2001
  • In this paper, we design and implement of the remote method connection system using Java RMI in the distributed computing system. In pursuing this goal, we implement the dynamic method connection interface and API. And then we describe the dynamic memory management routine.

  • PDF

Time-Predictable Java Dynamic Compilation on Multicore Processors

  • Sun, Yu;Zhang, Wei
    • Journal of Computing Science and Engineering
    • /
    • v.6 no.1
    • /
    • pp.26-38
    • /
    • 2012
  • Java has been increasingly used in programming for real-time systems. However, some of Java's features such as automatic memory management and dynamic compilation are harmful to time predictability. If these problems are not solved properly then it can fundamentally limit the usage of Java for real-time systems, especially for hard real-time systems that require very high time predictability. In this paper, we propose to exploit multicore computing in order to reduce the timing unpredictability that is caused by dynamic compilation and adaptive optimization. Our goal is to retain high performance comparable to that of traditional dynamic compilation, while at the same time, obtain better time predictability for Java virtual machine (JVM). We have studied pre-compilation techniques to utilize another core more efficiently, preoptimization on another core (PoAC) scheme to replace the adaptive optimization system (AOS) in Jikes JVM and the counter based optimization (CBO). Our evaluation reveals that the proposed approaches are able to attain high performance while greatly reducing the variation of the execution time for Java applications.

A Parallel Processing System for Visual Media Applications (시각매체를 위한 병렬처리 시스템)

  • Lee, Hyung;Pakr, Jong-Won
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.27 no.1A
    • /
    • pp.80-88
    • /
    • 2002
  • Visual media(image, graphic, and video) processing poses challenge from several perpectives, specifically from the point of view of real-time implementation and scalability. There have been several approaches to obtain speedups to meet the computing demands in multimedia processing ranging from media processors to special purpose implementations. A variety of parallel processing strategies are adopted in these implementations in order to achieve the required speedups. We have investigated a parallel processing system for improving the processing speed o f visual media related applications. The parallel processing system we proposed is similar to a pipelined memory stystem(MAMS). The multi-access memory system is made up of m memory modules and a memory controller to perform parallel memory access with a variety of combinations of 1${\times}$pq, pq${\times}$1, and p${\times}$q subarray, which improves both cost and complexity of control. Facial recognition, Phong shading, and automatic segmentation of moving object in image sequences are some that have been applied to the parallel processing system and resulted in faithful processing speed. This paper describes the parallel processing systems for the speedup and its utilization to three time-consuming applications.

MLC-LFU : The Multi-Level Buffer Cache Management Policy for Flash Memory (MLC-LFU : 플래시 메모리를 위한 멀티레벨 버퍼 캐시 관리 정책)

  • Ok, Dong-Seok;Lee, Tae-Hoon;Chung, Ki-Dong
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.1
    • /
    • pp.14-20
    • /
    • 2009
  • Recently, NAND flash memory is used not only for portable devices, but also for personal computers and server computers. Buffer cache replacement policies for the hard disks such as LRU and LFU are not good for NAND flash memories because they do not consider about the characteristics of NAND flash memory. CFLRU and its variants, CFLRU/C, CFLRU/E and DL-CFLRU/E(CFLRUs) are the buffer cache replacement policies considered about the characteristics of NAND flash memories, but their performances are not better than those of LRD. In this paper, we propose a new buffer cache replacement policy for NAND flash memory. Which is based on LFU and is taking into account the characteristics of NAND flash memory. And we estimate the performance of hit ratio and flush operation numbers. The proposed policy shows better hit ratio and the number of flush operation than any other policies.

Reconfigurable Integrated Flash Memory Software Architecture with FAT Compatibility (재구성 가능한 FAT 호환 통합 플래시 메모리 소프트웨어 구조)

  • Kim, Yu-Mi;Choi, Yong-Suk;Baek, Seung-Jae;Choi, Jong-Moo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.1
    • /
    • pp.17-22
    • /
    • 2010
  • As deployments of Flash memory are spreading out rapidly from tiny USB storages to large DB servers, interoperability become an indispensable requirement for Flash memory software architecture. For the purpose, many systems make use of the conventional FAT file system and FTL (Flash Translation Layer) software as a de facto standard. However, the tactless combination of the FAT file system and FTL does not satisfy diverse other requirements of a variety of systems. In this paper, we propose a novel reconfigurable integrated Flash memory software architecture, named INFLAWARE (INtegrated FLAsh softWARE) that supports not only interoperability but also reconfigurability and performance enhancement. Real implementation based experimental results have shown that INFLAWARE can achieve improvements of memory footprint up to 27% with an average of 19%, compared with the conventional FAT and FTL combination. Also, by using map_destroy technique, it can reduce response times of various applications up to 21% with an average of 10%.

An Embedded Text Index System for Mass Flash Memory (대용량 플래시 메모리를 위한 임베디드 텍스트 인덱스 시스템)

  • Yun, Sang-Hun;Cho, Haeng-Rae
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.6
    • /
    • pp.1-10
    • /
    • 2009
  • Flash memory has the advantages of nonvolatile, low power consumption, light weight, and high endurance. This enables the flash memory to be utilized as a storage of mobile computing device such as PMP(Portable Multimedia Player). Potable device with a mass flash memory can store various multimedia data such as video, audio, or image. Typical index systems for mobile computer are inefficient to search a form of text like lyric or title. In this paper, we propose a new text index system, named EMTEX(Embedded Text Index). EMTEX has the following salient features. First, it uses a compression algorithm for embedded system. Second, if a new insert or delete operation is executed on the base table. EMTEX updates the text index immediately. Third, EMTEX considers the characteristics of flash memory to design insert, delete, and rebuild operations on the text index. Finally, EMTEX is executed as an upper layer of DBMS. Therefore, it is independent of the underlying DBMS. We evaluate the performance of EMTEX. The Experiment results show that EMTEX can outperform th conventional index systems such as Oracle Text and FT3.

Design and Implementation of B-Tree on Flash Memory (플래시 메모리 상에서 B-트리 설계 및 구현)

  • Nam, Jung-Hyun;Park, Dong-Joo
    • Journal of KIISE:Databases
    • /
    • v.34 no.2
    • /
    • pp.109-118
    • /
    • 2007
  • Recently, flash memory is used to store data in mobile computing devices such as PDAs, SmartCards, mobile phones and MP3 players. These devices need index structures like the B-tree to efficiently support some operations like insertion, deletion and search. The BFTL(B-tree Flash Translation Layer) technique was first introduced which is for implementing the B-tree on flash memory. Flash memory has characteristics that a write operation is more costly than a read operation and an overwrite operation is impossible. Therefore, the BFTL method focuses on minimizing the number of write operations resulting from building the B-tree. However, we indicate in this paper that there are many rooms of improving the performance of the I/O cost in building the B-tree using this method and it is not practical since it increases highly the usage of the SRAM memory storage. In this paper, we propose a BOF(the B-tree On Flash memory) approach for implementing the B-tree on flash memory efficiently. The core of this approach is to store index units belonging to the same B-tree node to the same sector on flash memory in case of the replacement of the buffer used to build the B-tree. In this paper, we show that our BOF technique outperforms the BFTL or other techniques.

High Throughput Parallel KMP Algorithm Considering CPU-GPU Memory Hierarchy (CPU-GPU 메모리 계층을 고려한 고처리율 병렬 KMP 알고리즘)

  • Park, Soeun;Kim, Daehee;Lee, Myungho;Park, Neungsoo
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.67 no.5
    • /
    • pp.656-662
    • /
    • 2018
  • Pattern matching algorithm is widely used in many application fields such as bio-informatics, intrusion detection, etc. Among many string matching algorithms, KMP (Knuth-Morris-Pratt) algorithm is commonly used because of its fast execution time when using large texts. However, the processing speed of KMP algorithm is also limited when the text size increases significantly. In this paper, we propose a high throughput parallel KMP algorithm considering CPU-GPU memory hierarchy based on OpenCL in GPGPU (General Purpose computing on Graphic Processing Unit). We focus on the optimization for the allocation of work-times and work-groups, the local memory copy of the pattern data and the failure table, and the overlapping of the data transfer with the string matching operations. The experimental results show that the execution time of the optimized parallel KMP algorithm is about 3.6 times faster than that of the non-optimized parallel KMP algorithm.

Low-latency SAO Architecture and its SIMD Optimization for HEVC Decoder

  • Kim, Yong-Hwan;Kim, Dong-Hyeok;Yi, Joo-Young;Kim, Je-Woo
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.3 no.1
    • /
    • pp.1-9
    • /
    • 2014
  • This paper proposes a low-latency Sample Adaptive Offset filter (SAO) architecture and its Single Instruction Multiple Data (SIMD) optimization scheme to achieve fast High Efficiency Video Coding (HEVC) decoding in a multi-core environment. According to the HEVC standard and its Test Model (HM), SAO operation is performed only at the picture level. Most realtime decoders, however, execute their sub-modules on a Coding Tree Unit (CTU) basis to reduce the latency and memory bandwidth. The proposed low-latency SAO architecture has the following advantages over picture-based SAO: 1) significantly less memory requirements, and 2) low-latency property enabling efficient pipelined multi-core decoding. In addition, SIMD optimization of SAO filtering can reduce the SAO filtering time significantly. The simulation results showed that the proposed low-latency SAO architecture with significantly less memory usage, produces a similar decoding time as a picture-based SAO in single-core decoding. Furthermore, the SIMD optimization scheme reduces the SAO filtering time by approximately 509% and increases the total decoding speed by approximately 7% compared to the existing look-up table approach of HM.

Performance Analysis of Flash Translation Layer using TPC-C Benchmark (플래시 변환 계층에 대한 TPC-C 벤치마크를 통한 성능분석)

  • Park, Sung-Hwan;Jang, Ju-Yeon;Suh, Young-Ju;Park, Won-Joo;Park, Sang-Won
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.2
    • /
    • pp.201-205
    • /
    • 2008
  • The flash memory is widely used as a main storage of embedded devices. It is adopted as a storage of database as growing the capacity of the flash memory. We run TPC-C benchmark on various FTL algorithms. But, the database shows poor performance on flash memory because the characteristic of I/O requests is full random. In this paper, we show the performance of all existing FTL algorithms is very poor. Especially, the FTL algorithm known as good at small mobile equipment shows worst performance. In addition, the chip-inter leaving which is a technique to improve the performance of the flash memory doesn't work well. In this paper, we inform you the reason that we need a new FTL algorithm and the direction for the database in the future.