• Title/Summary/Keyword: in-memory computing

Search Result 766, Processing Time 0.028 seconds

A Low-Power LSI Design of Japanese Word Recognition System

  • Yoshizawa, Shingo;Miyanaga, Yoshikazu;Wada, Naoya;Yoshida, Norinobu
    • Proceedings of the IEEK Conference
    • /
    • 2002.07a
    • /
    • pp.98-101
    • /
    • 2002
  • This paper reports a parallel architecture in a HMM based speech recognition system for a low-power LSI design. The proposed architecture calculates output probability of continuous HMM (CHMM) by using concurrent and pipeline processing. They enable to reduce memory access and have high computing efficiency. The novel point is the efficient use of register arrays that reduce memory access considerably compared with any conventional method. The implemented system can achieve a real time response with lower clock in a middle size vocabulary recognition task (100-1000 words) by using this technique.

  • PDF

Design of High-speed Pointer Switching Fabric (초고속 포인터 스위칭 패브릭의 설계)

  • Ryu, Kyoung-Sook;Choe, Byeong-Seog
    • Journal of Internet Computing and Services
    • /
    • v.8 no.5
    • /
    • pp.161-170
    • /
    • 2007
  • The proposed switch which has separated data plane and switching plane can make parallel processing for packet data storing, memory address pointer switching and simultaneously can be capable of switching the variable length for IP packets. The proposed architecture does not require the complicated arbitration algorithms in VOQ, also is designed for QoS of generic output queue switch as well as input queue. At the result of simulations, the proposed architecture has less average packet delay than the one of the memory-sharing based architecture and guarantees keeping a certain average packet delay in increasing switch size.

  • PDF

Low-power Buffer Cache Management for Mixed HDD and SSD Storage Systems (HDD와 SSD의 혼합형 저장 시스템을 위한 절전형 버퍼 캐쉬 관리)

  • Kang, Hyo-Jung;Park, Jun-Seok;Koh, Kern;Bahn, Hyo-Kyung
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.4
    • /
    • pp.462-466
    • /
    • 2010
  • A new buffer cache management scheme that aims at reducing power consumption in mixed HDD and NAND flash memory storage systems is presented. The proposed scheme reduces power consumption by considering different energy-consumption rate of storage devices, I/O operation type (read or write), and reference potential of cached blocks in terms of both recency and frequency. Simulation shows that the proposed scheme reduces power consumption by 18.0% on average and up to 58.9%.

Task failure resilience technique for improving the performance of MapReduce in Hadoop

  • Kavitha, C;Anita, X
    • ETRI Journal
    • /
    • v.42 no.5
    • /
    • pp.748-760
    • /
    • 2020
  • MapReduce is a framework that can process huge datasets in parallel and distributed computing environments. However, a single machine failure during the runtime of MapReduce tasks can increase completion time by 50%. MapReduce handles task failures by restarting the failed task and re-computing all input data from scratch, regardless of how much data had already been processed. To solve this issue, we need the computed key-value pairs to persist in a storage system to avoid re-computing them during the restarting process. In this paper, the task failure resilience (TFR) technique is proposed, which allows the execution of a failed task to continue from the point it was interrupted without having to redo all the work. Amazon ElastiCache for Redis is used as a non-volatile cache for the key-value pairs. We measured the performance of TFR by running different Hadoop benchmarking suites. TFR was implemented using the Hadoop software framework, and the experimental results showed significant performance improvements when compared with the performance of the default Hadoop implementation.

Deep recurrent neural networks with word embeddings for Urdu named entity recognition

  • Khan, Wahab;Daud, Ali;Alotaibi, Fahd;Aljohani, Naif;Arafat, Sachi
    • ETRI Journal
    • /
    • v.42 no.1
    • /
    • pp.90-100
    • /
    • 2020
  • Named entity recognition (NER) continues to be an important task in natural language processing because it is featured as a subtask and/or subproblem in information extraction and machine translation. In Urdu language processing, it is a very difficult task. This paper proposes various deep recurrent neural network (DRNN) learning models with word embedding. Experimental results demonstrate that they improve upon current state-of-the-art NER approaches for Urdu. The DRRN models evaluated include forward and bidirectional extensions of the long short-term memory and back propagation through time approaches. The proposed models consider both language-dependent features, such as part-of-speech tags, and language-independent features, such as the "context windows" of words. The effectiveness of the DRNN models with word embedding for NER in Urdu is demonstrated using three datasets. The results reveal that the proposed approach significantly outperforms previous conditional random field and artificial neural network approaches. The best f-measure values achieved on the three benchmark datasets using the proposed deep learning approaches are 81.1%, 79.94%, and 63.21%, respectively.

User Mobility Model Based Computation Offloading Decision for Mobile Cloud

  • Lee, Kilho;Shin, Insik
    • Journal of Computing Science and Engineering
    • /
    • v.9 no.3
    • /
    • pp.155-162
    • /
    • 2015
  • The last decade has seen a rapid growth in the use of mobile devices all over the world. With an increasing use of mobile devices, mobile applications are becoming more diverse and complex, demanding more computational resources. However, mobile devices are typically resource-limited (i.e., a slower-speed CPU, a smaller memory) due to a variety of reasons. Mobile users will be capable of running applications with heavy computation if they can offload some of their computations to other places, such as a desktop or server machines. However, mobile users are typically subject to dynamically changing network environments, particularly, due to user mobility. This makes it hard to choose good offloading decisions in mobile environments. In general, users' mobility can provide some hints for upcoming changes to network environments. Motivated by this, we propose a mobility model of each individual user taking advantage of the regularity of his/her mobility pattern, and develop an offloading decision-making technique based on the mobility model. We evaluate our technique through trace-based simulation with real log data traces from 14 Android users. Our evaluation results show that the proposed technique can help boost the performance of mobile devices in terms of response time and energy consumption, when users are highly mobile.

An Address Translation Technique Large NAND Flash Memory using Page Level Mapping (페이지 단위 매핑 기반 대용량 NAND플래시를 위한 주소변환기법)

  • Seo, Hyun-Min;Kwon, Oh-Hoon;Park, Jun-Seok;Koh, Kern
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.3
    • /
    • pp.371-375
    • /
    • 2010
  • SSD is a storage medium based on NAND Flash memory. Because of its short latency, low power consumption, and resistance to shock, it's not only used in PC but also in server computers. Most SSDs use FTL to overcome the erase-before-overwrite characteristic of NAND flash. There are several types of FTL, but page mapped FTL shows better performance than others. But its usefulness is limited because of its large memory footprint for the mapping table. For example, 64MB memory space is required only for the mapping table for a 64GB MLC SSD. In this paper, we propose a novel caching scheme for the mapping table. By using the mapping-table-meta-data we construct a fully associative cache, and translate the address within O(1) time. The simulation results show more than 80 hit ratio with 32KB cache and 90% with 512KB cache. The overall memory footprint was only 1.9% of 64MB. The time overhead of cache miss was measured lower than 2% for most workload.

Design and Implementation of a Query Processor for Real-Time Main Memory Database Systems (실시간 주기억장치 데이타베이스 시스템을 위한 질의 처리기의 설계 및 구현)

  • Kim, Gyoung-Bae;Bae, Hae-Young
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.6 no.2
    • /
    • pp.113-119
    • /
    • 2000
  • In this paper, we design and implement a query processor of real-time main memory database systems, which reflect the characteristics of main memory database systems and satisfy timing constraints. The proposed query processor manages real-time data that has timing constraint by exploiting meta database. It supports CLI in order to make application programs. It also supports extended CLI and stored CLI. The former can be expressed the Information on real-time transaction. The latter is designed to support frequently processed transaction. The proposed query processor is implemented as query processor of real-time database management systems. We Present performance evaluation results that illustrate ratio of transaction, which satisfy deadline are increased by the query processing ability of system and the efficient management of real-time data.

  • PDF

Efficient Parallel Block-layered Nonbinary Quasi-cyclic Low-density Parity-check Decoding on a GPU

  • Thi, Huyen Pham;Lee, Hanho
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.6 no.3
    • /
    • pp.210-219
    • /
    • 2017
  • This paper proposes a modified min-max algorithm (MMMA) for nonbinary quasi-cyclic low-density parity-check (NB-QC-LDPC) codes and an efficient parallel block-layered decoder architecture corresponding to the algorithm on a graphics processing unit (GPU) platform. The algorithm removes multiplications over the Galois field (GF) in the merger step to reduce decoding latency without any performance loss. The decoding implementation on a GPU for NB-QC-LDPC codes achieves improvements in both flexibility and scalability. To perform the decoding on the GPU, data and memory structures suitable for parallel computing are designed. The implementation results for NB-QC-LDPC codes over GF(32) and GF(64) demonstrate that the parallel block-layered decoding on a GPU accelerates the decoding process to provide a faster decoding runtime, and obtains a higher coding gain under a low $10^{-10}$ bit error rate and low $10^{-7}$ frame error rate, compared to existing methods.

An Application of the Impedance Boundary Condition to Microwave Cavity Analysis using Vector Finite Element Method

  • Shin, Pan-Seok;Changyul Cheon;Sheppard J.Salon
    • KIEE International Transaction on Electrical Machinery and Energy Conversion Systems
    • /
    • v.3B no.1
    • /
    • pp.16-22
    • /
    • 2003
  • This paper presents an application of an impedance boundary condition to 3D vector finite element analysis of a multi-port cylidrical microwave cavity using Snell's law. Computing memory benefits and computing time reduction are obtained from this method compared with the conventional finite element method(FEM). To verify the method, a high permittivity scatterer in free space is analyzed and compared with the results of conventional (FEM). In addition, this method has been analyzed several types of cavities, including water load, to demonstrate the validity and accuracy of the program.