• Title/Summary/Keyword: in-memory computing

Search Result 766, Processing Time 0.034 seconds

Evaluation of Recurrent Neural Network Variants for Person Re-identification

  • Le, Cuong Vo;Tuan, Nghia Nguyen;Hong, Quan Nguyen;Lee, Hyuk-Jae
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.6 no.3
    • /
    • pp.193-199
    • /
    • 2017
  • Instead of using only spatial features from a single frame for person re-identification, a combination of spatial and temporal factors boosts the performance of the system. A recurrent neural network (RNN) shows its effectiveness in generating highly discriminative sequence-level human representations. In this work, we implement RNN, three Long Short Term Memory (LSTM) network variants, and Gated Recurrent Unit (GRU) on Caffe deep learning framework, and we then conduct experiments to compare performance in terms of size and accuracy for person re-identification. We propose using GRU for the optimized choice as the experimental results show that the GRU achieves the highest accuracy despite having fewer parameters than the others.

Analysis of Thermal flow Field Uing Equal Order Linear Finite Element and Fractional Step Method (동차선형 유한요소와 Fractional Step방법을 이용한 열유동장의 해석)

  • ;;Yoo, Jung Yul
    • Transactions of the Korean Society of Mechanical Engineers
    • /
    • v.19 no.10
    • /
    • pp.2667-2677
    • /
    • 1995
  • A new numerical algorithm using equal order linear finite element and fractional step method has been developed which is capable of analyzing unsteady fluid flow and heat transfer problems. Streamline Upwind Petrov-Galerkin (SUPG) method is used for the weighted residual formulation of the Navier-Stokes equations. It is shown that fractional step method, in which pressure term is splitted from the momentum equation, reduces computer memory and computing time. In addition, since pressure equation is derived without any approximation procedure unlike in the previously developed SIMPLE algorithm based FEM codes, the present numerical algorithm gives more accurate results than them. The present algorithm has been applied preferentially to the well known bench mark problems associated with steady flow and heat transfer, and proves to be more efficient and accurate.

Optimal control for voltage and reactive power using piecewise method (분할수법을 이용한 전압무효전력의 최적제어)

  • 유석구;임화영
    • 전기의세계
    • /
    • v.31 no.5
    • /
    • pp.375-382
    • /
    • 1982
  • The optimum control of voltage and reactive power in large system requires large amounts of complicated calculation. If the large power system is controlled by the centralized control scheme, the necessary computing time, memory requirments and data transmission channels increase exponetially, and computer control of the system becomes difficult. Piecewise method which aims at the reduction of the difficulties of centralized control scheme is to decompose a large power system into several subsystems, each of which is controlled by a local computer and the control efforts of each subsystem are coordinated by a central computer. Unless sufficient coordination is made between subsystems, the control quality may become very poor. This paper describes how piecewise method can be applied in the optimal control of voltage and reactive power in large system, and presents effective calaulating algorithm for the solution of the problem. The numerical example for model system is presented here.

  • PDF

Algorithm for optimum operation of large-scale systems by the mathematical programming (수리계획법에 의한 대형시스템의 최적운용 앨고리즘)

  • 박영문;이봉용;백영식;김영창;김건중;김중훈;양원영
    • 전기의세계
    • /
    • v.30 no.6
    • /
    • pp.375-385
    • /
    • 1981
  • New algorithms are derived for nonlinear programming problems which are characterized by their large variables and equality and inequality constraints. The algorithms are based upon the introduction of the Dependent-Variable-Elimination method, Independent-Variable-Reduction method, Optimally-Ordered-Triangular-Factorization method, Equality-Inequality-Sequential-Satisfaction method, etc. For a case study problem relating to the optimal determination of load flow in a 10-bus, 13-line sample power system, several approaches are undertaken, such as SUMT, Lagrange's Multiplier method, sequential applications of linear and quadratic programming method. For applying the linear programming method, the conventional simplex algorithm is modified to the large-system-oriented one by the introduction of the Two-Phase method and Variable-Upper-Bounding method, thus resulting in remarkable savings in memory requirements and computing time. The case study shows the validity and effectivity of the algorithms presented herein.

  • PDF

Design of a middleware for compound context-awareness on sensor-based mobile environments

  • Sung, Nak-Myoung;Rhee, Yunseok
    • Journal of the Korea Society of Computer and Information
    • /
    • v.21 no.2
    • /
    • pp.25-32
    • /
    • 2016
  • In this paper, we design a middleware for context-awareness which provides compound contexts from diverse sensors on a mobile device. Until now, most of context-aware application developers have taken responsibility for context processing from sensing data. Such application-level context processing causes heavily redundant data processing and leads to significant resource waste in energy as well as computing. In the proposed scheme, we define primitive and compound context map which consists of relavant sensors and features. Based on the context definition, each application demands a context of interest to the middleware, and thus similar context-aware applications inherently share context information and procesing within the middleware. We show that the proposed scheme significantly reduces the resource amounts of cpu, memory, and battery, and that the performance gain gets much more when multiple applications which need similar contexts are running.

A New Upper Bound for Two-Dimensional Guillotine Cutting Problem (2차원 길로틴 절단문제를 위한 새로운 상한)

  • 윤기섭;지영근;강맹규
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.24 no.62
    • /
    • pp.21-32
    • /
    • 2001
  • The two-dimensional guillotine cutting problem is to maximize sum of piece profits that cut from one stock rectangle and widely applied in the industry. The branch-and-bound method for this problem uses complementarily several upper bounds(the Gilmore and Gomoryp[8]'s two-dimensional knapsack function and the Hifi and Zissimopoulos[10]'s method using one-dimensional knapsack problem, etc) to reduce the number of searched nodes. These upper bounds has a shortcoming that does not consider the bound and layout of pieces simultaneously. In this paper, we propose an efficient upper bound which can complement the shortcoming of existing upper bounds. The proposed upper bound needs less memory spaces and computing time. Computational results show that the proposed upper bound significantly contribute to reduce the computational amount of time and number of searched nodes in tree.

  • PDF

Neural Network Model Compression Algorithms for Image Classification in Embedded Systems (임베디드 시스템에서의 객체 분류를 위한 인공 신경망 경량화 연구)

  • Shin, Heejung;Oh, Hyondong
    • The Journal of Korea Robotics Society
    • /
    • v.17 no.2
    • /
    • pp.133-141
    • /
    • 2022
  • This paper introduces model compression algorithms which make a deep neural network smaller and faster for embedded systems. The model compression algorithms can be largely categorized into pruning, quantization and knowledge distillation. In this study, gradual pruning, quantization aware training, and knowledge distillation which learns the activation boundary in the hidden layer of the teacher neural network are integrated. As a large deep neural network is compressed and accelerated by these algorithms, embedded computing boards can run the deep neural network much faster with less memory usage while preserving the reasonable accuracy. To evaluate the performance of the compressed neural networks, we evaluate the size, latency and accuracy of the deep neural network, DenseNet201, for image classification with CIFAR-10 dataset on the NVIDIA Jetson Xavier.

Technical Trends in On-device Small Language Model Technology Development (온디바이스 소형언어모델 기술개발 동향)

  • G. Kim;K. Yoon;R. Kim;J. H. Ryu;S. C. Kim
    • Electronics and Telecommunications Trends
    • /
    • v.39 no.4
    • /
    • pp.82-92
    • /
    • 2024
  • This paper introduces the technological development trends in on-device SLMs (Small Language Models). Large Language Models (LLMs) based on the transformer model have gained global attention with the emergence of ChatGPT, providing detailed and sophisticated responses across various knowledge domains, thereby increasing their impact across society. While major global tech companies are continuously announcing new LLMs or enhancing their capabilities, the development of SLMs, which are lightweight versions of LLMs, is intensely progressing. SLMs have the advantage of being able to run as on-device AI on smartphones or edge devices with limited memory and computing resources, enabling their application in various fields from a commercialization perspective. This paper examines the technical features for developing SLMs, lightweight technologies, semiconductor technology development trends for on-device AI, and potential applications across various industries.

THE ELEVATION OF EFFICACY IDENTIFYING PITUITARY TISSUE ABNORMALITIES WITHIN BRAIN IMAGES BY EMPLOYING MEMORY CONTRAST LEARNING TECHNIQUES

  • S. SINDHU;N. VIJAYALAKSHMI
    • Journal of applied mathematics & informatics
    • /
    • v.42 no.4
    • /
    • pp.931-943
    • /
    • 2024
  • Accurately identifying brain tumors is crucial for medical imaging's precise diagnosis and treatment planning. This study presents a novel approach that uses cutting-edge image processing techniques to automatically segment brain tumors. with the use of the Pyramid Network algorithm. This technique accurately and robustly delineates tumor borders in MRI images. Our strategy incorporates special algorithms that efficiently address problems such as tumor heterogeneity and size and shape fluctuations. An assessment using the RESECT Dataset confirms the validity and reliability of the method and yields promising results in terms of accuracy and computing efficiency. This method has a great deal of promise to help physicians accurately identify tumors and assess the efficacy of treatments, which could lead to higher standards of care in the field of neuro-oncology.

A Study on GPU-based Iterative ML-EM Reconstruction Algorithm for Emission Computed Tomographic Imaging Systems (방출단층촬영 시스템을 위한 GPU 기반 반복적 기댓값 최대화 재구성 알고리즘 연구)

  • Ha, Woo-Seok;Kim, Soo-Mee;Park, Min-Jae;Lee, Dong-Soo;Lee, Jae-Sung
    • Nuclear Medicine and Molecular Imaging
    • /
    • v.43 no.5
    • /
    • pp.459-467
    • /
    • 2009
  • Purpose: The maximum likelihood-expectation maximization (ML-EM) is the statistical reconstruction algorithm derived from probabilistic model of the emission and detection processes. Although the ML-EM has many advantages in accuracy and utility, the use of the ML-EM is limited due to the computational burden of iterating processing on a CPU (central processing unit). In this study, we developed a parallel computing technique on GPU (graphic processing unit) for ML-EM algorithm. Materials and Methods: Using Geforce 9800 GTX+ graphic card and CUDA (compute unified device architecture) the projection and backprojection in ML-EM algorithm were parallelized by NVIDIA's technology. The time delay on computations for projection, errors between measured and estimated data and backprojection in an iteration were measured. Total time included the latency in data transmission between RAM and GPU memory. Results: The total computation time of the CPU- and GPU-based ML-EM with 32 iterations were 3.83 and 0.26 see, respectively. In this case, the computing speed was improved about 15 times on GPU. When the number of iterations increased into 1024, the CPU- and GPU-based computing took totally 18 min and 8 see, respectively. The improvement was about 135 times and was caused by delay on CPU-based computing after certain iterations. On the other hand, the GPU-based computation provided very small variation on time delay per iteration due to use of shared memory. Conclusion: The GPU-based parallel computation for ML-EM improved significantly the computing speed and stability. The developed GPU-based ML-EM algorithm could be easily modified for some other imaging geometries.