• Title/Summary/Keyword: parallel computer processing

Search Result 652, Processing Time 0.026 seconds

Identification of Nonstationary Time Varying EMG Signal in the DCT Domain and a Real Time Implementation Using Parallel Processing Computer (DCT 평면에서의 비정상 시변 근전도 신호의 인식과 병렬처리컴퓨터를 이용한 실시간 구현)

  • Lee, Young-Seock;Lee, Jin;Kim, Sung-Hwan
    • Journal of Biomedical Engineering Research
    • /
    • v.16 no.4
    • /
    • pp.507-516
    • /
    • 1995
  • The nonstationary identifier in the DCT domain is suggested in this study for the identification of AR parameters of above-lesion upper-trunk electromyographic (EMG) signals as a means of developing a reliable real time signal to control functional electrical stimulation (FES) in paraplegics to enable primitive walking. As paraplegic shifts his posture from one attitude to another, there is transition period where the signal is clearly nonstationary. Also as muscle fatigues, nonstationarities become more prevalent even during stable postures. So, it requires a develpment of time varying nonstationary EMG signal identifier. In this paper, time varying nonstationary EMG signals are transformed into DCT domain and the transformed EMG signals are modeled and analyzed in the transform domain. In the DCT domain, we verified reduction of condition number and increment of the smallest eigenvalue of input correlation matrix that influences numerical properties and mean square error were compared with SLS algorithm, and the proposed algorithm is implemented using IMS T-805 parallel processing computer for real time application.

  • PDF

Design Space Exploration of Many-Core Processor for High-Speed Cluster Estimation (고속의 클러스터 추정을 위한 매니코어 프로세서의 디자인 공간 탐색)

  • Seo, Jun-Sang;Kim, Cheol-Hong;Kim, Jong-Myon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.10
    • /
    • pp.1-12
    • /
    • 2014
  • This paper implements and improves the performance of high computational subtractive clustering algorithm using a single instruction, multiple data (SIMD) based many-core processor. In addition, this paper implements five different processing element (PE) architectures (PEs=16, 64, 256, 1,024, 4,096) to select an optimal PE architecture for the subtractive clustering algorithm by estimating execution time and energy efficiency. Experimental results using two different medical images and three different resolutions ($128{\times}128$, $256{\times}256$, $512{\times}512$) show that PEs=4,096 achieves the highest performance and energy efficiency for all the cases.

Acceleration of Feature-Based Image Morphing Using GPU (GPU를 이용한 특징 기반 영상모핑의 가속화)

  • Kim, Eun-Ji;Yoon, Seung-Hyun;Lee, Jieun
    • Journal of the Korea Computer Graphics Society
    • /
    • v.20 no.2
    • /
    • pp.13-24
    • /
    • 2014
  • In this study, a graphics-processing-unit (GPU)-based acceleration technique is proposed for the feature-based image morphing. This technique uses the depth-buffer of the graphics hardware to calculate efficiently the shortest distance between a pixel and the control lines. The pairs of control lines between the source image and the destination image are determined by user's input, and the distance function of each control line is rendered using two rectangles and two cones. The distance between each pixel and its nearest control line is stored in the depth buffer through the graphics pipeline, and this is used to conduct the morphing operation efficiently. The pixel-unit morphing operation is parallelized using the compute unified device architecture (CUDA) to reduce the morphing time. We demonstrate the efficiency of the proposed technique using several experimental results.

GP-GPU based Parallelization for Urban Terrain Atmospheric Model CFD_NIMR (도시기상모델 CFD_NIMR의 GP-GPU 실행을 위한 병렬 프로그램의 구현)

  • Kim, Youngtae;Park, Hyeja;Choi, Young-Jeen
    • Journal of Internet Computing and Services
    • /
    • v.15 no.2
    • /
    • pp.41-47
    • /
    • 2014
  • In this paper, we implemented a CUDA Fortran parallel program to run the CFD_NIMR model on GP-GPU's, which simulates air diffusion on urban terrains. A GP-GPU is graphic processing unit in the form of a PCI card, and a general calculation accelerator to perform a large amount of high speed calculations with low cost and electric power. The GP-GPU gives performance enhancement of speed by 15 times to compare the Nvidia Tesla C1060 GPU with Intel XEON 2.0 GHz CPU. In addition, the program on a GP-GPU shows efficient performance compared to an MPI parallel program on multiple CPU's. It is expected that a proposed programming method on the GP-GPU parallel program can be used for numerical models with a similar structure.

Optimizing LRU Lock Management in the Linux Kernel for Improving Parallel Write Throughout in Many-Core CPU Systems (매니코어 CPU 시스템의 병렬 쓰기 성능 향상을 위한 리눅스 커널의 LRU 관리 최적화 기법)

  • Eun-Kyu Byun;Gibeom Gu;Kwang-Jin Oh;Jiwoo Bang
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.12 no.7
    • /
    • pp.209-216
    • /
    • 2023
  • Modern HPC systems are equipped with many-core CPUs with dozens of cores. When performing parallel I/O in such a system, there is a limit to scalability due to the problem of the LRU lock management policy of the Linux system. The study proposes an improved FinerLRU to solve this problem. Our new FinerLRU improves the parallel write performance of file systems using the buffer cache through granular lock management by increasing the number of LRU locks upto the maximum number of cores. The proposed method was implemented in Linux 5.18.11, and the performance was measured on two types of CPUs, Intel Icelake Xeon and Intel Knights landing, with different characteristics, and it was found that a performance improvement of about two times can be obtained in both types of systems.

Enhanced Graph-Based Method in Spectral Partitioning Segmentation using Homogenous Optimum Cut Algorithm with Boundary Segmentation

  • S. Syed Ibrahim;G. Ravi
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.7
    • /
    • pp.61-70
    • /
    • 2023
  • Image segmentation is a very crucial step in effective digital image processing. In the past decade, several research contributions were given related to this field. However, a general segmentation algorithm suitable for various applications is still challenging. Among several image segmentation approaches, graph-based approach has gained popularity due to its basic ability which reflects global image properties. This paper proposes a methodology to partition the image with its pixel, region and texture along with its intensity. To make segmentation faster in large images, it is processed in parallel among several CPUs. A way to achieve this is to split images into tiles that are independently processed. However, regions overlapping the tile border are split or lost when the minimum size requirements of the segmentation algorithm are not met. Here the contributions are made to segment the image on the basis of its pixel using min-cut/max-flow algorithm along with edge-based segmentation of the image. To segment on the basis of the region using a homogenous optimum cut algorithm with boundary segmentation. On the basis of texture, the object type using spectral partitioning technique is identified which also minimizes the graph cut value.

A Runge-Kutta scheme for smart control mechanism with computer-vision robotics

  • ZY Chen;Huakun Wu;Yahui Meng;Timothy Chen
    • Smart Structures and Systems
    • /
    • v.34 no.2
    • /
    • pp.117-127
    • /
    • 2024
  • A novel approach that the smart control of robotics can be realized by a fuzzy controller and an appropriate Runge-Kutta scheme in this paper. A recently proposed integral inequality is selected based on the free weight matrix, and the less conservative stability criterion is given in the form of linear matrix inequalities (LMIs). We demonstrate that this target information obtained through image processing is subjected to smart control with computer-vision robotic to Arduino, and the infrared beacon was utilized for the operation of practical illustrations. A fuzzy controller derived with a fuzzy Runge-Kutta type functions is injected into the system and then the system is stabilized asymptotically. In this study, a fuzzy controller and a fuzzy observer are proposed via the parallel distributed compensation technique to stabilize the system. This paper achieves the goal of real-time following of three vehicles and there are many areas where improvements were made. Finally, each information is transmitted to Arduino via I2C to follow the self-propelled vehicle. The proposed calculation is approved in reproductions and ongoing smart control tests.

A Design of Pipelined-parallel CABAC Decoder Adaptive to HEVC Syntax Elements (HEVC 구문요소에 적응적인 파이프라인-병렬 CABAC 복호화기 설계)

  • Bae, Bong-Hee;Kong, Jin-Hyeung
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.5
    • /
    • pp.155-164
    • /
    • 2015
  • This paper describes a design and implementation of CABAC decoder, which would handle HEVC syntax elements in adaptively pipelined-parallel computation manner. Even though CABAC offers the high compression rate, it is limited in decoding performance due to context-based sequential computation, and strong data dependency between context models, as well as decoding procedure bin by bin. In order to enhance the decoding computation of HEVC CABAC, the flag-type syntax elements are adaptively pipelined by precomputing consecutive flag-type ones; and multi-bin syntax elements are decoded by processing bins in parallel up to three. Further, in order to accelerate Binary Arithmetic Decoder by reducing the critical path delay, the update and renormalization of context modeling are precomputed parallel for the cases of LPS as well as MPS, and then the context modeling renewal is selected by the precedent decoding result. It is simulated that the new HEVC CABAC architecture could achieve the max. performance of 1.01 bins/cycle, which is two times faster with respect to the conventional approach. In ASIC design with 65nm library, the CABAC architecture would handle 224 Mbins/sec, which could decode QFHD HEVC video data in real time.

A Study on Performance Improvement of Business Card Recognition in Mobile Environments (모바일 환경에서의 명함인식 성능 향상에 관한 연구)

  • Shin, Hyunsub;Kim, Chajong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.2
    • /
    • pp.318-328
    • /
    • 2014
  • In this paper, as a way of performance improvement of business card recognition in the mobile environment, we suggested a hybrid OCR agent which combines data using a parallel processing sequence between various algorithms and different kinds of business card recognition engines which have learning data. We also suggested an Image Processing Method on mobile cameras which adapts to the changes of the lighting, exposing axis and the backgrounds of the cards which occur depending on the photographic conditions. In case a hybrid OCR agent is composed by the method suggested above, the average recognition rate of Korean business cards has improved from 90.69% to 95.5% compared to the cases where a single engine is used. By using the Image Processing Method, the image capacity has decreased to the average of 50%, and the recognition has improved from 83% to 92.48% showing 9.4% improvement.

A study on the process of mapping data and conversion software using PC-clustering (PC-clustering을 이용한 매핑자료처리 및 변환소프트웨어에 관한 연구)

  • WhanBo, Taeg-Keun;Lee, Byung-Wook;Park, Hong-Gi
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.7 no.2 s.14
    • /
    • pp.123-132
    • /
    • 1999
  • With the rapid increases of the amount of data and computing, the parallelization of the computing algorithm becomes necessary more than ever. However the parallelization had been conducted mostly in a super-computer until the rod 1990s, it was not for the general users due to the high price, the complexity of usage, and etc. A new concept for the parallel processing has been emerged in the form of K-clustering form the late 1990s, it becomes an excellent alternative for the applications need high computer power with a relative low cost although the installation and the usage are still difficult to the general users. The mapping algorithms (cut, join, resizing, warping, conversion from raster to vector and vice versa, etc) in GIS are well suited for the parallelization due to the characteristics of the data structure. If those algorithms are manipulated using PC-clustering, the result will be satisfiable in terms of cost and performance since they are processed in real flu with a low cos4 In this paper the tools and the libraries for the parallel processing and PC-clustering we introduced and how those tools and libraries are applied to mapping algorithms in GIS are showed. Parallel programs are developed for the mapping algorithms and the result of the experiments shows that the performance in most algorithms increases almost linearly according to the number of node.

  • PDF