• Title/Summary/Keyword: parallel computer processing

Search Result 652, Processing Time 0.032 seconds

Recognition of the 3-D motion of a human arm with HIGIPS

  • Yao, Feng-Hui;Tamaki, Akikazu;Kato, Kiyoshi
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1991.10b
    • /
    • pp.1724-1729
    • /
    • 1991
  • This paper gives an overview of HIGIPS design concepts and prototype HIGIPS configuration, and discusses its application to recognition of the 3-D motion of a human arm. HIGIPS which employs the combination of pipeline architecture and multiprocessor architecture, is a high-speed, high-performance and low cost N * M multimicroprocessor parallel machine, where N is the number of pipeline stages and M is the number of processors in each stage. The algorithm to recognize the motion of a human arm with a single TV camera was developed on personal computer (NEC PC9801 series). As a constraint condition, some simple ring marks are used. Each joint of the arm is attached with a ring mark to obtain its centroid position when the arm moves. These centroid positions in the three-dimensional space are linked at each of the successive pictures of the moving arm to recover its overall motion. This algorithm takes about 2 seconds to process one image frame on the general-purpose personal computer. This paper mainly discuses how to partition this algorithm and execute on HIGIPS, and shows the speed up. From this application, it is clear that HIGIPS is an efficient machine for image processing and recognizing.

  • PDF

Energy-Efficient Multi- Core Scheduling for Real-Time Video Processing (실시간 비디오 처리에 적합한 에너지 효율적인 멀티코어 스케쥴링)

  • Paek, Hyung-Goo;Yeo, Jeong-Mo;Lee, Wan-Yeon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.6
    • /
    • pp.11-20
    • /
    • 2011
  • In this paper, we propose an optimal scheduling scheme that minimizes the energy consumption of a real-time video task on the multi-core platform supporting dynamic voltage and frequency scaling. Exploiting parallel execution on multiple cores for less energy consumption, the propose scheme allocates an appropriate number of cores to the task execution, turns off the power of unused cores, and assigns the lowest clock frequency meeting the deadline. Our experiments show that the proposed scheme saves a significant amount of energy, up to 67% and 89% of energy consumed by two previous methods that execute the task on a single core and on all cores respectively.

Integrated Parallelization of Video Decoding on Multi-core Systems (멀티코어 시스템에서의 통합된 비디오 디코딩 병렬화)

  • Hong, Jung-Hyun;Kim, Won-Jin;Chung, Ki-Seok
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.49 no.7
    • /
    • pp.39-49
    • /
    • 2012
  • Demand for high resolution video services leads to active studies on high speed video processing. Especially, widespread deployment of multi-core systems accelerates researches on high resolution video processing based on parallelization of multimedia software. Previously proposed parallelization approach could improve the decoding performance. However, some parallelization methods did not consider the entropy decoding and others considered only a partial decoding parallelization. Therefore, we consider parallel entropy decoding integrated with other parallel video decoding process on a multi-core system. We propose a novel parallel decoding method called Integrated Parallelization. We propose a method on how to optimize the parallelization of video decoding when we have a multi-core system with many cores. We parallelized the KTA 2.7 decoder with the proposed technique on an Intel i7 Quad-Core platform with Intel Hyper-Threading technology and multi-threads scheduling. We achieved up to 70% performance improvement using IP method.

Spark Framework Based on a Heterogenous Pipeline Computing with OpenCL (OpenCL을 활용한 이기종 파이프라인 컴퓨팅 기반 Spark 프레임워크)

  • Kim, Daehee;Park, Neungsoo
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.67 no.2
    • /
    • pp.270-276
    • /
    • 2018
  • Apache Spark is one of the high performance in-memory computing frameworks for big-data processing. Recently, to improve the performance, general-purpose computing on graphics processing unit(GPGPU) is adapted to Apache Spark framework. Previous Spark-GPGPU frameworks focus on overcoming the difficulty of an implementation resulting from the difference between the computation environment of GPGPU and Spark framework. In this paper, we propose a Spark framework based on a heterogenous pipeline computing with OpenCL to further improve the performance. The proposed framework overlaps the Java-to-Native memory copies of CPU with CPU-GPU communications(DMA) and GPU kernel computations to hide the CPU idle time. Also, CPU-GPU communication buffers are implemented with switching dual buffers, which reduce the mapped memory region resulting in decreasing memory mapping overhead. Experimental results showed that the proposed Spark framework based on a heterogenous pipeline computing with OpenCL had up to 2.13 times faster than the previous Spark framework using OpenCL.

Image Browse for JPEG Decoder

  • Chong, Ui-Pil
    • Journal of IKEEE
    • /
    • v.2 no.1 s.2
    • /
    • pp.96-100
    • /
    • 1998
  • Due to expected wide spread use of DCT based image/video coding standard, it is advantageous to process data directly in the DCT domain rather than decoding the source back to the spatial domain. The block processing algorithm provides a parallel processing method since multiple input data are processed in the block filter structure. Hence a fast implementation of the algorithm is well suited. In this paper, we propose the JPEG browse by Block Transform Domain Filtering(BTDF) using subband filter banks. Instead of decompressing the entire image to retrieve at full resolution from compressed format, a user can select the level of expansion required$(2^N{\times}2^N)$. Also this approach reduces the computer cpu time by reducing the number of multiplication through BTDF in the filter banks.

  • PDF

Implementation of the modified-signed digit(MSD) number adder using triple rail-coding input and symbolic substitution (Triple rail-coding 입력과 기호치환을 이용한 변형부호화자리수 가산기 구현)

  • Shin, Chang-Mok;Kim, Soo-Joong;Seo, Dong-Hoan
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.41 no.6
    • /
    • pp.43-51
    • /
    • 2004
  • An optical parallel modified signed-digit(MSD) number adder system is proposed by using triple rail-coding input patterns and serial arrangement method of symbolic substitution. By combing overlapped arithmetic results. which are produced by encoding MSD input as triple rail-coding patterns. into the same patterns, symbolic substitution rules are reduced and also by using serialized and space-shifted input patterns in optical experiments, the optical adder without space-shifting operation, NOR operation and threshold operation is implemented.

Data Sampling-based Angular Space Partitioning for Parallel Skyline Query Processing (데이터 샘플링을 통한 각 기반 공간 분할 병렬 스카이라인 질의처리 기법)

  • Chung, Jaehwa
    • The Journal of Korean Association of Computer Education
    • /
    • v.18 no.5
    • /
    • pp.63-70
    • /
    • 2015
  • In the environment that the complex conditions need to be satisfied, skyline query have been applied to various field. To processing a skyline query in centralized scheme, several techniques have been suggested and recently map/reduce platform based approaches has been proposed which divides data space into multiple partitions for the vast volume of multidimensional data. However, the performances of these approaches are fluctuated due to the uneven data loading between servers and redundant tasks. Motivated by these issues, this paper suggests a novel technique called MR-DEAP which solves the uneven data loading using the random sampling. The experimental result gains the proposed MR-DEAP outperforms MR-Angular and MR-BNL scheme.

Development of a Systolic Array Design System(SADS) (시스톨릭 어레이 설계 시스템의 개발)

  • Yu, Gi-Hyeong;Lee, Seong-U;Park, Dong-Gi;Kim, Yun-Ho
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.5
    • /
    • pp.1380-1390
    • /
    • 1997
  • This paper presents a systolic array design method which derives 1 or 2 dimensional optimal planar systolic arrays from a given n dimensional problem represented as a regular recurrence equation and its implementation called a systolic array design system(SADS).The SADS parses a regular recurrence equation and gets the information such as problem space, data dependence vectors. and intial data positions. Systolic arrays are automati-cally derived by the space-time transformation form the information to be abeaired in the parsing phase.The SADS allows us to verify the parallel execution of the derived systolic aooay through the graghical interface.

  • PDF

Routing Strategy on the XMESH Topology for the Massively Parallel Computer Architecture (대규모 병렬컴퓨터에 적합한 교차메쉬구조에서의 경로설정)

  • Kim, Jong-Jin;Yun, Seong-Dae
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.12
    • /
    • pp.3109-3116
    • /
    • 1998
  • 본 논문에서는 대규모 병렬컴퓨터의 구현에 적합한 위상구조인 교차메쉬구조에서, 균등한 메시지분포를 갖고 메시지의 경쟁이 있는 실제적 상황에서 상호접속망내의 메시지의 경로를 선정하기 위한 방법들을 제안하고, 이의 성능을 검증하기 위해 우회 경로설정 알고리즘을 이용하여 시뮬레이션을 하였다. 교차메쉬의 특성상 최적경로의 수가 다른 구조들에 비해 다양하다는 특징을 이용하여 최적경로의 수를 우선순위에 반영한 우회조건 및 대각방향의 링크를 효율적으로 활용하기 위한 링크선정방법에 따를 교차메쉬의 최대지연(maximum delay), 평균지연(average delay) 및 메시지처리율(throughput)을 구하고 이를 비교 고찰하였다. 메시지 전송시 최적인 경로상의 링크에 경합이 생길 경우 최적 경로의 수가 적은 메시지가 높은 우선순위를 가지며 만약 같은 조건이라면 우회한 횟수가 많은 메시지가 높은 우선순위를 갖는 우회조건 LD를 사용하며 이 우선순위에 따라 경로를 선정할 차례가 된 메시지가 선택할 수 있는 최적경로의 수가 많을 경우 대각방향의 링크로 우선적으로 전송할 경우, 오래된 메시지가 높은 우선순위를 갖는 우회조건 A에 의한 방법에 비해 최대지연, 평균지연 및 메시지처리율에 있어서 각각 이상값에 대한 개선목표치의 약58%, 70% 및 31%의 성능개선이 있었다.

  • PDF

Signal Processing and Implementation of Transmitter for Cochlear Implant (인공 와우를 위한 신호 처리 및 전달부의 구현)

  • Chae, D.;Choi, D.;Byun, J.;Baeck, S.;Kong, H.;Park, S.
    • Proceedings of the KIEE Conference
    • /
    • 1993.07a
    • /
    • pp.284-286
    • /
    • 1993
  • Software and hardware for cochlear implant system have been developed to create a speech signal processing system which, in real-time, extracts model parameter including formants, pitch, amplitude information. The system is based on the Texas Instruments TMS320 family. In hardware, computer interface has been desisted and implemented that allows presentation of biphasic pulse stimuli to patients with the hearing handicapped. The host computer sends a stream of bytes to the parallel port. Upon receipt of the data the interface generates the appropriate burst sequence that is delivered to the patient's external transmitter coil. The coded information is interpreted by the Nucleus-22 internal receiver that delivers the pulse to the specified electrodes at the specified amplitude and pulse width.

  • PDF