• Title/Summary/Keyword: parallel computer processing

Search Result 652, Processing Time 0.027 seconds

An Applicability Study on Parallel Computing for webOS-based Smart TV (webOS 기반 스마트 TV에서의 병렬처리 가능성 연구)

  • Jeon, Yongkweon;Koo, Donghoon;Na, Byunggook;Yoon, Sungroh
    • Annual Conference of KIPS
    • /
    • 2014.11a
    • /
    • pp.336-339
    • /
    • 2014
  • 전자제품의 스마트화 열풍으로 임베디드 시스템의 하드웨어 및 소프트웨어의 발전이 경쟁적으로 이루어지고 있지만, 하드웨어 발전 속도에 비해 그 활용도는 미진한 편이다. 특히, 스마트 TV 는 대형 스크린을 갖고 있다는 장점이 있고, 사물인터넷 시대의 중추 역할을 할 것으로 기대되기 때문에 많은 계산의 신속한 처리를 요구 받을 가능성이 크다. 따라서 본 논문에서는 webOS 기반 스마트 TV 에서, 계산자원을 충분한 활용하기 위한 병렬처리 가능성을 확인하고자 webOS 시스템을 프로파일링하고 그 결과를 분석하였다.

A Study on Developing Distributed and Parallel Traffic Simulation Program with Open MPI (Open MPI 를 이용한분산/병렬 교통 시뮬레이션 프로그램 개발에 관한 연구)

  • Cho, Min-Kyu;Kyung, MinGi;Shin, In-soo;Min, Dug-Ki
    • Annual Conference of KIPS
    • /
    • 2019.10a
    • /
    • pp.137-140
    • /
    • 2019
  • 교통 시뮬레이션 시스템은 현실 세계의 교통 및 차량 관련 데이터를 기반으로 미래의 차량 움직임을 예측하는 프로그램으로, 다양한 교통문제를 해결을 위한 도구가 될 수 있다. 시뮬레이션 스케일을 전국단위로 확장하기 위해서 분산/병렬 시스템을 도입해야 하는데, 이 논문에서는 병렬/분산 과정에서 핵심이 되는 Open MPI 기반의 데이터 교환에 대한 방법을 제안하고자 한다. 공통된 하나의 커뮤니케이션 모듈을 기반으로 분산된 노드의 데이터 교환에 대한 문제를 해결하여 생산성을 높이고, 시뮬레이션 과정에서 소요되는 커뮤니케이션 타임을 줄여줄 것으로 예상된다.

A Parallel Reachability Analysis Method Based on Multiple Finite State Machine (다중 유한상태머신 기반 병렬적 도달성 분석 기법)

  • Lee, Jung Sun;Lee, Woo Jin;Shin, Youngsul;Cao, Thi Ly
    • Annual Conference of KIPS
    • /
    • 2010.04a
    • /
    • pp.966-968
    • /
    • 2010
  • 컴퓨팅 자원의 확보가 용이해짐에 따라 이러한 자원을 최대한 활용하려는 시도가 늘어나고 있다. 시스템을 검사하는 정형적 기법으로써 많이 사용되고 있는 모델 체킹은 상태폭발 문제를 완화하기 위해서 여러 컴퓨팅 자원을 한꺼번에 사용하려는 연구가 이루어져 왔다. 하지만 이 기법 역시 여러 상태 모델들이 하나로 합쳐지면서 여전히 상태폭발 문제를 발생 시킨다. 본 논문에서는 이러한 문제가 나타나는 원인을 지적하고 이를 해결하기 위해 모델 체킹의 기본 요소인 새로운 병렬적 도달성 분석 기법을 제시한다.

Implementation of LTE uplink System for SDR Platform using CUDA and UHD (CUDA와 UHD를 이용한 SDR 플랫폼 용 LTE 상향링크 시스템 구현)

  • Ahn, Chi Young;Kim, Yong;Choi, Seung Won
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.9 no.2
    • /
    • pp.81-87
    • /
    • 2013
  • In this paper, we present an implementation of Long Term Evolution (LTE) Uplink (UL) system on a Software Defined Radio (SDR) platform using a conventional Personal Computer (PC), which adopts Graphic Processing Units (GPU) and Universal Software Radio Peripheral2 (USRP2) with URSP Hardware Driver (UHD) for SDR software modem and Radio Frequency (RF) transceiver, respectively. We have adopted UHD because UHD provides flexibility in the design of transceiver chain. Also, Cognitive Radio (CR) engine have been implemented by using libraries from UHD. Meanwhile, we have implemented the software modem in our system on GPU which is suitable for parallel computing due to its powerful Arithmetic and Logic Units (ALUs). From our experiment tests, we have measured the total processing time for a single frame of both transmit and receive LTE UL data to find that it takes about 5.00ms and 6.78ms for transmit and receive, respectively. It particularly means that the implemented system is capable of real-time processing of all the baseband signal processing algorithms required for LTE UL system.

Construction of a CPU Cluster and Implementation of a 3-D Domain Decomposition Parallel FDTD Algorithm (CPU 클러스터 구축 및 3차원 공간분할 병렬 FDTD 알고리즘 구현)

  • Park, Sungmin;Chu, Kwang-Uk;Ju, Saehoon;Park, Yoon-Mi;Kim, Ki-Baek;Jung, Kyung-Young
    • The Journal of Korean Institute of Electromagnetic Engineering and Science
    • /
    • v.25 no.3
    • /
    • pp.357-364
    • /
    • 2014
  • In this work, we construct a CPU cluster to implement a parallel finite-difference time domain(FDTD) algorithm for fast electromagnetic analyses. This parallel FDTD algorithm can reduce the computational time significantly and also analyze electrically larger structures, compared to a single FDTD counterpart. The parallel FDTD algorithm needs communication between neighboring processors, which is performed by the MPI(Message Passing Interface) library and a 3-D domain decomposition is employed to decrease the communication time between neighboring processors. Compared to a single-processor FDTD, the speed up factor of a-CPU-cluster-based parallel FDTD algorithm is investigated for the normal mode and the hypermode and finally analyze an electrically large concrete structure by the developed parallel algorithm.

Detecting the First Race in OpenMP Program with Nested Parallelism (내포 병렬성을 가지는 OpenMP 프로그램의 최초 경합 탐지)

  • Chon, Byoung-Gyu;Woo, Jong-Jung;Jun, Yong-Kee
    • The KIPS Transactions:PartA
    • /
    • v.8A no.3
    • /
    • pp.253-260
    • /
    • 2001
  • It is important to detect races for debugging shared-memoy parallel programs, because the races cause unintended nondeterministic program execution. Previous on-the-fly techniques to detect races can not guarantee the first race detection in nested parallel programs. Detecting the first race is important for debugging parallel programs, since the removal of the first race may make the next occurred races disappear. In this paper, we presents an on-the-fly detection technique to detect all of the first races through the reexecution of the debugged programs. We assume that the debugged parallel program may have one-way nested parallel programs. The number of reexecution is at the least the nesting depth of the program in the worst case. The space complexity is O(VT) and the time complexity to detect race in each access of access history is O(T), where V is number of shared variables and T is the maximum parallelism of the program. This efficiency of our technique in each execution is the same with the previous on-the-fly detection techniques. Therefore, this technique makes debugging parallel programs more effective and practical.

  • PDF

The Bigdata Processing Environment Building for the Learning System (학습 시스템을 위한 빅데이터 처리 환경 구축)

  • Kim, Young-Geun;Kim, Seung-Hyun;Jo, Min-Hui;Kim, Won-Jung
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.9 no.7
    • /
    • pp.791-797
    • /
    • 2014
  • In order to create an environment for Apache Hadoop for parallel distributed processing system of Bigdata, by connecting a plurality of computers, or to configure the node, using the configuration of the virtual nodes on a single computer it is necessary to build a cloud fading environment. However, be constructed in practice for education in these systems, there are many constraints in terms of cost and complex system configuration. Therefore, it is possible to be used as training for educational institutions and beginners in the field of Bigdata processing, development of learning systems and inexpensive practical is urgent. Based on the Raspberry Pi board, training and analysis of Big data processing, such as Hadoop and NoSQL is now the design and implementation of a learning system of parallel distributed processing of possible Bigdata in this study. It is expected that Bigdata parallel distributed processing system that has been implemented, and be a useful system for beginners who want to start a Bigdata and education.

The vibration control of Flexible Manipulator using Parallel Fuzzy controller and Reference Trajectory Command (병렬퍼지 제어기와 기준궤적신호를 이용한 유연한 매니퓰레이터의 진동제어)

  • 박양수;박윤명
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.3 no.1
    • /
    • pp.61-66
    • /
    • 2002
  • A fuzzy control strategy is described which is utilized to control the joint angle and tip deflection in single flexible manipulator. In this paper, an existing model for a single flexible manipulator is used for the initial development of an FLC. One FLC is designed to govern the joint angle of the manipulator as it is rotated from one position to another, and the second FLC is designed to attenuate the tip deflection which result from joint angle body motion. Reference Trajectory Command is an important method to reduce vibration in flexible beam. This paper presents a very simple command control shaping which eliminates multiple mode residual vibration in a flexible beam combined parallel fuzzy controller. The effectiveness of proposed scheme is demonstrated through computer simulation.

  • PDF

Accelerating particle filter-based object tracking algorithms using parallel programming

  • Truong, Mai Thanh Nhat;Kim, Sanghoon
    • Annual Conference of KIPS
    • /
    • 2018.05a
    • /
    • pp.469-470
    • /
    • 2018
  • Object tracking is a common task in computer vision, an essential part of various vision-based applications. After several years of development, object tracking in video is still a challenging problem because of various visual properties of objects and surrounding environment. Particle filter is a well-known technique among common approaches, has been proven its effectiveness in dealing with difficulties in object tracking. However, particle filter is a high-complexity algorithms, which is an severe disadvantage because object tracking algorithms are required to run in real time. In this research, we utilize parallel programming to accelerate particle filter-based object tracking algorithms. Experimental results showed that our approach reduced the execution time significantly.

Accelerating Soft-Decision Reed-Muller Decoding Using a Graphics Processing Unit

  • Uddin, Md. Sharif;Kim, Cheol Hong;Kim, Jong-Myon
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.4 no.2
    • /
    • pp.369-378
    • /
    • 2014
  • The Reed-Muller code is one of the efficient algorithms for multiple bit error correction, however, its high-computation requirement inherent in the decoding process prohibits its use in practical applications. To solve this problem, this paper proposes a graphics processing unit (GPU)-based parallel error control approach using Reed-Muller R(r, m) coding for real-time wireless communication systems. GPU offers a high-throughput parallel computing platform that can achieve the desired high-performance decoding by exploiting massive parallelism inherent in the algorithm. In addition, we compare the performance of the GPU-based approach with the equivalent sequential approach that runs on the traditional CPU. The experimental results indicate that the proposed GPU-based approach exceedingly outperforms the sequential approach in terms of execution time, yielding over 70× speedup.