• 제목/요약/키워드: and parallel processing

Search Result 2,006, Processing Time 0.025 seconds

Design and Implementation of Real-Time Parallel Engine for Discrete Event Wargame Simulation (이산사건 워게임 시뮬레이션을 위한 실시간 병렬 엔진의 설계 및 구현)

  • Kim, Jin-Soo;Kim, Dae-Seog;Kim, Jung-Guk;Ryu, Keun-Ho
    • The KIPS Transactions:PartA
    • /
    • v.10A no.2
    • /
    • pp.111-122
    • /
    • 2003
  • Military wargame simulation models must support the HLA in order to facilitate interoperability with other simulations, and using parallel simulation engines offer efficiency in reducing system overhead generated by propelling interoperability. However, legacy military simulation model engines process events using sequential event-driven method. This is due to problems generated by parallel processing such as synchronous reference to global data domains. Additionally. using legacy simulation platforms result in insufficient utilization of multiple CPUs even if a multiple CPU system is under use. Therefore, in this paper, we propose conversing the simulation engine to an object model-based parallel simulation engine to ensure military wargame model's improved system processing capability, synchronous reference to global data domains, external simulation time processing, and the sequence of parallel-processed events during a crash recovery. The converted parallel simulation engine is designed and implemented to enable parallel execution on a multiple CPU system (SMP).

Performance of the Viterbi Decoder using Analog Parallel Processing circuit with Reference position (아날로그 병렬 처리 망을 이용한 비터비 디코더의 기준 입력 인가위치에 따른 성능 평가)

  • Kim, Hyung-Jung;Kim, In-Cheol;Lee, Wnag-Hee;Kim, Hyong-Suk
    • Proceedings of the KIEE Conference
    • /
    • 2006.10c
    • /
    • pp.378-380
    • /
    • 2006
  • A high speed Analog parallel processing-based Viterbi decoder with a circularly connected 2D analog processing cell array is proposed. It has a 2D parallel processing structure in which an analog processing cell is placed at each node of trellis diagram is connected circulary so that infinitively expanding trellis diagram is realized with the fixed size of circuits. The proposed Viterbi decoder has advantages in that it is operated with better performance of error corrections, has a shorter latency and requires no path memories. In this parer, the performance of error correction as a reference position with the Analog parallel processing-based Viterbi decoder is testd via the software simulation

  • PDF

Parallel Machine Scheduling Considering the Moving Time of Multiple Servers

  • Chong, Kyun-Rak
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.10
    • /
    • pp.101-107
    • /
    • 2017
  • In this paper, we study the problem of parallel machine scheduling considering the moving time of multiple servers. The parallel machine scheduling is to assign jobs to parallel machines so that the total completion time(makespan) is minimized. Each job has a setup phase, a processing phase and a removal phase. A processing phase is performed by a parallel machine alone while a setup phase and a removal phase are performed by both a server and a parallel machine simultaneously. A server is needed to move to a parallel machine for a setup phase and a removal phase. But previous researches have been done under the assumption that the server moving time is zero. In this study we have proposed an efficient algorithm for the problem of parallel machine scheduling considering multiple server moving time. We also have investigated experimentally how the number of servers and the server moving time affect the total completion time.

The GPU-based Parallel Processing Algorithm for Fast Inspection of Semiconductor Wafers (반도체 웨이퍼 고속 검사를 위한 GPU 기반 병렬처리 알고리즘)

  • Park, Youngdae;Kim, Joon Seek;Joo, Hyonam
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.19 no.12
    • /
    • pp.1072-1080
    • /
    • 2013
  • In a the present day, many vision inspection techniques are used in productive industrial areas. In particular, in the semiconductor industry the vision inspection system for wafers is a very important system. Also, inspection techniques for semiconductor wafer production are required to ensure high precision and fast inspection. In order to achieve these objectives, parallel processing of the inspection algorithm is essentially needed. In this paper, we propose the GPU (Graphical Processing Unit)-based parallel processing algorithm for the fast inspection of semiconductor wafers. The proposed algorithm is implemented on GPU boards made by NVIDIA Company. The defect detection performance of the proposed algorithm implemented on the GPU is the same as if by a single CPU, but the execution time of the proposed method is about 210 times faster than the one with a single CPU.

Parallel Processing Architecture for Parity Checksum Generator Complying with ITU-T J.83 ANNEX B (ITU-T J.83 ANNEX B의 Parity Checksum Generator를 위한 병렬 처리 구조)

  • Lee, Jong-Yeop;Hong, Eon-Pyo;Har, Dong-Soo;Lim, Hai-Jeong
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.34 no.6C
    • /
    • pp.619-625
    • /
    • 2009
  • This paper proposes a parallel architecture of a Parity Checksum Generator adopted for packet synchronization and error detection in the ITU-T Recommendation J.83 Annex B. The proposed parallel processing architecture removes a performance bottleneck occurred in a conventional serial processing architecture, leading to significant decrease in processing time for generating a Parity Checksum. The implementation results show that the proposed parallel processing architecture reduces the processing time by 83.1% at the expense of 16% area increase.

Inspection of guided missiles applied with parallel processing algorithm (병렬처리 알고리즘 적용 유도탄 점검)

  • Jung, Eui-Jae;Koh, Sang-Hoon;Lee, You-Sang;Kim, Young-Sung
    • Journal of Advanced Navigation Technology
    • /
    • v.25 no.4
    • /
    • pp.293-298
    • /
    • 2021
  • In general, the guided weapon seeker and the guided control device process the target, search, recognition, and capture information to indicate the state of the guided missile, and play a role in controlling the operation and control of the guided weapon. The signals required for guided weapons are gaze change rate, visual signal, and end-stage fuselage orientation signal. In order to process the complex and difficult-to-process missile signals of recent missiles in real time, it is necessary to increase the data processing speed of the missiles. This study showed the processing speed after applying the stop and go and inverse enumeration algorithm among the parallel algorithm methods of PINQ and comparing the processing speed of the signal data required for the guided missile in real time using the guided missile inspection program. Based on the derived data processing results, we propose an effective method for processing missile data when applying a parallel processing algorithm by comparing the processing speed of the multi-core processing method and the single-core processing method, and the CPU core utilization rate.

Parallel Scrambling Techniques for SDH and ATM Transmissions (SDH와 ATM 전송을 위한 병렬혼화 기법)

  • 김석창;이병기
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.18 no.8
    • /
    • pp.1146-1158
    • /
    • 1993
  • In this paper, parallel scrambling techniques are considered for practical use in the SDH transmission and the ATM transmission. In the ATM transmission, there are two ways of transmitting ATM cells - the SDH-based and the cell-based - and the corresponding scrambling techniques differ accordingly. For the SDH transmission and the SDH-based ATM transmission, the FSS (frame synchronous scrambling) is applied to the STM frames : while for the cell-based ATM trans-mission, the DSS(distributed sample scrambling) is used on the ATM cell stream. The parallel scrambling techniques are examined for the FSS and the DSS, and applied to achieve the parallel FSSs for use in the SDH and the SDH-based ATM transmission along with the parallel DSS applicable to the cell-based ATM transmission. The resulting(8, 4) PSRG(parallel shift resister generator) and (8, 16) PSRG based parallel scramblings are directly applicable for the STM-1 rate processing of the STM-4 and STM-16 scramblings, respectively. Likewise, the resulting (1, 8)PSRG and double-sampling-double-correction based parallel scrambling techniques can be practically used for a low-rate processing of the SDH-based and the cell-based ATM signal scrambling respectively.

  • PDF

Development of the Dynamic Host Management Scheme for Parallel/Distributed Processing on the Web (웹 환경에서의 병렬/분산 처리를 위한 동적 호스트 관리 기법의 개발)

  • Song, Eun-Ha;Jeong, Young-Sik
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.8 no.3
    • /
    • pp.251-260
    • /
    • 2002
  • The parallel/distributed processing with a lot of the idle hosts on the web has the high coot-performance ratio for large-scale applications. It's processing has to show the solutions for unpredictable status such as heterogeneity of hosts, variability of hosts, autonomy of hosts, the supporting performance continuously, and the number of hosts which are participated in computation and so on. In this paper, we propose the strategy of adaptive tack reallocation based on performance the host job processing, spread out geographically Also, It shows the scheme of dynamic host management with dynamic environment, which is changed by lots of hosts on the web during parallel processing for large-scale applications. This paper implements the PDSWeb (Parallel/Distributed Scheme on Web) system, evaluates and applies It to the generation of rendering image with highly intensive computation. The results are showed that the adaptive task reallocation with the variation of hosts has been increased up to maximum 90% and the improvement in performance according to add/delete of hosts.

A Performance Comparison of Parallel Programming Models on Edge Devices (엣지 디바이스에서의 병렬 프로그래밍 모델 성능 비교 연구)

  • Dukyun Nam
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.18 no.4
    • /
    • pp.165-172
    • /
    • 2023
  • Heterogeneous computing is a technology that utilizes different types of processors to perform parallel processing. It maximizes task processing and energy efficiency by leveraging various computing resources such as CPUs, GPUs, and FPGAs. On the other hand, edge computing has developed with IoT and 5G technologies. It is a distributed computing that utilizes computing resources close to clients, thereby offloading the central server. It has evolved to intelligent edge computing combined with artificial intelligence. Intelligent edge computing enables total data processing, such as context awareness, prediction, control, and simple processing for the data collected on the edge. If heterogeneous computing can be successfully applied in the edge, it is expected to maximize job processing efficiency while minimizing dependence on the central server. In this paper, experiments were conducted to verify the feasibility of various parallel programming models on high-end and low-end edge devices by using benchmark applications. We analyzed the performance of five parallel programming models on the Raspberry Pi 4 and Jetson Orin Nano as low-end and high-end devices, respectively. In the experiment, OpenACC showed the best performance on the low-end edge device and OpenSYCL on the high-end device due to the stability and optimization of system libraries.

High Throughput Parallel Decoding Method for H.264/AVC CAVLC

  • Yeo, Dong-Hoon;Shin, Hyun-Chul
    • ETRI Journal
    • /
    • v.31 no.5
    • /
    • pp.510-517
    • /
    • 2009
  • A high throughput parallel decoding method is developed for context-based adaptive variable length codes. In this paper, several new design ideas are devised and implemented for scalable parallel processing, a reduction in area, and a reduction in power requirements. First, simplified logical operations instead of memory lookups are used for parallel processing. Second, the codes are grouped based on their lengths for efficient logical operation. Third, up to M bits of the input stream can be analyzed simultaneously. For comparison, we designed a logical-operation-based parallel decoder for M=8 and a conventional parallel decoder. High-speed parallel decoding becomes possible with our method. In addition, for similar decoding rates (1.57 codes/cycle for M=8), our new approach uses 46% less chip area than the conventional method.