• Title/Summary/Keyword: 병렬성능

Search Result 1,946, Processing Time 0.032 seconds

Performance Analysis of Stepwise Parallel Processing for Cell Search in WCDMA over Rayleigh Fading Channels (레일리 페이딩 채널에서 WCDMA의 단계별 병렬 처리 셀 탐색의 성능 해석)

  • 송문규
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.27 no.2B
    • /
    • pp.125-136
    • /
    • 2002
  • It is very important to acquire the synchronization in a intercell asynchronous WCDMA system, and it is carried out through the three-step cell search process. The cell search can operate in a stepwise parallel manner, where each step works in pipelined operation, to reduce the cell search time. In case that the execution time is set to be the same in each step, excessive accumulations will be caused in both step 1 and step 3, because step 2 should take at least one frame for its processing. In general, the effect of post-detection integration becomes saturated as the number of the accumulations increases. Therefore, the stepwise parallel scheme does not give much enhancement. In this paper, the performance of the stepwise parallel processing for cell search in WCDMA system is analyzed over Rayleigh fading channels. Through the analysis, the effect of cell search parameters such as the number of accumulations in each step and the power ratio allocated among channels is investigated. In addition, the performance of the stepwise parallel cell search is improved by adjusting the execution time appropriately for each step and is compared with that of the conventional stepwise serial processing.

Parallel Rendering of High Quality Animation based on a Dynamic Workload Allocation Scheme (작업영역의 동적 할당을 통한 고화질 애니메이션의 병렬 렌더링)

  • Rhee, Yun-Seok
    • Journal of the Korea Society of Computer and Information
    • /
    • v.13 no.1
    • /
    • pp.109-116
    • /
    • 2008
  • Even though many studies on parallel rendering based on PC clusters have been done. most of those did not cope with non-uniform scenes, where locations of 3D models are biased. In this work. we have built a PC cluster system with POV-Ray, a free rendering software on the public domain, and developed an adaptive load balancing scheme to optimize the parallel efficiency Especially, we noticed that a frame of 3D animation are closely coherent with adjacent frames. and thus we could estimate distribution of computation amount, based on the computation time of previous frame. The experimental results with 2 real animation data show that the proposed scheme reduces by 40% of execution time compared to the simple static partitioning scheme.

  • PDF

Effective Parallel Hash Join Algorithm Based on Histoftam Equalization in the Presence of Data Skew (데이터 편재 하에서 히스토그램 변환기법에 기초한 효율적인 병렬 해쉬 결합 알고리즘)

  • Park, Ung-Gyu;Choe, Hwang-Gyu;Kim, Tak-Gon
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.2
    • /
    • pp.338-348
    • /
    • 1997
  • In this pater, we first propose a data distribution framework to resolve load imbalance and bucket oerflow in parallel hash join.Using the histogram equalization technique, the framework transforms a histogram of skewed data to the desired uniform distribution that corresponds to the relative computing power of node processors in the system.Next we propose an effcient parallel hash join algorithm for handing skwed data based on the proposed data distribution methodology.For performance comparison of our algorithm with other hash join algorithms.we perform similation experiments and actual exeution on COREDB database computer with 8-node hyperube architecture. In these experiments, skwed data distebution of the join atteibute is modeled using a Zipf-like distribution.The perfomance studies undicate that our algorithm outperforms other algorithms in the skewed cases.

  • PDF

Parallel Rabin Fingerprinting on GPGPU for Efficient Data Deduplication (효율적인 데이터 중복제거를 위한 GPGPU 병렬 라빈 핑거프린팅)

  • Ma, Jeonghyeon;Park, Sejin;Park, Chanik
    • Journal of KIISE
    • /
    • v.41 no.9
    • /
    • pp.611-616
    • /
    • 2014
  • Rabin fingerprinting used for chunking requires the largest amount computation time in data deduplication, In this paper, therefore, we proposed parallel Rabin fingerprinting on GPGPU for efficient data deduplication. In addition, for efficient parallelism in Rabin fingerprinting, four issues are considered. Firstly, when dividing input data stream into data sections, we consider the data located near the boundaries between data sections to calculate Rabin fingerprint continuously. Secondly, we consider exploiting the characteristics of Rabin fingerprinting for efficient operation. Thirdly, we consider the chunk boundaries which can be changed compared to sequential Rabin fingerprinting when adapting parallel Rabin fingerprinting. Finally, we consider optimizing GPGPU memory access. Parallel Rabin fingerprinting on GPGPU shows 16 times and 5.3 times better performance compared to sequential Rabin fingerprinting on CPU and compared to parallel Rabin fingerprinting on CPU, respectively. These throughput improvement of Rabin fingerprinting can lead to total performance improvement of data deduplication.

Performance Evaluation and Verification of MMX-type Instructions on an Embedded Parallel Processor (임베디드 병렬 프로세서 상에서 MMX타입 명령어의 성능평가 및 검증)

  • Jung, Yong-Bum;Kim, Yong-Min;Kim, Cheol-Hong;Kim, Jong-Myon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.10
    • /
    • pp.11-21
    • /
    • 2011
  • This paper introduces an SIMD(Single Instruction Multiple Data) based parallel processor that efficiently processes massive data inherent in multimedia. In addition, this paper implements MMX(MultiMedia eXtension)-type instructions on the data parallel processor and evaluates and analyzes the performance of the MMX-type instructions. The reference data parallel processor consists of 16 processors each of which has a 32-bit datapath. Experimental results for a JPEG compression application with a 1280x1024 pixel image indicate that MMX-type instructions achieves a 50% performance improvement over the baseline instructions on the same data parallel architecture. In addition, MMX-type instructions achieves 100% and 51% improvements over the baseline instructions in energy efficiency and area efficiency, respectively. These results demonstrate that multimedia specific instructions including MMX-type have potentials for widely used many-core GPU(Graphics Processing Unit) and any types of parallel processors.

Parallel lProcessing of Pre-conditioned Navier-Stokes Code on the Myrinet and Fast-Ethernet PC Cluster (Myrinet과 Fast-Ethernet PC Cluster에서 예조건화 Navier-Stokes코드의 병렬처리)

  • Lee, G.S.;Kim, M.H.;Choi, J.Y.;Kim, K.S.;Kim, S.L.;Jeung, I.S.
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.30 no.6
    • /
    • pp.21-30
    • /
    • 2002
  • A preconditioned Navier-Stokes code was parallelized by the domain decomposition technique, and the accuracy of the parallelized code was verified through a comparison with the result of a sequential code and experimental data. Parallel performance of the code was examined on a Myrinet based PC-cluster and a Fast-Ethernet system. Speed-up ratio was examined as a major performance parameter depending on the number of processor and the network communication topology. In this test, Myrinet system shows a superior parallel performance to the Fast-Ethernet system as was expected. A test for the dependency on problem size also shows that network communication speed in a crucial factor for parallel performance, and the Myrinet based PC-cluster is a plausible candidate for high performance parallel computing system.

Prediction-Based Parallel Gate-Level Timing Simulation Using Spatially Partial Simulation Strategy (공간적 부분시뮬레이션 전략이 적용된 예측기반 병렬 게이트수준 타이밍 시뮬레이션)

  • Han, Jaehoon;Yang, Seiyang
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.8 no.3
    • /
    • pp.57-64
    • /
    • 2019
  • In this paper, an efficient prediction-based parallel simulation method using spatially partial simulation strategy is proposed for improving both the performance of the event-driven gate-level timing simulation and the debugging efficiency. The proposed method quickly generates the prediction data on-the-fly, but still accurately for the input values and output values of parallel event-driven local simulations by applying the strategy to the simulation at the higher abstraction level. For those six designs which had used for the performance evaluation of the proposed strategy, our method had shown about 3.7x improvement over the most general sequential event-driven gate-level timing simulation, 9.7x improvement over the commercial multi-core based parallel event-driven gate-level timing simulation, and 2.7x improvement over the best of previous prediction-based parallel simulation results, on average.

A Study on the Performance of Parallelepiped Classification Algorithm (평행사변형 분류 알고리즘의 성능에 대한 연구)

  • Yong, Whan-Ki
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.4 no.4
    • /
    • pp.1-7
    • /
    • 2001
  • Remotely sensed data is the most fundamental data in acquiring the GIS informations, and may be analyzed to extract useful thematic information. Multi-spectral classification is one of the most often used methods of information extraction. The actual multi-spectral classification may be performed using either supervised or unsupervised approaches. This paper analyze the effect of assigning clever initial values to image classes on the performance of parallelepiped classification algorithm, which is one of the supervised classification algorithms. First, we investigate the effect on serial computing model, then expand it on MIMD(Multiple Instruction Multiple Data) parallel computing model. On serial computing model, the performance of the parallel pipe algorithm improved 2.4 times at most and, on MIMD parallel computing model the performance improved about 2.5 times as clever initial values are assigned to image class. Through computer simulation we find that initial values of image class greatly affect the performance of parallelepiped classification algorithms, and it can be improved greatly when classes on both serial computing model and MIMD parallel computation model.

  • PDF

Search scheme for parallel spatial index (병렬 공간 색인을 위한 검색 기법)

  • Seo, Young-Duk
    • Journal of Korea Spatial Information System Society
    • /
    • v.7 no.2 s.14
    • /
    • pp.81-89
    • /
    • 2005
  • Declustering and parallel index structures are important research areas to improve a performance of databases. Previous researches proposed several distribution schemes for parallel R-trees, however there is no search schemes to be suitable for the index. In this paper, we propose schemes to improve the performance of range queries for distribute parallel indexes. The proposed schemes use the features that a parallel disk can read multiple nodes from various disks. The proposed schemes are verified using various implementations and performance evaluations. We propose new schemes which can read multiple nodes from multiple disks in contrast that to the previous schemes which can read a node from disk. The experimental evaluation shows that the proposed schemes give us the performance improvement by 40% from the previous researches.

  • PDF

Design of Web Based Parallel I/O Control System Using IEEE 1284 Operating Modes (IEEE 1284 동작 모드를 사용하는 웹 기반 병렬 I/O 제어 장치의 설계)

  • Chang, Ho-Sung
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.11 no.3
    • /
    • pp.991-996
    • /
    • 2010
  • In this paper, we designed a parallel I/O control system using IEEE 1284 operating modes and implemented remote control communication under the internet environment. The IEEE 1284 standard defines an interface compatible with several distinct operation modes and brings higher performance to the PC parallel port. Therefore, parallel port devices become easier to configure and simplify interface because new operating systems bring PnP function to the parallel port with the Device/ID identification sequence. With these enhancements, the parallel port become an even better low-cost, readily available I/O port on the PC.