• Title/Summary/Keyword: and Parallel Processing

Search Result 2,013, Processing Time 0.028 seconds

Parallel Data Extraction Architecture for High-speed Playback of High-density Optical Disc (고용량 광 디스크의 고속 재생을 위한 병렬 데이터 추출구조)

  • Choi, Goang-Seog
    • Journal of Korea Multimedia Society
    • /
    • v.12 no.3
    • /
    • pp.329-334
    • /
    • 2009
  • When an optical disc is being played. the pick-up converts light to analog signal at first. The analog signal is equalized for removing the inter-symbol interference and then the equalized analog signal is converted into the digital signal for extracting the synchronized data and clock signals. There are a lot of algorithms that minimize the BER in extracting the synchronized data and clock when high. density optical disc like BD is being played in low speed. But if the high-density optical disc is played in high speed, it is difficult to adopt the same extraction algorithm to data PLL and PRML architecture used in low speed application. It is because the signal with more than 800MHz should be processed in those architectures. Generally, in the 0.13-${\mu}m$ CMOS technology, it is necessary to have the high speed analog cores and lots of efforts to layout. In this paper, the parallel data PLL and PRML architecture, which enable to process in BD 8x speed of the maximum speed of the high-density optical disc as the extracting data and clock circuit, is proposed. Test results show that the proposed architecture is well operated without processing error at BD 8x speed.

  • PDF

THREE-DIMENSIONAL ROUND-ROBIN SCHEDULER FOR ADVANCED INPUT QUEUING SWITCHES (고속 입력큐 스위치 패브릭을 위한 3차원 라운드로빈 스케줄러)

  • Jeong, Gab-Joong;Lee, Bhum-Cheol
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2003.10a
    • /
    • pp.373-376
    • /
    • 2003
  • This paper presents a new, three-dimensional round-robin scheduler that provides high throughput and fair across in an advanced input-queued packet switch using shared input buffers. We consider an architecture in which each input port group shares a common buffer and maintains a separate queue for each output, which is ratted the distributed common input buffer switch. In an NxN switch, our scheduler determines which queue in the total MxN input queues is served during each time slot where M is the number of common buffers. We suppose that each common buffer has K input ports and K output ports, and manages N output queues. The 3DRR scheduler determines MxK queues in every K(M) cycle when $K\geq$M (K$\leq$M), and provides massively parallel processing for the applications of high-speed switches with a large number of ports. The 3-DRR scheduler can be implemented using duplicated simple logic components allowing very high-speed implementation.

  • PDF

FPGA-based Implementation of Fast Histogram Equalization for Image Enhancement (영상 품질 개선을 위한 FPGA 기반 고속 히스토그램 평활화 회로 구현)

  • Ryu, Sang-Moon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.11
    • /
    • pp.1377-1383
    • /
    • 2019
  • Histogram equalization is the most frequently used algorithm for image enhancement. Its hardware implementation significantly outperforms in time its software version. The overall performance of FPGA-based implementation of histogram equalization can be improved by applying pipelining in the design and by exploiting the multipliers and a lot of SRAM blocks which are embedded in recent FPGAs. This work proposes how to implement a fast histogram equalization circuit for 8-bit gray level images. The proposed design contains a FIFO to perform equalization on an image while the histogram for next image is being calculated. Because of some overlap in time for histogram equalization, embedded multipliers and pipelined design, the proposed design can perform histogram equalization on a pixel nearly at a clock. And its dual parallel version outperforms in time almost two times over the original one.

Design and Implementation of High-Performance Cryptanalysis System Based on GPUDirect RDMA (GPUDirect RDMA 기반의 고성능 암호 분석 시스템 설계 및 구현)

  • Lee, Seokmin;Shin, Youngjoo
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.6
    • /
    • pp.1127-1137
    • /
    • 2022
  • Cryptographic analysis and decryption technology utilizing the parallel operation of GPU has been studied in the direction of shortening the computation time of the password analysis system. These studies focus on optimizing the code to improve the speed of cryptographic analysis operations on a single GPU or simply increasing the number of GPUs to enhance parallel operations. However, using a large number of GPUs without optimization for data transmission causes longer data transmission latency than using a single GPU and increases the overall computation time of the cryptographic analysis system. In this paper, we investigate GPUDirect RDMA and related technologies for high-performance data processing in deep learning or HPC research fields in GPU clustering environments. In addition, we present a method of designing a high-performance cryptanalysis system using the relevant technologies. Furthermore, based on the suggested system topology, we present a method of implementing a cryptanalysis system using password cracking and GPU reduction. Finally, the performance evaluation results are presented according to demonstration of high-performance technology is applied to the implemented cryptanalysis system, and the expected effects of the proposed system design are shown.

Design and Performance Analysis of A Parallel Digtal Signal Processing System (병렬 디지털신호처리시스템의 설계와 성능분석)

  • Moon, B.P.;Park, J.S.;Oh, D.S.;Jeon, C.H.;Park, S.J.;Lee, D.H.;Oh, W.C.;Han, K.T.
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 1998.10a
    • /
    • pp.724-726
    • /
    • 1998
  • 본 논문에서는 방대한 양의 데이터를 실시간으로 처리하기 위한 병렬 디지털 신호처리시스템을 제안한다. 버스와 메모리의 구조가 다른 네 가지 신호처리부 모델을 제안하고 그들의 성능을 분석한다. 신호처리부의 분석은 소나 알고리즘을 실행하는데 소요되는 하드웨어 지연시간과 버스 지연시간의 합을 척도로 한 성능 분석과 보드의 복잡도를 비교하는 방법을 통하여 이루어졌다. 성능분석한 결과, 지역 메모리와 공유 메모리를 함께 사용하는 모델이 가장 효율적인 것으로 나타났다.

  • PDF

WRF Physics Models Using GP-GPUs with CUDA Fortran (WRF 물리 과정의 GP-GPU 계산을 위한 CUDA Fortran 프로그램 구현)

  • Kim, Youngtae;Lee, Yong Hee;Chung, Kwan-Young
    • Atmosphere
    • /
    • v.23 no.2
    • /
    • pp.231-235
    • /
    • 2013
  • We parallelized WRF major physics routines for Nvidia GP-GPUs with CUDA Fortran. GP-GPUs are originally designed for graphic processing, but show high performance with low electricity for calculating numerical models. In the CUDA environment, a data domain is allocated into thread blocks and threads in each thread block are computing in parallel. We parallelized the WRF program to use of thread blocks efficiently. We validated the GP-GPU program with the original CPU program, and the WRF model using GP-GPUs shows efficient speedup.

Numerical Prediction of Incompressible Flows Using a Multi-Block Finite Volume Method on a Parellel Computer (병렬 컴퓨터에서 다중블록 유한체적법을 이용한 비압축성 유동해석)

  • Kang, Dong-Jin;Sohn, Jeong-Lak
    • The KSFM Journal of Fluid Machinery
    • /
    • v.1 no.1 s.1
    • /
    • pp.72-80
    • /
    • 1998
  • Computational analysis of incompressible flows by numerically solving Navier-Stokes equations using multi-block finite volume method is conducted on a parallel computing system. Numerical algorithms adopted in this study $include^{(1)}$ QUICK upwinding scheme for convective $terms,^{(2)}$ central differencing for other terms $and^{(3)}$ the second-order Euler differencing for time-marching procedure. Structured grids are used on the body-fitted coordinate with multi-block concept which uses overlaid grids on the block-interfacing boundaries. Computational code is parallelized on the MPI environment. Numerical accuracy of the computational method is verified by solving a benchmark test case of the flow inside two-dimensional rectangular cavity. Computation in the axial compressor cascade is conducted by using 4 PE's md, as results, no numerical instabilities are observed and it is expected that the present computational method can be applied to the turbomachinery flow problems without major difficulties.

  • PDF

Analysis of Parallel and Sequential processing for integrated XQuery query (통합 XQuery 질의의 병렬처리와 순차처리 성능분석)

  • Kang, Soon-Jong;Park, Jong-Hyun;Kang, Ji-Hoon
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.10c
    • /
    • pp.214-217
    • /
    • 2006
  • XML 문서의 검색을 위한 질의 언어인 XQuery는 다양한 데이터 소스로부터 가져온 고유한 구조를 가진 질의 결과로 구성할 수 있도록 설계되어 XML질의 언어의 표준이 되었다. XQuery를 이용해 특별히, 분산 환경에서 다중 XML문서 를 대상으로 하는 통합 질의의 경우, 질의 처리 계획을 결정하는 것은 처리 효율과 직결된다. 따라서 질의 처리 계획을 결정하는 요소 중 하나인 조인 처리 방법의 연구는 중요하다. 그러나 통합 질의에서 조인구조를 기준으로 단일 XML문서에 대한 질의 처리방법을 결정하는 것은 쉽지 않다. 본 논문에서는 분산환경에서 다중 XML문서를 대상으로 하는 조인을 포함한 다양한 통합 질의를 대상으로 실험을 통해 병렬처리 방법과 순차처리 방법 그리고 두 가지 처리방법을 조합한 하이브리드 방법을 적용하여 처리 시간을 비교 분석하고, 다중 문서에 대한 효율적인 조인방법과 순서를 모색한다.

  • PDF

Volume Holographic Fingerprint Recognition System for Personal Identification (개인 인증을 위한 체적 홀로그래픽 지문인식 시스템)

  • 이승현
    • Journal of the Korean Society of Safety
    • /
    • v.13 no.4
    • /
    • pp.256-263
    • /
    • 1998
  • In this paper, we propose a volume holographic fingerprint recognition system based on optical correlator for personal identification. Optical correlator has high speed and parallel processing characteristics of optics. Matched filters are recorded into a volume hologram that can store data with high density, transfer them with high speed, and select a randomly chosen data element. The multiple reference images of database are prerecorded in a photorefractive crystal in the form of Fourier transform images, simply by passing the image displayed in a spatial light modulator through a Fourier transform lens. The angular multiplexing method for multiple holograms of database can be achieved by rotating the crystal by use of a step motor. Experimental results show that the proposed system can be used for the security verification system.

  • PDF

A Design of the Preprocess Module for the Distributed Process of the ECG signals (ECG 신호의 분산처리를 위한 Preprocess Module에 관한 연구)

  • Song, H.B.;Lee, K.J.;Yoon, H.R.;Lee, M.H.
    • Proceedings of the KIEE Conference
    • /
    • 1987.07b
    • /
    • pp.1338-1340
    • /
    • 1987
  • This paper describes the design of ECG data preprocess module for the ECG signals. This module process the data obtained from two channels. It is composed of the AID converter, QRS detector, one chip micro-computer and memory. This module performs the following functions;digital filtering, R wave detection and determination of reference point for the ST segment. The measured points are transfered to the next data module by the interrupt process. This preprocessor data module is available to the basis for the parallel data processing for the real time automatic diagnosis.

  • PDF