• Title/Summary/Keyword: 직렬 및 병렬 알고리즘

Search Result 31, Processing Time 0.024 seconds

Sequential and Parallel Algorithms for Finding a Longest Non-negative Path in a Tree (트리에서 가장 긴 비음수 경로를 찾는 직렬 및 병렬 알고리즘)

  • Kim, Sung-Kwon
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.33 no.12
    • /
    • pp.880-884
    • /
    • 2006
  • In an edge-weighted(positive, negative, or zero weights are possible) tree, we want to solve the problem of finding a longest path such that the sum of the weights of the edges in tile path is non-negative. To find a longest non-negative path of a tree we present a sequential algorithm with O(n logn) time and a CREW PRAM parallel algorithm with $O(log^2n)$ time and O(n) processors. where n is the number of nodes in the tree.

Implementation and analysis of a parallel suffix tree construction algorithm using TBB and Cilk Plus (TBB, Cilk Plus를 이용한 병렬 접미사 트리 생성 알고리즘 구현 및 성능 분석)

  • Seo, Jun-Ho;Na, Joong-Chae
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2012.06a
    • /
    • pp.403-405
    • /
    • 2012
  • 접미사 트리는 문자열 압축, 텍스트 처리, 생물정보학 등 다양한 응용 분야에서 사용되는 인덱스 자료구조이다. 최근 64bit 하드웨어와 멀티코어 CPU가 보급됨에 따라 메모리상에서 병렬로 접미사 트리를 생성하는 알고리즘이 활발히 연구되고 있다. 본 논문에서는 McCreight의 선형시간 알고리즘과 Chen의 병렬 알고리즘을 기반으로 메모리상에서 접미사 트리를 병렬로 생성하는 구현 방법을 보였으며, TBB, Cilk Plus와 같은 병렬 프로그래밍 라이브러리를 이용하여 병렬 알고리즘을 구현하였다. 알고리즘 실험 결과 병렬로 수행한 알고리즘이 직렬로 수행한 결과보다 최대 4배 가량 성능 향상을 얻을 수 있었으며, 병렬 라이브러리를 사용함으로써 가지는 오버헤드는 극히 적은 것으로 나타났다.

A Study on comparison of calculation between CPU-intensive and GPU-intensive and finding proper model for specific program (GPU기반의 계산속도와 CPU기반의 계산속도 비교 및 특정 프로그램에 따른 적합한 모델 찾기에 대한 연구)

  • Shin, Hyun-Soo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2019.05a
    • /
    • pp.48-51
    • /
    • 2019
  • 최근 기술이 발달함으로 인해 더 짧은시간에 더 많은 계산량이 필요해진 시대가 왔다. 본 연구에서는 CPU와 GPU의 구조를 파악하고 계산속도를 비교한다. 직렬 방식의 알고리즘에서의 병렬 방식의 알고리즘 및 현재 GPU 병렬처리 적용 사례 및 추후 적합한 모델 찾기에 대해 연구한다.

Improved Iterative Decoding of Parallel and Serially Concatenated Trellis Coded Modulation (병렬 및 직렬적으로 연접된 트렐리스 부호화 변조 기법을 위한 향상된 반복적 복호 기법)

  • You, Cheol-Woo;Seo, Dong-Sun
    • Journal of IKEEE
    • /
    • v.11 no.4
    • /
    • pp.198-204
    • /
    • 2007
  • For parallel and serially concatenated trellis coded modulation (TCM), improved iterative decoding schemes with a simple mechanism are proposed and their performances are compared with those of conventional decoding schemes. Simulation results have shown that the proposed schemes have provided a considerable decoding gain in additive white Gaussian noise (AWGN) channels and Rayleigh fading channels, even if they can be implemented by a simple modification of conventional decoding algorithms.

  • PDF

Thin Film Bulk Acoustic Resonator(FBAR) Bandpass Filter Design Technique Using Genetic Algorithm (유전자알고리즘을 이용한 FBAR RF 대역통과여파기 설계기법)

  • 이정흠;김형동
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.40 no.3
    • /
    • pp.10-17
    • /
    • 2003
  • In this paper, genetic algorithm (GA)-based Thin Film Bulk Acoustic Resonator (FBAR) RF filter design technique is proposed. Since the BVD(Butterworth-Van Dyke) lumped element model is valid only around the resonance, FBAR filter design technique based on BVD circuit has an approximate error. Instead of using BVD model, optimizing filter design method utilizes an analytical electrical impedance equation of FBAR. The geometry of FBAR such as thickness of the piezoelectric layer and area which significantly affect the filter response is optimized by GA. US-PCS Rx Bandpass filter obtained by the proposed technique shows a better response comparing with the typical and BVD-based filter.

High Speed Turbo Product Code Decoding Algorithm (고속 Turbo Product 부호 복호 알고리즘 및 구현에 관한 연구)

  • Choi Duk-Gun;Lee In-Ki;Jung Ji-Won
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.6C
    • /
    • pp.442-449
    • /
    • 2005
  • In this paper, we introduce three kinds of simplified high-speed decoding algorithms for turbo product decoder. First, A parallel decoder structure, the row and column decoders operate in parallel, is proposed. Second, HAD(Hard Decision Aided) algorithm is used for early-stopping algorithm. Lastly, P-Parallel TPC decoder is a parallel decoding scheme, processing P rows and P columns in parallel instead of decoding one by one as that in the original scheme.

Parallel Cell-Connectivity Information Extraction Algorithm for Ray-casting on Unstructured Grid Data (비정렬 격자에 대한 광선 투사를 위한 셀 사이 연결정보 추출 병렬처리 알고리즘)

  • Lee, Jihun;Kim, Duksu
    • Journal of the Korea Computer Graphics Society
    • /
    • v.26 no.1
    • /
    • pp.17-25
    • /
    • 2020
  • We present a novel multi-core CPU based parallel algorithm for the cell-connectivity information extraction algorithm, which is one of the preprocessing steps for volume rendering of unstructured grid data. We first check the synchronization issues when parallelizing the prior serial algorithm naively. Then, we propose a 3-step parallel algorithm that achieves high parallelization efficiency by removing synchronization in each step. Also, our 3-step algorithm improves the cache utilization efficiency by increasing the spatial locality for the duplicated triangle test process, which is the core operation of building cell-connectivity information. We further improve the efficiency of our parallel algorithm by employing a memory pool for each thread. To check the benefit of our approach, we implemented our method on a system consisting of two octa-core CPUs and measured the performance. As a result, our method shows continuous performance improvement as we add threads. Also, it achieves up to 82.9 times higher performance compared with the prior serial algorithm when we use thirty-two threads (sixteen physical cores). These results demonstrate the high parallelization efficiency and high cache utilization efficiency of our method. Also, it validates the suitability of our algorithm for large-scale unstructured data.

Request Two-Phase Locking Method for Series Sequence Re-adjustment of Concurrency Control in Multi-Level Secure DBMS (다단계 보안 데이터베이스 시스템에서 병행수행 제어의 직렬화 순서를 재조정하기 위한 요청 2단계 로킹기법)

  • Lee, Seungsoo;Cho, Jinsung;Jeong, Byungsoo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2004.05a
    • /
    • pp.105-108
    • /
    • 2004
  • 다단계 보안데이터베이스 시스템에서 기본적인 병행수행 제어 기법들은 비밀채널과 교착상태등과 같은 문제들이 발생하였다. 이에 직렬화 순서를 동적으로 재조정함으로서 해결하려는 방안이 있었지만, 알고리즘의 복잡성으로 인해 오버 헤드와 많은 수행시간이 필요하게 되었고, 이에 따라 많은 양의 시스템 자원과 높은 사양의 시스템을 요구하게 되었다. 또한 이러한 방법은 다중 버전을 사용함으로서 추가적인 관리비용이 높게 되었고, 각각의 트랜잭션이 지연 및 재수행이란 불필요한 과정을 반복하게 되었다. 따라서 본 논문에서는 제안한 알고리즘은 데이터베이스의 용도에 맞게 직렬화 순서를 보장하여 스케줄을 관리하는 요청 2단계 로킹기법(Request Two-phase Locking)으로서 이는 2단계 로킹기법의 기본원리에 요청로크를 사용함으로 보다 효율적으로 병행제어를 할 수 있다. 여기서 요청로크는 각각의 트랜잭션 스케줄에 로크획득 및 해제를 병행수행제어의 필요에 따라 유동적으로 할 수 있으며, 읽기로크, 쓰기로크, 요청로크라는 3가지 로킹모드를 통해 대처방안을 마련함으로서, 충돌을 방지하며, 충돌연산의 특성에 따라 직렬화 순서를 동적으로 조정함으로 블록킹을 막는 병행제어를 응용하여 병렬성을 유지한다.

  • PDF

Design Methodology of LDPC Codes based on Partial Parallel Algorithm (부분병렬 알고리즘 기반의 LDPC 부호 구현 방안)

  • Jung, Ji-Won
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.4 no.4
    • /
    • pp.278-285
    • /
    • 2011
  • This paper makes an analysis of the encoding structure and the decoding algorithm proposed by the DVB-S2 specification. The methods of implementing the LDPC decoder are fully serial decoder, the partially parallel decoder and the fully parallel decoder. The partial parallel scheme is the efficient selection to achieve appropriate trade-offs between hardware complexity and decoding speed. Therefore, this paper proposed an efficient memory structure for check node update block, bit node update block, and LLR memory.

Massive Parallel Processing Algorithm for Semiconductor Process Simulation (반도체 공정 시뮬레이션을 위한 초고속 병렬 연산 알고리즘)

  • 이제희;반용찬;원태영
    • Journal of the Korean Institute of Telematics and Electronics D
    • /
    • v.36D no.3
    • /
    • pp.48-58
    • /
    • 1999
  • In this paper, a new parallel computation method, which fully utilize the parallel processors both in mesh generation and FEM calculation for 2D/3D process simulation, is presented. High performance parallel FEM and parallel linear algebra solving technique was showed that excessive computational requirement of memory size and CPU time for the three-dimensional simulation could be treated successively. Our parallelized numerical solver successfully interpreted the transient enhanced diffusion (TED) phenomena of dopant diffusion and irregular shape of R-LOCOS within 15 minutes. Monte Carlo technique requires excessive computational requirement of CPU time. Therefore high performance parallel solving technique were employed to our cascade sputter simulation. The simulation results of Our sputter simulator allowed the calculation time of 520 sec and speedup of 25 using 30 processors. We found the optimized number of ion injection of our MC sputter simulation is 30,000.

  • PDF