• 제목/요약/키워드: Parallel Process

검색결과 1,456건 처리시간 0.027초

멀티 코어 프로세서 기반의 영상 감시 시스템을 위한 침입 탐지 처리의 가속화 (Acceleration of Intrusion Detection for Multi-core Video Surveillance Systems)

  • 이길범;정상진;김태환;이명진
    • 전자공학회논문지
    • /
    • 제50권12호
    • /
    • pp.141-149
    • /
    • 2013
  • 본 논문은 멀티 코어 프로세서 기반의 영상 감시 시스템을 위한 침입 탐지 처리의 가속화를 제안한다. 침입 탐지 처리의 가속화를 위해 병렬화를 진행하였고, 이를 위해 기존 침입 탐지 알고리즘을 분석하고 데이터 의존성을 고려하여 프레임 단위의 병렬화된 처리 구조를 설계하였다. 병렬화된 침입 탐지 처리의 유효성을 검증하기 위하여 다중 쓰레드 기반의 프로그램으로 구현하여 침입 탐지의 가속화 정도를 측정하였다. 구현한 침입 탐지 처리 프로그램의 탐지 속도는 논리적 쓰레드를 8개까지 구현할 수 있는 환경에서 기존 단일 쓰레드 처리 대비 최대 353.76%가 향상되었다.

Parallel processing in structural reliability

  • Pellissetti, M.F.
    • Structural Engineering and Mechanics
    • /
    • 제32권1호
    • /
    • pp.95-126
    • /
    • 2009
  • The present contribution addresses the parallelization of advanced simulation methods for structural reliability analysis, which have recently been developed for large-scale structures with a high number of uncertain parameters. In particular, the Line Sampling method and the Subset Simulation method are considered. The proposed parallel algorithms exploit the parallelism associated with the possibility to simultaneously perform independent FE analyses. For the Line Sampling method a parallelization scheme is proposed both for the actual sampling process, and for the statistical gradient estimation method used to identify the so-called important direction of the Line Sampling scheme. Two parallelization strategies are investigated for the Subset Simulation method: the first one consists in the embarrassingly parallel advancement of distinct Markov chains; in this case the speedup is bounded by the number of chains advanced simultaneously. The second parallel Subset Simulation algorithm utilizes the concept of speculative computing. Speedup measurements in context with the FE model of a multistory building (24,000 DOFs) show the reduction of the wall-clock time to a very viable amount (<10 minutes for Line Sampling and ${\approx}$ 1 hour for Subset Simulation). The measurements, conducted on clusters of multi-core nodes, also indicate a strong sensitivity of the parallel performance to the load level of the nodes, in terms of the number of simultaneously used cores. This performance degradation is related to memory bottlenecks during the modal analysis required during each FE analysis.

산업용 소형 고속병렬로봇의 구조해석에 관한 연구 (Study on the Structural Analysis of Small Size Industrial High Speed Parallel Robot)

  • 박찬훈;도현민;최태용;김병인
    • 한국정밀공학회지
    • /
    • 제30권9호
    • /
    • pp.923-930
    • /
    • 2013
  • These days, the interests on the high speed handling robots are increasing because it is important to get down the unit cost of production to get the price competitiveness. The parallel kinematic mechanism is more suitable to implement the high speed robot system as well known. The moving parts of the high speed parallel robot have to be designed for light weight. But the vibration motion is induced by the light weight links because they drive in high acceleration and deceleration. In this reason, the structural analysis of the high speed parallel kinematic robot is very important in the design process. In this paper, the study on the structural analysis of a high speed parallel robot has been done and the research results will be introduced.

병렬 쉴드터널의 이격거리와 적용사례 (Distance between the Parallel Shield tunnel and Application)

  • 곽철홍;김재영;김동현;이두화;이승복;김응태;심재범
    • 한국터널공학회:학술대회논문집
    • /
    • 한국터널공학회 2005년도 학술발표회 논문집
    • /
    • pp.225-232
    • /
    • 2005
  • The construction of parallel tunnel by using the shield TBM method was increased recently. Accordingly the application and the propriety of the parallel shield TBM tunnels were studied through domestic and foreign construction cases herein. Also the behavior of tunnel structure and ground was evaluated by a numerical analysis with various ground conditions and the distance between the parallel tunnels. As a result, it was concluded that a deep investigation as well as a ground reinforcement was required with a ratio(L/D) of the distance between the parallel tunnels(L) to tunnel outer diameter(D) less than 0.5 because the Interference phenomenon was expected to occur. And the appropriateness of the application method of parallel shield TBM tunnel was validated through the 2-dimensional numerical analysis simulated the process of excavation after the ground reinforcement in the starting area of the OOO construction site with the ratio(L/D) of 0.35.

  • PDF

저전력 파이프라인 병렬 누적기를 사용한 직접 디지털 주파수 합성기 (A Direct Digital Frequency Synthesizer Using A Low Power Pipelined Parallel Accumulator)

  • 양병도;김이섭
    • 대한전자공학회논문지SD
    • /
    • 제40권5호
    • /
    • pp.361-368
    • /
    • 2003
  • 저전력 파이프라인 병렬 누적기를 사용한 새로운 고속 직접 디지털 주파수 합성기가 제안되었다. 제안된 파이프라인 병렬 누적기는 속도 향상과 전력 소모 감소를 위하여 파이프라인과 병렬 기법 모두를 사용한다. 같은 처리 속도를 가지는 4 파이프라인 누적기와 4 병렬 누적기에 비하여 2 파이프라인 2 병렬 누적기는 66%와 69%의 전력만을 소모한다 제안된 누적기는 더 낮은 클럭 주파수에서 더 작은 면적과 더 적은 전력을 소모하면서 같은 속도를 얻을 수 있다. 3.3V전원의 0.35um CMOS 공정을 사용하여 모든 회로의 모의 실험과 제작이 수행되었다.

FPGA 상에서 OpenCL을 이용한 병렬 문자열 매칭 구현과 최적화 방향 (Parallel String Matching and Optimization Using OpenCL on FPGA)

  • 윤진명;최강일;김현진
    • 전기학회논문지
    • /
    • 제66권1호
    • /
    • pp.100-106
    • /
    • 2017
  • In this paper, we propose a parallel optimization method of Aho-Corasick (AC) algorithm and Parallel Failureless Aho-Corasick (PFAC) algorithm using Open Computing Language (OpenCL) on Field Programmable Gate Array (FPGA). The low throughput of string matching engine causes the performance degradation of network process. Recently, many researchers have studied the string matching engine using parallel computing. FPGA's vendors offer a parallel computing platform using OpenCL. In this paper, we apply the AC and PFAC algorithm on DE1-SoC board with Cyclone V FPGA, where the optimization that considers FPGA architecture is performed. Experiments are performed considering global id, local id, local memory, and loop unrolling optimizations using PFAC algorithm. The performance improvement using loop unrolling is 129 times greater than AC algorithm that not adopt loop unrolling. The performance improvements using loop unrolling are 1.1, 0.2, and 1.5 times greater than those using global id, local id, and local memory optimizations mentioned above.

퍼지 벡터 양자화를 위한 대규모 병렬 알고리즘 (A Massively Parallel Algorithm for Fuzzy Vector Quantization)

  • ;김철홍;김종면
    • 정보처리학회논문지A
    • /
    • 제16A권6호
    • /
    • pp.411-418
    • /
    • 2009
  • 퍼지 클러스터링 기반 벡터 양자화 알고리즘은 퍼지 클러스터링 분석이 벡터 양자화 프로세스 초기단계에서 초기화에 덜 민감하게 하기 때 문에 데이터 압축 분야에서 널리 사용되어 왔다. 하지만, 퍼지 클러스터링 처리는 훈련 벡터 공간에 포함된 불확실한 양적 공식의 복잡한 프레 임워크 때문에 상당한 계산량이 요구된다. 이러한 상당한 계산량 부하를 극복하기위해 본 논문은 4,096 프로세싱 엘리먼트로 구성된 어레이 아 키텍처를 이용하여 퍼지 벡터 양자화 알고리즘의 병렬 구현을 제안한다. 제안하는 병렬 구현은 4,096 프로세싱 엘리먼트를 이용하여 클러스터 링 프로세스 동안 효과적인 벡터 할당 정책을 적용함으로써 계산적으로 효율적인 솔루션을 제공한다. 모의실험 결과, 제안한 병렬 구현은 기존 의 다른 어레이 아키텍처를 이용한 구현보다 성능 및 효율 측면에서 상당한 향상을 보였다. 또한동일한 130nm 기술에서 제안한 병렬 구현은 오늘날의 ARM이나 TI DSP 프로세서를 이용한 구현과 비교하여 약 1000배의 성능 향상 및 100배의 에너지 효율 향상을 보였다. 이 결과들은 향상된 성능 및 에너지효율에서 제안한 병렬 구현의 잠재가능성을 입증한다.

A Novel Parallel Viterbi Decoding Scheme for NoC-Based Software-Defined Radio System

  • Wang, Jian;Li, Yubai;Li, Huan
    • ETRI Journal
    • /
    • 제35권5호
    • /
    • pp.767-774
    • /
    • 2013
  • In this paper, a novel parallel Viterbi decoding scheme is proposed to decrease the decoding latency and power consumption for the software-defined radio (SDR) system. It implements a divide-and-conquer approach by first dividing a block into a series of subblocks, then performing independent Viterbi decoding for each subsequence, and finally merging the surviving subpaths into the final path. Moreover, a network-on-chip-based SDR platform is used to evaluate the performance of the proposed parallel Viterbi decoding scheme. The experiment results show that our scheme can speed up the Viterbi decoding process without increasing the BER, and it performs better than the current state-of-the-art methods.

평면골조의 최적설계를 위한 병렬 O.C. 알고리즘 (Parallel O.C. Algorithm for Optimal design of Plane Frame Structures)

  • 김철용;박효선;박성무
    • 한국전산구조공학회:학술대회논문집
    • /
    • 한국전산구조공학회 2000년도 봄 학술발표회논문집
    • /
    • pp.466-473
    • /
    • 2000
  • Optimality Criteria algorithm based on the derivation of reciprocal approximations has been applied to structural optimization of large-scale structures. However, required computational cost for the serial analysis algorithm of large-scale structures consisting of a large number of degrees of freedom and members is too high to be adopted in the solution process of O.C. algorithm Thus, parallel version of O.C. algorithm on the network of personal computers is presented in this Paper. Parallelism in O.C. algorithm may be classified into two regions such as analysis and optimizer part As the first step of development of parallel algorithm, parallel structural analysis algorithm is developed and used in O.C. algorithm The algorithm is applied to optimal design of a 54-story plane frame structure

  • PDF

분산형 FP트리를 활용한 병렬 데이터 마이닝 (Parallel Data Mining with Distributed Frequent Pattern Trees)

  • 조두산;김동승
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2003년도 하계종합학술대회 논문집 V
    • /
    • pp.2561-2564
    • /
    • 2003
  • Data mining is an effective method of the discovery of useful information such as rules and previously unknown patterns existing in large databases. The discovery of association rules is an important data mining problem. We have developed a new parallel mining called Distributed Frequent Pattern Tree (abbreviated by DFPT) algorithm on a distributed shared nothing parallel system to detect association rules. DFPT algorithm is devised for parallel execution of the FP-growth algorithm. It needs only two full disk data scanning of the database by eliminating the need for generating the candidate items. We have achieved good workload balancing throughout the mining process by distributing the work equally to all processors. We implemented the algorithm on a PC cluster system, and observed that the algorithm outperformed the Improved Count Distribution scheme.

  • PDF