• Title/Summary/Keyword: parallel computer processing

Search Result 652, Processing Time 0.028 seconds

Development of CUBRID based Middleware supporting Distributed Parallel Query Processing (분산 병렬 질의 처리를 지원하는 CUBRID 기반 미들웨어 개발)

  • Kim, Hyeong-Il;Yoon, Min;Cho, Ahra;Choi, Mun-Chul;Chang, Jae-Woo
    • Annual Conference of KIPS
    • /
    • 2014.11a
    • /
    • pp.714-717
    • /
    • 2014
  • 최근 SNS의 발전으로 인해 정보의 양이 급격히 증가하였으며, 이에 따라 빅데이터 처리를 위한 NoSQL에 대한 연구가 활발히 진행되고 있다. 그러나 NoSQL은 데이터베이스의 ACID 조건을 만족하지 못하는 문제점이 존재한다. 따라서 RDBMS를 기반으로 빅데이터 처리를 수행하는 연구가 활발히 진행되고 있다. 이를 위한 대표적인 기법인 CUBRID Shard는 데이터베이스를 Shard 단위로 수평 분할하여 각기 다른 물리 노드에 데이터를 분산 저장한다. 그러나 해당 기법은 한 클라이언트의 질의가 다수의 서버에서 실행되어야 하는 경우를 에는 질의를 처리하지 못하는 단점을 보인다. 따라서 본 논문에서는 병렬 질의 처리를 지원하는 CUBRID 기반 분산 미들웨어를 제안한다.

OpenCL-based Efficient Parallel Processing in a Heterogeneous Computing Environment (이기종 컴퓨팅 환경에서 OpenCL을 이용한 효율적인 병렬처리)

  • Kim, Heegon;Lee, Sungju;Chung, Yongwha;Park, Daihee
    • Annual Conference of KIPS
    • /
    • 2013.11a
    • /
    • pp.111-114
    • /
    • 2013
  • 최근 고성능 컴퓨팅과 모바일 컴퓨팅에서 GPU 등의 성능가속기 사용이 증가함에 따라 성능가속기를 사용한 다양한 병렬처리 방법이 소개되고 있다. 그러나 성능 가속기를 처음 접하거나 성능가속기를 사용한 병렬처리 경험이 적은 사용자의 경우, 이러한 성능가속기를 이용하여 효과적인 병렬처리를 하는 것은 쉽지 않다. 본 논문에서는 성능가속기와 마이크로프로세서를 동시에 사용하여 단순히 성능가속기만을 사용한 병렬처리보다 효율적인 병렬처리 방법을 제안하고, 성능가속기만을 사용하여 얻은 성능과 제안한 방법의 성능을 비교한다. 실험결과, 제안방법은 순차처리와 비교하여 약 40배의 성능 향상을 얻을 수 있었고, 성능가속기만을 사용한 병렬처리 방법보다도 25%의 성능 향상이 가능함을 확인하였다.

Performance Analysis of DNN inference using OpenCV Built in CPU and GPU Functions (OpenCV 내장 CPU 및 GPU 함수를 이용한 DNN 추론 시간 복잡도 분석)

  • Park, Chun-Su
    • Journal of the Semiconductor & Display Technology
    • /
    • v.21 no.1
    • /
    • pp.75-78
    • /
    • 2022
  • Deep Neural Networks (DNN) has become an essential data processing architecture for the implementation of multiple computer vision tasks. Recently, DNN-based algorithms achieve much higher recognition accuracy than traditional algorithms based on shallow learning. However, training and inference DNNs require huge computational capabilities than daily usage purposes of computers. Moreover, with increased size and depth of DNNs, CPUs may be unsatisfactory since they use serial processing by default. GPUs are the solution that come up with greater speed compared to CPUs because of their Parallel Processing/Computation nature. In this paper, we analyze the inference time complexity of DNNs using well-known computer vision library, OpenCV. We measure and analyze inference time complexity for three cases, CPU, GPU-Float32, and GPU-Float16.

A 4-parallel Scheduling Architecture for High-performance H.264/AVC Deblocking Filter (고성능 H.264/AVC 디블로킹 필터를 위한 4-병렬 스케줄링 아키텍처)

  • Ko, Byung-Soo;Kong, Jin-Hyeung
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.49 no.8
    • /
    • pp.63-72
    • /
    • 2012
  • In this paper, we proposed a parallel architecture of line & block edge filter for high-performance H.264/AVC deblocking filter for Quad Full High Definition(Quad FHD) video real time processing. To improve throughput, we designed 4-parallel block edge filter with 16 line edge filter. To reduce internal buffer size and processing cycle, we scheduled 4-parallel zig-zag scan order as deblocking filtering order. To avoid data conflicts we placed 1 delay cycle between block edge filtering. We implemented interleaving buffer, as internal buffer of block edge filter, to sharing buffer for reducing buffer size. The proposed architecture was simulated in 0.18um standard cell library. The maximum operation frequency is 108MHz. The gate count is 140.16Kgates. The proposed H.264/AVC deblocking filter can support Quad FHD at 113.17 frames per second by running at 90MHz.

Development of Parallel Event-Driven Remote IT Convergence (병렬 이벤트 기반 원격 IT 융합 개발)

  • Kim, Jung-Sook;Kim, Sung-Wan;Kim, Hong-Sup
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.12
    • /
    • pp.1-9
    • /
    • 2010
  • This paper describes parallel event-driven remote IT convergence applications which are a combination of traditional industry and IT Technology including advanced communication. In IT convergence system, events can occur currently from many sensors of devices or users. And IT convergence system must have a parallel processing method. In this paper, the parallel processing method was implemented using a thread and we developed a connection method between a device and a mode of communication which is a wireless communication or a power line communication. In addition to that, we developed object modeling, device, user and event modeling, based on XML (eXtensible Markup Language) using object-oriented modeling method. To efficiently show results in real time, systems provide various graphic user interfaces such as a bar graph, a table, and a combination of the two.

Implementation and Performance Analysis of High Performance Computing Library for Parallel Processing (병렬처리를 위한 고성능 라이브러리의 구현과 성능 평가)

  • 김영태;이용권
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.31 no.7
    • /
    • pp.379-386
    • /
    • 2004
  • We designed a portable parallel library HPCL(High Performance Computing Library) with following objectives: (1) to provide a close relationship between the parallel code and the original sequential code that will help future versions of the sequential code and (2) to enhance performance of the parallel code. The library is an interface written in C and Fortran programming languages between MPI(Message Passing Interface) and parallel programs in Fortran. Performance results were determined on clusters of PC's and IBM SP4.

Parallel Procedure and Evaluation of Parallel Performance of Impact Simulation Based on Two-Step Eulerian Scheme (Two-Step Eulerian 기법에 기반 한 충돌 해석의 병렬처리 및 병렬효율 평가)

  • Kim Seung-Jo;Lee Min-Hyung;Paik Seung-Hoon
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.30 no.10 s.253
    • /
    • pp.1320-1327
    • /
    • 2006
  • Parallel procedure and performance of two-step Eulerian code have not been reported sufficiently yet even though it was developed and utilized widely in the impact simulation. In this study, parallel strategy of two-step Eulerian code was proposed and described in detail. The performance was evaluated in the self-made linux cluster computer. Compared with commercial code, a relatively good performance is achieved. Through the performance evaluation of each computation stage, remap is turned out to be the most time consuming part among the other part such as FE processing, communication, time marching etc.

A New Prediction-Based Parallel Event-Driven Logic Simulation (새로운 예측기반 병렬 이벤트구동 로직 시뮬레이션)

  • Yang, Seiyang
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.4 no.3
    • /
    • pp.85-90
    • /
    • 2015
  • In this paper, anew parallel event-driven logic simulation is proposed. As the proposed prediction-based parallel event-driven simulation method uses both prediction data and actual data for the input and output values of local simulations executed in parallel, the synchronization overhead and the communication overhead, the major bottleneck of the performance improvement, are greatly reduced. Through the experimentation with multiple designs, we have observed the effectiveness of the proposed approach.

Privacy-Preserving Parallel Range Query Processing Algorithm Based on Data Filtering in Cloud Computing (클라우드 컴퓨팅에서 프라이버시 보호를 지원하는 데이터 필터링 기반 병렬 영역 질의 처리 알고리즘)

  • Kim, Hyeong Jin;Chang, Jae-Woo
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.10 no.9
    • /
    • pp.243-250
    • /
    • 2021
  • Recently, with the development of cloud computing, interest in database outsourcing is increasing. However, when the database is outsourced, there is a problem in that the information of the data owner is exposed to internal and external attackers. Therefore, in this paper, we propose a parallel range query processing algorithm that supports privacy protection. The proposed algorithm uses the Paillier encryption system to support data protection, query protection, and access pattern protection. To reduce the operation cost of a checking protocol (SRO) for overlapping regions in the existing algorithm, the efficiency of the SRO protocol is improved through a garbled circuit. The proposed parallel range query processing algorithm is largely composed of two steps. It consists of a parallel kd-tree search step that searches the kd-tree in parallel and safely extracts the data of the leaf node including the query, and a parallel data search step through multiple threads for retrieving the data included in the query area. On the other hand, the proposed algorithm provides high query processing performance through parallelization of secure protocols and index search. We show that the performance of the proposed parallel range query processing algorithm increases in proportion to the number of threads and the proposed algorithm shows performance improvement by about 5 times compared with the existing algorithm.

Introduction to general purpose GPU computing (GPU를 이용한 범용 계산의 소개)

  • Yu, Donghyeon;Lim, Johan
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.5
    • /
    • pp.1043-1061
    • /
    • 2013
  • Recent advances in computer technology introduce massive data and their analysis becomes important. The high performance computing is one of the most essential part in analysis of massive data. In this paper, we review the general purpose of the graphics processing unit and its application to parallel computing, which has been of great interest in statistics communities.