• Title/Summary/Keyword: Parallel data processing

Search Result 751, Processing Time 0.029 seconds

A study on the enhancement and performance optimization of parallel data processing model for Big Data on Emissions of Air Pollutants Emitted from Vehicles (차량에서 배출되는 대기 오염 물질의 빅 데이터에 대한 병렬 데이터 처리 모델의 강화 및 성능 최적화에 관한 연구)

  • Kang, Seong-In;Cho, Sung-youn;Kim, Ji-Whan;Kim, Hyeon-Joung
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.6
    • /
    • pp.1-6
    • /
    • 2020
  • Road movement pollutant air environment big data is a link between real-time traffic data such as vehicle type, speed, and load using AVC, VDS, WIM, and DTG, which are always traffic volume survey equipment, and road shape (uphill, downhill, turning section) data using GIS. It consists of traffic flow data. Also, unlike general data, a lot of data per unit time is generated and has various formats. In particular, since about 7.4 million cases/hour or more of large-scale real-time data collected as detailed traffic flow information are collected, stored and processed, a system that can efficiently process data is required. Therefore, in this study, an open source-based data parallel processing performance optimization study is conducted for the visualization of big data in the air environment of road transport pollution.

Auto Regulated Data Provisioning Scheme with Adaptive Buffer Resilience Control on Federated Clouds

  • Kim, Byungsang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.11
    • /
    • pp.5271-5289
    • /
    • 2016
  • On large-scale data analysis platforms deployed on cloud infrastructures over the Internet, the instability of the data transfer time and the dynamics of the processing rate require a more sophisticated data distribution scheme which maximizes parallel efficiency by achieving the balanced load among participated computing elements and by eliminating the idle time of each computing element. In particular, under the constraints that have the real-time and limited data buffer (in-memory storage) are given, it needs more controllable mechanism to prevent both the overflow and the underflow of the finite buffer. In this paper, we propose an auto regulated data provisioning model based on receiver-driven data pull model. On this model, we provide a synchronized data replenishment mechanism that implicitly avoids the data buffer overflow as well as explicitly regulates the data buffer underflow by adequately adjusting the buffer resilience. To estimate the optimal size of buffer resilience, we exploits an adaptive buffer resilience control scheme that minimizes both data buffer space and idle time of the processing elements based on directly measured sample path analysis. The simulation results show that the proposed scheme provides allowable approximation compared to the numerical results. Also, it is suitably efficient to apply for such a dynamic environment that cannot postulate the stochastic characteristic for the data transfer time, the data processing rate, or even an environment where the fluctuation of the both is presented.

High-Performance Korean Morphological Analyzer Using the MapReduce Framework on the GPU

  • Cho, Shi-Won;Lee, Dong-Wook
    • Journal of Electrical Engineering and Technology
    • /
    • v.6 no.4
    • /
    • pp.573-579
    • /
    • 2011
  • To meet the scalability and performance requirements of data analyses, which often involve voluminous data, efficient parallel or concurrent algorithms and frameworks are essential. We present a high-performance Korean morphological analyzer which employs the MapReduce framework on the graphics processing unit (GPU). MapReduce is a programming framework introduced by Google to aid the development of web search applications on a large number of central processing units (CPUs). GPUs are designed as a special-purpose co-processor. Their programming interfaces are typically formulated for graphics applications. Compared to CPUs, GPUs have greater computation power and memory bandwidth; however, GPUs are more difficult to program because of the design of their architectures. The performance of the Korean morphological analyzer using the MapReduce framework on the GPU is evaluated in comparison with the CPU-based model. The proposed Korean Morphological analyzer shows promising scalable performance on distributed computing with the GPU.

An Adaptive Workflow Scheduling Scheme Based on an Estimated Data Processing Rate for Next Generation Sequencing in Cloud Computing

  • Kim, Byungsang;Youn, Chan-Hyun;Park, Yong-Sung;Lee, Yonggyu;Choi, Wan
    • Journal of Information Processing Systems
    • /
    • v.8 no.4
    • /
    • pp.555-566
    • /
    • 2012
  • The cloud environment makes it possible to analyze large data sets in a scalable computing infrastructure. In the bioinformatics field, the applications are composed of the complex workflow tasks, which require huge data storage as well as a computing-intensive parallel workload. Many approaches have been introduced in distributed solutions. However, they focus on static resource provisioning with a batch-processing scheme in a local computing farm and data storage. In the case of a large-scale workflow system, it is inevitable and valuable to outsource the entire or a part of their tasks to public clouds for reducing resource costs. The problems, however, occurred at the transfer time for huge dataset as well as there being an unbalanced completion time of different problem sizes. In this paper, we propose an adaptive resource-provisioning scheme that includes run-time data distribution and collection services for hiding the data transfer time. The proposed adaptive resource-provisioning scheme optimizes the allocation ratio of computing elements to the different datasets in order to minimize the total makespan under resource constraints. We conducted the experiments with a well-known sequence alignment algorithm and the results showed that the proposed scheme is efficient for the cloud environment.

Implementation of Parallel Local Alignment Method for DNA Sequence using Apache Spark (Apache Spark을 이용한 병렬 DNA 시퀀스 지역 정렬 기법 구현)

  • Kim, Bosung;Kim, Jinsu;Choi, Dojin;Kim, Sangsoo;Song, Seokil
    • The Journal of the Korea Contents Association
    • /
    • v.16 no.10
    • /
    • pp.608-616
    • /
    • 2016
  • The Smith-Watrman (SW) algorithm is a local alignment algorithm which is one of important operations in DNA sequence analysis. The SW algorithm finds the optimal local alignment with respect to the scoring system being used, but it has a problem to demand long execution time. To solve the problem of SW, some methods to perform SW in distributed and parallel manner have been proposed. The ADAM which is a distributed and parallel processing framework for DNA sequence has parallel SW. However, the parallel SW of the ADAM does not consider that the SW is a dynamic programming method, so the parallel SW of the ADAM has the limit of its performance. In this paper, we propose a method to enhance the parallel SW of ADAM. The proposed parallel SW (PSW) is performed in two phases. In the first phase, the PSW splits a DNA sequence into the number of partitions and assigns them to multiple nodes. Then, the original Smith-Waterman algorithm is performed in parallel at each node. In the second phase, the PSW estimates the portion of data sequence that should be recalculated, and the recalculation is performed on the portions in parallel at each node. In the experiment, we compare the proposed PSW to the parallel SW of the ADAM to show the superiority of the PSW.

A 4-parallel Scheduling Architecture for High-performance H.264/AVC Deblocking Filter (고성능 H.264/AVC 디블로킹 필터를 위한 4-병렬 스케줄링 아키텍처)

  • Ko, Byung-Soo;Kong, Jin-Hyeung
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.49 no.8
    • /
    • pp.63-72
    • /
    • 2012
  • In this paper, we proposed a parallel architecture of line & block edge filter for high-performance H.264/AVC deblocking filter for Quad Full High Definition(Quad FHD) video real time processing. To improve throughput, we designed 4-parallel block edge filter with 16 line edge filter. To reduce internal buffer size and processing cycle, we scheduled 4-parallel zig-zag scan order as deblocking filtering order. To avoid data conflicts we placed 1 delay cycle between block edge filtering. We implemented interleaving buffer, as internal buffer of block edge filter, to sharing buffer for reducing buffer size. The proposed architecture was simulated in 0.18um standard cell library. The maximum operation frequency is 108MHz. The gate count is 140.16Kgates. The proposed H.264/AVC deblocking filter can support Quad FHD at 113.17 frames per second by running at 90MHz.

Design of a Parallel Rendering Processor Architecture with Effective Memory System (효과적인 메모리 구조를 갖는 병렬 렌더링 프로세서 설계)

  • Park Woo-Chan;Yoon Duk-Ki;Kim Kyoung-Su
    • The KIPS Transactions:PartA
    • /
    • v.13A no.4 s.101
    • /
    • pp.305-316
    • /
    • 2006
  • Current rendering processors are organized mainly to process a triangle as fast as possible and recently parallel 3D rendering processors, which can process multiple triangles in parallel with multiple rasterizers, begin to appear. For high performance in processing triangles, it is desirable for each rasterizer have its own local pixel cache. However, the consistency problem may occur in accessing the data at the same address simultaneously by more than one rasterizer. In this paper, we propose a parallel rendering processor architecture resolving such consistency problem effectively. Moreover, the proposed architecture reduces the latency due to a pixel cache miss significantly. For the above two goals, effective memory organizations including a new pixel cache architecture are presented. The experimental results show that the proposed architecture achieves almost linear speedup at best case even in sixteen rasterizers.

Efficiency Low-Power Signal Processing for Multi-Channel LiDAR Sensor-Based Vehicle Detection Platform (멀티채널 LiDAR 센서 기반 차량 검출 플랫폼을 위한 효율적인 저전력 신호처리 기법)

  • Chong, Taewon;Park, Daejin
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.7
    • /
    • pp.977-985
    • /
    • 2021
  • The LiDAR sensor is attracting attention as a key sensor for autonomous driving vehicle. LiDAR sensor provides measured three-dimensional lengths within range using LASER. However, as much data is provided to the external system, it is difficult to process such data in an external system or processor of the vehicle. To resolve these issues, we develop integrated processing system for LiDAR sensor. The system is configured that client receives data from LiDAR sensor and processes data, server gathers data from clients and transmits integrated data in real-time. The test was carried out to ensure real-time processing of the system by changing the data acquisition, processing method and process driving method of process. As a result of the experiment, when receiving data from four LiDAR sensors, client and server process was operated using background or multi-core processing, the system response time of each client was about 13.2 ms and the server was about 12.6 ms.

Parallel BCH Encoding/decoding Method and VLSI Design for Nonvolatile Memory (비휘발성 메모리를 위한 병렬 BCH 인코딩/디코딩 방법 및 VLSI 설계)

  • Lee, Sang-Hyuk;Baek, Kwang-Hyun
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.47 no.5
    • /
    • pp.41-47
    • /
    • 2010
  • This paper has proposed parallel BCH, one of error correction coding methods which has been used to NAND flash memory for SSD(solid state disk). To alter error correction capability, the proposed design improved reliability on data block has higher error rate as used frequency increasingly. Decoding parallel process bit width is as two times as encoding parallel process bit width, that could reduce decoding processing time, accordingly resulting in one half reduction over conventional ECC.

Correct Implementation of Sub-warp Parallel Prefix Operations based on GPU Hardware Architecture (GPU 하드웨어 아키텍처 기반 sub-warp 단위 병렬 프리픽스(prefix) 연산의 정확한 구현)

  • Park, Taejung
    • Journal of Digital Contents Society
    • /
    • v.18 no.3
    • /
    • pp.613-619
    • /
    • 2017
  • This paper presents a CUDA (Compute Unified Device Architecture) code to achieve correct GPU parallel segmented prefix operation results with less than 32 segment length for large data arrays. Mark Harris and Michael Garland had published CUDA code to address the tasks. This paper shows that their code does not generate correct results when the local segment length is less than 32, discusses the cause of the problem, and presents a CUDA code that generates correct results. The segmented parallel prefix operation presented in this paper can be applied as a building block to various large parallel processing algorithms including the k-nearest neighbor search problems.