• Title/Summary/Keyword: Parallel data processing

Search Result 751, Processing Time 0.029 seconds

Design of a High-Speed Data Packet Allocation Circuit for Network-on-Chip (NoC 용 고속 데이터 패킷 할당 회로 설계)

  • Kim, Jeonghyun;Lee, Jaesung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.10a
    • /
    • pp.459-461
    • /
    • 2022
  • One of the big differences between Network-on-Chip (NoC) and the existing parallel processing system based on an off-chip network is that data packet routing is performed using a centralized control scheme. In such an environment, the best-effort packet routing problem becomes a real-time assignment problem in which data packet arriving time and processing time is the cost. In this paper, the Hungarian algorithm, a representative computational complexity reduction algorithm for the linear algebraic equation of the allocation problem, is implemented in the form of a hardware accelerator. As a result of logic synthesis using the TSMC 0.18um standard cell library, the area of the circuit designed through case analysis for the cost distribution is reduced by about 16% and the propagation delay of it is reduced by about 52%, compared to the circuit implementing the original operation sequence of the Hungarian algorithm.

  • PDF

Efficient Processing of Grouped Aggregation on Non-Uniformed Memory Access Architecture (비균등 메모리 접근 구조에서의 효율적인 그룹화 집단 연산의 처리)

  • Choe, Seongjun;Min, Jun-Ki
    • Database Research
    • /
    • v.34 no.3
    • /
    • pp.14-27
    • /
    • 2018
  • Recently, to alleviate the memory bottleneck problme occurred in Symmetric Multiprocessing (SMP) architecture, Non-Uniform Memory Access (NUMA) architecture was proposed. In addition, since an aggregation operator is an important operator providing properties and summary of data, the efficiency of the aggregation operator is crucial to overall performance of a system. Thus, in this paper, we propose an efficient aggregation processing technique on NUMA architecture. Our proposed technique consists of partition phase and merge phase. In the partition phase, the target relation is partitioned into several partial relations according to grouping attribute. Thus, since each thread can process aggregation operator on partial relation independently, we prevent the remote memory access during the merge phase. Furthermore, at the merge phase, we improve the performance of the aggregation processing by letting each thread compute aggregation with a local hash table as well as avoiding lock contention to merge aggregation results generated by all threads into one.

High-Speed Implementations of Block Ciphers on Graphics Processing Units Using CUDA Library (GPU용 연산 라이브러리 CUDA를 이용한 블록암호 고속 구현)

  • Yeom, Yong-Jin;Cho, Yong-Kuk
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.18 no.3
    • /
    • pp.23-32
    • /
    • 2008
  • The computing power of graphics processing units(GPU) has already surpassed that of CPU and the gap between their powers is getting wider. Thus, research on GPGPU which applies GPU to general purpose becomes popular and shows great success especially in the field of parallel data processing. Since the implementation of cryptographic algorithm using GPU was started by Cook et at. in 2005, improved results using graphic libraries such as OpenGL and DirectX have been published. In this paper, we present skills and results of implementing block ciphers using CUDA library announced by NVIDIA in 2007. Also, we discuss a general method converting source codes of block ciphers on CPU to those on GPU. On NVIDIA 8800GTX GPU, the resulting speeds of block cipher AES, ARIA, and DES are 4.5Gbps, 7.0Gbps, and 2.8Gbps, respectively which are faster than the those on CPU.

Spatial Computation on Spark Using GPGPU (GPGPU를 활용한 스파크 기반 공간 연산)

  • Son, Chanseung;Kim, Daehee;Park, Neungsoo
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.5 no.8
    • /
    • pp.181-188
    • /
    • 2016
  • Recently, as the amount of spatial information increases, an interest in the study of spatial information processing has been increased. Spatial database systems extended from the traditional relational database systems are difficult to handle large data sets because of the scalability. SpatialHadoop extended from Hadoop system has a low performance, because spatial computations in SpationHadoop require a lot of write operations of intermediate results to the disk, resulting in the performance degradation. In this paper, Spatial Computation Spark(SC-Spark) is proposed, which is an in-memory based distributed processing framework. SC-Spark is extended from Spark in order to efficiently perform the spatial operation for large-scale data. In addition, SC-Spark based on the GPGPU is developed to improve the performance of the SC-Spark. SC-Spark uses the advantage of the Spark holding intermediate results in the memory. And GPGPU-based SC-Spark can perform spatial operations in parallel using a plurality of processing elements of an GPU. To verify the proposed work, experiments on a single AMD system were performed using SC-Spark and GPGPU-based SC-Spark for Point-in-Polygon and spatial join operation. The experimental results showed that the performance of SC-Spark and GPGPU-based SC-Spark were up-to 8 times faster than SpatialHadoop.

Investigation and Processing of Seismic Reflection Data Collected from a Water-Land Area Using a Land Nodal Airgun System (수륙 경계지역에서 얻어진 육상 노달 에어건 탄성파탐사 자료의 고찰 및 자료처리)

  • Lee, Donghoon;Jang, Seonghyung;Kang, Nyeonkeon;Kim, Hyun-do;Kim, Kwansoo;Kim, Ji-Soo
    • The Journal of Engineering Geology
    • /
    • v.31 no.4
    • /
    • pp.603-620
    • /
    • 2021
  • A land nodal seismic system was employed to acquire seismic reflection data using stand-alone cable-free receivers in a land-river area. Acquiring reliable data using this technology is very cost effective, as it avoids topographic problems in the deployment and collection of receivers. The land nodal airgun system deployed on the mouth of the Hyungsan River (in Pohang, Gyeongsangbuk Province) used airgun sources in the river and receivers on the riverbank, with subparallel source and receiver lines, approximately 120 m-spaced. Seismic data collected on the riverbank are characterized by a low signal-to-noise (S/N) and inconsistent reflection events. Most of the events are represented by hyperbola in the field records, including direct waves, guided waves, air waves, and Scholte surface waves, in contrast to the straight lines in the data collected conventionally where source and receiver lines are coincident. The processing strategy included enhancing the signal behind the low-frequency large-amplitude noise with a cascaded application of bandpass and f-k filters for the attenuation of air waves. Static time delays caused by the cross-offset distance between sources and receivers are corrected, with a focus on mapping the shallow reflections obscured by guided wave and air wave noise. A new time-distance equation and curve for direct and air waves are suggested for the correction of the static time delay caused by the cross-offset between source and receiver. Investigation of the minimum cross-offset gathers shows well-aligned shallow reflections around 200 ms after time-shift correction. This time-delay static correction based on the direct wave is found essential to improving the data from parallel source and receiver lines. Data acquisition and processing strategies developed in this study for land nodal airgun seismic systems will be readily applicable to seismic data from land-sea areas when high-resolution signal data becomes available in the future for investigation of shallow gas reservoirs, faults, and engineering designs for the development of coastal areas.

Criticality benchmarking of ENDF/B-VIII.0 and JEFF-3.3 neutron data libraries with RMC code

  • Zheng, Lei;Huang, Shanfang;Wang, Kan
    • Nuclear Engineering and Technology
    • /
    • v.52 no.9
    • /
    • pp.1917-1925
    • /
    • 2020
  • New versions of ENDF/B and JEFF data libraries have been released during the past two years with significant updates in the neutron reaction sublibrary and the thermal neutron scattering sublibrary. In order to get a more comprehensive impression of the criticality quality of these two latest neutron data libraries, and to provide reference for the selection of the evaluated nuclear data libraries for the science and engineering applications of the Reactor Monte Carlo code RMC, the criticality benchmarking of the two latest neutron data libraries has been performed. RMC was employed as the computational tools, whose processing capability for the continuous representation ENDF/B-VIII.0 thermal neutron scattering laws was developed. The RMC criticality validation suite consisting of 116 benchmarks was established for the benchmarking work. The latest ACE format data libraries of the neutron reaction and the thermal neutron scattering laws for ENDF/B-VIII.0, ENDF/B-VII.1, and JEFF-3.3 were downloaded from the corresponding official sites. The ENDF/B-VII.0 data library was also employed to provide code-to-code validation for RMC. All the calculations for the four different data libraries were performed by using a parallel version of RMC, and all the calculated standard deviations are lower than 30pcm. Comprehensive analyses including the C/E values with uncertainties, the δk/σ values, and the metrics of χ2 and < |Δ| >, were conducted and presented. The calculated keff eigenvalues based on the four data libraries generally agree well with the benchmark evaluations for most cases. Among the 116 criticality benchmarks, the numbers of the calculated keff eigenvalues which agree with the benchmark evaluations within 3σ interval (with a confidence level of 99.6%) are 107, 109, 112, and 113 for ENDF/B-VII.0, ENDF/B-VII.1, ENDF/B-VIII.0 and JEFF-3.3, respectively. The present results indicate that the ENDF/B-VIII.0 neutron data library has a better performance on average.

A Study on a Lossless Compression Scheme for Cloud Point Data of the Target Construction (목표 구조물에 대한 점군데이터의 무손실 압축 기법에 관한 연구)

  • Bang, Min-Suk;Yun, Kee-Bang;Kim, Ki-Doo
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.48 no.5
    • /
    • pp.33-41
    • /
    • 2011
  • In this paper, we propose a lossless compression scheme for cloud point data of the target construction by using doubleness and decreasing useless information of cloud point data. We use Hough transform to find the horizontal angle between construction and terrestrial LIDAR. This angle is used for the rotation of the cloud point data. The cloud point data can be parallel to x-axis, then y-axis doubleness is increased. Therefore, the cloud point data can be more compressed. In addition, we apply two methods to decrease the number of cloud point data for useless information of them. One is decimation of the cloud point data, the other is to extract the range of y-coordinates of target construction, and then extract the cloud point data existing in the range only. The experimental result shows the performance of proposed scheme. To compress the data, we use only the position information without additional information. Therefore, this scheme can increase processing speed of the compression algorithm.

Development of Integrated Retrieval System of the Biology Sequence Database Using Web Service (웹 서비스를 이용한 바이오 서열 정보 데이터베이스 및 통합 검색 시스템 개발)

  • Lee, Su-Jung;Yong, Hwan-Seung
    • The KIPS Transactions:PartD
    • /
    • v.11D no.4
    • /
    • pp.755-764
    • /
    • 2004
  • Recently, the rapid development of biotechnology brings the explosion of biological data and biological data host. Moreover, these data are highly distributed and heterogeneous, reflecting the distribution and heterogeneity of the Molecular Biology research community. As a consequence, the integration and interoperability of molecular biology databases are issue of considerable importance. But, up to now, most of the integrated systems such as link based system, data warehouse based system have many problems which are keeping the data up to date when the schema and data of the data source are changed. For this reason, the integrated system using web service technology that allow biological data to be fully exploited have been proposed. In this paper, we built the integrated system if the bio sequence information bated on the web service technology. The developed system allows users to get data with many format such as BSML, GenBank, Fasta to traverse disparate data resources. Also, it has better retrieval performance because the retrieval modules of the external database proceed in parallel.

A Study of Big data-based Machine Learning Techniques for Wheel and Bearing Fault Diagnosis (차륜 및 차축베어링 고장진단을 위한 빅데이터 기반 머신러닝 기법 연구)

  • Jung, Hoon;Park, Moonsung
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.19 no.1
    • /
    • pp.75-84
    • /
    • 2018
  • Increasing the operation rate of components and stabilizing the operation through timely management of the core parts are crucial for improving the efficiency of the railroad maintenance industry. The demand for diagnosis technology to assess the condition of rolling stock components, which employs history management and automated big data analysis, has increased to satisfy both aspects of increasing reliability and reducing the maintenance cost of the core components to cope with the trend of rapid maintenance. This study developed a big data platform-based system to manage the rolling stock component condition to acquire, process, and analyze the big data generated at onboard and wayside devices of railroad cars in real time. The system can monitor the conditions of the railroad car component and system resources in real time. The study also proposed a machine learning technique that enabled the distributed and parallel processing of the acquired big data and automatic component fault diagnosis. The test, which used the virtual instance generation system of the Amazon Web Service, proved that the algorithm applying the distributed and parallel technology decreased the runtime and confirmed the fault diagnosis model utilizing the random forest machine learning for predicting the condition of the bearing and wheel parts with 83% accuracy.

Performance Analysis on Declustering High-Dimensional Data by GRID Partitioning (그리드 분할에 의한 다차원 데이터 디클러스터링 성능 분석)

  • Kim, Hak-Cheol;Kim, Tae-Wan;Li, Ki-Joune
    • The KIPS Transactions:PartD
    • /
    • v.11D no.5
    • /
    • pp.1011-1020
    • /
    • 2004
  • A lot of work has been done to improve the I/O performance of such a system that store and manage a massive amount of data by distributing them across multiple disks and access them in parallel. Most of the previous work has focused on an efficient mapping from a grid ceil, which is determined bY the interval number of each dimension, to a disk number on the assumption that each dimension is split into disjoint intervals such that entire data space is GRID-like partitioned. However, they have ignored the effects of a GRID partitioning scheme on declustering performance. In this paper, we enhance the performance of mapping function based declustering algorithms by applying a good GRID par-titioning method. For this, we propose an estimation model to count the number of grid cells intersected by a range query and apply a GRID partitioning scheme which minimizes query result size among the possible schemes. While it is common to do binary partition for high-dimensional data, we choose less number of dimensions than needed for binary partition and split several times along that dimensions so that we can reduce the number of grid cells touched by a query. Several experimental results show that the proposed estimation model gives accuracy within 0.5% error ratio regardless of query size and dimension. We can also improve the performance of declustering algorithm based on mapping function, called Kronecker Sequence, which has been known to be the best among the mapping functions for high-dimensional data, up to 23 times by applying an efficient GRID partitioning scheme.