• Title/Summary/Keyword: Large-memory data processing

Search Result 194, Processing Time 0.034 seconds

A Parallel Processing Technique for Large Spatial Data (대용량 공간 데이터를 위한 병렬 처리 기법)

  • Park, Seunghyun;Oh, Byoung-Woo
    • Spatial Information Research
    • /
    • v.23 no.2
    • /
    • pp.1-9
    • /
    • 2015
  • Graphical processing unit (GPU) contains many arithmetic logic units (ALUs). Because many ALUs can be exploited to process parallel processing, GPU provides efficient data processing. The spatial data require many geographic coordinates to represent the shape of them in a map. The coordinates are usually stored as geodetic longitude and latitude. To display a map in 2-dimensional Cartesian coordinate system, the geodetic longitude and latitude should be converted to the Universal Transverse Mercator (UTM) coordinate system. The conversion to the other coordinate system and the rendering process to represent the converted coordinates to screen use complex floating-point computations. In this paper, we propose a parallel processing technique that processes the conversion and the rendering using the GPU to improve the performance. Large spatial data is stored in the disk on files. To process the large amount of spatial data efficiently, we propose a technique that merges the spatial data files to a large file and access the file with the method of memory mapped file. We implement the proposed technique and perform the experiment with the 747,302,971 points of the TIGER/Line spatial data. The result of the experiment is that the conversion time for the coordinate systems with the GPU is 30.16 times faster than the CPU only method and the rendering time is 80.40 times faster than the CPU.

Design of Asynchronous Non-Volatile Memory Module Using NAND Flash Memory and PSRAM (낸드 플래시 메모리와 PSRAM을 이용한 비동기용 불휘발성 메모리 모듈 설계)

  • Kim, Tae Hyun;Yang, Oh;Yeon, Jun Sang
    • Journal of the Semiconductor & Display Technology
    • /
    • v.19 no.3
    • /
    • pp.118-123
    • /
    • 2020
  • In this paper, the design method of asynchronous nonvolatile memory module that can efficiently process and store large amounts of data without loss when the power turned off is proposed and implemented. PSRAM, which takes advantage of DRAM and SRAM, was used for data processing, and NAND flash memory was used for data storage and backup. The problem of a lot of signal interference due to the characteristics of memory devices was solved through PCB design using high-density integration technology. In addition, a boost circuit using the super capacitor of 0.47F was designed to supply sufficient power to the system during the time to back up data when the power is off. As a result, an asynchronous nonvolatile memory module was designed and implemented that guarantees reliability and stability and can semi-permanently store data for about 10 years. The proposed method solved the problem of frequent data loss in industrial sites and presented the possibility of commercialization by providing convenience to users and managers.

A Design of Discrete Wavelet Transform Encoder for Multimedia Image Signal Processing (멀티미디어 영상신호 처리를 위한 DWT 부호화기 설계)

  • 이강현
    • Proceedings of the IEEK Conference
    • /
    • 2003.07d
    • /
    • pp.1685-1688
    • /
    • 2003
  • The modem multimedia applications which are video Processor, video conference or video phone and so forth require real time processing. Because of a large amount of image data, those require high compression performance. In this paper, the proposed image processing encoder was designed by using wavelet transform encoding. The proposed filter block can process image data on tile high speed because of composing individual function blocks by parallel and compute both highpass and lowpass coefficient in the same clock cycle. When image data is decomposed into multiresolution, the proposed scheme needs external memory and controller to save intermediate results and it can operate within 33㎒.

  • PDF

The Conceptual Design of Mass Memory Unit for High Speed Data Processing in the STSAT-3 (고속 데이터 처리를 위한 과학기술위성 3호 대용량 메모리 유닛의 개념 설계)

  • Seo, In-Ho;Oh, Dae-Soo;Myung, Noh-Hoon
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.38 no.4
    • /
    • pp.389-394
    • /
    • 2010
  • This paper describes the conceptual design of mass memory unit for high speed data processing and mass memory management in the STSAT-3 compared to that of STSAT-2. The FPGA directly controls the data receiving from two payloads with the maximum 100Mbps speed and 32Gb mass memory management to satisfy these requirements. We used SRAM-based FPGA from XILINX having fast operating speed and large logic cells. Therefore, the Triple Modular Redundancy(TMR) and configuration memory scrubbing techniques will also be used to protect FPGA from Single Event Upset(SEU) in space.

An Efficient System Software of Flash Translation Layer for Large Block Flash Memory (대용량 플래시 메모리를 위한 효율적인 플래시 변환 계층 시스템 소프트웨어)

  • Chung Tae-Sun;Park Dong-Joo;Cho Sehyeong
    • The KIPS Transactions:PartA
    • /
    • v.12A no.7 s.97
    • /
    • pp.621-626
    • /
    • 2005
  • Recently, flash memory is widely used in various embedded applications since it has many advantages in terms of non-volatility, fast access speed, shock resistance, and low power consumption. However, it requires a software layer called FTL(Flash Translation Layer) due to its hardware characteristics. We present a new FTL algorithm named LSTAFF(Large State Transition Applied Fast flash Translation Layer) which is designed for large block flash memory The presented LSTAFF is adjusted to flash memory with pages which are larger than operating system data sector sizes and we provide performance results based on our implementation of LSTAFF and previous FTL algorithms using a flash simulator.

Memory-Efficient Belief Propagation for Stereo Matching on GPU (GPU 에서의 고속 스테레오 정합을 위한 메모리 효율적인 Belief Propagation)

  • Choi, Young-Kyu;Williem, Williem;Park, In Kyu
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2012.11a
    • /
    • pp.52-53
    • /
    • 2012
  • Belief propagation (BP) is a commonly used global energy minimization algorithm for solving stereo matching problem in 3D reconstruction. However, it requires large memory bandwidth and data size. In this paper, we propose a novel memory-efficient algorithm of BP in stereo matching on the Graphics Processing Units (GPU). The data size and transfer bandwidth are significantly reduced by storing only a part of the whole message. In order to maintain the accuracy of the matching result, the local messages are reconstructed using shared memory available in GPU. Experimental result shows that there is almost an order of reduction in the global memory consumption, and 21 to 46% saving in memory bandwidth when compared to the conventional algorithm. The implementation result on a recent GPU shows that we can obtain 22.8 times speedup in execution time compared to the execution on CPU.

  • PDF

GPU Based Incremental Connected Component Processing in Dynamic Graphs (동적 그래프에서 GPU 기반의 점진적 연결 요소 처리)

  • Kim, Nam-Young;Choi, Do-Jin;Bok, Kyoung-Soo;Yoo, Jae-Soo
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.6
    • /
    • pp.56-68
    • /
    • 2022
  • Recently, as the demand for real-time processing increases, studies on a dynamic graph that changes over time has been actively done. There is a connected components processing algorithm as one of the algorithms for analyzing dynamic graphs. GPUs are suitable for large-scale graph calculations due to their high memory bandwidth and computational performance. However, when computing the connected components of a dynamic graph using the GPU, frequent data exchange occurs between the CPU and the GPU during real graph processing due to the limited memory of the GPU. The proposed scheme utilizes the Weighted-Quick-Union algorithm to process large-scale graphs on the GPU. It supports fast connected components computation by applying the size to the connected component label. It computes the connected component by determining the parts to be recalculated and minimizing the data to be transmitted to the GPU. In addition, we propose a processing structure in which the GPU and the CPU execute asynchronously to reduce the data transfer time between GPU and CPU. We show the excellence of the proposed scheme through performance evaluation using real dataset.

Design and Implementation of Memory-Centric Computing System for Big Data Analysis

  • Jung, Byung-Kwon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.7
    • /
    • pp.1-7
    • /
    • 2022
  • Recently, as the use of applications such as big data programs and machine learning programs that are driven while generating large amounts of data in the program itself becomes common, the existing main memory alone lacks memory, making it difficult to execute the program quickly. In particular, the need to derive results more quickly has emerged in a situation where it is necessary to analyze whether the entire sequence is genetically altered due to the outbreak of the coronavirus. As a result of measuring performance by applying large-capacity data to a computing system equipped with a self-developed memory pool MOCA host adapter instead of processing large-capacity data from an existing SSD, performance improved by 16% compared to the existing SSD system. In addition, in various other benchmark tests, IO performance was 92.8%, 80.6%, and 32.8% faster than SSD in computing systems equipped with memory pool MOCA host adapters such as SortSampleBam, ApplyBQSR, and GatherBamFiles by task of workflow. When analyzing large amounts of data, such as electrical dielectric pipeline analysis, it is judged that the measurement delay occurring at runtime can be reduced in the computing system equipped with the memory pool MOCA host adapter developed in this research.

Auto Regulated Data Provisioning Scheme with Adaptive Buffer Resilience Control on Federated Clouds

  • Kim, Byungsang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.11
    • /
    • pp.5271-5289
    • /
    • 2016
  • On large-scale data analysis platforms deployed on cloud infrastructures over the Internet, the instability of the data transfer time and the dynamics of the processing rate require a more sophisticated data distribution scheme which maximizes parallel efficiency by achieving the balanced load among participated computing elements and by eliminating the idle time of each computing element. In particular, under the constraints that have the real-time and limited data buffer (in-memory storage) are given, it needs more controllable mechanism to prevent both the overflow and the underflow of the finite buffer. In this paper, we propose an auto regulated data provisioning model based on receiver-driven data pull model. On this model, we provide a synchronized data replenishment mechanism that implicitly avoids the data buffer overflow as well as explicitly regulates the data buffer underflow by adequately adjusting the buffer resilience. To estimate the optimal size of buffer resilience, we exploits an adaptive buffer resilience control scheme that minimizes both data buffer space and idle time of the processing elements based on directly measured sample path analysis. The simulation results show that the proposed scheme provides allowable approximation compared to the numerical results. Also, it is suitably efficient to apply for such a dynamic environment that cannot postulate the stochastic characteristic for the data transfer time, the data processing rate, or even an environment where the fluctuation of the both is presented.

Metadata-Based Data Structure Analysis to Optimize Search Speed and Memory Efficiency (검색 속도와 메모리 효율 최적화를 위한 메타데이터 기반 데이터 구조 분석)

  • Kim Se Yeon;Lim Young Hoon
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.7
    • /
    • pp.311-318
    • /
    • 2024
  • As the amount of data increases due to the development of artificial intelligence and the Internet, data management is becoming increasingly important, and the efficient utilization of data retrieval and memory space is crucial. In this study, we investigate how to optimize search speed and memory efficiency by analyzing data structure based on metadata. As a research method, we compared and analyzed the performance of the array, association list, dictionary binary tree, and graph data structures using metadata of photographic images, focusing on temporal and space complexity. Through experimentation, it was confirmed that dictionary data structure performs best in collection speed and graph data structure performs best in search speed when dealing with large-scale image data. We expect the results of this paper to provide practical guidelines for selecting data structures to optimize search speed and memory efficiency for the images data.