• Title/Summary/Keyword: I/O Processing

Search Result 569, Processing Time 0.024 seconds

Pipelined Macroblock Processing to Reduce Internal Buffer Size of Motion Estimation in Multimedia SoCs

  • Lee, Seong-Soo
    • ETRI Journal
    • /
    • v.25 no.5
    • /
    • pp.297-304
    • /
    • 2003
  • A multimedia SoC often requires a large internal buffer, because it must store the whole search window to reduce the huge I/O bandwidth of motion estimation. However, the silicon area of the internal buffer increases tremendously as the search range becomes larger. This paper proposes a new method that greatly reduces the internal buffer size of a multimedia SoC while the computational cost, I/O bandwidth, and image quality do not change. In the proposed method, only the overlapped parts of search windows for consecutive macroblocks are stored in the internal buffer. The proposed method reduces the internal buffer. The proposed method reduces the internal buffer size to 1/5.0 and 1/8.8 when the search range is ${\pm}64{\times}{\pm}$64 and ${\pm}128{\times}{\pm}$128, respectively.

  • PDF

MultiRing An Efficient Hardware Accelerator for Design Rule Checking (멀티링 설계규칙검사를 위한 효과적인 하드웨어 가속기)

  • 노길수;경종민
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.24 no.6
    • /
    • pp.1040-1048
    • /
    • 1987
  • We propose a hardware architecture called Multiring which is applicable for various geometrical operations on rectilinear objects such as design rule checking in VLSI layout and many image processing operations including noise suppression and coutour extraction. It has both a fast execution speed and extremely high flexibility. The whole architecture is mainly divided into four parts` I/O between host and Multiring, ring memory, linear processor array and instruction decoder. Data transmission between host and Multiring is bit serial thereby reducing the bandwidth requirement for teh channel and the number of external pins, while each row data in the bit map stored in ring memory is processed in the corresponding processor in full parallelism. Each processor is simultaneously configured by the instruction decoder/controller to perform one of the 16 basic instructions such as Boolean (AND, OR, NOT, and Copy), geometrical(Expand and Shrink), and I/O operations each ring cycle, which gives Multiring maximal flexibility in terms of design rule change or the instruction set enhancement. Correct functional behavior of Multiring was confirmed by successfully running a software simulator having one-to-one structural correspondence to the Multiring hardware.

  • PDF

Development of Large Scale Programmable Controller (대형 프로그래머블 콘트롤러의 개발 2 : Part II (S/W))

  • 권욱현;박홍성;최한홍;김덕우
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1987.10b
    • /
    • pp.413-418
    • /
    • 1987
  • The software developed for the large scale Programmable Controller consists of the programmer's S/W, the Controller's S/W the RBC's (Remote Base Controller's) S/W and the Analog's S/W. The programmer's S/W, running on the Programmer, includes the editor, the compiler, the communication program, and some other programs for easy use. The Controller S/W, which requires the fast scanning time, consists of the BTI( Block Type Instruction) solving program, the timer service routine, the i/o update program, the communication program and etc. The RBC's S/W includes the communication program, the error recovery program and the i/o processing program. The analog S/W, controlled by the Programmer, includes the PID program. The data communication between the Programmer and the Controller the Controller and the RBC, and the RBC and the Analog are developed.

  • PDF

The Design of the Cost Model for Query Processing in Parallel Spatial Database (병렬 공간 데이터베이스의 질의 처리를 위한 비용 모델의 설계)

  • 안성우;서영덕;홍봉희
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.04b
    • /
    • pp.90-92
    • /
    • 2000
  • 비용모델과 측정(Cost Model and Estimation)은 모든 DBMS에서 성능 측정을 하기 위한 기본적인 도구이다. 지금까지의 질의 수행에 대한 비용모델을 제시하는 연구가 많이 있었지만 대부분이 연구가 특정 부분에 대한 비용-CPU비용, 색인 방법에 대한 I/O횟수, I/O비용 등만을 고려함으로써 질의를 수행하는데 필요한 전체적인 비용을 산출하는데 부족한 점이 있었다. 그리고 병렬 공간 DBMS에 대한 비용모델을 산출하는 연구가 아직까지 이루어지지 않았다. 이 논문에서는 병렬 공간 DBMS에서 질의를 처리하는데 드는 전체적인 비용을 산출하고 있다. 기존의 연구에서 제시하고 있는 비용모델을 적용하고, 병렬 컴퓨터와 공간 데이터가 결합되었을 때의 고려사항을 추가하여 병렬 공간 DBMS에 적합한 전체적인 비용모델을 산출함으로써 병렬 공간 DBMS에서의 효율적인 질의수행에 대한 다른 연구를 수행할 때 비용모델에 대한 초석을 제공한다.

  • PDF

The Design of the Cost Model for Query Processing in Parallel Spatial Databases (병렬 공간 데이터베이스의 질의 처리를 위한 비용 모델의 설계)

  • 안성우;서영덕;홍봉희
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.04b
    • /
    • pp.51-53
    • /
    • 2000
  • 비용모델과 측정(Cost Model and Estimation)은 모든 DBMS에서 성능 측정을 하기 위한 기본적인 도구이다. 지금까지의 질의 수행에 대한 비용모델을 제시하는 연구가 많이 있었지만 대부분의 연구가 특정 부분에 대한 비용-CPU비용, 색인 방법에 대한 I/O횟수, I/O비용 등만을 고려함으로써 질의를 수행하는데 필요한 전체적인 비용을 산출하는데 부족한 점이 있었다. 그리고 병렬 공간 DBMS에 대한 비용모델을 산출하는 연구가 아직까지 이루어지지 않았다. 이 논문에서는 병렬 공간 DBMS에서 질의를 처리하는데 드는 전체적인 비용을 산출하고 있다. 기존의 연구에서 제시하고 있는 비용모델을 적용하고, 병렬 컴퓨터와 공간 데이터가 결합되었을 때의 고려사항을 추가하여 병렬 공간 DBMS에 적합한 전체적인 비용모델을 산출함으로써 병렬 공간 DBMS에서의 효율적인 질의수행에 대한 다른 연구를 수행할 때 비용모델에 대한 초석을 제공한다.

  • PDF

STUDY OF DETERMINISM OF DATA INTEGRITY DURING I/O DATA EXCHANGE BETWEEN TASKS AND DEVICE

  • Koo, Cheol-Hea;Park, Su-Hyun;Kang, Soo-Yeon;Yang, Koon-Ho;Choi, Sung-Bong
    • Proceedings of the KSRS Conference
    • /
    • 2007.10a
    • /
    • pp.77-80
    • /
    • 2007
  • In this paper, the method which can protect the situation of possible data corruption when collision has happened during I/O data exchange between device and tasks is presented. Also, an example diagram of mechanism according to this introduced method is shown and the effect and merits and demerits of the method is evaluated.

  • PDF

Optimizing LRU Lock Management in the Linux Kernel for Improving Parallel Write Throughout in Many-Core CPU Systems (매니코어 CPU 시스템의 병렬 쓰기 성능 향상을 위한 리눅스 커널의 LRU 관리 최적화 기법)

  • Eun-Kyu Byun;Gibeom Gu;Kwang-Jin Oh;Jiwoo Bang
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.12 no.7
    • /
    • pp.209-216
    • /
    • 2023
  • Modern HPC systems are equipped with many-core CPUs with dozens of cores. When performing parallel I/O in such a system, there is a limit to scalability due to the problem of the LRU lock management policy of the Linux system. The study proposes an improved FinerLRU to solve this problem. Our new FinerLRU improves the parallel write performance of file systems using the buffer cache through granular lock management by increasing the number of LRU locks upto the maximum number of cores. The proposed method was implemented in Linux 5.18.11, and the performance was measured on two types of CPUs, Intel Icelake Xeon and Intel Knights landing, with different characteristics, and it was found that a performance improvement of about two times can be obtained in both types of systems.

Performance Optimization of Big Data Center Processing System - Big Data Analysis Algorithm Based on Location Awareness

  • Zhao, Wen-Xuan;Min, Byung-Won
    • International Journal of Contents
    • /
    • v.17 no.3
    • /
    • pp.74-83
    • /
    • 2021
  • A location-aware algorithm is proposed in this study to optimize the system performance of distributed systems for processing big data with low data reliability and application performance. Compared with previous algorithms, the location-aware data block placement algorithm uses data block placement and node data recovery strategies to improve data application performance and reliability. Simulation and actual cluster tests showed that the location-aware placement algorithm proposed in this study could greatly improve data reliability and shorten the application processing time of I/O interfaces in real-time.

Comparative study of various buffer layers on IBAD- MgO template (IBAD-MgO 기판 위 다양한 완충층들의 비교 연구)

  • Ko, K.P.;Jang, K.S.;Yoo, S.I.;Oh, S.S.;Ko, R.K.;Moon, S.H.;Kim, H.K.
    • Progress in Superconductivity and Cryogenics
    • /
    • v.10 no.3
    • /
    • pp.5-8
    • /
    • 2008
  • On highly-textured IBAD-MgO templates, we have tried to find proper buffer layers among various candidate materials, including $LaMnO_3$ (LMO), $La_2Zr_2O_7$ (LAO), $LaAlO_3$ (LAO), $LaGaO_3$ (LGO), $NdGaO_3$ (NGO), and $BaZrO_3$ (BZO). All buffer layers were deposited on the IBAD-MgO templates by KrF pulsed laser deposition(PLD). LAO layer showed an armorphous phase. LZO, LGO, and NGO layers showed polycrystalline growth. Only LMO and BZO layers exhibited c-axis oriented biaxially textured films. Optimally processed LMO buffer layer at deposition temperature of $750^{\circ}C$ and $PO_2$ of 100mTorr exhibited ${\triangle}{\phi}$ value of ${\sim}-5.2^{\circ}$ and RMS roughness of 5.6nm. Interestingly, BZO buffer layers with ${\triangle}{\phi}$ values of ${\sim}-6^{\circ}$ could be routinely produced over a wide PLD processing condition.

Parallel Processing of k-Means Clustering Algorithm for Unsupervised Classification of Large Satellite Images: A Hybrid Method Using Multicores and a PC-Cluster (대용량 위성영상의 무감독 분류를 위한 k-Means Clustering 알고리즘의 병렬처리: 다중코어와 PC-Cluster를 이용한 Hybrid 방식)

  • Han, Soohee;Song, Jeong Heon
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.37 no.6
    • /
    • pp.445-452
    • /
    • 2019
  • In this study, parallel processing codes of k-means clustering algorithm were developed and implemented in a PC-cluster for unsupervised classification of large satellite images. We implemented intra-node code using multicores of CPU (Central Processing Unit) based on OpenMP (Open Multi-Processing), inter-nodes code using a PC-cluster based on message passing interface, and hybrid code using both. The PC-cluster consists of one master node and eight slave nodes, and each node is equipped with eight multicores. Two operating systems, Microsoft Windows and Canonical Ubuntu, were installed in the PC-cluster in turn and tested to compare parallel processing performance. Two multispectral satellite images were tested, which are a medium-capacity LANDSAT 8 OLI (Operational Land Imager) image and a high-capacity Sentinel 2A image. To evaluate the performance of parallel processing, speedup and efficiency were measured. Overall, the speedup was over N / 2 and the efficiency was over 0.5. From the comparison of the two operating systems, the Ubuntu system showed two to three times faster performance. To confirm that the results of the sequential and parallel processing coincide with the other, the center value of each band and the number of classified pixels were compared, and result images were examined by pixel to pixel comparison. It was found that care should be taken to avoid false sharing of OpenMP in intra-node implementation. To process large satellite images in a PC-cluster, code and hardware should be designed to reduce performance degradation caused by file I / O. Also, it was found that performance can differ depending on the operating system installed in a PC-cluster.