• Title/Summary/Keyword: 파이프라인 방법

Search Result 275, Processing Time 0.025 seconds

A High-speed Packet Filtering System Architecture in Signature-based Network Intrusion Prevention (시그내쳐 기반의 네트워크 침입 방지에서 고속의 패킷 필터링을 위한 시스템 구조)

  • Kim, Dae-Young;Kim, Sun-Il;Lee, Jun-Yong
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.34 no.2
    • /
    • pp.73-83
    • /
    • 2007
  • In network intrusion prevention, attack packets are detected and filtered out based on their attack signatures. Pattern matching is extensively used to find attack signatures and the most time-consuming execution part of Network Intrusion Prevention Systems(NIPS). Pattern matching is usually accelerated by hardware and should be performed at wire speed in NIPS. However, that alone is not good enough. First, pattern matching hardware should be able to generate sufficient pattern match information including the pattern index number and the location of the match found at wire speed. Second, it should support pattern grouping to reduce unnecessary pattern matches. Third, it should always have a constant worst-case performance even if the number of patterns is increased. Finally it should be able to update patterns in a few minutes or seconds without stopping its operations, We propose a system architecture to meet the above requirement. The system architecture can process multiple pattern characters in parallel and employs a pipeline architecture to achieve high speed. Using Xilinx FPGA simulation, we show that the new system stales well to achieve a high speed oner 10Gbps and satisfies all of the above requirements.

Efficient Pipeline Architecture of CABAC in H.264/AVC (H.264/AVC의 효율적인 파이프라인 구조를 적용한 CABAC 하드웨어 설계)

  • Choi, Jin-Ha;Oh, Myung-Seok;Kim, Jae-Seok
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.45 no.7
    • /
    • pp.61-68
    • /
    • 2008
  • In this paper, we propose an efficient hardware architecture and algorithm to increase an encoding process rate and implement a hardware for CABAC (Context Adaptive Binary Arithmetic Coding) which is used with one of the entropy coding ways for the latest video compression technique, H.264/AVC (Advanced Video Coding). CABAC typically provides a better high compression performance maximum 15% compared with CAVLC. However, the complexity of operation of CABAC is significantly higher than the CAVLC. Because of complicated data dependency during the encoding process, the complexity of operation is higher. Therefore, various architectures were proposed to reduce an amount of operation. However, they have still latency on account of complicated data dependency. The proposed architecture has two techniques to implement efficient pipeline architecture. The one is quick calculation of 7, 8th bits used to calculate a probability is the first step in Binary arithmetic coding. The other is one step reduced pipeline arcbitecture when the type of the encoded symbols is MPS. By adopting these two techniques, the required processing time was reduced about 27-29% compared with previous architectures. It is designed in a hardware description language and total logic gate count is 19K using 0.18um standard cell library.

Characteristics and Automatic Detection of Block Reference Patterns (블록 참조 패턴의 특성 분석과 자동 발견)

  • Choe, Jong-Mu;Lee, Dong-Hui;No, Sam-Hyeok;Min, Sang-Ryeol;Jo, Yu-Geun
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.26 no.9
    • /
    • pp.1083-1095
    • /
    • 1999
  • 최근 처리기와 입출력 시스템의 속도 차이가 점점 커짐에 따라 버퍼 캐쉬의 효율적인 관리가 더욱 중요해지고 있다. 버퍼 캐쉬는 블록 교체 정책과 선반입 정책에 의해 관리되며, 각 정책은 버퍼 캐쉬에서 블록의 가치 즉 어떤 블록이 더 가까운 미래에 참조될 것인가를 결정해야 한다. 블록의 가치는 응용들의 블록 참조 패턴의 특성에 기반하며, 블록 참조 패턴의 특성에 대한 정확한 분석은 올바른 결정을 가능하게 하여 버퍼 캐쉬의 효율을 높일 수 있다. 본 논문은 각 응용들의 블록 참조 패턴에 대한 특성을 분석하고 이를 자동으로 발견하는 기법을 제안한다. 제안된 기법은 블록의 속성과 미래 참조 거리간의 관계를 이용해 블록 참조 패턴을 발견한다. 이 기법은 2 단계 파이프라인 방법을 이용하여 온라인으로 참조 패턴을 발견할 수 있으며, 참조 패턴의 변화가 발생하면 이를 인식할 수 있다. 본 논문에서는 8개의 실제 응용 트레이스를 이용해 블록 참조 패턴의 발견을 실험하였으며, 제안된 기법이 각 응용의 블록 참조 패턴을 정확히 발견함을 확인하였다. 그리고 발견된 참조 패턴 정보를 블록 교체 정책에 적용해 보았으며, 실험 결과 기존의 대표적인 블록 교체 정책인 LRU에 비해 최대 57%까지 디스크 입출력 횟수를 줄일 수 있었다.Abstract As the speed gap between processors and disks continues to increase, the role of the buffer cache located in main memory is becoming increasingly important. The buffer cache is managed by block replacement policies and prefetching policies and each policy should decide the value of block, that is which block will be accessed in the near future. The value of block is based on the characteristics of block reference patterns of applications, hence accurate characterization of block reference patterns may improve the performance of the buffer cache. In this paper, we study the characteristics of block reference behavior of applications and propose a scheme that automatically detects the block reference patterns. The detection is made by associating block attributes of a block with the forward distance of the block. With the periodic detection using a two-stage pipeline technique, the scheme can make on-line detection of block reference patterns and monitor the changes of block reference patterns. We measured the detection capability of the proposed scheme using 8 real workload traces and found that the scheme accurately detects the block reference patterns of applications. Also, we apply the detected block reference patterns into the block replacement policy and show that replacement policies appropriate for the detected block reference patterns decreases the number of DISK I/Os by up to 57%, compared with the traditional LRU policy.

The Design of Transform and Quantization Hardware for High-Performance HEVC Encoder (고성능 HEVC 부호기를 위한 변환양자화기 하드웨어 설계)

  • Park, Seungyong;Jo, Heungseon;Ryoo, Kwangki
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.2
    • /
    • pp.327-334
    • /
    • 2016
  • In this paper, we propose a hardware architecture of transform and quantization for high-perfornamce HEVC(High Efficiency VIdeo Coding) encoder. HEVC transform decides the transform mode by comparing RDCost to search for the best mode of them. But, RDCost is computed using the bit-rate and distortion which is computed by transform, quantization, de-quantization, and inverse transform. Due to the many calculations and encoding time, it is hard to process high resolution and high definition image in real-time. This paper proposes the method of transform mode decision by comparing sum of coefficient after transform only. We use BD-PSNR and BD-Bitrate which is performance indicator. Based on the experimental result, We confirmed that the decision of transform mode can process images with no significant change in the image quality. We reduced hardware area by assigning different values at the same output according to the transform mode and overlapping coefficient multiplied as much as possible. Also, we raise performance by implementing sequential pipeline operation. In view of the larger process that we used compared with the process of reference paper, Our design has reduced by half the hardware area and has increased performance 2.3 times.

Design of Efficient Gradient Orientation Bin and Weight Calculation Circuit for HOG Feature Calculation (HOG 특징 연산에 적용하기 위한 효율적인 기울기 방향 bin 및 가중치 연산 회로 설계)

  • Kim, Soojin;Cho, Kyeongsoon
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.51 no.11
    • /
    • pp.66-72
    • /
    • 2014
  • Histogram of oriented gradient (HOG) feature is widely used in vision-based pedestrian detection. The interpolation is the most important technique in HOG feature calculation to provide high detection rate. In interpolation technique of HOG feature calculation, two nearest orientation bins to gradient orientation for each pixel and the corresponding weights are required. In this paper, therefore, an efficient gradient orientation bin and weight calculation circuit for HOG feature is proposed. In the proposed circuit, pre-calculated values are defined in tables to avoid the operations of tangent function and division, and the size of tables is minimized by utilizing the characteristics of tangent function and weights for each gradient orientation. Pipeline architecture is adopted to the proposed circuit to accelerate the processing speed, and orientation bins and the corresponding weights for each pixel are calculated in two clock cycles by applying efficient coarse and fine search schemes. Since the proposed circuit calculates gradient orientation for each pixel with the interval of $1^{\circ}$ and determines both orientation bins and weights required in interpolation technique, it can be utilized in HOG feature calculation to support interpolation technique to provide high detection rate.

Design and Analysis of a Digit-Serial $AB^{2}$ Systolic Arrays in $GF(2^{m})$ ($GF(2^{m})$ 상에서 새로운 디지트 시리얼 $AB^{2}$ 시스톨릭 어레이 설계 및 분석)

  • Kim Nam-Yeun;Yoo Kee-Young
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.32 no.4
    • /
    • pp.160-167
    • /
    • 2005
  • Among finite filed arithmetic operations, division/inverse is known as a basic operation for public-key cryptosystems over $GF(2^{m})$ and it is computed by performing the repetitive $AB^{2}$ multiplication. This paper presents a digit-serial-in-serial-out systolic architecture for performing the $AB^2$ operation in GF$(2^{m})$. To obtain L×L digit-serial-in-serial-out architecture, new $AB^{2}$ algorithm is proposed and partitioning, index transformation and merging the cell of the architecture, which is derived from the algorithm, are proposed. Based on the area-time product, when the digit-size of digit-serial architecture, L, is selected to be less than about m, the proposed digit-serial architecture is efficient than bit-parallel architecture, and L is selected to be less than about $(1/5)log_{2}(m+1)$, the proposed is efficient than bit-serial. In addition, the area-time product complexity of pipelined digit-serial $AB^{2}$ systolic architecture is approximately $10.9\%$ lower than that of nonpipelined one, when it is assumed that m=160 and L=8. Additionally, since the proposed architecture can be utilized for the basic architecture of crypto-processor and it is well suited to VLSI implementation because of its simplicity, regularity and pipelinability.

A Study on the User-Based Small Fishing Boat Collision Alarm Classification Model Using Semi-supervised Learning (준지도 학습을 활용한 사용자 기반 소형 어선 충돌 경보 분류모델에대한 연구)

  • Ho-June Seok;Seung Sim;Jeong-Hun Woo;Jun-Rae Cho;Jaeyong Jung;DeukJae Cho;Jong-Hwa Baek
    • Journal of Navigation and Port Research
    • /
    • v.47 no.6
    • /
    • pp.358-366
    • /
    • 2023
  • This study aimed to provide a solution for improving ship collision alert of the 'accident vulnerable ship monitoring service' among the 'intelligent marine traffic information system' services of the Ministry of Oceans and Fisheries. The current ship collision alert uses a supervised learning (SL) model with survey labels based on large ship-oriented data and its operators. Consequently, the small ship data and the operator's opinion are not reflected in the current collision-supervised learning model, and the effect is insufficient because the alarm is provided from a longer distance than the small ship operator feels. In addition, the supervised learning (SL) method requires a large number of labeled data, and the labeling process requires a lot of resources and time. To overcome these limitations, in this paper, the classification model of collision alerts for small ships using unlabeled data with the semi-supervised learning (SSL) algorithms (Label Propagation and TabNet) was studied. Results of real-time experiments on small ship operators using the classification model of collision alerts showed that the satisfaction of operators increased.

LASPI: Hardware friendly LArge-scale stereo matching using Support Point Interpolation (LASPI: 지원점 보간법을 이용한 H/W 구현에 용이한 스테레오 매칭 방법)

  • Park, Sanghyun;Ghimire, Deepak;Kim, Jung-guk;Han, Youngki
    • Journal of KIISE
    • /
    • v.44 no.9
    • /
    • pp.932-945
    • /
    • 2017
  • In this paper, a new hardware and software architecture for a stereo vision processing system including rectification, disparity estimation, and visualization was developed. The developed method, named LArge scale stereo matching method using Support Point Interpolation (LASPI), shows excellence in real-time processing for obtaining dense disparity maps from high quality image regions that contain high density support points. In the real-time processing of high definition (HD) images, LASPI does not degrade the quality level of disparity maps compared to existing stereo-matching methods such as Efficient LArge-scale Stereo matching (ELAS). LASPI has been designed to meet a high frame-rate, accurate distance resolution performance, and a low resource usage even in a limited resource environment. These characteristics enable LASPI to be deployed to safety-critical applications such as an obstacle recognition system and distance detection system for autonomous vehicles. A Field Programmable Gate Array (FPGA) for the LASPI algorithm has been implemented in order to support parallel processing and 4-stage pipelining. From various experiments, it was verified that the developed FPGA system (Xilinx Virtex-7 FPGA, 148.5MHz Clock) is capable of processing 30 HD ($1280{\times}720pixels$) frames per second in real-time while it generates disparity maps that are applicable to real vehicles.

Implementation of RTOS Simulator With Execution Time Estimation (실행시간 추정 가능한 RTOS 시뮬레이터의 구현)

  • 김방현;류성준;김종현;남영광;이광용
    • Proceedings of the Korea Society for Simulation Conference
    • /
    • 2002.05a
    • /
    • pp.125-129
    • /
    • 2002
  • 실시간 운영체제(Real-Time Operating System: 이하 RTOS라 함) 개발환경에서 제공하는 도구 중에 하나인 RTOS 시뮬레이터는 타겟 하드웨어가 호스트에 연결되어 있지 않아도 호스트에서 응용프로그램의 개발과 디버깅을 가능하게 해주는 타겟 시뮬레이션 환경을 제공해 줌으로서, 개발자로 하여금 빠른 시간 내에 응용프로그램을 개발할 수 있도록 지원하며 하드웨어 개발이 완료되기 전에도 응용프로그램을 개발할 수 있게 해 준다. 그러한 이유로 현재 대부분의 상용 RTOS 개발환경에서는 RTOS 시뮬레이터를 제공하고 있다. 그러나 현재 상용 RTOS 시뮬레이터들은 대부분 RTOS의 기능적인 부분들만 호스트에서 동작하도록 구현되어 있어서 RTOS나 RTOS 응용프로그램이 실제 타겟에서 실행될 때의 실질적인 시간 추정이 불가능하다. 이러한 문제점은 실시간 시스템이 정해진 시간 내에 결과를 출력해야 하는 시스템임을 감안한다면 RTOS 시뮬레이터의 가장 큰 결점이 되기 때문에 실행시간 추정 기능을 가지면서 실용화도 가능한 RTOS 시뮬레이터가 필요하다. 본 연구에서는 이러한 문제점을 해결하여 RTOS와 RTOS 응용프로그램이 실제 타겟에서 처리될 때의 실행시간 추정이 가능하고 상용화가 가능한 기계 명령어 기반(machine instruction-based)의 RTOS 시뮬레이터를 연구 개발하였다. 나아가 실행시간의 주요 요소인 파이프라인과 캐쉬의 영향도 고려함으로서 실행시간 추정의 정확도를 향상시켰다 본 연구에서 사용된 RTOS는 한국전자통신연구원(ETRI)에서 2000년에 개발된 Q+이고, Q+가 동작하는 타겟 하드웨어는 ARM 계열의 StrongARM SA-110 마이크로프로세서와 21285 주제어기가 장착된 EBSA-285 보드이다. 측정하면서 수행하였다. 검증 결과 random 상태에서는 문헌자료에 부합되는 예측결과를 보여주었으나, intermediate와 constant 상태에서는 문헌보다 다소 낮은 속도를 보여주었다 이러한 속도차는 추후 현장 데이터를 수집하여 보다 실질적인 검증을 통하여 조정되어야 할 것으로 판단된다.지발광(1.26초)보다 구애발광(1.12초)에서 0.88배 감소하였고, 암컷에서 정지발광(2.99초)보다 구애발광(1.06초)에서 0.35배 감소하였다. 발광양상에서 발광주파수는 수짓의 정지발광에서 0.8 Hz, 수컷 구애발광에서 0.9 Hz, 암컷의 정지발광에서 0.3 Hz, 암컷의 구애발광에서 0.9 Hz로 각각 나타났다. H. papariensis의 발광파장영역은 400 nm에서 700 nm에 이르는 모든 영역에서 확인되었으며 가장 높은 첨두치는 600 nm에 있고 500에서 600 nm 사이의 파장대가 가장 두드러지게 나타났다. 발광양상과 어우러진 교미행동은 Hp system과 같은 결과를 얻었다.하는 방법을 제안한다. 즉 채널 액세스 확률을 각 슬롯에서 예약상태에 있는 음성 단말의 수뿐만 아니라 각 슬롯에서 예약을 하려고 하는 단말의 수에 기초하여 산출하는 방법을 제안하고 이의 성능을 분석하였다. 시뮬레이션에 의해 새로 제안된 채널 허용 확률을 산출하는 방식의 성능을 비교한 결과 기존에 제안된 방법들보다 상당한 성능의 향상을 볼 수 있었다., 인삼이 성장될 때 부분적인 영양상태의 불충분이나 기후 등에 따른 영향을 받을 수 있기 때문에 앞으로 이에 대한 많은 연구가 이루어져야할 것으로 판단된다.태에도 불구하고 [-wh]의미의 겹의문사는 병렬적 관계의 합성어가 아니라 내부구조를 지니지 않은 단순한 단어(minimal $X^{0}$

  • PDF

A Study on the Installation of SCR System for Generator Diesel Engine of Existing Ship (기존 선박의 디젤발전기용 SCR 시스템 설치에 관한 연구)

  • Ryu, Younghyun;Kim, Hongryeol;Cho, Gyubaek;Kim, Hongsuk;Nam, Jeonggil
    • Journal of Advanced Marine Engineering and Technology
    • /
    • v.39 no.4
    • /
    • pp.412-417
    • /
    • 2015
  • The IMO MEPC has been increasingly strengthening the emission standard for marine environment protection. In particular, nitrogen oxide (NOx) emissions of all ocean-going ships built from 2016 will be required to comply with the Tier-III regulation. In this study, a vanadia based SCR (Selective Catalytic Reduction) system developed for ship application was installed on a diesel engine for power generation of the training ship T/S SAENURI in Mokpo National Maritime University. For the present study, the exhaust pipeline of the generator diesel engine was modified to fit the urea SCR system. This study investigated the NOx reduction performance according to the two kind of injection method of urea solution (40%): Auto mode through the PLC (Programable Logic Control) and Manual mode. We were able to find the ammonia slip conditions when in manual mode method. So, the optimal urea injection quantity can be controlled at each engine load (25, 35, 50%) condition. It was achieved 80% reduction on nitrogen oxide. Furthermore, we found that the NOx reduction performance was better with the load up-down (while down to 25% from 50%) than the load down-up (while up to 50% from 25%) test.