• 제목/요약/키워드: network processor

검색결과 558건 처리시간 0.02초

네트워크 프로세서에 적용 가능한 클래스 별 레이트 제한 기법 (A class-based rate limiting method applicable to the network processor)

  • 노진택;이진선;최경희;정기현;임강빈
    • 정보처리학회논문지C
    • /
    • 제12C권5호
    • /
    • pp.725-732
    • /
    • 2005
  • 본 논문은 기존의 범용 시스템 또는 전용 하드웨어 기반의 네트워크 시스템에서 사용하던 레이트 제한(rate limiting) 기법과 클래스 별 대역폭 관리 기법을 기가 비트 트래픽 처리를 위한 네트워크 프로세서에서 구현하기 위한 방안을 제시하고 이를 구현하여 실험하였다 구현과 실험은 인텔사의 IXP1200 네트워크 프로세서에서 이루어졌으며 그 결과로서 의도한 대역폭으로 제한된 트래픽 레이트의 정확도와 변화하는 입력 레이트에 대한 대역폭 제한 알고리즘의 안정화 시간을 보여 주고 있다. 이를 통하여, 네트워크 프로세서에 적합하도록 구현된 클래스 별 레이트 제한 기능이 일반 시스템에서의 토큰버킷 알고리즘의 오차범위 $10\%$에 근접한 성능으로 잘 동작하는 것을 확인하였다.

FLUID MODEL SOLUTION OF FEEDFORWARD NETWORK OF OVERLOADED MULTICLASS PROCESSOR SHARING QUEUES

  • AMAL EZZIDANI;ABDELGHANI BEN TAHAR;MOHAMED HANINI
    • Journal of applied mathematics & informatics
    • /
    • 제42권2호
    • /
    • pp.291-303
    • /
    • 2024
  • In this paper, we consider a feedforward network of overloaded multiclass processor sharing queues and we give a fluid model solution under the condition that the system is initially empty. The main theorem of the paper provides sufficient conditions for a fluid model solution to be linear with time. The results are illustrated through examples.

AB9: A neural processor for inference acceleration

  • Cho, Yong Cheol Peter;Chung, Jaehoon;Yang, Jeongmin;Lyuh, Chun-Gi;Kim, HyunMi;Kim, Chan;Ham, Je-seok;Choi, Minseok;Shin, Kyoungseon;Han, Jinho;Kwon, Youngsu
    • ETRI Journal
    • /
    • 제42권4호
    • /
    • pp.491-504
    • /
    • 2020
  • We present AB9, a neural processor for inference acceleration. AB9 consists of a systolic tensor core (STC) neural network accelerator designed to accelerate artificial intelligence applications by exploiting the data reuse and parallelism characteristics inherent in neural networks while providing fast access to large on-chip memory. Complementing the hardware is an intuitive and user-friendly development environment that includes a simulator and an implementation flow that provides a high degree of programmability with a short development time. Along with a 40-TFLOP STC that includes 32k arithmetic units and over 36 MB of on-chip SRAM, our baseline implementation of AB9 consists of a 1-GHz quad-core setup with other various industry-standard peripheral intellectual properties. The acceleration performance and power efficiency were evaluated using YOLOv2, and the results show that AB9 has superior performance and power efficiency to that of a general-purpose graphics processing unit implementation. AB9 has been taped out in the TSMC 28-nm process with a chip size of 17 × 23 ㎟. Delivery is expected later this year.

멀티 코어 시스템에서 통신 프로세스의 동적 스케줄링 (Dynamic Scheduling of Network Processes for Multi-Core Systems)

  • 장혜천;진현욱;김학영
    • 한국정보과학회논문지:컴퓨팅의 실제 및 레터
    • /
    • 제15권12호
    • /
    • pp.968-972
    • /
    • 2009
  • 멀티 코어 프로세서는 현재 많은 고성능 서버에 적용되어 사용되고 있다. 최근 이들 서버는 점차 높은 네트워크 대역폭 활용을 요구하고 있다. 이러한 요구를 만족시키기 위해서는 멀티 코어를 효율적으로 활용하여 네트워크 처리율을 향상시키는 방안이 필요하다. 그러나 현재 운영체제들은 멀티 코어 시스템을 멀티 프로세서 환경과 거의 동일하게 다루고 있으며 아직 멀티 코어의 고유 특성을 고려한 성능 최적화 시도는 미흡한 상태이다. 이러한 문제를 해결하기 위해서 본 논문에서는 멀티 코어의 특성을 최대한으로 고려하여 프로세스 스케줄링을 결정함으로써 통신 성능을 향상시키는 방안에 대해서 연구한다. 제안되는 프로세스 스케줄링은 멀티 코어 프로세서의 캐쉬 구조, 프로세스의 통신 집중도, 그리고 각 코어의 부하를 기반으로 해당 프로세스에게 최적의 코어를 결정하고 스케줄링한다. 제안된 기법은 리눅스 커널에 구현되었으며 측정 결과는 최신 리눅스 커널의 네트워크 처리율을 20%까지 향상시켰으며 프로세서 자원은 55% 더 절약할 수 있음을 보인다.

Hot Spot 이 Interconnection Network 의 성능에 미치는 영향 (Effect of Hot Spot to Performance of Interconnection Network)

  • 김성종;김태형;이영노;신인철
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 1988년도 전기.전자공학 학술대회 논문집
    • /
    • pp.655-658
    • /
    • 1988
  • Interconnection network is to provide communication among functional modules. The interconnections considered are Generalized Cube networks. Two situations are examined: a memory module is equally likely to be addressed by a processor and a processor has a favorite memory. This paper proposes the effective condition of operation in interconnection network through performance evaluation by simulation.

  • PDF

수정된 하니발 구조를 이용한 신경회로망의 하드웨어 구현 (A hardware implementation of neural network with modified HANNIBAL architecture)

  • 이범엽;정덕진
    • 대한전기학회논문지
    • /
    • 제45권3호
    • /
    • pp.444-450
    • /
    • 1996
  • A digital hardware architecture for artificial neural network with learning capability is described in this paper. It is a modified hardware architecture known as HANNIBAL(Hardware Architecture for Neural Networks Implementing Back propagation Algorithm Learning). For implementing an efficient neural network hardware, we analyzed various type of multiplier which is major function block of neuro-processor cell. With this result, we design a efficient digital neural network hardware using serial/parallel multiplier, and test the operation. We also analyze the hardware efficiency with logic level simulation. (author). refs., figs., tabs.

  • PDF

Multicore Flow Processor with Wire-Speed Flow Admission Control

  • Doo, Kyeong-Hwan;Yoon, Bin-Yeong;Lee, Bhum-Cheol;Lee, Soon-Seok;Han, Man Soo;Kim, Whan-Woo
    • ETRI Journal
    • /
    • 제34권6호
    • /
    • pp.827-837
    • /
    • 2012
  • We propose a flow admission control (FAC) for setting up a wire-speed connection for new flows based on their negotiated bandwidth. It also terminates a flow that does not have a packet transmitted within a certain period determined by the users. The FAC can be used to provide a reliable transmission of user datagram and transmission control protocol applications. If the period of flows can be set to a short time period, we can monitor active flows that carry a packet over networks during the flow period. Such powerful flow management can also be applied to security systems to detect a denial-of-service attack. We implement a network processor called a flow management network processor (FMNP), which is the second generation of the device that supports FAC. It has forty reduced instruction set computer core processors optimized for packet processing. It is fabricated in 65-nm CMOS technology and has a 40-Gbps process performance. We prove that a flow router equipped with an FMNP is better than legacy systems in terms of throughput and packet loss.

대규모 신경망 시뮬레이션을 위한 칩상 학습가능한 단일칩 다중 프로세서의 구현 (Design of a Dingle-chip Multiprocessor with On-chip Learning for Large Scale Neural Network Simulation)

  • 김종문;송윤선;김명원
    • 전자공학회논문지B
    • /
    • 제33B권2호
    • /
    • pp.149-158
    • /
    • 1996
  • In this paper we describe designing and implementing a digital neural chip and a parallel neural machine for simulating large scale neural netsorks. The chip is a single-chip multiprocessor which has four digiral neural processors (DNP-II) of the same architecture. Each DNP-II has program memory and data memory, and the chip operates in MIMD (multi-instruction, multi-data) parallel processor. The DNP-II has the instruction set tailored to neural computation. Which can be sed to effectively simulate various neural network models including on-chip learning. The DNP-II facilitates four-way data-driven communication supporting the extensibility of parallel systems. The parallel neural machine consists of a host computer, processor boards, a buffer board and an interface board. Each processor board consists of 8*8 array of DNP-II(equivalently 2*2 neural chips). Each processor board acn be built including linear array, 2-D mesh and 2-D torus. This flexibility supports efficiency of mapping from neural network models into parallel strucgure. The neural system accomplishes the performance of maximum 40 GCPS(giga connection per second) with 16 processor boards.

  • PDF

40-TFLOPS artificial intelligence processor with function-safe programmable many-cores for ISO26262 ASIL-D

  • Han, Jinho;Choi, Minseok;Kwon, Youngsu
    • ETRI Journal
    • /
    • 제42권4호
    • /
    • pp.468-479
    • /
    • 2020
  • The proposed AI processor architecture has high throughput for accelerating the neural network and reduces the external memory bandwidth required for processing the neural network. For achieving high throughput, the proposed super thread core (STC) includes 128 × 128 nano cores operating at the clock frequency of 1.2 GHz. The function-safe architecture is proposed for a fault-tolerance system such as an electronics system for autonomous cars. The general-purpose processor (GPP) core is integrated with STC for controlling the STC and processing the AI algorithm. It has a self-recovering cache and dynamic lockstep function. The function-safe design has proved the fault performance has ASIL D of ISO26262 standard fault tolerance levels. Therefore, the entire AI processor is fabricated via the 28-nm CMOS process as a prototype chip. Its peak computing performance is 40 TFLOPS at 1.2 GHz with the supply voltage of 1.1 V. The measured energy efficiency is 1.3 TOPS/W. A GPP for control with a function-safe design can have ISO26262 ASIL-D with the single-point fault-tolerance rate of 99.64%.

계층구조 Computer Network에서 공정제어를 위한 JOB Scheduling (JOB Scheduling for process Control in Hierarchical Computer Network)

  • 박일
    • 한국통신학회논문지
    • /
    • 제5권1호
    • /
    • pp.83-87
    • /
    • 1980
  • 階層構造 COMPUTER Network을 通한 工程制御로 Processing을 分數하여 Fault tolerance를 極大化시키며 複雜하고 多樣한 變數의 相互關係를 週期的으로 監視制御하는 分散制御 Processor JOB은 그 週期와 實行時間으로 定義할 수 있다. 모든 JOB에 대하여 Tree structure인 關係를 가진 subset들로 구성하여 이에 JOB Scheduling Algorithm을 求하여 본 결과 FCFS(Fist Come/First Service)인 Schedule 보다 Processor의 利用에 있어 Loose Time을 減少시키고 處理 可能時間 確保에 有利하였다.

  • PDF