• Title/Summary/Keyword: Network Processor[1]

Search Result 147, Processing Time 0.024 seconds

Low-Power Multiplication Processing Element Hardware to Support Parallel Convolutional Neural Network Processing (합성곱 신경망 병렬 연산처리를 지원하는 저전력 곱셈 프로세싱 엘리먼트 설계)

  • Eunpyoung Park;Jongsu Park
    • Journal of Platform Technology
    • /
    • v.12 no.2
    • /
    • pp.58-63
    • /
    • 2024
  • CNNs tend to take a long time to learn and consume a lot of power due to lack of system resources with many data processing units when there are repetitive handles that do not have high performance in the image field. In this paper, we propose a handling method based on a low-power bus that can increase the exchange rate of multipliers and multiplicands within the convolution mixer, which is a tendency activity that occurs when a convolution mixer has multiplication, which is the core element of combination. Convolutional neural networks have proprietary low-power shared processor support and the design was implemented on an Intel DE1-SoC FPGA board using Verilog-HDL. The experiments validated the performance by comparing it with the exchange rate of the multiplier originally proposed by Shen on MNIST's numeric image database.

  • PDF

Study on Korean Variable Message Format Construction for Battlefield Visualization (전장가시화를 위한 한국형 지상전술데이터링크 구축 연구)

  • Kim, Seung-Chun;Lee, Hyung-Keun
    • Journal of IKEEE
    • /
    • v.15 no.1
    • /
    • pp.104-112
    • /
    • 2011
  • During the ground operation of Korean army, the voice message is mainly used for exchanging informations related to the surveillance and reconnaissance, command and control, and precision strike. However, in order to the battlefield visualization among fighting powers participating in the ground force operation, automatic situational awareness and variable message format (VMF) for command and control are required. For securing core technologies necessary for the battlefield visualization, message standard and message handler are established through several applied researches. Besides, the VMF for equipping a weapon system is in development. In this paper, a study on the Korean variable message format (KVMF), where interoperability of integrated battle management system (BMS) is guaranteed due to performing joint, ground, and combined operations so that the situation awareness and strike system can be automated in almost real time, is presented. From the modeling and simulation (M&S) results of the message processor, delay time is varied in accordance with the number of nodes in unit platoon network, message length, and generation interval of routine messages. Therefore, it is shown that the system performance can be optimized by establishing proper network protocol for each situation.

An efficient interconnection network topology in dual-link CC-NUMA systems (이중 연결 구조 CC-NUMA 시스템의 효율적인 상호 연결망 구성 기법)

  • Suh, Hyo-Joong
    • The KIPS Transactions:PartA
    • /
    • v.11A no.1
    • /
    • pp.49-56
    • /
    • 2004
  • The performance of the multiprocessor systems is limited by the several factors. The system performance is affected by the processor speed, memory delay, and interconnection network bandwidth/latency. By the evolution of semiconductor technology, off the shelf microprocessor speed breaks beyond GHz, and the processors can be scalable up to multiprocessor system by connecting through the interconnection networks. In this situation, the system performances are bound by the latencies and the bandwidth of the interconnection networks. SCI, Myrinet, and Gigabit Ethernet are widely adopted as a high-speed interconnection network links for the high performance cluster systems. Performance improvement of the interconnection network can be achieved by the bandwidth extension and the latency minimization. Speed up of the operation clock speed is a simple way to accomplish the bandwidth and latency betterment, while its physical distance makes the difficulties to attain the high frequency clock. Hence the system performance and scalability suffered from the interconnection network limitation. Duplicating the link of the interconnection network is one of the solutions to resolve the bottleneck of the scalable systems. Dual-ring SCI link structure is an example of the interconnection network improvement. In this paper, I propose a network topology and a transaction path algorism, which optimize the latency and the efficiency under the duplicated links. By the simulation results, the proposed structure shows 1.05 to 1.11 times better latency, and exhibits 1.42 to 2.1 times faster execution compared to the dual ring systems.

Optimal Interface Design between Short-Range Air Defense Missile System and Dissimilar Combat Systems (단거리 대공방어유도탄체계와 이기종 함정 전투체계간 최적의 연동 설계 기법)

  • Park, Hyeon-Woo
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.18 no.3
    • /
    • pp.260-266
    • /
    • 2015
  • The warship is run based on the combat system which shares tactical information collected by target detection systems and navigation devices across a network, and conducts the command and control of weapons from target detection to kill assessment. The short-range air defense missile system defends a warship from anti-ship missiles, aircraft, helicopter and other threats in order to contribute to the survival of a warship and the success of missions. The short-range air defense missile system is applied to a various combat systems. In this paper, we have proposed the interface design between the short-range air defense missile and dissimilar combat systems. To employ the short-range air defense missile at dissimilar combat systems, each system is driven by independent processor, and the tasks which are performed by each system are assigned. The information created by them is exchanged through the interface, and the flow of messages is designed.

Direction-Embedded Branch Prediction based on the Analysis of Neural Network (신경망의 분석을 통한 방향 정보를 내포하는 분기 예측 기법)

  • Kwak Jong Wook;Kim Ju-Hwan;Jhon Chu Shik
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.42 no.1
    • /
    • pp.9-26
    • /
    • 2005
  • In the pursuit of ever higher levels of performance, recent computer systems have made use of deep pipeline, dynamic scheduling and multi-issue superscalar processor technologies. In this situations, branch prediction schemes are an essential part of modem microarchitectures because the penalty for a branch misprediction increases as pipelines deepen and the number of instructions issued per cycle increases. In this paper, we propose a novel branch prediction scheme, direction-gshare(d-gshare), to improve the prediction accuracy. At first, we model a neural network with the components that possibly affect the branch prediction accuracy, and analyze the variation of their weights based on the neural network information. Then, we newly add the component that has a high weight value to an original gshare scheme. We simulate our branch prediction scheme using Simple Scalar, a powerful event-driven simulator, and analyze the simulation results. Our results show that, compared to bimodal, two-level adaptive and gshare predictor, direction-gshare predictor(d-gshare. 3) outperforms, without additional hardware costs, by up to 4.1% and 1.5% in average for the default mont of embedded direction, and 11.8% in maximum and 3.7% in average for the optimal one.

The Design and Implementation of User Authorization Module based on Zigbee for Automotive Smart-key System (차량용 스마트키 시스템을 위한 지그비 기반의 사용자 인증 모듈 설계 및 구현)

  • Kim, Kyeong-Seob;Lee, Yun-Seob;Yun, Hyun-Min;Choi, Sang-Bang
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.14 no.11
    • /
    • pp.2442-2450
    • /
    • 2010
  • Using sensor devices applied to various objects will be needed wireless network that it is easy to install in them. Tiny devices configured to processor that bas comparatively low computing ability are inappropriate to use devices that are wireless LAN, etc. In result, network devices needed to not only have simple communication protocol, but have Plug and Play function that it works as soon as it connects without installing any device driver. it also will industrially have both low power and low cost because of mobility of it. From IEEE 802.11 standard, WPAN(Wireless Personal Area Network) included in LAN is being developed by WPAN WG(Working Group) on area with low power consumption and low complexity. In addition to, it is standardizing MAC and PRY of the standard that is expected to wirelessly communicate within 10m. WPAN will be used generally in the more near future because of both low power and low cost of Zigbee. In this paper we designed zigbee based user authentication module for a automotive smart-key system.

8.1 Gbps High-Throughput and Multi-Mode QC-LDPC Decoder based on Fully Parallel Structure (전 병렬구조 기반 8.1 Gbps 고속 및 다중 모드 QC-LDPC 복호기)

  • Jung, Yongmin;Jung, Yunho;Lee, Seongjoo;Kim, Jaeseok
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.11
    • /
    • pp.78-89
    • /
    • 2013
  • This paper proposes a high-throughput and multi-mode quasi-cyclic (QC) low-density parity-check (LDPC) decoder based on a fully parallel structure. The proposed QC-LDPC decoder employs the fully parallel structure to provide very high throughput. The high interconnection complexity, which is the general problem in the fully parallel structure, is solved by using a broadcasting-based sum-product algorithm and proposing a low-complexity cyclic shift network. The high complexity problem, which is caused by using a large amount of check node processors and variable node processors, is solved by proposing a combined check and variable node processor (CCVP). The proposed QC-LDPC decoder can support the multi-mode decoding by proposing a routing-based interconnection network, the flexible CCVP and the flexible cyclic shift network. The proposed QC-LDPC decoder is operated at 100 MHz clock frequency. The proposed QC-LDPC decoder supports multi-mode decoding and provides 8.1 Gbps throughput for a (1944, 1620) QC-LDPC code.

A Study on The RFID/WSN Integrated system for Ubiquitous Computing Environment (유비쿼터스 컴퓨팅 환경을 위한 RFID/WSN 통합 관리 시스템에 관한 연구)

  • Park, Yong-Min;Lee, Jun-Hyuk
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.49 no.1
    • /
    • pp.31-46
    • /
    • 2012
  • The most critical technology to implement ubiquitous health care is Ubiquitous Sensor Network (USN) technology which makes use of various sensor technologies, processor integration technology, and wireless network technology-Radio Frequency Identification (RFID) and Wireless Sensor Network (WSN)-to easily gather and monitor actual physical environment information from a remote site. With the feature, the USN technology can make the information technology of the existing virtual space expanded to actual environments. However, although the RFID and the WSN have technical similarities and mutual effects, they have been recognized to be studied separately, and sufficient studies have not been conducted on the technical integration of the RFID and the WSN. Therefore, EPCglobal which realized the issue proposed the EPC Sensor Network to efficiently integrate and interoperate the RFID and WSN technologies based on the international standard EPCglobal network. The proposed EPC Sensor Network technology uses the Complex Event Processing method in the middleware to integrate data occurring through the RFID and the WSN in a single environment and to interoperate the events based on the EPCglobal network. However, as the EPC Sensor Network technology continuously performs its operation even in the case that the minimum conditions are not to be met to find complex events in the middleware, its operation cost rises. Moreover, since the technology is based on the EPCglobal network, it can neither perform its operation only for the sake of sensor data, nor connect or interoperate with each information system in which the most important information in the ubiquitous computing environment is saved. Therefore, to address the problems of the existing system, we proposed the design and implementation of USN integration management system. For this, we first proposed an integration system that manages RFID and WSN data based on Session Initiation Protocol (SIP). Secondly, we defined the minimum conditions of the complex events to detect unnecessary complex events in the middleware, and proposed an algorithm that can extract complex events only when the minimum conditions are to be met. To evaluate the performance of the proposed methods we implemented SIP-based integration management system.

A Study on Efficient Executions of MPI Parallel Programs in Memory-Centric Computer Architecture

  • Lee, Je-Man;Lee, Seung-Chul;Shin, Dongha
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.1
    • /
    • pp.1-11
    • /
    • 2020
  • In this paper, we present a technique that executes MPI parallel programs, that are developed on processor-centric computer architecture, more efficiently on memory-centric computer architecture without program modification. The technique we present here improves performance by replacing low-speed data communication over the network of MPI library functions with high-speed data communication using the property called fast large shared memory of memory-centric computer architecture. The technique we present in the paper is implemented in two programs. The first program is a modified MPI library called MC-MPI-LIB that runs MPI parallel programs more efficiently on memory-centric computer architecture preserving the semantics of MPI library functions. The second program is a simulation program called MC-MPI-SIM that simulates the performance of memory-centric computer architecture on processor-centric computer architecture. We developed and tested the programs on distributed systems environment deployed on Docker based virtualization. We analyzed the performance of several MPI parallel programs and showed that we achieved better performance on memory-centric computer architecture. Especially we could see very high performance on the MPI parallel programs with high communication overhead.

A Study on Lightweight CNN-based Interpolation Method for Satellite Images (위성 영상을 위한 경량화된 CNN 기반의 보간 기술 연구)

  • Kim, Hyun-ho;Seo, Doochun;Jung, JaeHeon;Kim, Yongwoo
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.2
    • /
    • pp.167-177
    • /
    • 2022
  • In order to obtain satellite image products using the image transmitted to the ground station after capturing the satellite images, many image pre/post-processing steps are involved. During the pre/post-processing, when converting from level 1R images to level 1G images, geometric correction is essential. An interpolation method necessary for geometric correction is inevitably used, and the quality of the level 1G images is determined according to the accuracy of the interpolation method. Also, it is crucial to speed up the interpolation algorithm by the level processor. In this paper, we proposed a lightweight CNN-based interpolation method required for geometric correction when converting from level 1R to level 1G. The proposed method doubles the resolution of satellite images and constructs a deep learning network with a lightweight deep convolutional neural network for fast processing speed. In addition, a feature map fusion method capable of improving the image quality of multispectral (MS) bands using panchromatic (PAN) band information was proposed. The images obtained through the proposed interpolation method improved by about 0.4 dB for the PAN image and about 4.9 dB for the MS image in the quantitative peak signal-to-noise ratio (PSNR) index compared to the existing deep learning-based interpolation methods. In addition, it was confirmed that the time required to acquire an image that is twice the resolution of the 36,500×36,500 input image based on the PAN image size is improved by about 1.6 times compared to the existing deep learning-based interpolation method.