• Title/Summary/Keyword: Parallel Implementation

Search Result 880, Processing Time 0.03 seconds

Low-Latency Programmable Look-Up Table Routing Engine for Parallel Computers (병렬 컴퓨터를 위한 저지연 프로그램형 조견표 경로지정 엔진)

  • Chang, Nae-Hyuck
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.6 no.2
    • /
    • pp.244-253
    • /
    • 2000
  • Since no single routing-switching combination performs the best under various different types of applications, a flexible network is required to support a range of polices. This paper introduces an implementation of a look-up table routing engine offering flexible routing and switching polices without performance degradation unlike those based on microprocessors. By deciding contents of look-up tables, the engine can implement wormhole routing, virtual cut-through routing, and packet switching, as well as hybrid switching, under a variety of routing algorithms. Since the routing engine has a piplelined look-up table architecture, the routing delay is as small as one flit, and thus it can overlap multiple routing actions without performance degradation in comparison with hardwired routers dedicated to a specific policy. Because four pipeline stages do not induce a hazard, expensive forwarding logic is not required. The routing engine can accommodate four physical links with a time shared cut-through bus or single link with a cross-bar switch. It is implemented using Xilinx 4000 series FPGA.

  • PDF

The Design and Implementation of OSF/1 AD3 Based-Microkernel Initialization for SPAX (SPAX를 위한 OSF/1 AD3 기반의 마이크로 커널 초기화 설계 및 구현)

  • Kim, Jeong-Nyeo;Cho, Il-Yeon;Lee, Jae-Kyung;Kim, Hae-Jin
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.5
    • /
    • pp.1333-1344
    • /
    • 1998
  • In comparison to traditional monolithic kernel, the microkernel based operating system has slower speed. But Microkernel based OS suites for multi-computer system, because It has benefits in the modularity and portability point of view. Each unit and memory of a processor must be initialized by using the boot information so that the multi-computer system OS can actively run the function of the system. This paper describes the microkernel initialization of OSF/1 AD3 MISIX that is based on OSF/1 AD3 for SPAX. It will introduce the initialization of microkernel for the SPAX which is High-speed Parallel Processing system in terms of Boot, Initialization related hardware and memory address space construction. This paper will also state the test result based on test environments. Microkernel tested in single node system that has 4 processors.

  • PDF

Implementation of Adaptive MCS in The IEEE 802.11ac/ad Wireless LAN (IEEE 802.11ac/ad 무선 LAN의 적응형 MCS 구현 연구)

  • Lee, Ha-cheol
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.40 no.8
    • /
    • pp.1613-1621
    • /
    • 2015
  • This paper analyzes the rate adaptation scheme and suggests applicable strategy of the MCS(Modulation and Coding Scheme) for improving DCF throughput in the IEEE 802.11ad and 802.11ad wireless LAN. IEEE 802.11ac and 802.11ad wireless LAN provide MCS technique that dynamically adjusts modulation level and code rate to the time-varying channel conditions in order to obtain considerably high data rates. But these standards did not provide rate adaptation algorithm, so this paper surveyes rate adaptation algorithm and suggests MCS scheme applied to IEEE 802.11ac and 802.11ad wireless LAN. Specially A MAC(Medium Access Control) layer throughput is evaluated over error-prone channel in the IEEE 802.11ac-based wireless LAN. In this evaluation, DCF (Distributed Coordination Function) protocol and A-MPDU (MAC Protocol Data Unit Aggregation) scheme are used. Using theoretical analysis method, the MAC saturation throughput is evaluated with the PER (Packet Error Rate) on the condition that the number of station, transmission probability, the number of parallel beams and the number of frames in each A-MPDU are variables.

Efficient Task Distribution Method for Load Balancing on Clusters of Heterogeneous Workstations (이기종 워크스테이션 클러스터 상에서 부하 균형을 위한 효과적 작업 분배 방법)

  • 지병준;이광모
    • Journal of Internet Computing and Services
    • /
    • v.2 no.3
    • /
    • pp.81-92
    • /
    • 2001
  • The clustering environment with heterogeneous workstations provides the cost effectiveness and usability for executing applications in parallel. The load balancing is considered as a necessary feature for the clustering of heterogeneous workstations to minimize the turnaround time. Since each workstation may have different users, groups. requests for different tasks, and different processing power, the capability of each processing unit is relative to the others' unit in the clustering environment Previous works is a static approach which assign a predetermined weight for the processing capability of each workstation or a dynamic approach which executes a benchmark program to get relative processing capability of each workstation. The execution of the benchmark program, which has nothing to do with the application being executed, consumes the computation time and the overall turnaround time is delayed. In this paper, we present an efficient task distribution method and implementation of load balancing system for the clustering environment with heterogeneous workstations. Turnaround time of the methods presented in this paper is compared with the method without load balancing as well as with the method load balancing with performance evaluation program. The experimental results show that our methods outperform all the other methods that we compared.

  • PDF

The Application and Integration of an Improvement Technique for Layers of NETCONF (NETCONF 계층에 대한 개선 기법 적용 및 통합)

  • Lee, YangMin;Lee, JaeKee
    • Journal of KIISE
    • /
    • v.43 no.2
    • /
    • pp.256-268
    • /
    • 2016
  • Modern networks consisting of various heterogeneous equipment are often installed in a distributed manner. Thus the NETCONF standard was established to manage networks centrally and efficiently. In this paper, we present a method that integrates each NETCONF layer into a single system based on the results of previous studies. In the RPC Layer, an asynchronous communication channel and parallel processes are possible using multi-threading. In the Operation Layer, operational efficiency is increased by using a data group with dependencies between the equipment configuration data and by improving the data structure, enabling efficiently processing of XML queries even with multiple managers. The data modeling techniques and grouping methods in the Content Layer are presented in detail for interoperability between the Operation Layer and the Content Layer. Finally, the GUI program was implemented and its implementation is reported. We performed an experiment comparing the improved NETCONF with the standard NETCONF to measure factors, such as query processing ratio, query processing speed, and CPU utilization. The improved NETCONF demonstrated excellent query processing ratio and query processing speed, whereas the standard NETCONF had excellent CPU utilization.

Implementation of Acoustic Properties Measurement System Based on LabVIEW Using PXI for Marine Sediment (PXI를 이용한 LabVIEW기반 해양퇴적물의 음향특성 측정시스템 개발)

  • Park, Ki-Ju;Kim, Dae-Choul;Lee, Gwang-Soo;Bae, Sung Ho;Kim, Gil Young
    • Journal of the Korean Society for Marine Environment & Energy
    • /
    • v.18 no.3
    • /
    • pp.216-222
    • /
    • 2015
  • A previous velocity measurement system for marine sediment had several problems such as the errors occurred when picking first arrival time and the inconvenient measurement procedure. In order to resolve these problems, we developed a new acoustic properties measurement system by using PXI (PCI eXtentions for Instrumentation) module based on LabVIEW. To verify the new system, we measured the velocity and attenuation of sediment using the new system in a parallel with the previous system under the same experimental environment. The result of measurement showed 1~2% margin of error for the velocity as well as similar attenuation values. We concluded that the new system can efficiently measure the acoustic properties of marine sediment. It also has an advantage to construct the database of acoustic data and raw signal.

Low System Complexity Parallel Multiplier for a Class of Finite Fields based on AOP (시스템 복잡도 개선을 위한 AOP 기반의 병렬 유한체 승산기)

  • 변기영;나기수;윤병희;최영희;한성일;김흥수
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.3A
    • /
    • pp.331-336
    • /
    • 2004
  • This study focuses on the hardware implementation of fast and low-system-complexity multiplier over GF(2$^{m}$ ). From the properties of an irreducible AOP of degree m. the modular reduction in GF(2$^{m}$ ) multiplicative operation can be simplified using cyclic shift operation. And then, GF(2$^{m}$ ) multiplicative operation can be established using the away structure of AND and XOR gates. The proposed multiplier is composed of m(m+1) 2-input AND gates and (m+1)$^2$ 2-input XOR gates. And the minimum critical path delay is Τ$_{A+}$〔lo $g_2$$^{m}$ 〕Τ$_{x}$ proposed multiplier obtained have low circuit complexity and delay time, and the interconnections of the circuit are regular, well-suited for VLSI realization.n.

On Designing 4-way Superscalar Digital Signal Processor Core (4-way 수퍼 스칼라 디지털 시그널 프로세서 코어 설계)

  • 김준석;유선국;박성욱;정남훈;고우석;이근섭;윤대희
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.23 no.6
    • /
    • pp.1409-1418
    • /
    • 1998
  • The recent audio CODEC(Coding/Decoding) algorithms are complex of several coding techniques, and can be divided into DSP tasks, controller tasks and mixed tasks. The traditional DSP processor has been designed for fast processing of DSP tasks only, but not for controller and mixed tasks. This paper presents a new architecture that achieves high throughput on both controller and mixed tasks of such algorithms while maintaining high performance for DSP tasks. The proposed processor, YSP-3, operates four algorithms while maintaining high performance for DSP tasks. The proposed processor, YSP-3, operates functional units (Multiplier, two ALUs, Load/Store Unit) in parallel via 4-issue super-scalar instruction structure. The performance evaluation of YSP-3 has been done through the implementation of the several DSP algorithms and the part of the AC-3 decoding algorithms.

  • PDF

Design and Implementation of Acoustic Echo Canceller (Acoustic Echo Canceller 설계 및 구현)

  • 장수안;문대철
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.2C
    • /
    • pp.291-297
    • /
    • 2004
  • In this paper, a new structure for the AEC(Acoustic Echo Canceller) is proposed in which echo signal components that can be created in mobile communications is effectively eliminated. Block Data Flow Architecture is a parallel architecture that achieves high performance, high efficiency, high throughput, and almost linear speed up. The proposed architecture employs AEC and is implemented using the TMS320C6711 for real-time applications. The proposed AEC shows improved performance by eliminating echoes at 55ms delay path. Since the proposed AEC can also be implemented in Firmware, it is believed to effectively work on various types of echoes if it is applied on CDMA mobile devices. The TMS320C6711 shows much better performance comparing to previous DSPs. For experimental verifications, filtering operation using adaptive algorithm is performed on TMS320C6711 board and error signals resulted from computations are monitored on PC, and then the performance of the implemented AEC is verified through ERLE computation. According the results of simulation, good characteristic of 100dB are shown after 500 sampling data.

A Design and Implementation of OTU4 Framer for l00G Ethernet (100G 이더넷 수용을 위한 OTU4 프레이머 표준기술 설계 및 구현)

  • Youn, Ji-Wook;Kim, Jong-Ho;Shin, Jong-Yoon;Kim, Kwang-Joon
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.36 no.12B
    • /
    • pp.1601-1610
    • /
    • 2011
  • This paper discusses standardization activities, requirements and enabling technologies for 100G Ethernet and 100G OTN. The need of 100Gbps transport capacity has been gaining greater interest from service providers and carrier vendors. Moreover, optical transport networks based on OTN/DWDM are changing their properties to apply Ethernet traffic which is dramatically increasing. We realize and experimentally demonstrate OTU4 framer with commercial FPGA. The key features of the realized OTU4 framer are parallel signal processing function, multi-lane distribution function, GMP function and FEC function. The realized OTU4 framer has the large signal processing capacity of 120Gbps, which allows to transport about 120Gbps client signals such as $12{\times}10G$ Ethernet and $3{\times}40G$ Ethernet. The realized OTU4 framer has the advantages to quickly adjust to changing markets and new technologies by using commercial FPGA instead of ASIC.