• 제목/요약/키워드: Memory Bandwidth

검색결과 245건 처리시간 0.023초

Ethernet-Based Avionic Databus and Time-Space Partition Switch Design

  • Li, Jian;Yao, Jianguo;Huang, Dongshan
    • Journal of Communications and Networks
    • /
    • 제17권3호
    • /
    • pp.286-295
    • /
    • 2015
  • Avionic databuses fulfill a critical function in the connection and communication of aircraft components and functions such as flight-control, navigation, and monitoring. Ethernet-based avionic databuses have become the mainstream for large aircraft owning to their advantages of full-duplex communication with high bandwidth, low latency, low packet-loss, and low cost. As a new generation aviation network communication standard, avionics full-duplex switched ethernet (AFDX) adopted concepts from the telecom standard, asynchronous transfer mode (ATM). In this technology, the switches are the key devices influencing the overall performance. This paper reviews the avionic databus with emphasis on the switch architecture classifications. Based on a comparison, analysis, and discussion of the different switch architectures, we propose a new avionic switch design based on a time-division switch fabric for high flexibility and scalability. This also merges the design concept of space-partition switch fabric to achieve reliability and predictability. The new switch architecture, called space partitioned shared memory switch (SPSMS), isolates the memory space for each output port. This can reduce the competition for resources and avoid conflicts, decrease the packet forwarding latency through the switch, and reduce the packet loss rate. A simulation of the architecture with optimized network engineering tools (OPNET) confirms the efficiency and significant performance improvement over a classic shared memory switch, in terms of overall packet latency, queuing delay, and queue size.

차세대 저장 장치에 최적화된 SWAP 시스템 설계 (Design of Optimized SWAP System for Next-Generation Storage Devices)

  • 한혁
    • 한국콘텐츠학회논문지
    • /
    • 제15권4호
    • /
    • pp.9-16
    • /
    • 2015
  • Linux와 같은 발전된 운영 체제의 가상 메모리 관리 기술은 메인 메모리와 하드디스크와 같은 저장 장치를 이용하여 응용 프로그램에게 가상의 큰 주소 공간을 제공해준다. 최근 저장 장치는 속도의 측면에서 비약적인 발전을 보이고 있기 때문에 고속의 차세대 저장 장치를 메모리 확장에 이용하면 메모리를 많이 사용하는 응용의 성능이 좋아질 것이다. 그러나 기존 운영체제의 가상 메모리 관리 오버헤드 때문에 응용의 성능을 극대화시킬 수 없다. 이러한 문제를 해결하기 위해 본 논문은 차세대 저장 장치를 메모리 확장에 사용했을 때 쓰기 연산을 위한 블록 주소를 할당하는 향상된 알고리즘 및 시스템 튜닝 기법들에 대해 제안하였고, 제안된 기법들을 Linux 3.14.3의 가상 메모리 관리 시스템에 구현하였다. 그리고 구현된 시스템을 벤치마크를 이용하여 실험을 하였고, 마이크로 벤치마크의 경우에 평균 3배, 과학 계산 벤치마크 응용의 경우에 24%의 성능 향상이 있음을 보였다.

Smart Bus Arbiter for QoS control in H.264 decoders

  • Lee, Chan-Ho
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • 제11권1호
    • /
    • pp.33-39
    • /
    • 2011
  • H.264 decoders usually have pipeline architecture by a macroblock or a 4 ${\times}$ 4 sub-block. The period of the pipeline is usually fixed to guarantee the operation in the worst case which results in many idle cycles and higher data bandwidth. Adaptive pipeline architecture for H.264 decoders has been proposed for efficient decoding and lower the requirement of the bandwidth for the memory bus. However, it requires a controller for the adaptive priority control to utilize the advantage. We propose a smart bus arbiter that replaces the controller. It is introduced to adjust the priority adaptively the QoS (Quality of Service) control of the decoding process. The smart arbiter can be integrated the arbiter of bus systems and it works when certain conditions are met so that it does not affect the original functions of the arbiter. An H.264 decoder using the proposed architecture is designed and implemented to verify the operation using an FPGA.

Reliable Distributed Lookup Service Scheme for Mobile Ad-hoc Networks

  • Malik Muhammad Ali;Kim Jai-Hoon
    • 한국정보과학회:학술대회논문집
    • /
    • 한국정보과학회 2006년도 한국컴퓨터종합학술대회 논문집 Vol.33 No.1 (D)
    • /
    • pp.124-126
    • /
    • 2006
  • Mobile Ad hoc networking is an emerging technology and in these days many applications are being developed to run on these networks. In these networks lookup services are very important because all nodes do not have same resources in term of memory and computing power. Nodes need to use different services offered by different servers. Reliable and efficient scheme should be available for lookup services due to limited bandwidth and low computing power of devices in mobile ad hoc networks. Due to mobility and rapid change in network topology, lookup mechanism used in wired network is not appropriate. Service discovery mechanism can be divided into two main categories Centralized and Distributed. Centralized mechanism is not reliable as there is no central node in these networks. Secondly centralized mechanism leads toward single point failure. We can handle the service discovery mechanism by distributing server's information to each node. But this approach is also not appropriate due to limited bandwidth and rapid change in network. In this paper, we present reliable lookup service scheme which is based on distributed mechanism. We are not using replication approach as well due to low bandwidth of wireless networks. In this scheme service discovery mechanism will be handled through different lookup servers. Reliability is the key feature of our proposed scheme.

  • PDF

토큰 코히런스 프로토콜을 위한 경서열 트렌지언트 요청 처리 방법 (New Transient Request with Loose Ordering for Token Coherence Protocol)

  • 박윤경;김대영
    • 대한전기학회논문지:시스템및제어부문D
    • /
    • 제54권10호
    • /
    • pp.615-619
    • /
    • 2005
  • Token coherence protocol has many good reasons against snooping/directory-based protocol in terms of latency, bandwidth, and complexity. Token counting easily maintains correctness of the protocol without global ordering of request which is basis of other dominant cache coherence protocols. But this lack of global ordering causes starvation which is not happening in snooping/directory-based protocols. Token coherence protocol solves this problem by providing an emergency mechanism called persistent request. It enforces other processors in the competition (or accessing same shared memory block, to give up their tokens to feed a starving processor. However, as the number of processors grows in a system, the frequency of starvation occurrence increases. In other words, the situation where persistent request occurs becomes too frequent to be emergent. As the frequency of persistent requests increases, not only the cost of each persistent matters since it is based on broadcasting to all processors, but also the increased traffic of persistent requests will saturate the bandwidth of multiprocessor interconnection network. This paper proposes a new request mechanism that defines order of requests to reduce occurrence of persistent requests. This ordering mechanism has been designed to be decentralized since centralized mechanism in both snooping-based protocol and directory-based protocol is one of primary reasons why token coherence protocol has advantage in terms of latency and bandwidth against these two dominant Protocols.

A Novel 3-Level Transceiver using Multi Phase Modulation for High Bandwidth

  • Jung, Dae-Hee;Park, Jung-Hwan;Kim, Chan-Kyung;Kim, Chang-Hyun;Kim, Suki
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2003년도 하계종합학술대회 논문집 II
    • /
    • pp.791-794
    • /
    • 2003
  • The increasing computational capability of processors is driving the need for high bandwidth links to communicate and store the information that is processed. Such links are often an important part of multi processor interconnection, processor-to-memory interfaces and Serial-network interfaces. This paper describes a 0.11-${\mu}{\textrm}{m}$ CMOS 4 Gbp s/pin 3-Level transceiver using RSL/(Rambus Signaling Logic) for high bandwidth. This system which uses a high-gain windowed integrating receiver with wide common-mode range which was designed in order to improve SNR when operating with the smaller input overdrive of 3-Level. For multi-gigabit/second application, the data rate is limited by Inter-Symbol Interference (ISI) caused by low pass effects of channel, process-limited on-chip clock frequency, and serial link distance. In order to detect the transmited 4Gbps/pin with 3-Level data sucessfully ,the receiver is designed using 3-stage sense amplifier. The proposed transceiver employes multi-level signaling (3-Level Pulse Amplitude Modulation) using clock multi phase, double data rate and Prbs patten generator. The transceiver shows data rate of 3.2 ~ 4.0 Gbps/pin with a 1GHz internal clock.

  • PDF

FCBAFL: An Energy-Conserving Federated Learning Approach in Industrial Internet of Things

  • Bin Qiu;Duan Li;Xian Li;Hailin Xiao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제18권9호
    • /
    • pp.2764-2781
    • /
    • 2024
  • Federated learning (FL) has been proposed as an emerging distributed machine learning framework, which lowers the risk of privacy leakage by training models without uploading original data. Therefore, it has been widely utilized in the Industrial Internet of Things (IIoT). Despite this, FL still faces challenges including the non-independent identically distributed (Non-IID) data and heterogeneity of devices, which may cause difficulties in model convergence. To address these issues, a local surrogate function is initially constructed for each device to ensure a smooth decline in global loss. Subsequently, aiming to minimize the system energy consumption, an FL approach for joint CPU frequency control and bandwidth allocation, called FCBAFL is proposed. Specifically, the maximum delay of a single round is first treated as a uniform delay constraint, and a limited-memory Broyden-Fletcher-Goldfarb-Shanno bounded (L-BFGS-B) algorithm is employed to find the optimal bandwidth allocation with a fixed CPU frequency. Following that, the result is utilized to derive the optimal CPU frequency. Numerical simulation results show that the proposed FCBAFL algorithm exhibits more excellent convergence compared with baseline algorithm, and outperforms other schemes in declining the energy consumption.

음계를 기반으로 한 HS 구현 (HS Implementation Based on Music Scale)

  • 이태봉
    • 한국정보전자통신기술학회논문지
    • /
    • 제15권5호
    • /
    • pp.299-307
    • /
    • 2022
  • Harmony Search(HS)는 비교적 최근에 개발된 메타 휴리스틱 최적화 알고리즘으로 최근 이에 관한 연구가 다양하게 진행되고 있다. HS는 음악인의 즉홍 연주를 기반으로 하고 있으며 목적변수는 악기의 역할을 한다. 그러나 각 악기는 음대역만 주어질 뿐 음악의 기본이라 할 수 있는 음계의 개념이 없다. 본 연구에서는 기존 HS에 음계를 도입하고 대역폭을 양자화하여 알고리즘의 성능을 향상시키고자 한다. 도입한 음계는 음대역 범위에서 무작위로 초기화되던 기존 방식을 대신하여 HM 초기화에 적용하였다. 양자화 단계는 임의로 정할 수 있도록 하였으며 이를 통해 알고리즘 초반에는 상대적으로 큰 대역폭을 사용하여 알고리즘의 탐색성을 향상시키고 후반에는 작은 대역폭을 통해 탐지성을 향상시키고자 하였다. 음계 도입과 대역폭 양자화를 통하여 기존 HS보다 초기값에 따른 알고리즘 성능 편차를 줄이고 알고리즘 수렴속도 및 성공률을 향상시킬 수 있었다. 본 연구의 성과는 여러 함수에 대한 최적화 수치 예를 종래의 방식과 비교하여 확인하였다. 구체적인 비교 수치는 모의실험에 서술하였다.

하이브리드 메모리 큐브 (HMC) 시스템의 고속 직렬 링크 (SerDes)를 위한 모델링 및 성능 분석 (Modeling and Analysis of High Speed Serial Links (SerDes) for Hybrid Memory Cube Systems)

  • 전동익;정기석
    • 대한임베디드공학회논문지
    • /
    • 제12권4호
    • /
    • pp.193-204
    • /
    • 2017
  • Various 3D-stacked DRAMs have been proposed to overcome the memory wall problem. Hybrid Memory Cube (HMC) is a true 3D-stacked DRAM with stacked DRAM layers on top of a logic layer. The logic die is mainly used to implement a memory controller for HMC, and it is connected through a high speed serial link called SerDes with a host that is either a processor or another HMC. In HMC, the serial link is crucial for both performance and power consumption. Therefore, it is important that the link is configured properly so that the required performance should be satisfied while the power consumption is minimized. In this paper, we propose a HMC system model included the high speed serial link to estimate performance accurately. Since the link modeling strictly follows the link flow control mechanism defined in the HMC spec, the actual HMC performance can be estimated accurately with respect to each link configuration. Various simulations are conducted in order to deduce the correlation between the HMC performance and the link configuration with regard to memory utilization. It is confirmed that there is a strong correlation between the achievable maximum performance of HMC and the link configuration in terms of both bandwidth and latency. Therefore, it is possible to find the best link configuration when the required HMC performance is known in advance, and finding the best configuration will lead to significant power saving while the performance requirement is satisfied.

DSM 시스템에서 통신 부하의 가중치를 고려한 경쟁적인 갱신 프로토콜 (Weighted Competitive Update Protocol for DSM Systems)

  • 임성화;백상현;김재훈;김성수
    • 한국정보처리학회논문지
    • /
    • 제6권8호
    • /
    • pp.2245-2252
    • /
    • 1999
  • 분산 공유 메모리(Distributed Shared Memory)시스템은 사용자에게 간단한 공유 메모리 개념을 제공하기 때문에 노드 사이의 데이터 이동에 관여할 필요가 없다. 각 노드는 프로세서, 메모리, 그리고 네트워크 연결장치 등으로 이루어져 있다. 메모리는 페이지 단위로 구분되며 페이지는 여러 노드에 복제본을 소유할 수 있다. 이들간 일치성을 유지하기 위하여 무효화 방식(invalidate protocol)과 갱신 방식(update protocol)이 전통적으로 많이 사용되었다. 이 두 가지 프로토콜의 성능은 시스템 변수 또는 응용 프로그램의 공유 메모리 사용 형태에 따라 좌우된다. 메모리 사용 형태에 적응하기 위하여 경쟁적 갱신(competitive update) 프로토콜은 가까운 장래에 사용되어질 복제본을 갱신시키는 반면, 다른 복제본은 무효화시킨다. 본 논문에서는 노드 사이의 통신비용이 동일하지 않은 구조를 감안한 가중치를 고려한(weighted) 경쟁적 갱신 프로토콜을 제안하였다. 시뮬레이션에 의한 성능 측정 결과 가중치를 고려한 경쟁적 갱신 프로토콜의 성능 향상을 보였다.

  • PDF