• Title/Summary/Keyword: Memory Bandwidth

Search Result 245, Processing Time 0.019 seconds

Exploring R&D Policy Directions for Semiconductor Advanced Packaging in Korea Based on Expert Interviews (국내 반도체 첨단패키징 R&D 정책방향: 산학연 전문가 조사를 중심으로)

  • S.J. Min;J.H. Park;S.S. Choi
    • Electronics and Telecommunications Trends
    • /
    • v.39 no.3
    • /
    • pp.1-12
    • /
    • 2024
  • As the demand for high-performance semiconductors, such as chips for artificial intelligence and high-bandwidth memory devices, increases along with the limitations of ultrafine processing technology in the semiconductor in-line process, advanced packaging becomes an increasingly important breakthrough technology for further improving semiconductor performance. Major countries, including Korea, the United States, Taiwan, and China, and large companies are strengthening their technological industry capabilities through the development of advanced packaging technology and policy support. Nevertheless, Korea has a lower level of development of related technologies by approximately 66% compared with the most advanced countries. Therefore, we aim to discover the needs for an advanced packaging research and development (R&D) policy through written expert interviews and importance satisfaction analysis. As a result, various implications for R&D policy directions are suggested to strengthen the technological capabilities and R&D ecosystem of the Korean advanced packaging technology.

Analysis on Spectral Regrowth of Bandwidth Expansion Module by Quadrature Modulation Error in Digital Chirp Generator (디지털 첩 발생기에서의 직교 변조 오차에 의한 대역 확장 모듈에서의 스펙트럴 재성장 분석)

  • Kim, Se-Young;Sung, Jin-Bong;Lee, Jong-Hwan;Yi, Dong-Woo
    • The Journal of Korean Institute of Electromagnetic Engineering and Science
    • /
    • v.21 no.7
    • /
    • pp.761-768
    • /
    • 2010
  • This paper presents an effective method to achieve the wideband waveform for high resolution SAR(Synthetic Aperture Radar) using the frequency multiplication technique. And also this paper analyzes the root causes for the spectral regrowth due to 3rd-order intermodulation in chirp bandwidth expansion scheme using quadrature modulator and frequency multipliers. The amplitude and phase imbalance requirement are defined based on the simulation results in terms of quadrature channel imbalance. This minimizes the degradation of range resolution, peak sidelobe ratio and integrated sidelobe ratio. The wideband chirp generator using the frequency multiplier and memory map scheme was manufactured and the compensation technique was presented to reduce the spectral regrowth of SAR waveform by minimizing the amplitude and phase imbalance. After I and Q channel imbalance adjustment, the carrier level reduces -28.7 dBm to -53.4 dBm. Chirp signal with 150 MHz bandwidth at S-band expands to 600 MHz bandwidth at X-band. The sidelobe levels are reduced by about 8 to 9 dB by compensating the amplitude balance between I and Q channels.

Implementation of a TCP/IP Offload Engine Using High Performance Lightweight TCP/IP (고성능 경량 TCP/IP를 이용한 소프트웨어 기반 TCP/IP 오프로드 엔진 구현)

  • Jun, Yong-Tae;Chung, Sang-Hwa;Yoon, In-Su
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.4
    • /
    • pp.369-377
    • /
    • 2008
  • Today, Ethernet technology is rapidly developing to have a bandwidth of 10Gbps beyond 1Gbps. In such high-speed networks, the existing method that host CPU processes TCP/IP in the operating system causes numerous overheads. As a result of the overheads, user applications cannot get the enough computing power from the host CPU. To solve this problem, the TCP/IP Offload Engine(TOE) technology was emerged. TOE is a specialized NIC which processes the TCP/IP instead of the host CPU. In this paper, we implemented a high-performance, lightweight TCP/IP(HL-TCP) for the TOE and applied it to an embedded system. The HL-TCP supports existing fundamental TCP/IP functions; flow control, congestion control, retransmission, delayed ACK, processing out-of-order packets. And it was implemented to utilize Ethernet MAC's hardware features such as TCP segmentation offload(TSO), checksum offload(CSO) and interrupt coalescing. Also we eliminated the copy overhead from the host memory to the NIC memory when sending data and we implemented an efficient DMA mechanism for the TCP retransmission. The TOE using the HL-TCP has the CPU utilization of less than 6% and the bandwidth of 453Mbps.

Trace-Back Viterbi Decoder with Sequential State Transition Control (순서적 역방향 상태천이 제어에 의한 역추적 비터비 디코더)

  • 정차근
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.40 no.11
    • /
    • pp.51-62
    • /
    • 2003
  • This paper presents a novel survivor memeory management and decoding techniques with sequential backward state transition control in the trace back Viterbi decoder. The Viterbi algorithm is an maximum likelihood decoding scheme to estimate the likelihood of encoder state for channel error detection and correction. This scheme is applied to a broad range of digital communication such as intersymbol interference removing and channel equalization. In order to achieve the area-efficiency VLSI chip design with high throughput in the Viterbi decoder in which recursive operation is implied, more research is required to obtain a simple systematic parallel ACS architecture and surviver memory management. As a method of solution to the problem, this paper addresses a progressive decoding algorithm with sequential backward state transition control in the trace back Viterbi decoder. Compared to the conventional trace back decoding techniques, the required total memory can be greatly reduced in the proposed method. Furthermore, the proposed method can be implemented with a simple pipelined structure with systolic array type architecture. The implementation of the peripheral logic circuit for the control of memory access is not required, and memory access bandwidth can be reduced Therefore, the proposed method has characteristics of high area-efficiency and low power consumption with high throughput. Finally, the examples of decoding results for the received data with channel noise and application result are provided to evaluate the efficiency of the proposed method.

Hierarchical Ring Extension of NUMA Systems using Snooping Protocol (스누핑 프로토콜을 사용하는 NUMA 시스템의 계층적 링 구조로의 확장)

  • Seong, Hyeon-Jung;Kim, Hyeong-Ho;Jang, Seong-Tae;Jeon, Ju-Sik
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.26 no.11
    • /
    • pp.1305-1317
    • /
    • 1999
  • NUMA 구조는 원격 메모리에 대한 접근이 불가피한 구조적 특성 때문에 상호 연결망이 성능을 좌우하는 큰 변수가 된다. 기존에 대중적으로 사용되던 버스는 물리적 확장성 및 대역폭에서 대규모 시스템을 구성하는 데 한계를 보인다. 이를 대체하는 고속의 지점간 링크를 사용한 링 구조는 버스가 가지는 확장성 및 대역폭의 한계라는 단점을 개선하였으나, 많은 클러스터가 연결되는 경우에는 전송 지연시간이 증가하는 문제점을 가지고 있다. 본 논문에서는 스누핑 프로토콜이 적용된 링 구조에서 클러스터 개수 증가에 따른 지연시간 증가의 문제점을 보완하기 위해 계층적 링 구조로의 확장을 제안하고, 이 구조에 효과적인 캐쉬 일관성 프로토콜을 설계하였다. 전역 링과 지역 링을 연결하는 브리지는 캐쉬 프로토콜을 관리하며 이 프로토콜에 의해 지역 링의 부하를 줄일 수 있도록 트랜잭션을 필터링하는 역할도 담당함으로써 시스템의 성능을 향상시킨다. probability-driven 시뮬레이터를 통해 계층적 링 구조가 시스템의 성능 및 링 이용률에 미치는 영향을 알아본다. Abstract Since NUMA architecture has to access remote memory, interconnection network performance determines performance of NUMA architecture. Bus, which has been used as popular interconnection network of NUMA, has a limit to build a large-scale system because of limited physical scalability and bandwidth. Ring interconnection network, composed of high-speed point-to-point link, made up for bus's defects of scalability and bandwidth. But, it also has problem of increasing delay as the number of clusters is increased. In this paper, we propose a hierarchical expansion of snoop-based ring architecture in order to overcome ring's defects of increasing delay. And we also design an efficient cache coherence protocol adopted to this architecture. Bridge, which connects local ring and global ring, maintains cache coherence protocol and does snoop-filtering which reduces local ring and cluster bus utilization. Therefore bridge can improve performance of this system. We analyze effects of hierarchical architecture on the performance of system and utilization of point-to-point links using probability-driven simulator.

Design of Low Power H.264 Decoder Using Adaptive Pipeline (적응적 파이프라인을 적용한 저전력 H.264 복호기 설계)

  • Lee, Chan-Ho
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.47 no.9
    • /
    • pp.1-6
    • /
    • 2010
  • H.264 video coding standard is widely used due to the high compression rate and quality. H.264 decoders usually have pipeline architecture by a macroblock or a $4{\times}4$ sub-block. The period of the pipeline is usually fixed to guarantee the operation in the worst case which results in many idle cycles and the requirement of high data bandwidth and high performance processing units. We propose adaptive pipeline architecture for H.264 decoders for efficient decoding and lower the requirement of the bandwidth for the memory bus. Parameters and coefficients are delivered using hand-shaking communication through dedicated interconnections and frame pixel data are transferred using AMBA AHB network. The processing time of each block is variable depending on the characteristics of images, and the processing units start to work whenever they are ready. An H.264 decoder is designed and implemented using the proposed architecture to verify the operation using an FPGA.

DRAM Buffer Data Management Techniques to Enhance SSD Performance (SSD 성능 향상을 위한 DRAM 버퍼 데이터 처리 기법)

  • Im, Kwang-Seok;Han, Tae-Hee
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.48 no.7
    • /
    • pp.57-64
    • /
    • 2011
  • To adjust the difference of bandwidth between host interface and NAND flash memory, DRAM is adopted as the buffer management in SSD (Solid-state Disk). In this paper, we propose cost-effective techniques to enhance SSD performance instead of using expensive high bandwidth DRAM. The SSD data can be classified into three groups such as user data, meta data for handling user data, and FEC(Forward Error Correction) parity/ CRC(Cyclic Redundancy Check) for error control. In order to improve the performance by considering the features of each data, we devise a flexible burst control method through monitoring system and a page based FEC parity/CRC application. Experimental results show that proposed methods enhance the SSD performance up to 25.9% with a negligible 0.07% increase in chip size.

A Low Memory Bandwidth Motion Estimation Core for H.264/AVC Encoder Based on Parallel Current MB Processing (병렬처리 기반의 H.264/AVC 인코더를 위한 저 메모리 대역폭 움직임 예측 코어설계)

  • Kim, Shi-Hye;Choi, Jun-Rim
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.48 no.2
    • /
    • pp.28-34
    • /
    • 2011
  • In this paper, we present integer and fractional motion estimation IP for H.264/AVC encoder by hardware-oriented algorithm. In integer motion engine, the reference block is used to share for consecutive current macro blocks in parallel processing which exploits data reusability and reduces off-chip bandwidth. In fractional motion engine, instead of two-step sequential refinement, half and quarter pel are processed in parallel manner in order to discard unnecessary candidate positions and double throughput. The H.264/AVC motion estimation chip is fabricated on a MPW(Multi-Project Wafer) chip using the chartered $0.18{\mu}m$ standard CMOS 1P5M technology and achieves high throughput supporting HDTV 720p 30 fps.

FlashEDF: An EDF-style Scheduling Scheme for Serving Real-time I/O Requests in Flash Storage

  • Lim, Seong-Chae
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.10 no.3
    • /
    • pp.26-34
    • /
    • 2018
  • In this paper, we propose a scheduling scheme that can efficiently serve I/O requests having deadlines in flash storage. The I/O requests with deadlines, namely, real-time requests, are assumed to be issued for streaming services of continuous media. Since a Web-based streaming server commonly supports downloads of HTMLs or images, we also aim to quickly process non-real-time I/O requests, together with real-time ones. For this purpose, we adopt the well-known rate-reservation EDF (RR-EDF) algorithm for determining scheduling priorities among mixed I/O requests. In fact, for the use of an EDF-style algorithm, overhead of task's switching should be low and predictable, as with its application of CPU scheduling. In other words, the EDF algorithm is inherently unsuitable for scheduling I/O requests in HDD storage because of highly varying latency times of HDD. Unlike HDD, time for reading a block in flash storage is almost uniform with respect to its physical location. This is because flash storage has no mechanical component, differently from HDD. By capitalizing on this uniform block read time, we compute bandwidth utilization rates of real-time requests from streams. Then, the RR-EDF algorithm is applied for determining how much storage bandwidth can be assigned to non-real-time requests, while meeting deadlines of real-time requests. From this, we can improve the service times of non-real-time requests, which are issued for downloads of static files. Because the proposed scheme can expand flexibly the scheduling periods of streams, it can provide a full usage of slack times, thereby improving the overall throughput of flash storage significantly.

Cross-Layer Architecture for QoS Provisioning in Wireless Multimedia Sensor Networks

  • Farooq, Muhammad Omer;St-Hilaire, Marc;Kunz, Thomas
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.6 no.1
    • /
    • pp.178-202
    • /
    • 2012
  • In this paper, we first survey cross-layer architectures for Wireless Sensor Networks (WSNs) and Wireless Multimedia Sensor Networks (WMSNs). Afterwards, we propose a novel cross-layer architecture for QoS provisioning in clustered and multi-hop based WMSNs. The proposed architecture provides support for multiple network-based applications on a single sensor node. For supporting multiple applications on a single node, an area in memory is reserved where each application can store its network protocols settings. Furthermore, the proposed cross-layer architecture supports heterogeneous flows by classifying WMSN traffic into six traffic classes. The architecture incorporates a service differentiation module for QoS provisioning in WMSNs. The service differentiation module defines the forwarding behavior corresponding to each traffic class. The forwarding behavior is primarily determined by the priority of the traffic class, moreover the service differentiation module allocates bandwidth to each traffic class with goals to maximize network utilization and avoid starvation of low priority flows. The proposal incorporates the congestion detection and control algorithm. Upon detection of congestion, the congested node makes an estimate of the data rate that should be used by the node itself and its one-hop away upstream nodes. While estimating the data rate, the congested node considers the characteristics of different traffic classes along with their total bandwidth usage. The architecture uses a shared database to enable cross-layer interactions. Application's network protocol settings and the interaction with the shared database is done through a cross-layer optimization middleware.