• Title/Summary/Keyword: low-latency processing

Search Result 105, Processing Time 0.028 seconds

Integer-Pel Motion Estimation for HEVC on Compute Unified Device Architecture (CUDA)

  • Lee, Dongkyu;Sim, Donggyu;Oh, Seoung-Jun
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.3 no.6
    • /
    • pp.397-403
    • /
    • 2014
  • A new video compression standard called High Efficiency Video Coding (HEVC) has recently been released onto the market. HEVC provides higher coding performance compared to previous standards, but at the cost of a significant increase in encoding complexity, particularly in motion estimation (ME). At the same time, the computing capabilities of Graphics Processing Units (GPUs) have become more powerful. This paper proposes a parallel integer-pel ME (IME) algorithm for HEVC on GPU using the Compute Unified Device Architecture (CUDA). In the proposed IME, concurrent parallel reduction (CPR) is introduced. CPR performs several parallel reduction (PR) operations concurrently to solve two problems in conventional PR; low thread utilization and high thread synchronization latency. The proposed encoder reduces the portion of IME in the encoder to almost zero with a 2.3% increase in bitrate. In terms of IME, the proposed IME is up to 172.6 times faster than the IME in the HEVC reference model.

Delay Improvement Greedy Forwarding in Low-Duty-Cycle Wireless Sensor Networks (로우듀티사이클 환경을 고려한 무선센서네트워크에서 데이터 전송지연을 향상한 그리디 포워딩)

  • Choe, Junseong;Le, Huu Nghia;Shon, Minhan;Choo, Hyunseung
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2012.04a
    • /
    • pp.609-611
    • /
    • 2012
  • 논문에서는 로우듀티사이클 환경을 고려하여 목적지까지 데이터 전송의 신뢰성뿐만 아니라 낮은 데이터 지연도 보장하는 DIGF (Delay Improvement Greedy Forwarding) 기법을 제안한다. 초기에 제안된 그리디 포워텅 기법들은 무선링크가 갖는 비신뢰성 및 비대칭성의 문제점을 해결하기 위해 데이터 전송 성공률과 에너지 효율을 높이는 기법이 제안되었다. 하지만 많은 그리디 포워텅 기법들은 노드들이 데이터를 송수신하기 위해 대기하고 있는 수신대기상태로 인한 많은 에너지 소모를 고려하지 않아 네트워크 라이프타임을 감소시킨다. 이러한 문제점을 해결하고자 제안기법인 DIGF는 무선링크의 비신뢰성과 비대칭성을 고려할 뿐만 아니라 로우듀티사이클 환경을 고려한다. 또한 로우듀티사이클 환경을 고려할 때 발생되는 높은 수면지연성 (Sleep latency) 을 해결하기 위한 알고리즘을 제안하여 낮은 전송지연과 신뢰성 있는 데이터 전송을 보장한다.

A Routing Algorithm for Low Latency Time in Wireless Sensor Networks (무선센서네트워크의 지연시간을 고려한 라우팅 알고리즘)

  • Park, Jae-Yeon;Park, Myong-Soon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2010.04a
    • /
    • pp.619-622
    • /
    • 2010
  • 무선센서네트워크는 한정된 에너지 자원을 장기간 사용하기 위하여 주로 에너지 절약을 위한 연구에 중점을 두게 됨으로써 상대적으로 지연시간에 대한 고려는 멀어지게 되었다. 그러나 송배전 선로 등과 같이 일렬로 진행하는 설비 감시에 관한 무선센서네트워크를 구축할 경우에는 일정 구간 단위로 클러스터 헤드의 중계가 반드시 거치게 되므로 클러스터링 자체가 어렵거나 지나치게 많은 전송지연을 유발하며, 한개 노드 장애시 전체 노드에 영향을 주게 되었다. 이를 개선하기 위하여 본 논문에서는 일렬로 진행하는 소규모 클러스터의 효과적 클러스터링 방법과 Every Other Hop(EOH) 중계전송 기법을 사용하여 전송지연시간을 줄이고, Single Point of Failure 문제 해결을 가능하게 하였다.

Low Latency Handover Scheme Based on Optical Buffering at LMA and Simplified Authentication Procedure in PMIPv6 Networks (PMIPv6 네트워크에서 LMA 광 버퍼링 및 간소화한 인증절차 기반의 핸드오버 지연시간 단축 기법)

  • Oh, Seungtak;Choo, Hyunseung
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2009.04a
    • /
    • pp.1206-1209
    • /
    • 2009
  • 모바일 단말의 이동성을 지원하는 호스트 기반의 MIPv6, HMIPv6 및 FMIPv6 프로토콜이 개발되었지만, 이동성 기능을 단말기에 구현해야 하는 부담이 있다. 이러한 문제점을 해결하기 위해 최근에 네트워크 기반으로 동작하는 PMIPv6 프로토콜이 등장하였다. 그러나 라우팅 최적화나 핸드오버 지연시간을 단축해야 하는 과제가 아직 남아 있다. 따라서, 본 논문에서는 사용자의 인증절차를 간소화하여 지연시간을 줄이고, 핸드오버 시간 동안 패킷들을 LMA 의 별도 광 버퍼링 공간에 저장하였다가 재전송함으로써 패킷 disordering 문제점을 해결하는 기법을 제안한다. 성능평가는 분석 모델을 통해 평가하며, 제안기법의 핸드오버 지연시간은 표준 PMIPv6 대비 33% 개선된 성능을 보인다.

Global Mobility Management Scheme for Seamless Mobile Multicasting Service Support in PMIPv6 Networks

  • Song, Myungseok;Cho, Jun-Dong;Jeong, Jong-Pil
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.2
    • /
    • pp.637-658
    • /
    • 2015
  • The development of multimedia applications has followed the development of high-speed networks. By improving the performance of mobile devices, it is possible to provide high-transfer-speed broadband and seamless mobile multicasting services between indoor and outdoor environments. Multicasting services support efficient group communications. However, mobile multicasting services have two constraints: tunnel convergence and handoff latency. In order to solve these problems, many protocols and handoff methods have been studied. In this paper, we propose inter local mobility anchor (inter-LMA) optimized handoff model for mobile multicasting services in proxy mobility IPv6 based (PMIPv6-based) networks. The proposed model removes the tunnel convergence issue and reduces the router processing costs. Further, it the proposed model allows for the execution of fast handoff operations with adaptive transmission mechanisms. In addition, the proposed scheme exhibits low packet delivery costs and handoff latency in comparison with existing schemes and ensures fast handoff when moving the inter-LMA domain.

Design of 6-bit 800 Msample/s DSDA A/D Converter for HDD Read Channel (HDD 읽기 채널용 6-bit 800 Msample/s DSDA 아날로그/디지털 변환기의 설계)

  • Jeong, Dae-Yeong;Jeong, Gang-Min
    • The KIPS Transactions:PartA
    • /
    • v.9A no.1
    • /
    • pp.93-98
    • /
    • 2002
  • This paper introduces the design of high-speed analog-to-digital converter (ADC) for hard disk drive (HDD) read channel applications. This circuit is bated on fast regenerative autozero comparator for high speed and low-error rate comparison operation, and Double Speed Dual ADC (DSDA) architecture for efficiently increasing the overall conversion speed of ADC. A new type of thermometer-to-binary decoder appropriate for the autozero architecture is employed for no glitch decoding, simplifying the conventional structure significantly. This ADC is designed for 6-bit resolution, 800 Msample/s maximum conversion rate, 390 mW power dissipation, one clock cycle latency in 0.65 m CMOS technology.

A Design of RS Decoder for MB-OFDM UWB (MB-OFDM UWB 를 위한 RS 복호기 설계)

  • Choi, Sung-Woo;Shin, Cheol-Ho;Choi, Sang-Sung
    • Proceedings of the Korea Electromagnetic Engineering Society Conference
    • /
    • 2005.11a
    • /
    • pp.131-136
    • /
    • 2005
  • UWB is the most spotlighted wireless technology that transmits data at very high rates using low power over a wide spectrum of frequency band. UWB technology makes it possible to transmit data at rate over 100Mbps within 10 meters. To preserve important header information, MB-OFDM UWB adopts Reed-Solomon(23,17) code. In receiver, RS decoder needs high speed and low latency using efficient hardware. In this paper, we suggest the architecture of RS decoder for MB-OFDM UWB. We adopts Modified-Euclidean algorithm for key equation solver block which is most complex in area. We suggest pipelined processing cell for this block and show the detailed architecture of syndrome, Chien search and Forney algorithm block. At last, we show the hardware implementation results of RS decoder for ASIC implementation.

  • PDF

Architecture of RS decoder for MB-OFDM UWB

  • Choi, Sung-Woo;Choi, Sang-Sung;Lee, Han-Ho
    • Proceedings of the Korea Society of Information Technology Applications Conference
    • /
    • 2005.11a
    • /
    • pp.195-198
    • /
    • 2005
  • UWB is the most spotlighted wireless technology that transmits data at very high rates using low power over a wide spectrum of frequency band. UWB technology makes it possible to transmit data at rate over 100Mbps within 10 meters. To preserve important header information, MBOFDM UWB adopts Reed-Solomon(23,17) code. In receiver, RS decoder needs high speed and low latency using efficient hardware. In this paper, we suggest the architecture of RS decoder for MBOFDM UWB. We adopts Modified-Euclidean algorithm for key equation solver block which is most complex in area. We suggest pipelined processing cell for this block and show the detailed architecture of syndrome, Chien search and Forney algorithm block. At last, we show the hardware implementation results of RS decoder for ASIC implementation.

  • PDF

A Model-based Methodology for Application Specific Energy Efficient Data path Design Using FPGAs (FPGA에서 에너지 효율이 높은 데이터 경로 구성을 위한 계층적 설계 방법)

  • Jang Ju-Wook;Lee Mi-Sook;Mohanty Sumit;Choi Seonil;Prasanna Viktor K.
    • The KIPS Transactions:PartA
    • /
    • v.12A no.5 s.95
    • /
    • pp.451-460
    • /
    • 2005
  • We present a methodology to design energy-efficient data paths using FPGAs. Our methodology integrates domain specific modeling, coarse-grained performance evaluation, design space exploration, and low-level simulation to understand the tradeoffs between energy, latency, and area. The domain specific modeling technique defines a high-level model by identifying various components and parameters specific to a domain that affect the system-wide energy dissipation. A domain is a family of architectures and corresponding algorithms for a given application kernel. The high-level model also consists of functions for estimating energy, latency, and area that facilitate tradeoff analysis. Design space exploration(DSE) analyzes the design space defined by the domain and selects a set of designs. Low-level simulations are used for accurate performance estimation for the designs selected by the DSE and also for final design selection We illustrate our methodology using a family of architectures and algorithms for matrix multiplication. The designs identified by our methodology demonstrate tradeoffs among energy, latency, and area. We compare our designs with a vendor specified matrix multiplication kernel to demonstrate the effectiveness of our methodology. To illustrate the effectiveness of our methodology, we used average power density(E/AT), energy/(area x latency), as themetric for comparison. For various problem sizes, designs obtained using our methodology are on average $25\%$ superior with respect to the E/AT performance metric, compared with the state-of-the-art designs by Xilinx. We also discuss the implementation of our methodology using the MILAN framework.

Performance Comparison of Timestamp based Fair Packet Schedulers inServer Resource Utilization (서버자원 이용도 측면에서 타임스탬프 기반 공평 패킷 스케줄러의 성능 비교 분석)

  • Kim Tae-Joon;Ahn Hyo-Beom
    • The KIPS Transactions:PartC
    • /
    • v.13C no.2 s.105
    • /
    • pp.203-210
    • /
    • 2006
  • Fair packet scheduling algorithms supporting quality-of-services of real-time multimedia applications can be classified into the following two design schemes in terms of the reference time used in calculating the timestamp of arriving packet: Finish-time Design (FD) and Start-time Design (SD) schemes. Since the former can adjust the latency of a flow with raising the flow's reserved rate, it has been applied to a router for the guaranteed service of the IETF (Internet Engineering Task Force) IntServ model. However, the FD scheme may incur severe bandwidth loss for traffic flows requiring low-rate but strong delay bound such as internet phone. In order to verify the usefulness of the SD scheme based router for the IETF guaranteed service, this paper analyzes and compares two design schemes in terms of bandwidth and payload utilizations. It is analytically proved that the SD scheme is better bandwidth utilization than the FD one, and the simulation result shows that the SD scheme gives better payload utilization by up to 20%.