• Title/Summary/Keyword: MPEG-4 Visual

Search Result 81, Processing Time 0.021 seconds

An Efficient Motion Search Algorithm for a Media Processor (미디어 프로세서에 적합한 효율적인 움직임 탐색 알고리즘)

  • Noh Dae-Young;Kim Seang-Hoon;Sohn Chae-Bong;Oh Seoung-Jun;Ahn Chang-Beam
    • Journal of Broadcast Engineering
    • /
    • v.9 no.4 s.25
    • /
    • pp.434-445
    • /
    • 2004
  • Motion Estimation is an essential module in video encoders based on international standards such as H.263 and MPEG. Many fast motion estimation algorithms have been proposed in order to reduce the computational complexity of a well-known full search algorithms(FS). However, these fast algorithms can not work efficiently in DSP processors recently developed for video processing. To solve for this. we propose an efficient motion estimation scheme optimized in the DSP processor like Philips TM1300. A motion vector predictor is pre-estimated and a small search range is chosen in the proposed scheme using strong motion vector correlation between a current macro block (MB) and its neighboring MB's to reduce computation time. An MPEG-4 SP@L3(Simple Profile at Level 3) encoding system is implemented in Philips TM1300 to verify the effectiveness of the proposed method. In that processor, we can achieve better performance using our method than other conventional ones while keeping visual quality as good as that of the FS.

A 3D Audio-Visual Animated Agent for Expressive Conversational Question Answering

  • Martin, J.C.;Jacquemin, C.;Pointal, L.;Katz, B.
    • 한국정보컨버전스학회:학술대회논문집
    • /
    • 2008.06a
    • /
    • pp.53-56
    • /
    • 2008
  • This paper reports on the ACQA(Animated agent for Conversational Question Answering) project conducted at LIMSI. The aim is to design an expressive animated conversational agent(ACA) for conducting research along two main lines: 1/ perceptual experiments(eg perception of expressivity and 3D movements in both audio and visual channels): 2/ design of human-computer interfaces requiring head models at different resolutions and the integration of the talking head in virtual scenes. The target application of this expressive ACA is a real-time question and answer speech based system developed at LIMSI(RITEL). The architecture of the system is based on distributed modules exchanging messages through a network protocol. The main components of the system are: RITEL a question and answer system searching raw text, which is able to produce a text(the answer) and attitudinal information; this attitudinal information is then processed for delivering expressive tags; the text is converted into phoneme, viseme, and prosodic descriptions. Audio speech is generated by the LIMSI selection-concatenation text-to-speech engine. Visual speech is using MPEG4 keypoint-based animation, and is rendered in real-time by Virtual Choreographer (VirChor), a GPU-based 3D engine. Finally, visual and audio speech is played in a 3D audio and visual scene. The project also puts a lot of effort for realistic visual and audio 3D rendering. A new model of phoneme-dependant human radiation patterns is included in the speech synthesis system, so that the ACA can move in the virtual scene with realistic 3D visual and audio rendering.

  • PDF

Sensible Media Simulation in an Automobile Application and Human Responses to Sensory Effects

  • Kim, Sang-Kyun;Joo, Yong-Soo;Lee, YoungMi
    • ETRI Journal
    • /
    • v.35 no.6
    • /
    • pp.1001-1010
    • /
    • 2013
  • A sensible media simulation system for automobiles is introduced to open up new possibilities for an in-car entertainment system. In this paper, the system architecture is presented, which includes a virtuality-to-reality adaptation scheme. Standard data schemes for context and control information from the International Standard MPEG-V (ISO/IEC 23005) are introduced to explain the details of data formats, which are interchangeable in the system. A sensible media simulator and the implementation of a sensory device are presented to prove the effectiveness of the proposed system. Finally, a correlation between learning styles and sensory effects (that is, wind and vibration effects) is statistically analyzed using the proposed system. The experiment results show that the level of satisfaction with the sensory effects is unaffected overall by the learning styles of the test subjects. Stimulations by vibration effects, however, generate more satisfaction in people with a high tactile perception level or a low visual perception level.

Scence Change Adaptive Bit Rate Control Using Local Variance (국부 분산을 이용한 장면 전환 적응 비트율 제어)

  • 이호영;김기석;박영식;송근원;남재열;하영호
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.22 no.4
    • /
    • pp.675-684
    • /
    • 1997
  • The bit rate control algorithm which is capable of handing scene change is proposed. In MPEG-2 TM5, block variance is used to measure block activity. But block variance is not consistent with human visual system and does not differenciate the distribution of pixel values within the block. In target bit allocation process of TM5, global complexity, obtained by results of previous coded pictures, is used. Since I pictures are spaced relatively far apart, their complexity estimate is not very accurate. In the proposed algorithm local variance is used to measure block activity and detect scene change. Local variance, using deviation from the mean of neighboring pixels, well represents the distribution of pixel values within the block. If scene change is detected, the local variance information is used for target bit allocation process. Allocating target bits for I picture, the average local variance difference between previous and current I picture is considered. The experimental results show that the proposed algorithm can detect scene change very precisely and gives better picture quality and higher PSNR values than MPEG-2 TM5.

  • PDF

A new approach for content-based video retrieval

  • Kim, Nac-Woo;Lee, Byung-Tak;Koh, Jai-Sang;Song, Ho-Young
    • International Journal of Contents
    • /
    • v.4 no.2
    • /
    • pp.24-28
    • /
    • 2008
  • In this paper, we propose a new approach for content-based video retrieval using non-parametric based motion classification in the shot-based video indexing structure. Our system proposed in this paper has supported the real-time video retrieval using spatio-temporal feature comparison by measuring the similarity between visual features and between motion features, respectively, after extracting representative frame and non-parametric motion information from shot-based video clips segmented by scene change detection method. The extraction of non-parametric based motion features, after the normalized motion vectors are created from an MPEG-compressed stream, is effectively fulfilled by discretizing each normalized motion vector into various angle bins, and by considering the mean, variance, and direction of motion vectors in these bins. To obtain visual feature in representative frame, we use the edge-based spatial descriptor. Experimental results show that our approach is superior to conventional methods with regard to the performance for video indexing and retrieval.

Multipoint multimedia communcation service in broadband ISDN part I: a conversational communcation on DAVID STB environment (광대역ISDN상의 다지점 멀티미디어 통신서비스 I부:DAVIC 표준 STB에서의 대화형 멀티미디어통신)

  • 황대환;이종형;박영덕;조규섭
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.23 no.4
    • /
    • pp.821-835
    • /
    • 1998
  • The Digital Audio-Visual Council(DAVIC) that was established to develop useful multimedia communication services has completed the specifications for providing on-demand services such as Movie on Demand(MoD), Teleshopping and accepting Internet service. And then they are proceeding the works to suport converstional communcation services like Plain Old Telecphone Service(POTS), Video telephone, Video teleconferencing. In this paper, we prpose an efficient terminal architecture which can provide conversational multimedia communication services on DAVIC Set-Top Box (STB) environments. To apply the implemented conversational terminal to the multipoint communication environment, we considered the factors of Qurlity of Services(QoS) that determine grade of conversational communication service. We also present the inter-working scheme and that system structure to satisfy QoS by using new MPEG video bridge which gurantees end to end delay requirements as major element of QoS for achieving the real time communication and does not accompany visual quality degradation.

  • PDF

Semantic Event Detection and Summary for TV Golf Program Using MPEG-7 Descriptors (MPEG-7 기술자를 이용한 TV 골프 프로그램의 이벤트검출 및 요약)

  • 김천석;이희경;남제호;강경옥;노용만
    • Journal of Broadcast Engineering
    • /
    • v.7 no.2
    • /
    • pp.96-106
    • /
    • 2002
  • We introduce a novel scheme to characterize and index events in TV golf programs using MPEG-7 descriptors. Our goal is to identify and localize the golf events of interest to facilitate highlight-based video indexing and summarization. In particular, we analyze multiple (low-level) visual features using domain-specific model to create a perceptual relation for semantically meaningful(high-level) event identification. Furthermore, we summarize a TV golf program with TV-Anytime segmentation metadata, a standard form of an XML-based metadata description, in which the golf events are represented by temporally localized segments and segment groups of highlights. Experimental results show that our proposed technique provides reasonable performance for identifying a variety of golf events.

MultiFormat motion picture storage subsystem using DirectShow Filters for a Mutichannel Visual Monitoring System (다채널 영상 감시 시스템을 위한 다중 포맷 동영상 저장 DirectShow Filter설계 및 구현)

  • 정연권;하상석;정선태
    • Proceedings of the IEEK Conference
    • /
    • 2002.06d
    • /
    • pp.113-116
    • /
    • 2002
  • Windows provides Directshow for efficient multimedia streaming processings such as multimedia capture, storage, display and etc. Presently, many motion picture codecs and audio codecs are made to be used in Directshow framework and Windows also supports many codecs (MPEG4, H,263, WMV, WMA, ASF, etc.) in addition to a lot of useful tools for multimedia streaming processing. Therefore, Directshow can be effectively utilized for developing windows-based multimedia streaming applications such as visual monitoring systems which needs to store real-time video data for later retrieval. In this paper, we present our efforts for developing a Directshow Filter System supporting storage of motion pictures in various motion picture codecs. Our Directshow Filter system also provides an additional functionality of motion detection.

  • PDF

Video Segmentation using the Level Set Method (Level Set 방법을 이용한 영상분할 알고리즘)

  • 김대희;호요성
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.40 no.5
    • /
    • pp.303-311
    • /
    • 2003
  • Since the MPEG-4 visual standard enables content-based functionalities, it is necessary to extract video object from natural video sequences. Segmentation algorithms can largely be classified into automatic segmentation and user-assisted segmentation. In this paper, we propose a user-assisted VOP generation method based on the geometric active contour. Since the geometric active contour, unlike the parametric active contour, employs the level set method to evolve the curve, we can draw the initial curve independent of the shape of the object. In order to generate the edge function from a smoothed image, we propose a vector-valued diffusion process in the LUV color space. We also present a discrete 3-D diffusion model for easy implementation. By combining the curve shrinkage in the vector field space with the curve expansion in the empty vector space, we can make accurate extraction of visual objects from video sequences.

A Study on Parallel Processing System for Automatic Segmentation of Moving Object in Image Sequences

  • Lee, Hyung;Park, Jong-Won
    • Proceedings of the IEEK Conference
    • /
    • 2000.07a
    • /
    • pp.429-432
    • /
    • 2000
  • The new MPEG-4 video coding standard enables content-based functionalities. In order to support the philosophy of the MPEG-4 visual standard, each frame of video sequences should be represented in terms of video object planes (VOP’s). In other words, video objects to be encoded in still pictures or video sequences should be prepared before the encoding process starts. Therefore, it requires a prior decomposition of sequences into VOP’s so that each VOP represents a moving object. A parallel processing system is required an automatic segmentation to be processed in real-time, because an automatic segmentation is time consuming. This paper addresses the parallel processing: system for an automatic segmentation for separating moving object from the background in image sequences. The proposed parallel processing system comprises of processing elements (PE’s) and a multi-access memory system (MAMS). Multi-access memory system is a memory controller to perform parallel memory access with the variety of types: horizontal, vertical, and block access way. In order to realize these ways, a multi-access memory system consists of a memory module selection module, data routing modules, and an address calculation and routing module. The proposed system is simulated and evaluated by the CADENCE Verilog-XL hardware simulation package.

  • PDF