• Title/Summary/Keyword: short video

Search Result 315, Processing Time 0.027 seconds

ViStoryNet: Neural Networks with Successive Event Order Embedding and BiLSTMs for Video Story Regeneration (ViStoryNet: 비디오 스토리 재현을 위한 연속 이벤트 임베딩 및 BiLSTM 기반 신경망)

  • Heo, Min-Oh;Kim, Kyung-Min;Zhang, Byoung-Tak
    • KIISE Transactions on Computing Practices
    • /
    • v.24 no.3
    • /
    • pp.138-144
    • /
    • 2018
  • A video is a vivid medium similar to human's visual-linguistic experiences, since it can inculcate a sequence of situations, actions or dialogues that can be told as a story. In this study, we propose story learning/regeneration frameworks from videos with successive event order supervision for contextual coherence. The supervision induces each episode to have a form of trajectory in the latent space, which constructs a composite representation of ordering and semantics. In this study, we incorporated the use of kids videos as a training data. Some of the advantages associated with the kids videos include omnibus style, simple/explicit storyline in short, chronological narrative order, and relatively limited number of characters and spatial environments. We build the encoder-decoder structure with successive event order embedding, and train bi-directional LSTMs as sequence models considering multi-step sequence prediction. Using a series of approximately 200 episodes of kids videos named 'Pororo the Little Penguin', we give empirical results for story regeneration tasks and SEOE. In addition, each episode shows a trajectory-like shape on the latent space of the model, which gives the geometric information for the sequence models.

Estimating the Optimal Buffer Size on Mobile Devices for Increasing the Quality of Video Streaming Services (동영상 재생 품질 향상을 위한 최적 버퍼 수준 결정)

  • Park, Hyun Min
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.3
    • /
    • pp.34-40
    • /
    • 2018
  • In this study, the optimal buffer size is calculated for seamless video playback on a mobile device. Buffer means the memory space for multimedia packet which arrives in mobile device for video play such as VOD service. If the buffer size is too large, latency time before video playback can be longer. However, if it is too short, playback service can be paused because of shortage of packets arrived. Hence, the optimal buffer size insures QoS of video playback on mobile devices. We model the process of buffering into a discret-time queueing model. Mean busy period length and mean waiting time of Geo/G/1 queue with N-policy is analyzed. After then, we uses the main performance measures to present numerical examples to decide the optimal buffer size on mobile devices. Our results enhance the user satisfaction by insuring the seamless playback and minimizing the initial delay time in VOD streaming process.

Bandwidth Efficient Harmonic Staggered Broadcasting Method for Multimedia on-Demand Services (주문형 멀티미디어 서비스를 위한 대역폭 효율적인 하모닉 스태거드 전송 기술)

  • Kim, Hong-Ik;Park, Sung-Kwon
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.31 no.12B
    • /
    • pp.1076-1086
    • /
    • 2006
  • In providing video-on-demand (VoD) services to a number of clients through networks, the bandwidth requirements of video transmission restrict VoD services. For this reason, many significant broadcasting-based VoD schemes have been proposed to efficiently support services. However, the broadcasting-based VoD schemes approach needs frequency channel hopping, using many channels at the same time and managing many segments of a video. These make it difficult to implement. In this paper, we propose a Harmonic Staggered broadcasting scheme which has a simple structure and substantially improved VoD efficiency. The numerical results demonstrate that the viewer's waiting time of the Harmonic Staggered broadcasting scheme is close to the harmonic broadcasting scheme and the maximum buffer requirements of this can be adapted for demanding rate by adjusting the short front part of a video sizes.

Fast Staggered Data Broadcasting and Receiving Scheme for Simple and Efficient Video on Demand Services (주문형 비디오 서비스의 복잡성와 대역폭 효율을 개선한 Fast Staggered 방식)

  • Kim Hong-Ik;Park Sung-Kwon
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.31 no.7B
    • /
    • pp.668-678
    • /
    • 2006
  • In designing a video-on-demand (VoD) system, one of the major challenges is how to reduce the viewer's waiting time maintaining a given bandwidth allocation and how to reduce the client's buffer requirements. To solve these problems, many VoD schemes were proposed. However, most VoD schemes require managing many segments of a video, frequency channel hopping, and using many channels at the same time. Therefore, to the complexity is a barrier to implementation. In this paper, we propose a fast staggered broadcasting scheme which has a simple structure and substantially improved VoD efficiency. The numerical results demonstrate that the viewer's waiting time of the fast staggered broadcasting scheme is close to the fast broadcasting scheme and the buffer requirements of this can be adapt for demanding rate by adjusting the short front part of a video sizes.

The Implementation of DSP-Based Real-Time Video Transmission System using In-Vehicle Multimedia Network (차량 내 멀티미디어 네트워크를 이용한 DSP 기반 실시간 영상 전송 시스템의 구현)

  • Jeon, Young-Joon;Kim, Jin-II
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.14 no.1
    • /
    • pp.62-69
    • /
    • 2013
  • This paper proposes real-time video transmission system by the car-mounted cameras based on MOST Network. Existing vehicles transmit videos by connecting the car-mounted cameras in the form of analog. However, the increase in the number of car-mounted cameras leads to development of the network to connect the cameras. In this paper, DSP is applied to process MPEG 2 encoding/decoding for real-time video transmission in a short period of time. MediaLB is employed to transfer data stream between DSP and MOST network controller. During this procedure, DSP cannot transport data stream directly from MediaLB. Therefore, FPGA is used to deliver data stream transmitting MediaLB to DSP. MediaLB is designed to streamline hardware/software application development for MOST Network and to support all MOST Network data transportation methods. As seen in this paper, the test results verify that real-time video transmission using proposed system operates in a normal matter.

Feasibility Study on Audio-Tactile Display via Spectral Modulation (스펙트럼 변조를 이용한 청각정보의 촉감재현 가능성 연구)

  • Kwak, Hyun-Koo;Kim, Whee-Kuk;Chung, Ju-No;Kang, Dae-Im;Park, Yon-Kyu;Koo, Min-Mo
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.28 no.5
    • /
    • pp.638-647
    • /
    • 2011
  • Various approaches directly using vibrations of speakers have been suggested to effectively display the aural information such as the music to the hearing-impaired or the deaf. However, in these approaches, the human can't sense the frequency information over the maximum perceivable vibro-tactile frequency (around 1kHz). Therefore, in this study, an approach via spectral modulation of compressing the high frequency audio information into perceivable vibro-tactile frequency domain and outputting the modulated signals through the designated speakers is proposed. Then it is shown, through simulations of using Short-Time Fourier Transform (STFT) with Hanning windows and through preliminary experiments of using the vibro-tactile display testbed which is built and interfaced with a notebook PC, that the modulated signal of a natural sound composing sounds of a frog, a bird, and a water stream could produce the noise-free signal suitable enough for vibro-tactile speakers without causing Significant interfering disturbances, Lastly, for three different combinations of information provided to the subject, that is, i) with only video image, ii) with video image along with the modulated vibro-tactile stimuli as proposed in this study to the forearm of the subject, and iii) with video image along with full audio information, the effects to the human sense of reality and his emotion to given audio-video clips including various sounds and images are investigated and compared. It is shown from results of those experiments that the proposed method of providing modulated vibro-tactile stimuli along with the video images to the human has very high feasibility to transmit pseudo-aural sense to the human.

WebCam : A Web-based Remote Recordable Surveillance System using Index Search Algorithm (웹캠 : 새로운 인데스검색 알고리듬을 이용한 웹기반 원격 녹화 보안 시스템)

  • Lee, Myeong-Ok;Lee, Eun-Mi
    • The KIPS Transactions:PartC
    • /
    • v.9C no.1
    • /
    • pp.9-16
    • /
    • 2002
  • As existing analog video surveillance systems could save and retrieve data only in a limited space within short distance, it had many constraints in developing into various application systems. However, on the back of development of the Internet and computer technologies, digital video surveillance systems can be controlled from a remote location by web browser without space limits. Moreover, data compression and management technologies with Index Search algorithm make it possible to efficiently handling, storing, and retrieving a large amount of data and further motion detection algorithm enhances a recording speed and efficiency for a practical application, that is, a practical remote recordable video surveillance system using our efficient algorithms as mentioned, called WebCam. The WebCam server system can intelligently record and save video images digitized through efficient database management, monitor and control cameras in a remote place through user authentication, and search logs.

Video Compression Standard Prediction using Attention-based Bidirectional LSTM (어텐션 알고리듬 기반 양방향성 LSTM을 이용한 동영상의 압축 표준 예측)

  • Kim, Sangmin;Park, Bumjun;Jeong, Jechang
    • Journal of Broadcast Engineering
    • /
    • v.24 no.5
    • /
    • pp.870-878
    • /
    • 2019
  • In this paper, we propose an Attention-based BLSTM for predicting the video compression standard of a video. Recently, in NLP, many researches have been studied to predict the next word of sentences, classify and translate sentences by their semantics using the structure of RNN, and they were commercialized as chatbots, AI speakers and translator applications, etc. LSTM is designed to solve the gradient vanishing problem in RNN, and is used in NLP. The proposed algorithm makes video compression standard prediction possible by applying BLSTM and Attention algorithm which focuses on the most important word in a sentence to a bitstream of a video, not an sentence of a natural language.

Video Representation via Fusion of Static and Motion Features Applied to Human Activity Recognition

  • Arif, Sheeraz;Wang, Jing;Fei, Zesong;Hussain, Fida
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.7
    • /
    • pp.3599-3619
    • /
    • 2019
  • In human activity recognition system both static and motion information play crucial role for efficient and competitive results. Most of the existing methods are insufficient to extract video features and unable to investigate the level of contribution of both (Static and Motion) components. Our work highlights this problem and proposes Static-Motion fused features descriptor (SMFD), which intelligently leverages both static and motion features in the form of descriptor. First, static features are learned by two-stream 3D convolutional neural network. Second, trajectories are extracted by tracking key points and only those trajectories have been selected which are located in central region of the original video frame in order to to reduce irrelevant background trajectories as well computational complexity. Then, shape and motion descriptors are obtained along with key points by using SIFT flow. Next, cholesky transformation is introduced to fuse static and motion feature vectors to guarantee the equal contribution of all descriptors. Finally, Long Short-Term Memory (LSTM) network is utilized to discover long-term temporal dependencies and final prediction. To confirm the effectiveness of the proposed approach, extensive experiments have been conducted on three well-known datasets i.e. UCF101, HMDB51 and YouTube. Findings shows that the resulting recognition system is on par with state-of-the-art methods.

Development of a Portable RPV for Short-range Operations (근거리 원격탐색용 휴대용 무인기의 구성에 관한 연구)

  • 박주원
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.4 no.2
    • /
    • pp.227-232
    • /
    • 2001
  • IPresented is a small and handy remotely piloted vehicle(RPV) that can be used for military and non-military surveillance operations. The RPV is equipped with an on-board high resolution color camera to transmit the analog video images and on-board electronics to provide real-time flight information to the pilot, thereby enabling him/her to remotely pilot within the range of 5 km radius. This paper describes the RPV system including its design, manufacturing and flight test results which manifest the stability of on-board mission and flight equipment as well as the remote piloting capability. A future plan for necessary improvements identified from the flight tests are also discussed.

  • PDF