• Title/Summary/Keyword: Video modeling

Search Result 311, Processing Time 0.028 seconds

A Study for properties of Spline to 3D game modeling (3D 게임 모델링을 위한 Spline 특성 연구)

  • Cho, Hyung-Ik
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2012.05a
    • /
    • pp.433-436
    • /
    • 2012
  • Today due to the development of technology, 3D graphics have become an essential element of the game graphic. When game companies create a game using 3D graphics, game users can enjoy a better game graphics like photo-realistic live action than 2D game graphics. And because the game companies have many advantages in creating games which are easy to deal with many basic effects and special effects, in video game business field, 3D game have become the mainstream. This paper will inquire the characteristics of 2D spline which is basic to various kinds of 3D modeling necessary to making 3D game graphics, compare and analyze the merits and demerits of each kind of spline and find out its development history.

  • PDF

Efficient Multimodal Background Modeling and Motion Defection (효과적인 다봉 배경 모델링 및 물체 검출)

  • Park, Dae-Yong;Byun, Hae-Ran
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.6
    • /
    • pp.459-463
    • /
    • 2009
  • Background modeling and motion detection is the one of the most significant real time video processing technique. Until now, many researches are conducted into the topic but it still needs much time for robustness. It is more important when other algorithms are used together such as object tracking, classification or behavior understanding. In this paper, we propose efficient multi-modal background modeling methods which can be understood as simplified learning method of Gaussian mixture model. We present its validity using numerical methods and experimentally show detecting performance.

Development of 3D Stereoscopic Image Generation System Using Real-time Preview Function in 3D Modeling Tools

  • Yun, Chang-Ok;Yun, Tae-Soo;Lee, Dong-Hoon
    • Journal of Korea Multimedia Society
    • /
    • v.11 no.6
    • /
    • pp.746-754
    • /
    • 2008
  • A 3D stereoscopic image is generated by interdigitating every scene with video editing tools that are rendered by two cameras' views in 3D modeling tools, like Autodesk MAX(R) and Autodesk MAYA(R). However, the depth of object from a static scene and the continuous stereo effect in the view of transformation, are not represented in a natural method. This is because after choosing the settings of arbitrary angle of convergence and the distance between the modeling and those two cameras, the user needs to render the view from both cameras. So, the user needs a process of controlling the camera's interval and rendering repetitively, which takes too much time. Therefore, in this paper, we will propose the 3D stereoscopic image editing system for solving such problems as well as exposing the system's inherent limitations. We can generate the view of two cameras and can confirm the stereo effect in real-time on 3D modeling tools. Then, we can intuitively determine immersion of 3D stereoscopic image in real-time, by using the 3D stereoscopic image preview function.

  • PDF

Analysis of Research Trends in Deep Learning-Based Video Captioning (딥러닝 기반 비디오 캡셔닝의 연구동향 분석)

  • Lyu Zhi;Eunju Lee;Youngsoo Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.13 no.1
    • /
    • pp.35-49
    • /
    • 2024
  • Video captioning technology, as a significant outcome of the integration between computer vision and natural language processing, has emerged as a key research direction in the field of artificial intelligence. This technology aims to achieve automatic understanding and language expression of video content, enabling computers to transform visual information in videos into textual form. This paper provides an initial analysis of the research trends in deep learning-based video captioning and categorizes them into four main groups: CNN-RNN-based Model, RNN-RNN-based Model, Multimodal-based Model, and Transformer-based Model, and explain the concept of each video captioning model. The features, pros and cons were discussed. This paper lists commonly used datasets and performance evaluation methods in the video captioning field. The dataset encompasses diverse domains and scenarios, offering extensive resources for the training and validation of video captioning models. The model performance evaluation method mentions major evaluation indicators and provides practical references for researchers to evaluate model performance from various angles. Finally, as future research tasks for video captioning, there are major challenges that need to be continuously improved, such as maintaining temporal consistency and accurate description of dynamic scenes, which increase the complexity in real-world applications, and new tasks that need to be studied are presented such as temporal relationship modeling and multimodal data integration.

VHDL modeling of a real-time system for image enhancement (향상된 영상 획득을 위한 실시간 시스템의 VHDL 모델링)

  • Oh, Se-Jin;Kim, Young-Mo
    • Proceedings of the IEEK Conference
    • /
    • 2005.11a
    • /
    • pp.509-512
    • /
    • 2005
  • The aim of this work is to design a real-time reusable image enhancement architecture for video signals, based on a spatial processing of the video sequence. The VHDL hardware description language has been used in order to make possible a top-down design methodology. By adding proposed algorithms to the LPR(License Plate Recognition) system, the system is implemented with reliability and safety on a rainy day. Spartan-2E XC2s300E is used as implementation platforms for real-time system.

  • PDF

Traffic Modeling of Video Source and Performance Analysis of ATM Multiplexer (비디오 원의 트래픽 모형화와 다중화 장치의 성능분석)

  • Yoon, Young-Ha;Hong, Jung-Sik;Lie, Chang-Hoon
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.23 no.1
    • /
    • pp.235-247
    • /
    • 1997
  • In this study, the performance of ATM multiplexer with MPEG(Motion Pictures Experts Group) video is analyzed by considering the effect of the MPEG GOP(Group of Pictures) structure. By assuming that frame starting times are synchronized, aggregated traffics are considered to be transmitted at the beginning of each frame time units. The aggregated number of cells generated during a frame time unit is, therefore, derived from distributions of individual sources. The stationary probability of buffer occupancy can be easily obtained by using a property of the periodicity of aggregated traffics. Simulation approach is also used to determine the traffic load for a given probability of satisfying QoS(Quality of Service).

  • PDF

Collaborative Place and Object Recognition in Video using Bidirectional Context Information (비디오에서 양방향 문맥 정보를 이용한 상호 협력적인 위치 및 물체 인식)

  • Kim, Sung-Ho;Kweon, In-So
    • The Journal of Korea Robotics Society
    • /
    • v.1 no.2
    • /
    • pp.172-179
    • /
    • 2006
  • In this paper, we present a practical place and object recognition method for guiding visitors in building environments. Recognizing places or objects in real world can be a difficult problem due to motion blur and camera noise. In this work, we present a modeling method based on the bidirectional interaction between places and objects for simultaneous reinforcement for the robust recognition. The unification of visual context including scene context, object context, and temporal context is also. The proposed system has been tested to guide visitors in a large scale building environment (10 topological places, 80 3D objects).

  • PDF

Real-Time Surveillance of People on an Embedded DSP-Platform

  • Qiao, Qifeng;Peng, Yu;Zhang, Dali
    • Journal of Ubiquitous Convergence Technology
    • /
    • v.1 no.1
    • /
    • pp.3-8
    • /
    • 2007
  • This paper presents a set of techniques used in a real-time visual surveillance system. The system is implemented on a low-cost embedded DSP platform that is designed to work with stationary video sources. It consists of detection, a tracking and a classification module. The detector uses a statistical method to establish the background model and extract the foreground pixels. These pixels are grouped into blobs which are classified into single person, people in a group and other objects by the dynamic periodicity analysis. The tracking module uses mean shift algorithm to locate the target position. The system aims to control the human density in the surveilled scene and detect what happens abnormally. The major advantage of this system is the real-time capability and it only requires a video stream without other additional sensors. We evaluate the system in the real application, for example monitoring the subway entrance and the building hall, and the results prove the system's superior performance.

  • PDF

A Study on Story propose model based on Machine Learning - Focused on YouTube

  • CHUN, Sanghun;SHIN, Seung-Jung
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.13 no.2
    • /
    • pp.224-230
    • /
    • 2021
  • YouTube is an OTT service that leads the home economy, which has emerged from the 2020 Corona Pandemic. With the growth of OTT-based individual media, creators are required to establish attractive storytelling strategies that can be preferred by viewers and elected for YouTube recommendation algorithms. In this study, we conducted a study on modeling that proposes a content storyline for creators. As the ability for Creators to create content that viewers prefer, we have presented the data literacy ability to find patterns in complex and massive data. We also studied the importance of compelling storytelling configurations that viewers prefer and can be selected for YouTube recommendation algorithms. This study is of great significance in that it deviated from the viewer-oriented recommendation system method and proposed a story suggestion model for individual creaters. As a result of incorporating this story proposal model into the production of the YouTube channel Tiger Love video, it showed a certain effectiveness. This story suggestion model is a machine learning text-based story suggestion system, excluding the application of photography or video.

Intelligent Activity Recognition based on Improved Convolutional Neural Network

  • Park, Jin-Ho;Lee, Eung-Joo
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.6
    • /
    • pp.807-818
    • /
    • 2022
  • In order to further improve the accuracy and time efficiency of behavior recognition in intelligent monitoring scenarios, a human behavior recognition algorithm based on YOLO combined with LSTM and CNN is proposed. Using the real-time nature of YOLO target detection, firstly, the specific behavior in the surveillance video is detected in real time, and the depth feature extraction is performed after obtaining the target size, location and other information; Then, remove noise data from irrelevant areas in the image; Finally, combined with LSTM modeling and processing time series, the final behavior discrimination is made for the behavior action sequence in the surveillance video. Experiments in the MSR and KTH datasets show that the average recognition rate of each behavior reaches 98.42% and 96.6%, and the average recognition speed reaches 210ms and 220ms. The method in this paper has a good effect on the intelligence behavior recognition.