• Title/Summary/Keyword: Digital video data

Search Result 617, Processing Time 0.028 seconds

Implementation of Character and Object Metadata Generation System for Media Archive Construction (미디어 아카이브 구축을 위한 등장인물, 사물 메타데이터 생성 시스템 구현)

  • Cho, Sungman;Lee, Seungju;Lee, Jaehyeon;Park, Gooman
    • Journal of Broadcast Engineering
    • /
    • v.24 no.6
    • /
    • pp.1076-1084
    • /
    • 2019
  • In this paper, we introduced a system that extracts metadata by recognizing characters and objects in media using deep learning technology. In the field of broadcasting, multimedia contents such as video, audio, image, and text have been converted to digital contents for a long time, but the unconverted resources still remain vast. Building media archives requires a lot of manual work, which is time consuming and costly. Therefore, by implementing a deep learning-based metadata generation system, it is possible to save time and cost in constructing media archives. The whole system consists of four elements: training data generation module, object recognition module, character recognition module, and API server. The deep learning network module and the face recognition module are implemented to recognize characters and objects from the media and describe them as metadata. The training data generation module was designed separately to facilitate the construction of data for training neural network, and the functions of face recognition and object recognition were configured as an API server. We trained the two neural-networks using 1500 persons and 80 kinds of object data and confirmed that the accuracy is 98% in the character test data and 42% in the object data.

The Influence of Altering Mobile Phone Interface on the Generation of Mental Model (모바일 폰의 인터페이스 변경이 멘탈모델 형성에 미치는 영향)

  • Park, Ye-Jin;Kim, Bon-Han
    • Science of Emotion and Sensibility
    • /
    • v.11 no.4
    • /
    • pp.575-588
    • /
    • 2008
  • This study is to inquire respective patterns of mental models caused by wrongful usages which can be experienced when a user who is used to a keypad-based mobile phone starts using a touch screen mobile phone and to find out the features of the user's logical process of correcting such wrongful usages to a new mental model. In addition, design improvement to be considered for easy generation of the mental model regarding touch screen mobile phones was reviewed in this study. We set up test subjects for the most frequently used seven high priority functions among touch screen phone functions and carried out the subject assessment together with interview surveys after the video observation experiment. Our test results show that test subjects who were used to keypad-based mobile phones tend to use operation knowledge related to the computer operational system(Window) or the web browse, navigation including Tap or Double Tap in order to correct the mental model when a wrongful usage is made. In addition, the result of comparison and analysis of the subject assessment and the video observation experiment data shows that wrongful usages of touch screen mobile phones mostly occurred in the field of 'information feedback' and 'navigation' among mobile phone components.

  • PDF

Geocoding of the Free Stereo Mosaic Image Generated from Video Sequences (비디오 프레임 영상으로부터 제작된 자유 입체 모자이크 영상의 실좌표 등록)

  • Noh, Myoung-Jong;Cho, Woo-Sug;Park, Jun-Ku;Kim, Jung-Sub;Koh, Jin-Woo
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.29 no.3
    • /
    • pp.249-255
    • /
    • 2011
  • The free-stereo mosaics image without GPS/INS and ground control data can be generated by using relative orientation parameters on the 3D model coordinate system. Its origin is located in one reference frame image. A 3D coordinate calculated by conjugate points on the free-stereo mosaic images is represented on the 3D model coordinate system. For determining 3D coordinate on the 3D absolute coordinate system utilizing conjugate points on the free-stereo mosaic images, transformation methodology is required for transforming 3D model coordinate into 3D absolute coordinate. Generally, the 3D similarity transformation is used for transforming each other 3D coordinates. Error of 3D model coordinates used in the free-stereo mosaic images is non-linearly increased according to distance from 3D model coordinate and origin point. For this reason, 3D model coordinates used in the free-stereo mosaic images are difficult to transform into 3D absolute coordinates by using linear transformation. Therefore, methodology for transforming nonlinear 3D model coordinate into 3D absolute coordinate is needed. Also methodology for resampling the free-stereo mosaic image to the geo-stereo mosaic image is needed for overlapping digital map on absolute coordinate and stereo mosaic images. In this paper, we propose a 3D non-linear transformation for converting 3D model coordinate in the free-stereo mosaic image to 3D absolute coordinate, and a 2D non-linear transformation based on 3D non-linear transformation converting the free-stereo mosaic image to the geo-stereo mosaic image.

Design of UWB/WiFi Module based Wireless Transmission for Endoscopic Camera (UWB/WiFi 모듈 기반의 내시경 카메라용 무선전송 설계)

  • Shim, Dongha;Lee, Jaegon;Yi, Jaeson;Cha, Jaesang;Kang, Mingoo
    • Journal of Internet Computing and Services
    • /
    • v.16 no.1
    • /
    • pp.1-8
    • /
    • 2015
  • Ultra-wide-angle wireless endoscopes are demonstrated in this paper. The endoscope is composed of an ultra-wide-angle camera module and wireless transmission module. A lens unit with the ultra-wide FOV of 162 degrees is designed and manufactured. The lens, image sensor, and camera processor unit are packaged together in a $3{\times}3{\times}9-cm3$ case. The wireless transmission modules are implemented based on UWB- and WiFi-based platform, respectively. The UWB-based module can transmit HD video to a computer in resolution of $2048{\times}1536$ (QXGA) and the frame rate of 15 fps in MJPEG compression mode. The maximum data transfer rate reaches 41.2 Mbps. The FOV and the resolution of the endoscope is comparable to a medical-grade endoscope. The FOV and resolution is ~3X and 16X higher than that of a commercial high-performance WiFi endoscope, respectively. The WiFi-based module streams out video to a smart device with th maximum date transfer rate of 1.5 Mbps at the resolution of $640{\times}480$ (VGA) and the frame rate of 30 fps in MJPEG compression mode. The implemented components show the feasibility of cheap medical-grade wireless electronic endoscopes, which can be effectively used in u-healthcare, emergency treatment, home-healthcare, remote diagnosis, etc.

Mobile Presentation using Transcoding Method of Region of Interest (관심 영역의 트랜스코딩 기법을 이용한 모바일 프리젠테이션)

  • Seo, Jung-Hee;Park, Hung-Bog
    • The KIPS Transactions:PartC
    • /
    • v.17C no.2
    • /
    • pp.197-204
    • /
    • 2010
  • An effective integration of web-based learning environment and mobile device technology is considered as a new challenge to the developers. The screen size, however, of the mobile device is too small, and its performance is too inferior. Due to the foregoing limit of mobile technology, displaying bulk data on the mobile screen, such as a cyber lecture accompanied with real-time image transmission on the web, raises a lot of problems. Users have difficulty in recognizing learning contents exactly by means of a mobile device, and continuous transmission of video stream with bulky information to the mobile device arouses a lot of load for the mobile system. Thus, an application which is developed to be applied in PC is improper to be used for the mobile device as it is, a player which is fitting for the mobile device should be developed. Accordingly, this paper suggests mobile presentation using transcoding techniques of the field concerned. To display continuous video frames of learning image, such as a cyber lecture or remote lecture, by means of a mobile device, the performance difference between high-resolution digital image and mobile device should be surmounted. As the transcoding techniques to settle the performance difference causes damage of image quality, high-quality image may be guaranteed by application of trial and error between transcoding and selected learning resources.

Comparative Analysis of CNN Deep Learning Model Performance Based on Quantification Application for High-Speed Marine Object Classification (고속 해상 객체 분류를 위한 양자화 적용 기반 CNN 딥러닝 모델 성능 비교 분석)

  • Lee, Seong-Ju;Lee, Hyo-Chan;Song, Hyun-Hak;Jeon, Ho-Seok;Im, Tae-ho
    • Journal of Internet Computing and Services
    • /
    • v.22 no.2
    • /
    • pp.59-68
    • /
    • 2021
  • As artificial intelligence(AI) technologies, which have made rapid growth recently, began to be applied to the marine environment such as ships, there have been active researches on the application of CNN-based models specialized for digital videos. In E-Navigation service, which is combined with various technologies to detect floating objects of clash risk to reduce human errors and prevent fires inside ships, real-time processing is of huge importance. More functions added, however, mean a need for high-performance processes, which raises prices and poses a cost burden on shipowners. This study thus set out to propose a method capable of processing information at a high rate while maintaining the accuracy by applying Quantization techniques of a deep learning model. First, videos were pre-processed fit for the detection of floating matters in the sea to ensure the efficient transmission of video data to the deep learning entry. Secondly, the quantization technique, one of lightweight techniques for a deep learning model, was applied to reduce the usage rate of memory and increase the processing speed. Finally, the proposed deep learning model to which video pre-processing and quantization were applied was applied to various embedded boards to measure its accuracy and processing speed and test its performance. The proposed method was able to reduce the usage of memory capacity four times and improve the processing speed about four to five times while maintaining the old accuracy of recognition.

A Mobile Landmarks Guide : Outdoor Augmented Reality based on LOD and Contextual Device (모바일 랜드마크 가이드 : LOD와 문맥적 장치 기반의 실외 증강현실)

  • Zhao, Bi-Cheng;Rosli, Ahmad Nurzid;Jang, Chol-Hee;Lee, Kee-Sung;Jo, Geun-Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.1
    • /
    • pp.1-21
    • /
    • 2012
  • In recent years, mobile phone has experienced an extremely fast evolution. It is equipped with high-quality color displays, high resolution cameras, and real-time accelerated 3D graphics. In addition, some other features are includes GPS sensor and Digital Compass, etc. This evolution advent significantly helps the application developers to use the power of smart-phones, to create a rich environment that offers a wide range of services and exciting possibilities. To date mobile AR in outdoor research there are many popular location-based AR services, such Layar and Wikitude. These systems have big limitation the AR contents hardly overlaid on the real target. Another research is context-based AR services using image recognition and tracking. The AR contents are precisely overlaid on the real target. But the real-time performance is restricted by the retrieval time and hardly implement in large scale area. In our work, we exploit to combine advantages of location-based AR with context-based AR. The system can easily find out surrounding landmarks first and then do the recognition and tracking with them. The proposed system mainly consists of two major parts-landmark browsing module and annotation module. In landmark browsing module, user can view an augmented virtual information (information media), such as text, picture and video on their smart-phone viewfinder, when they pointing out their smart-phone to a certain building or landmark. For this, landmark recognition technique is applied in this work. SURF point-based features are used in the matching process due to their robustness. To ensure the image retrieval and matching processes is fast enough for real time tracking, we exploit the contextual device (GPS and digital compass) information. This is necessary to select the nearest and pointed orientation landmarks from the database. The queried image is only matched with this selected data. Therefore, the speed for matching will be significantly increased. Secondly is the annotation module. Instead of viewing only the augmented information media, user can create virtual annotation based on linked data. Having to know a full knowledge about the landmark, are not necessary required. They can simply look for the appropriate topic by searching it with a keyword in linked data. With this, it helps the system to find out target URI in order to generate correct AR contents. On the other hand, in order to recognize target landmarks, images of selected building or landmark are captured from different angle and distance. This procedure looks like a similar processing of building a connection between the real building and the virtual information existed in the Linked Open Data. In our experiments, search range in the database is reduced by clustering images into groups according to their coordinates. A Grid-base clustering method and user location information are used to restrict the retrieval range. Comparing the existed research using cluster and GPS information the retrieval time is around 70~80ms. Experiment results show our approach the retrieval time reduces to around 18~20ms in average. Therefore the totally processing time is reduced from 490~540ms to 438~480ms. The performance improvement will be more obvious when the database growing. It demonstrates the proposed system is efficient and robust in many cases.

Depth Image Distortion Correction Method according to the Position and Angle of Depth Sensor and Its Hardware Implementation (거리 측정 센서의 위치와 각도에 따른 깊이 영상 왜곡 보정 방법 및 하드웨어 구현)

  • Jang, Kyounghoon;Cho, Hosang;Kim, Geun-Jun;Kang, Bongsoon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.5
    • /
    • pp.1103-1109
    • /
    • 2014
  • The motion recognition system has been broadly studied in digital image and video processing fields. Recently, method using th depth image is used very useful. However, recognition accuracy of depth image based method will be loss caused by size and shape of object distorted for angle of the depth sensor. Therefore, distortion correction of depth sensor is positively necessary for distinguished performance of the recognition system. In this paper, we propose a pre-processing algorithm to improve the motion recognition system. Depth data from depth sensor converted to real world, performed the corrected angle, and then inverse converted to projective world. The proposed system make progress using the OpenCV and the window program, and we test a system using the Kinect in real time. In addition, designed using Verilog-HDL and verified through the Zynq-7000 FPGA Board of Xilinx.

Kinematical Analysis of the Back Somersault in Floor Exercise (마루운동 제자리 뒤공중돌기 동작의 운동학적 분석)

  • Chung, Nam-Ju
    • Korean Journal of Applied Biomechanics
    • /
    • v.17 no.2
    • /
    • pp.157-166
    • /
    • 2007
  • This study was to compare the major kinematic factors between the success and failure group on performing the back somersault motion in floor exercise. Three gymnasts(height : $167.3{\pm}2.88cm$, age : $22.0{\pm}1.0years$, body weight : $64.4{\pm}2.3kg$) were participated in this study. The kinematic data was recorded at 60Hz with four digital video camera. Two successful motions and failure motions for each subject were selected for three dimensional analysis. 1. Success Trail It was appear that success trail was larger than failure group in projection velocity, but success trail was smaller than failure trail in projection angle. Also it was appear that success trail was longer than failure group in the time required. Hand segment velocity and maximum velocity in success trail were larger than those in failure trail, and this result was increasing the projection velocity and finally increasing the vertical height of center of mass. At the take-off(event 2), flection amount of hip and knee joint angle was contributed to the optimal condition for the take-off and at the peak point, hip and knee joint angle was maximum flexed for reducing the moment of inertia. Also in this point, upper extremities of success trail extended more than those of failure trail. in this base, success trail in upward phase(p3) 2. Failure Trail It was appear that failure trail was smaller than success trail in projection velocity, but failure trail was larger than success trail in projection angle. Also it was appear that failure trail was more short than success trail in the time required. Hand segment velocity and maximum velocity in failure trail were smaller than those in success trail, and this result was reducing the projection velocity and finally reducing the vertical high of center of mass. At the take-off(event 2), flection amount of hip and knee joint angle wasn't contributed to the optimal condition for the take-off and at the peak point, hip and knee joint angle wasn't maximum flexed for reducing the moment of inertia. Also in this point, upper extremities of failure trail didn't extended more than those of success trail.

The Design of Repeated Motion on Adaptive Block Matching Algorithm in Real-Time Image (실시간 영상에서 반복적인 움직임에 적응한 블록정합 알고리즘 설계)

  • Kim Jang-Hyung;Kang Jin-Suk
    • Journal of Korea Multimedia Society
    • /
    • v.8 no.3
    • /
    • pp.345-354
    • /
    • 2005
  • Since motion estimation and motion compensation methods remove the redundant data to employ the temporal redundancy in images, it plays an important role in digital video compression. Because of its high computational complexity, however, it is difficult to apply to high-resolution applications in real time environments. If we have a priori knowledge about the motion of an image block before the motion estimation, the location of a better starting point for the search of an exact motion vector can be determined to expedite the searching process. In this paper presents the motion detection algorithm that can run robustly about recusive motion. The motion detection compares and analyzes two frames each other, motion of whether happened judge. Through experiments, we show significant improvements in the reduction of the computational time in terms of the number of search steps without much quality degradation in the predicted image.

  • PDF