Search | Korea Science

Analysis of Research Trends in Deep Learning-Based Video Captioning (딥러닝 기반 비디오 캡셔닝의 연구동향 분석)

Lyu Zhi;Eunju Lee;Youngsoo Kim
- KIPS Transactions on Software and Data Engineering
- /
- v.13 no.1
- /
- pp.35-49
- /
- 2024
Video captioning technology, as a significant outcome of the integration between computer vision and natural language processing, has emerged as a key research direction in the field of artificial intelligence. This technology aims to achieve automatic understanding and language expression of video content, enabling computers to transform visual information in videos into textual form. This paper provides an initial analysis of the research trends in deep learning-based video captioning and categorizes them into four main groups: CNN-RNN-based Model, RNN-RNN-based Model, Multimodal-based Model, and Transformer-based Model, and explain the concept of each video captioning model. The features, pros and cons were discussed. This paper lists commonly used datasets and performance evaluation methods in the video captioning field. The dataset encompasses diverse domains and scenarios, offering extensive resources for the training and validation of video captioning models. The model performance evaluation method mentions major evaluation indicators and provides practical references for researchers to evaluate model performance from various angles. Finally, as future research tasks for video captioning, there are major challenges that need to be continuously improved, such as maintaining temporal consistency and accurate description of dynamic scenes, which increase the complexity in real-world applications, and new tasks that need to be studied are presented such as temporal relationship modeling and multimodal data integration.
https://doi.org/10.3745/KTSDE.2024.13.1.35 인용 PDF

The Effect of Recorded Video Monitoring on Students' Self Reflection of Patient-Physician Interaction (녹화영상 활용 학습법이 학생들의 '환자-의사관계'에서의 자기성찰에 미치는 영향)

Ju, Misun;Hwang, Jiyeong;Kim, Jaemyung;Kang, Jeaku
- Korean Medical Education Review
- /
- v.19 no.2
- /
- pp.83-89
- /
- 2017
The aim of this study is to examine the effect of recorded video monitoring on students' self-reflection after completing their clinical performance examination. Taking into account the particular cases involved in the examination, the present study utilized history-taking, physical examination, and patient education as bases for evaluating information-establishment ability, and asking, listening, understanding, explaining, and connectedness as the bases for evaluating patient-physician interaction ability. Student self-monitoring through recorded video feedback was carried out three days after completion of their clinical performance examination. Students self-evaluated their performance with a 10-point scale before and after self-monitoring. The results of this study show that students have a general tendency to lower their own self-evaluation scores after self-monitoring. Although there was not a statistically significant change of interrelationship in the information-establishment ability evaluation, there was a meaningful change of interrelationship in the patient-physician interaction ability evaluation after self-monitoring; specifically, in the case of acute lower abdominal pain, a high correlation was found (r=0.31, p=0.02) between the evaluation scores of standardized patients and students related to patient-physician interaction ability. This implies that self-monitoring enables the students to acquire a reflective viewpoint from which to evaluate their own performance. Therefore, it can be said that self-monitoring through recorded video feedback is a valuable method for students to use in reviewing their performance in patient-physician interactions.
https://doi.org/10.17496/kmer.2017.19.2.83 인용 PDF KSCI

Implementation of Video Processing Module for Integrated Modular Avionics System (모듈통합형 항공전자시스템을 위한 Video Processing Module 구현)

Jeon, Eun-Seon;Kang, Dae-Il;Ban, Chang-Bong;Yang, Seong-Yul
- Journal of Advanced Navigation Technology
- /
- v.18 no.5
- /
- pp.437-444
- /
- 2014
The integrated modular avionics (IMA) system has quite a number of line repalceable moduels (LRMs) in a cabinet. The LRM performs functions like line replaceable units (LRUs) in federated architecture. The video processing module (VPM) acts as a video bus bridge and gateway of ARINC 818 avionics digital video bus (ADVB). The VPM is a LRM in IMA core system. The ARINC 818 video interface and protocol standard was developed for high-bandwidth, low-latency and uncompressed digital video transmission. FPGAs of the VPM include video processing function such as ARINC 818 to DVI, DVI to ARINC 818 convertor, video decoder and overlay. In this paper we explain how to implement VPM's Hardware. Also we show the verification results about VPM functions and IP core performance.
https://doi.org/10.12673/jant.2014.18.5.437 인용 PDF KSCI

A Dynamic Bandwidth Allocation Scheme based on Playback Buffer Level in a Distributed Mobile Multimedia System (분산 모바일 멀티미디어 시스템에서 재생 버퍼 수준에 기반한 동적 대역폭 할당 기법)

Kim, Jin-Hwan
- The KIPS Transactions:PartB
- /
- v.17B no.6
- /
- pp.413-420
- /
- 2010
In this paper, we propose a scheme for dynamic allocating network bandwidth based on the playback buffer levels of the clients in a distributed mobile multimedia system. In this scheme, the amount of bandwidth allocated to serve a video request depends on the buffer level of the requesting client. If the buffer level of a client is low or high temporarily, more or less bandwidth will be allocated to serve it with an objective to make it more adaptive to the playback situation of this client. By employing the playback buffer level based bandwidth allocation policy, fair services can also be provided to the clients. In order to support high quality video playbacks, video frames must be transported to the client prior to their playback times. The main objectives in this bandwidth allocation scheme are to enhance the quality of service and performance of individual video playback such as to minimize the number of dropped video frames and at the same time to provide fair services to all the concurrent video requests. The performance of the proposed scheme is compared with that of other static bandwidth allocation scheme through extensive simulation experiments, resulting in the 4-9% lower ratio of frames dropped according to the buffer level.
https://doi.org/10.3745/KIPSTB.2010.17B.6.413 인용 PDF KSCI

A Hadoop-based Multimedia Transcoding System for Processing Social Media in the PaaS Platform of SMCCSE

Kim, Myoungjin;Han, Seungho;Cui, Yun;Lee, Hanku;Jeong, Changsung
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.6 no.11
- /
- pp.2827-2848
- /
- 2012
Previously, we described a social media cloud computing service environment (SMCCSE). This SMCCSE supports the development of social networking services (SNSs) that include audio, image, and video formats. A social media cloud computing PaaS platform, a core component in a SMCCSE, processes large amounts of social media in a parallel and distributed manner for supporting a reliable SNS. Here, we propose a Hadoop-based multimedia system for image and video transcoding processing, necessary functions of our PaaS platform. Our system consists of two modules, including an image transcoding module and a video transcoding module. We also design and implement the system by using a MapReduce framework running on a Hadoop Distributed File System (HDFS) and the media processing libraries Xuggler and JAI. In this way, our system exponentially reduces the encoding time for transcoding large amounts of image and video files into specific formats depending on user-requested options (such as resolution, bit rate, and frame rate). In order to evaluate system performance, we measure the total image and video transcoding time for image and video data sets, respectively, under various experimental conditions. In addition, we compare the video transcoding performance of our cloud-based approach with that of the traditional frame-level parallel processing-based approach. Based on experiments performed on a 28-node cluster, the proposed Hadoop-based multimedia transcoding system delivers excellent speed and quality.
https://doi.org/10.3837/tiis.2012.10.005 인용 PDF KSCI

A "GAP-Model" based Framework for Online VVoIP QoE Measurement

Calyam, Prasad;Ekici, Eylem;Lee, Chang-Gun;Haffner, Mark;Howes, Nathan
- Journal of Communications and Networks
- /
- v.9 no.4
- /
- pp.446-456
- /
- 2007
Increased access to broadband networks has led to a fast-growing demand for voice and video over IP(VVoIP) applications such as Internet telephony(VoIP), videoconferencing, and IP television(IPTV). For pro-active troubleshooting of VVoIP performance bottlenecks that manifest to end-users as performance impairments such as video frame freezing and voice dropouts, network operators cannot rely on actual end-users to report their subjective quality of experience(QoE). Hence, automated and objective techniques that provide real-time or online VVoIP QoE estimates are vital. Objective techniques developed to-date estimate VVoIP QoE by performing frame-to-frame peak-signal-to-noise ratio(PSNR) comparisons of the original video sequence and the reconstructed video sequence obtained from the sender-side and receiver-side, respectively. Since processing such video sequences is time consuming and computationally intensive, existing objective techniques cannot provide online VVoIP QoE. In this paper, we present a novel framework that can provide online estimates of VVoIP QoE on network paths without end-user involvement and without requiring any video sequences. The framework features the "GAP-model", which is an offline model of QoE expressed as a function of measurable network factors such as bandwidth, delay, jitter, and loss. Using the GAP-model, our online framework can produce VVoIP QoE estimates in terms of "Good", "Acceptable", or "Poor"(GAP) grades of perceptual quality solely from the online measured network conditions.
PDF KSCI

Study on Scalable Video Coding Signals Transmission Scheme using LED-ID System (LED-ID 시스템을 이용한 SVC 신호의 전송 기법에 관한 연구)

Lee, Kyu-Jin;Cha, Dong-Ho;Hwang, Sun-Ha;Lee, Kye-San
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.36 no.10B
- /
- pp.1258-1267
- /
- 2011
In this paper, using the indoor LED-ID communication system have researched for how to transmit video signals. In LED-ID communications use the LEDs for lighting features at the same time communication is an effective way to implement. This proposed system using Visible light(RGB) as way to transmit signals, depends on the mixture RGB, which decided the color of light, moreover, each things determined their performance. However, if the video signal were fixed allocated RGB to transmit such as the original system, the importance of the each signals a different occur the limit on the quality of the video than SVC signals. In order to solve this problem in this paper, according to the RGB mixture ratios analyze the performance for the White LED, which analyzed based on allocating the SVC signal by transmitting to improve the quality of the video was about how researched.
https://doi.org/10.7840/KICS.2011.36B.10.1258 인용 PDF KSCI

Adaptive Video Enhancement Algorithm for Military Surveillance Camera Systems (국방용 감시카메라를 위한 적응적 영상화질 개선 알고리즘)

Shin, Seung-Ho;Park, Youn-Sun;Kim, Yong-Sung
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.39C no.1
- /
- pp.28-35
- /
- 2014
Surveillance cameras in national border and coastline area often occur the video distortion because of rapidly changing weather and light environments. It is positively necessary to enhance the distorted video quality for keeping surveillance. In this paper, we propose an adaptive video enhancement algorithm in the various environment changes. To solve an unstable performance problem of the existing method, the proposed method is based on Retinex algorithm and uses enhanced curves which is adapted in foggy and low-light conditions. In addition, we mixture the weighted HSV color model to keep color constancy and reduce noise to obtain clear images. As a results, the proposed algorithm improves the performance of well-balanced contrast enhancement and effective color restoration without any quality loss compared with the existing algorithm. We expect that this method will be used in surveillance camera systems and offer help of national defence with reliability.
https://doi.org/10.7840/kics.2014.39C.1.28 인용 PDF KSCI

A Study of CCTV Video Tracking Technique to The Object Monitoring in The Automation Manufacturing Facilities (자동화 생산 시설물의 객체모니터링을 위한 CCTV 영상추적 기술에 관한 연구)

Seo, Won-Gi;Lee, Ju-Young;Park, Goo-Man;Shin, Jae-Kwon;Lee, Seung-Youn
- Journal of Satellite, Information and Communications
- /
- v.7 no.1
- /
- pp.134-138
- /
- 2012
In this paper, we implement the real-time status monitoring system to surveil the object in the automation manufacturing facilities and we propose the CCTV video tracking system using the video tracking filter to improve efficiency. To surveil the object in automation manufacturing facilities, we implement monitoring SW on the based of the video tracking filter instead of the general method for the video monitoring so the reliable monitoring based on the PC is possible efficiently. In addition, accessibility and convenience for administrator are improved as the real-time status confirmation function. Also, we conform the performance improvement effect through the performance analysis of the proposed monitoring system using the video tracking filter.
PDF KSCI

A Study on the Performance Enhancements of Video Streaming Service in MPLS Networks (MPLS 망을 통한 Video Streaming Service의 성능 개선에 관한 연구)

Kim Choong-Hyun;Kim Young-Beom
- Journal of the Institute of Convergence Signal Processing
- /
- v.7 no.2
- /
- pp.60-64
- /
- 2006
In typical video streaming services such as MPEG encoded video, the transmission data sizes change depending on the frame types and the required bandwidth for QoS support also changes as time passes. Accordingly, in video streaming services using the Internet it could happen that the video services stop occasionally due to instantaneous deficiencies of the required bandwidth under unexpected congestion even if the highest priority is allocated to the service by network administration. In this paper, we investigate several methods for bandwidth allocation and traffic engineering to support MPEG video traffic and propose a new method to reduce the transmission delay and enhance the throughput, there by provisioning the QoS requirements. In the proposed scheme, LSPs are established based on temporal variation patterns of the Internet traffic and the CQ_LLQ policing scheme is applied for traffic shaping at the ingress routers. Finally we verified the performance of the proposed scheme via computer simulations using OPNET.
PDF

Search Result 2,476, Processing Time 0.029 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)