• Title/Summary/Keyword: Video Resources

Search Result 276, Processing Time 0.03 seconds

MF sampler: Sampling method for improving the performance of a video based fashion retrieval model (MF sampler: 동영상 기반 패션 검색 모델의 성능 향상을 위한 샘플링 방법)

  • Baek, Sanghun;Park, Jonghyuk
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.4
    • /
    • pp.329-346
    • /
    • 2022
  • Recently, as the market for short form videos (Instagram, TikTok, YouTube) on social media has gradually increased, research using them is actively being conducted in the artificial intelligence field. A representative research field is Video to Shop, which detects fashion products in videos and searches for product images. In such a video-based artificial intelligence model, product features are extracted using convolution operations. However, due to the limitation of computational resources, extracting features using all the frames in the video is practically impossible. For this reason, existing studies have improved the model's performance by sampling only a part of the entire frame or developing a sampling method using the subject's characteristics. In the existing Video to Shop study, when sampling frames, some frames are randomly sampled or sampled at even intervals. However, this sampling method degrades the performance of the fashion product search model while sampling noise frames where the product does not exist. Therefore, this paper proposes a sampling method MF (Missing Fashion items on frame) sampler that removes noise frames and improves the performance of the search model. MF sampler has improved the problem of resource limitations by developing a keyframe mechanism. In addition, the performance of the search model is improved through noise frame removal using the noise detection model. As a result of the experiment, it was confirmed that the proposed method improves the model's performance and helps the model training to be effective.

Analysis of Research Trends in Deep Learning-Based Video Captioning (딥러닝 기반 비디오 캡셔닝의 연구동향 분석)

  • Lyu Zhi;Eunju Lee;Youngsoo Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.13 no.1
    • /
    • pp.35-49
    • /
    • 2024
  • Video captioning technology, as a significant outcome of the integration between computer vision and natural language processing, has emerged as a key research direction in the field of artificial intelligence. This technology aims to achieve automatic understanding and language expression of video content, enabling computers to transform visual information in videos into textual form. This paper provides an initial analysis of the research trends in deep learning-based video captioning and categorizes them into four main groups: CNN-RNN-based Model, RNN-RNN-based Model, Multimodal-based Model, and Transformer-based Model, and explain the concept of each video captioning model. The features, pros and cons were discussed. This paper lists commonly used datasets and performance evaluation methods in the video captioning field. The dataset encompasses diverse domains and scenarios, offering extensive resources for the training and validation of video captioning models. The model performance evaluation method mentions major evaluation indicators and provides practical references for researchers to evaluate model performance from various angles. Finally, as future research tasks for video captioning, there are major challenges that need to be continuously improved, such as maintaining temporal consistency and accurate description of dynamic scenes, which increase the complexity in real-world applications, and new tasks that need to be studied are presented such as temporal relationship modeling and multimodal data integration.

Estimation of the distribution density of snow crab, Chionoecetes opilio using a deep-sea underwater camera system attached on a towing sledge (예인식 심해용 비디오카메라를 이용한 대게의 서식밀도 추정)

  • An, Heui-Chun;Lee, Kyoung-Hoon;Bae, Jae-Hyun;Bae, Bong-Seong;Shin, Jong-Keun
    • Journal of the Korean Society of Fisheries and Ocean Technology
    • /
    • v.45 no.3
    • /
    • pp.151-156
    • /
    • 2009
  • This study shows that the distribution density of snow crab, Chionoecetes opilio, was estimated using an underwater video monitoring system attached on the towing sledge. The field experiments were carried out at the coastal waters around Chuksan, East Sea, where ranged from 110 to 130m depth during September and October 2007. The sledge was towed for 40 minutes and the towing speed was controlled between 1.5 to 1.7 knot and each research areas were calculated to multiply towed distance by the detection width of the video monitoring system(1.2m), and then, distribution density of snow crab in each observations were estimated as a counted number of crab per 1,000$m^2$. The result shows that their survey, taken between two months, reflected similar results during survey period, and the maximum and mean distribution densities in September estimated to be 77.0(number/1,000$m^2$) and 19.9, respectively, and those of October were 36.0 and 21.8, respectively.

Detection of an Impact Flash Candidate on the Moon with an Educational Telescope System

  • Kim, Eunsol;Kim, Yong Ha;Hong, Ik-Seon;Yu, Jaehyung;Lee, Eungseok;Kim, Kyoungja
    • Journal of Astronomy and Space Sciences
    • /
    • v.32 no.2
    • /
    • pp.121-125
    • /
    • 2015
  • At the suggestion of the NASA Meteoroid Environment Office (NASA/MEO), which promotes lunar impact monitoring worldwide during NASA's Lunar Atmosphere and Dust Environment Explorer (LADEE) mission period (launched Sept. 2013), we set up a video observation system for lunar impact flashes using a 16-inch educational telescope at Chungnam National University. From Oct. 2013 through Apr. 2014, we recorded 80 hours of video observation of the unilluminated part of the crescent moon in the evening hours. We found a plausible candidate impact flash on Feb. 3, 2014 at selenographic longitude $2.1^{\circ}$ and latitude $25.4^{\circ}$. The flash lasted for 0.2 s and the light curve was asymmetric with a slow decrease after a peak brightness of $8.7{\pm}0.3mag$. Based on a star-like distribution of pixel brightness and asymmetric light curve, we conclude that the observed flash was due to a meteoroid impact on the lunar surface. Since unequivocal detection of an impact flash requires simultaneous observation from at least two sites, we strongly recommend that other institutes and universities in Korea set up similar inexpensive monitoring systems involving educational or amateur telescopes, and that they collaborate in the near future.

A Selective Layer Discard Algorithm for Stored Video Delivery over Resource Constrained Networks (자원 제약이 있는 네트워크에서 저장 비디오 데이터의 효율적인 전송을 위한 선택적 계층삭제 알고리즘)

  • No, Ji-Won;Lee, Mi-Jeong
    • The KIPS Transactions:PartC
    • /
    • v.8C no.5
    • /
    • pp.647-656
    • /
    • 2001
  • Video delivery from a server to a client across a network system is an important part of many multimedia applications. Usually, the network system has constraint in both the amount of network bandwidth and the buffer size in the client. While delivering a video stream across such a constrained network system, loss of frames may be unavoidable. The system resources consumed by the dropped frames are wasted, and the losses of frames would result in discontinuous display at the client. In this paper, for delivering hierarchically encoded video stream, we introduce the notion of selective layer discard algorithm at the server which not only preemptively discards data at the server but also drops less important part of a frame instead of the entire frame. By the simulation, we compare the proposed selective layer discard algorithm and the existing selective frame discard algorithm. The simulation results show that the proposed algorithm may improve the quality of decoded video, and decrease the replay discontinuity at the client.

  • PDF

An Intelligent Media Player for Guaranteeing QoS Streaming Media on Thin-Client Computing (씬클라이언트 컴퓨팅에서 스트리밍 미디어의 QoS를 보장하는 지능형 미디어 플레이어)

  • Kim, Byeong-Gil;Lee, Joa-Hyoung;Jung, In-Bum
    • The KIPS Transactions:PartB
    • /
    • v.12B no.5 s.101
    • /
    • pp.607-616
    • /
    • 2005
  • Due to the limited resources in thin-client and the large amount of computation for decoding MPEG media, it is not easy to support the QoS stream media to clients. To solve the problems, the terminal servers would be charged for decoding the MPEG media and thin-clients have a role to update only the changed areas in their screen. However, these previous approaches cause severely low video quality. In addition, since servers perform all procedures to decode MPEG media, they are easily saturated even under a small number of clients. In this paper, the sources of the low video duality are investigated in the previous thin-clients' solutions working in wireless and wired environments. From the detailed experiments, an intelligent media player is proposed to achieve the QoS streams by supporting both the enhanced video duality and the audio synchronized with video frames.

A Dual Transcoding Method for Retaining QoS of Video Streaming Services under Restricted Computing Resources (동영상 스트리밍 서비스의 QoS유지를 위한 듀얼 트랜스코딩 기법)

  • Oh, Doohwan;Ro, Won Woo
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.3 no.7
    • /
    • pp.231-240
    • /
    • 2014
  • Video transcoding techniques provide an efficient mechanism to make a video content adaptive to the capabilities of a variety of clients. However, it is hard to provide an appropriate quality-of-service(QoS) to the clients owing to heavy workload on transcoding operations. In light of this fact, this paper presents the dual transcoding method in order to guarantee QoS in streaming services by maximizing resource usage in a transcoding server equipped with both CPU and GPU computing units. The CPU and GPU computing units have different architectural features. The proposed method speculates workload of incoming transcoding requests and then schedules the requests either to the CPU or GPU accordingly. From performance evaluation, the proposed dual transcoding method achieved a speedup of 1.84 compared with traditional transcoding approach.

CLO (Cross Layer Optimization) Technique for Multi-view Video Streaming Service over WiBro Network (WiBro망에서의 다시점 비디오 스트리밍 서비스를 위한 계층 간 최적화 방식)

  • Son, Jung-Hyun;Cho, Ye-Jin;Suh, Doug-Young;Park, Gwang-Hoon;Kim, Kyu-Heon
    • Journal of Broadcast Engineering
    • /
    • v.13 no.5
    • /
    • pp.719-731
    • /
    • 2008
  • This paper defines QoE (Quality of Experience) for multi-view video streaming service over WiBro and proposes the CLO (Cross-Layer Optimization) algorithm can maximize this. Proposal CLO algorithm contains from physical layer to video layer. Under the time-varying wireless channel condition, the CLO technique takes view-wise and the temporal priority of the multi-view video into consideration in order to decide the transmission of frames and its FEC level. At the handover situation, it is shown through computer simulation that the optimal quality of the multi-view video can be achieved using the minimum amount of resources if the proposed CLO technique is applied.

Distortion Minimization Resource Allocation Scheme for Multiuser Video Transmission Over OFDM Network with Proportional Rates (다수 사용자 OFDM 시스템에서의 비디오 전송을 위한 비례 율 적용 왜곡 최소화 자원 할당 방법)

  • Ha, Ho-Jin;Yim, Chang-Hoon;Kim, Young-Yong
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.33 no.7B
    • /
    • pp.583-591
    • /
    • 2008
  • This paper proposes a resource allocation algorithm for minimizing the overall distortion of multiple users in orthogonal frequency division multiplex (OFDM). The proposed algorithm exploits the diversity of multiuser and the rate-distortion function using packet distortion model in a system with limited resources. We first induce a rate-distortion function considering error concealment and error propagation properties of H.264 video structures. Then we perform adaptive resource allocation utilizing multiuser diversity for minimizing the overall video quality degradation. We also consider the proportional rate which is pre-determined for each user. Simulation results show that compared to the previous time division multiple access method and the resource allocation method maximizing data rate, the proposed rate allocation algorithm substantially improves the received video quality.

Distributed video coding complexity balancing method by phase motion estimation algorithm (단계적 움직임 예측을 이용한 분산비디오코딩(DVC)의 복잡도 분배 방법)

  • Kim, Chul-Keun;Kim, Min-Geon;Suh, Doug-Young;Park, Jong-Bin;Jeon, Byeung-Woo
    • Journal of Broadcast Engineering
    • /
    • v.15 no.1
    • /
    • pp.112-121
    • /
    • 2010
  • Distributed video coding is a coding paradigm that allows complexity to be shared between encoder and decoder, in contrast with conventional video coding. We propose that complexity balancing method of encoder/decoder by phase motion estimation algorithm. The encoder performs partial motion estimation. The result of the partial motion estimation is transferred to the decoder, and the decoder performs motion estimation within the narrow range. When the encoder can afford some complexity, complexity balancing is possible. The method proposed is able to know relativity between complexity balancing and coding efficiency. The coding efficiency increase rate by the encoder complexity increases is higher than that by the decoder complexity increases. The proposed method can control the complexity and coding efficiency according to devices' resources and channel conditions.