Search | Korea Science

Salient Video Frames Sampling Method Using the Mean of Deep Features for Efficient Model Training (효율적인 모델 학습을 위한 심층 특징의 평균값을 활용한 의미 있는 비디오 프레임 추출 기법)

Yoon, Hyeok;Kim, Young-Gi;Han, Ji-Hyeong
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2021.06a
- /
- pp.318-321
- /
- 2021
최근 정보통신의 발달과 함께 인터넷에 접속하는 사용자 수와 그에 따른 비디오 데이터의 전송량이 늘어나는 추세이다. 이렇게 늘어나는 많은 비디오 데이터를 관리하고 분석하기 위해서 최근에는 딥 러닝 기법을 많이 활용하게 된다. 일반적으로 비디오 데이터에 딥 러닝 모델을 학습할 때 컴퓨터 자원의 한계로 인해 전체 비디오 프레임에서 균등한 간격 또는 무작위로 프레임을 선택하는 방법을 많이 사용한다. 하지만 학습에 사용되는 비디오 데이터는 항상 시간 축에 따라 같은 문맥을 담고 있는 Trimmed 비디오라고 가정할 수가 없다. 만약 같지 않은 문맥을 지닌 Untrimmed 비디오에서 균등한 간격 또는 무작위로 프레임을 선택해서 사용하게 된다면 비디오의 범주와 관련이 없는 프레임이 샘플링 될 가능성이 있기 때문에 모델의 학습 및 최적화에 전혀 도움이 되지 않는다. 이를 해결하기 위해 우리는 각 비디오 프레임에서 심층 특징을 추출하여 평균값을 계산하고 이와 각 추출된 심층특징들과 코사인 유사도를 계산해서 얻은 유사도 점수를 바탕으로 Untrimmed 비디오에서 의미 있는 비디오 프레임을 추출하는 기법을 제안한다. 그리고 Untrimmed 비디오로 구성된 데이터셋으로 유명한 ActivityNet 데이터셋에 대해서 대표적인 2가지 프레임 샘플링 방식(균등한 간격, 무작위)과 비교하여 우리가 제안하는 기법이 Untrimmed 비디오에서 효과적으로 비디오의 범주에 해당하는 의미 있는 프레임 추출이 가능함을 보일 것이다. 우리가 실험에 사용한 코드는 https://github.com/titania7777/VideoFrameSampler에서 확인할 수 있다.
PDF

Indexing and Retrieving of Video Data (비디오 데이터의 색인과 검색)

Heo, Jin-Yong;Park, Dong-Won;An, Syung-Og
- The Journal of Engineering Research
- /
- v.3 no.1
- /
- pp.107-116
- /
- 1998
Video data are retrieved and stored in various compressed forms according to their characteristics. In this paper, we present a generic data model that captures the structure of a video document and that provides a means for indexing a video stream. Using this model, we design and implement CVIMS (the MPEG-2 Compressed Video Information Management System) to store and retrieve video documents. CVIMS extracts I-frames from MPEG-2 TS files, selects key-frames from the I-frames, and stores in database the index information such as thumbnails, captions, and picture descriptors of the key-frames. And, CVIMS retrieves MPEG-2 video data using the thumbnails of key-frames and various labels of queries. And also, the system is accessible by a web interface.
PDF

Design and Implementation of MPEG-2 Compressed Video Information Management System (MPEG-2 압축 동영상 정보 관리 시스템의 설계 및 구현)

Heo, Jin-Yong;Kim, In-Hong;Bae, Jong-Min;Kang, Hyun-Syug
- The Transactions of the Korea Information Processing Society
- /
- v.5 no.6
- /
- pp.1431-1440
- /
- 1998
Video data are retrieved and stored in various compressed forms according to their characteristics, In this paper, we present a generic data model that captures the structure of a video document and that provides a means for indexing a video stream, Using this model, we design and implement CVIMS (the MPEG-2 Compressed Video Information Management System) to store and retrieve video documents, CVIMS extracts I-frames from MPEG-2 files, selects key-frames from the I -frames, and stores in database the index information such as thumbnails, captions, and picture descriptors of the key-frames, And also, CVIMS retrieves MPEG- 2 video data using the thumbnails of key-frames and v31ious labels of queries.
PDF

Fast construction of motion graph using PCA (PCA를 이용한 효율적 모션 그래프 생성)

Seong, Hye-Young;Kyung, Min-Ho
- Journal of the Korea Computer Graphics Society
- /
- v.10 no.2
- /
- pp.51-56
- /
- 2004
모션 데이터들을 그래프로 저장하고 이를 모션합성에 이용하는 기존의 연구들은, 모든 모션 프레임간 연결비용계산으로 인하여 그래프 생성에 많은 시간이 걸린다는 단점이 있다. 본 논문에서는 이런 단점을 보완하여 빠르고 효과적으로 그래프를 생성하는 방법을 제시한다. 우선, PCA를 이용하여 모션들을 2차원에 투영시키고, 2차원 상의 간단한 거리계산으로 전이에지가 존재할 가능성이 큰 프레임 쌍들을 찾아낸다. 다음으로, 이런 프레임 쌍에 대해서만 연결비용을 계산하여 그래프를 생성한다. 따라서, 모든 프레임에 대한 비용계산에 비해 본 논문에서 제안한 방법은 효율적으로 그래프를 생성하게 된다.
PDF

Non-Reference P Frame Coding for Low-Delay Encoding in Internet Video Coding (IVC의 저지연 부호화 모드를 위한 비참조 P 프레임의 부호화 기법)

Kim, Dong-Hyun;Kim, Jin-Soo;Kim, Jae-Gon
- Journal of Broadcast Engineering
- /
- v.19 no.2
- /
- pp.250-256
- /
- 2014
Non-reference P frame coding is used to enhance coding efficiency in low-delay encoding configuration of Internet Video Coding (IVC), which is being standardized as a royalty-free video codec in MPEG. The existing method of non-reference P frame coding which was adopted in the reference Test Model of IVC (ITM) 4.0 adaptively applies a non-reference P frame with a fixed coding structure based on the magnitude of motion vectors (MVs), however, which unexpectedly degrades the coding efficiency for some sequences. In this paper, the existing non-reference P frame coding is improved by changing non-reference P frame coding structure and applying a new adaptive method using the ratio of the amount of generated bits of non-reference frames to that of reference frames as well as MVs. Experimental results show that the proposed non-reference P frame coding gives 6.6% BD-rate bit saving in average over ITM 7.0.
https://doi.org/10.5909/JBE.2014.19.2.250 인용 PDF KSCI KPUBS

Convert 2D Video Frames into 3D Video Frames (2차원 동영상의 3차원 동영상 변화)

Lee, Hee-Man
- Journal of the Korea Society of Computer and Information
- /
- v.14 no.6
- /
- pp.117-123
- /
- 2009
In this paper, An algorithm which converts 2D video frames into 3D video frames of parallel looking stereo camea is proposed. The proposed algorithm finds the disparity information between two consecutive video frames and generates 3D video frames from the obtained disparity maps. The disparity information is obtained from the modified iterative convergence algorithm. The method of generating 3D video frames from the disparity information is also proposed. The proposed algorithm uses coherence method which overcomes the video pattern based algorithms.
https://doi.org/10.9708/jksci.2009.14.6.117 인용 PDF

A Study on Iterative Turbo Decoding Using Three Cascade MAP Decoder (3개의 직렬 MAP 복호기를 이용한 반복 터보 복호화에 관한 연구)

Kim Dong-Won;Kang Chul-Ho
- Proceedings of the Acoustical Society of Korea Conference
- /
- spring
- /
- pp.343-346
- /
- 1999
터보부호는 일반적으로 인터리버의 크기가 클수록 성능이 우수한 것으로 알려져 있는데 이동통신 시스템 등에서 음성 신호를 전송하는 경우 프레임의 크기 즉, 인터리버의 크기가 너무 작아서 성능의 저하가 생기게 되는 것은 당연한 원리이다. 본 논문에서는 터보부호의 복호시 3개의 직렬 MAP복호기를 제안하여 기존의 방식보다 메모리 수는 감소시키면서 음성의 기준인 S/N 2.0[dB]에서 BER $10^{-3}$의 성능을 제안한 알고리즘을 통해 살펴본다. 모의실험결과, 부호율 1/3 , 반복복호의 수 5, 생성부흐 다항식 G=(7, 5)일 때 IS-95[9]에서 사용되고 있는 프레임과 같은 크기의 인터리버인 프레임 24인 경우 $10^{-2}$, 프레임 192인 경우 $10^{-3}$ 정도 값을 얻었다.
PDF

Implementation of Integrated Receiver for Terrestrial/Cable/Satellite HD Broadcasting Services (유럽형 지상파/케이블/위성 멀티모드 HD 방송 수신이 가능한 통합 수신기 구현)

Lee, Youn-Sung;Kwon, Ki Won;Kim, Dong Ku
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.40 no.11
- /
- pp.2113-2120
- /
- 2015
This paper presents an integrated receiver to support multimode broadcasting standards such as DVB-T2, DVB-C2, and DVB-S2 in a single platform. The integrated receiver consists of a tuner block, a receiver engine, a frame processor, and an A/V decoder. The receiver engine includes a channel decoding engine and a demodulation engine to perform OFDM and APSK demodulations. The frame processor performs deinterleaving and BB frame decoding functions. The demodulator engine and the frame processor are implemented in two FPGA devices and DSP-based embedded software, respectively. To verify the functionality of the integrated receiver, it is tested in the laboratory. Commercial PC-based modulators are used to generate the DVB-T2, DVB-C2, and DVB-S2 modulated signals. The integrated receiver was tested under various operation modes as specified in the standards such as DVB-T2, DVB-C2, and DVB-S2 and showed successful operation in all the scenarios tested.
https://doi.org/10.7840/kics.2015.40.11.2113 인용 PDF KSCI

DCT-domain Intra Prediction Scheme for MPEG-2/H.264 Transcoder (MPEG-2/H.264 트랜스코더를 위한 DCT 기반 Intra 예측기법)

Lee, Joo-Kyong;Chung, Ki-Dong
- Proceedings of the Korea Information Processing Society Conference
- /
- 2006.11a
- /
- pp.231-234
- /
- 2006
H.264 코딩시 Intra 모드는 다양한 모드 즉, 매크로블록(MB) 당 16개의 $4{\times}4$ 블록 각각에 대한 9가지의 $4{\times}4$ 모드와 4가지의 $16{\times}16$ 모드의 오류 값을 계산하여 최상의 모드를 선택하게 된다. 이와 같은 픽셀 기반의 예측 기법을 DCT 기반의 MPEG-2/H.264 트랜스코더에 적용할 경우 DCT 변환의 특성으로 인하여 모드 예측을 위한 계산량이 높아지는 단점이 있다. 본 논문에서는 이러한 문제점을 해결하기 위해 MPEG2 DCT를 H.264의 정수형 DCT로 변환한 후, DCT 계수의 특징을 이용하여 Intra 모드를 결정하는 기법을 제안한다. 이때, 계수의 특징이 모호하여 모드를 결정하기 어려운 경우는 몇 가지 모드를 선택하여 오류 값을 계산하여 모드를 결정한다. 현재 정밀한 실험은 진행 중이며, 여러 동영상의 첫 Intra 프레임에 대한 실험을 수행한 결과 MB의 모드 결정의 정확도는 프레임 내 이미지의 특징에 영향을 많이 받았다. 예를 들면 Mobile과 같이 프레임 내의 픽셀 간 에지가 많이 존재하는 프레임은 추가적인 모드 결정을 사용하지 않아도 93%정도의 정확도를 보였으며 Akiyo, Foreman과 같이 이웃한 픽셀간 유사도가 상대적으로 높은 경우는 약 80% 정도의 순수 정확성을 보였다. 그러나 모드 판단이 모호한 경우의 모드도 결정한다면 90%이상의 정확도를 보일 것으로 예상된다. 향후 이미지의 특성에 따라 모드를 결정하는 값을 유동적으로 설정하는 기법을 연구하여 정확도를 높이는 연구를 수행하도록 하겠다.
PDF

3D Human Reconstruction from Video using Quantile Regression (분위 회귀 분석을 이용한 비디오로부터의 3차원 인체 복원)

Han, Jisoo;Park, In Kyu
- Journal of Broadcast Engineering
- /
- v.24 no.2
- /
- pp.264-272
- /
- 2019
In this paper, we propose a 3D human body reconstruction and refinement method from the frames extracted from a video to obtain natural and smooth motion in temporal domain. Individual frames extracted from the video are fed into convolutional neural network to estimate the location of the joint and the silhouette of the human body. This is done by projecting the parameter-based 3D deformable model to 2D image and by estimating the value of the optimal parameters. If the reconstruction process for each frame is performed independently, temporal consistency of human pose and shape cannot be guaranteed, yielding an inaccurate result. To alleviate this problem, the proposed method analyzes and interpolates the principal component parameters of the 3D morphable model reconstructed from each individual frame. Experimental result shows that the erroneous frames are corrected and refined by utilizing the relation between the previous and the next frames to obtain the improved 3D human reconstruction result.
https://doi.org/10.5909/JBE.2019.24.2.264 인용 PDF KSCI KPUBS HTML

Search Result 3,083, Processing Time 0.038 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)