Search | Korea Science

Decoder Adaptive Tile Clustering Algorithm for Viewport-Dependent Virtual Reality Video Decoding System (시점 기반 가상 현실 영상 복호화 시스템을 위한 복호기 적응적 타일 클러스터링 알고리즘)

Park, Jun-Ho;Jeong, Jong-Beom;Jeong, Se-Hoon;Ryu, Eun-Seok
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- fall
- /
- pp.197-200
- /
- 2021
몰입형 고품질 가상 현실 영상 스트리밍을 위한 360도 영상 부호화 및 전송 기술 중 하나로 사용자 시점 기반 타일 스트리밍 기법이 활발히 연구되고 있다. 360도 영상은 용량이 크기 때문에 개별 타일 기반 스트리밍 방법을 사용해 사용자 시점만 보내는 것이 효율적이다. 본 논문은 시점 기반 가상 현실 영상 복호화 시스템을 위한 복호기 적응적 타일 클러스터링 알고리즘을 제안한다. 제안하는 방법은 클라이언트의 복호기가 최대로 복호화 가능한 해상도를 탐색한 후, 사용자 시점 데이터와 복호기 적응적 타일 클러스터링 알고리즘을 이용해 클러스터화할 복수 개의 사용자 시점 타일들의 목록을 생성한 후, 타일 병합기를 이용해 타일들을 병합하여 클러스터 비트스트림을 생성한다. 이후 클라이언트는 병합된 클러스터 비트스트림들을 복호화한 후 사용자 시점을 생성한다. 제안하는 방법을 이용하면 클라이언트의 복호기 환경에 제약받지 않는 복호화가 가능하며, 제안하는 방법 중 하나인 4K_clustering 방법의 경우 8%의 복호화 속도 개선 효과를 얻을 수 있어 몰입형 고품질 가상 현실 영상을 위한 실시간 타일 스트리밍이 가능하다.
PDF

Intelligent Face Mosaicing Method in Video for Personal Information Protection (개인정보 보호를 위한 비디오에서의 지능형 얼굴 모자이킹 방법)

Lim, Hyuk;Choi, Minseok;Choi, Seungbi;Choi, Haechul
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2020.11a
- /
- pp.338-339
- /
- 2020
개인 방송의 보편화로 인해 인터넷 혹은 방송으로 유포되는 영상에서 일반인의 얼굴이 빈번히 노출되고 있으며, 동의 받지 않은 얼굴의 방송 노출은 개인 초상권 침해와 같은 사회적 문제를 일으킬 수 있다. 이러한 개인 초상권 침해 문제를 해결하고자 본 논문은 비디오에서 일반인의 얼굴을 검출하고 이에 마스킹을 가하는 방법을 제안한다. 제안 방법은 우선 딥러닝 기반의 Faster R-CNN을 이용하여 모자이킹을 하지 않을 특정인과 모자이킹을 가할 비특정인을 포함한 다수의 얼굴 영상을 학습한다. 학습된 네트워크를 이용하여 입력 비디오에 대해 사람의 얼굴을 검출하고 검출된 결과 중 특정인을 선별해 낸다. 최종적으로 입력 비디오에서 특정인을 제외한 나머지 검출된 얼굴에 대해 모자이킹 처리를 수행함으로써 비디오에서 지능적으로 비특정인의 얼굴을 가린다. 실험결과, 특정인과 비특정인을 포함한 얼굴 검출의 경우 99%의 정확도를 보였으며, 얼굴 검출 결과 중 특정인을 정확히 맞춘 경우는 86%의 정확도를 보였다. 제안 방법은 인터넷 동영상 서비스 및 방송 분야에서 개인 정보 보호를 위해 효과적으로 활용될 수 있을 것으로 기대된다.
PDF

Feature map reordering for Neural Network feature map coding (신경망 특징맵 부호화를 위한 특징맵 재배열 방법)

Han, Heeji;Kwak, Sangwoon;Yun, Joungil;Cheong, Won-Sik;Seo, Jeongil;Choi, Haechul
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2020.11a
- /
- pp.180-182
- /
- 2020
최근 IoT 기술이 대중화됨에 따라 커넥티드 카, 스마트 시티와 같은 machine-to-machine 기술의 활용 분야가 다양화되고 있다. 이에 따라, 기계 지향 비디오 처리 및 부호화 기술에 대한 연구분야에 산업계와 학계의 관심 역시 집중되고 있다. 국제 표준화 단체인 MPEG은 이러한 추세를 반영하여 기존 비디오 부호화 표준을 개선할 새로운 표준을 수립하기 위해 Video Coding for Machines (VCM) 그룹을 구성하여 기계 소비를 대상으로 하는 비디오 표준의 표준화를 진행하고 있다. 이에 본 논문에서는 VCM이 기계 소비를 대상으로 진행하고 있는 특징맵 부호화의 부호화 효율을 개선하기 위해 특징맵을 시간적, 공간적으로 재정렬하는 방법을 제안한다. 실험 결과, 제안 방법이 CityScapes의 검증 세트 내 일부 이미지에 대해 시간적 재정렬을 수행한 결과 random access 조건에서 최대 1.48%의 부호화 효율이 향상됨이 확인되었다.
PDF

Temporally adaptive and region-selective signaling of applying multiple neural network models

Ki, Sehwan;Kim, Munchurl
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2020.11a
- /
- pp.237-240
- /
- 2020
The fine-tuned neural network (NN) model for a whole temporal portion in a video does not always yield the best quality (e.g., PSNR) performance over all regions of each frame in the temporal period. For certain regions (usually homogeneous regions) in a frame for super-resolution (SR), even a simple bicubic interpolation method may yield better PSNR performance than the fine-tuned NN model. When there are multiple NN models available at the receivers where each NN model is trained for a group of images having a specific category of image characteristics, the performance of Quality enhancement can be improved by selectively applying an appropriate NN model for each image region according to its image characteristic category to which the NN model was dedicatedly trained. In this case, it is necessary to signal which NN model is applied for each region. This is very advantageous for image restoration and quality enhancement (IRQE) applications at user terminals with limited computing capabilities.
PDF

An Emergency Rescue System based on Real-time Video Processing (실시간 영상 전송 기술을 활용한 응급 구조 시스템)

Lee, Hyeonggeon;Park, Junho;Cheon, Jaeyoon;Lim, Jeonghoon;Oh, Myeongseong;Moon, Dongjin;Jang, Hyunsu;Kim, Jeongseok;Koh, Seokjoo
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2020.11a
- /
- pp.277-279
- /
- 2020
최근 무선통신기술의 발달로 텍스트나 이미지 등 적은 양의 데이터를 송출하는 것을 넘어 동영상과 같은 많은 양의 데이터 전송이 가능해졌다. 이에 본 논문은 실시간으로 사고의 상황을 효과적으로 구조기관에 전달하기 위해 GPS와 각종 센서를 활용한 GPS 데이터 및 비디오를 실시간으로 전송하는 무선 네트워크 상황 전파 시스템을 제안한다. Raspberry pi module의 카메라와 GPS 데이터는 ffmpeg와 ffserver를 사용하여 서버와 구조기관으로 실시간 송출 및 전송된다. 제안된 시스템은 실제 프로토타입으로 구현되었으며, 실험 결과 제안한 시스템은 즉각적으로 구조기관에 영상 및 GPS 좌표를 송출함으로써 조기에 사고상황을 파악하고 빠른 구조에 이바지함을 보여준다.
PDF

Feature map channel reordering and compression for Neural Network feature map coding (신경망 특징맵 부호화를 위한 특징맵 재배열 및 압축 방법)

Han, Heeji;Kwak, Sangwoon;Yun, Joungil;Cheong, Won-Sik;Seo, Jeongil;Choi, Haechul
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2021.06a
- /
- pp.39-42
- /
- 2021
최근 영상 혹은 비디오를 이용한 신경망 기반 기술들이 활발히 응용되고 있으며, 신경망이 처리하는 임무도 다양하고 복잡해지고 있다. 이러한 신경망 임무의 다양성과 복잡성은 더욱 많은 비디오 데이터를 요구하기 때문에 비디오 데이터를 효과적으로 전송할 방법이 필요하다. 이에 따라 국제 표준화 단체인 MPEG 에서는 신경망 기계 소비에 적합한 비디오 부호화 표준 개발을 위해서 Video Coding for Machines 표준화를 진행하고 있다. 본 논문에서는 신경망의 특징 맵 부호화 효율을 개선하기 위해 특징 맵 채널 간의 유사도가 높도록 특징맵 채널을 재배열하여 압축하는 방법을 제안한다. 제안 방법으로 VCM 의 OpenImages 데이터셋의 5000 개 검증 영상 중 임의 선택된 360 개 영상에 대해 부호화 효율을 평가한 결과, 객체 검출 임무의 정확도가 유지되면서 모든 양자화 값에 대해 화소당 비트수가 감소했으며, BD-rate 측면에서 2.07%의 부호화 이득을 얻었다.
PDF

Non-manner parking enforcement system (비매너 주차 단속시스템)

Park, Sang-min;Son, Byung-Soo;Kim, Myung-sik;Choe, Byeong-Yun
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2021.05a
- /
- pp.603-604
- /
- 2021
It is a enforcement system to prevent collisions caused by unmanageable parking that may occur in parking lots. There are handicapped people who can get up in parking lots, general vehicles parked in electric vehicle parking areas, and vehicles parked in two lanes. The vehicle above is detected and notified through the deep learning object recognition function. By using a picture or video of an unmanageable parking situation as learning data, the learning data is produced so that the situation can be recognized, and the situation is recognized to determine the presence or absence of unmanageable parking. The purpose is to reduce collisions between parking lot users by making the environment of the parking lot more comfortable.
PDF

Video System for Real-time Criminal Activity Detection (실시간 범죄행위 감지를 위한 영상시스템)

Shin, Kwang-seong;Shin, Seong-yoon
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2021.05a
- /
- pp.357-358
- /
- 2021
Although many people watch the scene with multiple surveillance cameras, it is difficult to ensure that immediate action can be taken in the event of a crime. Therefore, there is a need for a "crime behavior detection system" that can analyze images in real time from multiple surveillance cameras installed in elevators, call immediate crime alerts, and track crime scenes and times effectively. In this paper, a study was conducted to detect violent scenes occurring in elevators using Scene Change Detection. For effective detection, an x2-color histogram combining color histogram and histogram was applied.
PDF

Loss Compression and Loss Correction Technique of 3D Point Cloud Data (3차원 데이터의 손실압축과 손실보정기법 연구)

Shin, Kwang-seong;Shin, Seong-yoon
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2021.05a
- /
- pp.351-352
- /
- 2021
Due to the recent rapid change in the social environment due to Corona 19, the need for non-face-to-face/contact-based information exchange technology is rapidly emerging. Due to these changes, the development of an alternative system using a sense of immersion and a sense of presence is urgently required. In this study, in order to implement a video conferencing system, we implemented a technology for transmitting large-capacity 3D data in real time without delay. For this, the applied algorithm of GAN, the latest deep learning algorithm of the unsupervised learning series, was used.
PDF

Aural-visual two-stream based infant cry recognition (Aural-visual two-stream 기반의 아기 울음소리 식별)

Bo, Zhao;Lee, Jonguk;Atif, Othmane;Park, Daihee;Chung, Yongwha
- Proceedings of the Korea Information Processing Society Conference
- /
- 2021.05a
- /
- pp.354-357
- /
- 2021
Infants communicate their feelings and needs to the outside world through non-verbal methods such as crying and displaying diverse facial expressions. However, inexperienced parents tend to decode these non-verbal messages incorrectly and take inappropriate actions, which might affect the bonding they build with their babies and the cognitive development of the newborns. In this paper, we propose an aural-visual two-stream based infant cry recognition system to help parents comprehend the feelings and needs of crying babies. The proposed system first extracts the features from the pre-processed audio and video data by using the VGGish model and 3D-CNN model respectively, fuses the extracted features using a fully connected layer, and finally applies a SoftMax function to classify the fused features and recognize the corresponding type of cry. The experimental results show that the proposed system classification exceeds 0.92 in F1-score, which is 0.08 and 0.10 higher than the single-stream aural model and single-stream visual model.
https://doi.org/10.3745/PKIPS.y2021m05a.354 인용 PDF

Search Result 2,921, Processing Time 0.03 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)