Search | Korea Science

Self-Supervised Spatiotemporal Learning For Video Using Variable Rotate Angle And Speed Prediction (비디오에서의 다양한 회전 각도와 회전 속도를 사용한 시 공간 자기 지도학습)

Kim, Taehoon;Hwang, Wonjun
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2020.07a
- /
- pp.732-735
- /
- 2020
기존에 지도학습 방법은 성능은 좋지만, 학습할 때 비디오 데이터와 정답 라벨이 있어야 한다. 그러나 이러한 데이터의 라벨을 수동으로 붙여줘야 하는 문제점과 그에 필요한 시간과 돈이 크다는 것이다. 이러한 문제점을 해결하기 위한 다양한 방법 중 자기지도학습(Self-Supervised Learning) 중 하나인 회전 방법을 비디오 데이터에 적용하여 학습하는 연구를 진행하였다. 본 연구에서는 두가지 방법을 제안한다. 먼저 기존의 비디오 데이터를 입력으로 받으면 단순히 비디오 자체를 회전시키는 것이 아닌 입력으로 들어온 비디오의 각각 프레임이 시간이 지나면서 일정한 속도로 회전을 시킨다. 이때의 회전은 총 네 가지 각도[0, 90, 180, 270]를 분류하도록 하는 방법론이다. 두 번째로 비디오의 프레임이 시간이 지나면서 변할 때 프레임 별로 고정된 각도로 회전시키는데 이때 회전하는 속도 네 가지 [1x, 0.5x, 0.25x, 0.125]를 분류하도록 하는 방법론이다. 이와 같은 제안하는 pretext task들을 통해 네트워크를 학습한 뒤, 학습된 모델을 fine tune 시켜 비디오 분류에 대한 실험을 수행 및 결과를 도출하였다.
PDF

GAN-based avatar generation and animation for video conferencing service (화상회의 서비스를 위한 GAN 기반 아바타 생성 및 애니메이션 구현 기술)

Moon, Ji-Eun;Kim, Ji-Yun;Park, Ji-Hye;Ahn, Hyo-Won;Lee, Kyoung-Mi
- Annual Conference of KIPS
- /
- 2022.11a
- /
- pp.761-763
- /
- 2022
코로나19 이후 화상회의 빈도가 높아지면서 줌 피로라는 신조어가 등장할 만큼 상대방을 가까이 마주하며 회의를 진행하는 것이 사람들의 피로도를 상승시키고 있다. 본 논문에서는 얼굴 합성과 이미지 애니메이션을 이용한 아바타를 통해 사용자가 화상회의에 참가할 수 있는 시스템을 제안한다. 사용자와 닮은 개성 있는 캐릭터는 실시간으로 사용자의 표정 및 움직임을 반영하여 화상회의에 적용될 수 있고 채팅과 커뮤니티에서 캐릭터의 이모티콘으로 감정을 표현할 수 있다.
https://doi.org/10.3745/PKIPS.y2022m11a.761 인용 PDF

Using Ensemble Learning Algorithm and AI Facial Expression Recognition, Healing Service Tailored to User's Emotion (앙상블 학습 알고리즘과 인공지능 표정 인식 기술을 활용한 사용자 감정 맞춤 힐링 서비스)

Yang, seong-yeon;Hong, Dahye;Moon, Jaehyun
- Annual Conference of KIPS
- /
- 2022.11a
- /
- pp.818-820
- /
- 2022
The keyword 'healing' is essential to the competitive society and culture of Koreans. In addition, as the time at home increases due to COVID-19, the demand for indoor healing services has increased. Therefore, this thesis analyzes the user's facial expression so that people can receive various 'customized' healing services indoors, and based on this, provides lighting, ASMR, video recommendation service, and facial expression recording service.The user's expression was analyzed by applying the ensemble algorithm to the expression prediction results of various CNN models after extracting only the face through object detection from the image taken by the user.
https://doi.org/10.3745/PKIPS.y2022m11a.818 인용 PDF

A Study on Dialect Expression in Korean-Based Speech Recognition (한국어 기반 음성 인식에서 사투리 표현에 관한 연구)

Lee, Sin-hyup
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2022.05a
- /
- pp.333-335
- /
- 2022
The development of speech recognition processing technology has been applied and used in various video and streaming services along with STT and TTS technologies. However, there are high barriers to clear written expression due to the use of dialects and overlapping of stop words, exclamations, and similar words for voice recognition of actual conversation content. In this study, for ambiguous dialects in speech recognition, we propose a speech recognition technology that applies dialect key word dictionary processing method by category and dialect prosody as speech recognition network model properties.
PDF

Ankle Flexion Information in Healthcare (헬스케어에서 발목의 굴곡 정보)

Shin, Seong-Yoon;Lee, Min-Hye;Shin, Kwang-Seong;Lee, Hyun-Chang
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2022.05a
- /
- pp.678-679
- /
- 2022
In this paper, information on ankle instability will be acquired and processed by selecting the area of flexion of the ankle among a diverse range of areas of the Healthcare System. In addition, re-damaging of patient's ankle will be prevented on the basis of such information. Moreover, the system of automatically measuring and managing the angles of dorsiflexion and plantarflexion of the ankle by using video will be presented in this System.
PDF

Design of Calibration Circuit for LCOS Microdisplay (LCOS 마이크로디스플레이 구동용 보정회로 설계)

Lee, Youn-Sung;Wee, Jung-Wook;Han, Chung-Woo;Song, Nam-Chol
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2022.10a
- /
- pp.469-471
- /
- 2022
This paper presents an implementation of a calibration circuit to correct the gain error, DC offset and sampling clock phase error generated in the process of converting digital pixels to analog pixels to drive an analog-driven 4K UHD LCOS panel. The proposed calibration circuit consists of a gain and DC offset adjustment circuit and a sampling clock phase adjustment circuit. The calibration circuit is implemented with an FPGA device, and video amplifiers.
PDF

Fast Grid-Based Refine Segmentation on V-PCC encoder (V-PCC 부호화기의 그리드 기반 세그먼트 정제 고속화)

Kim, Yura;Kim, Yong-Hwan
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2022.06a
- /
- pp.265-268
- /
- 2022
Video-based Point Cloud Compression(V-PCC) 부호화기의 세그먼트 정제(Refining segmentation) 과정은 3D 세그먼트를 2D 패치 데이터로 효율적으로 변환하기 위한 V-PCC 부호화기의 핵심 파트이지만, 많은 연산량을 필요로 하는 모듈이다. 때문에 이미 TMC2 에 Fast Grid-based refine segmentation 과정이 구현되어 있으나, 아직도 세그먼트 정제 기술의 연산량은 매우 높은 편이다. 본 논문에서는 현재 TMC2 에 구현되어 있는 Fast Gridbased Refine Segmentation 을 살펴보고, 복셀(Voxel) 타입에 따른 특성에 맞춰 두 가지 조건을 추가하는 고속화 알고리즘을 제안한다. 실험 결과 압축성능(BD-BR)은 TMC2 와 거의 차이를 보이지 않았지만, 모듈 단위 평균 10% 연산량이 절감되는 것을 확인하였다.
PDF

Efficient Signaling of Extended GPM Modes in ECM (ECM 의 효율적인 GPM 확장 모드 시그널링 기법)

Moon, Gihwa;Lee, Jiwon;Park, Dohyeon;Kim, Jae-Gon
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2022.06a
- /
- pp.1236-1238
- /
- 2022
JVET 은 최신 비디오 부호화 표준인 VVC(Versatile Video Coding) 표준화를 완료한 후, VVC 보다 더 높은 압축 성능을 가지는 새로운 표준기술 탐색을 진행하고 있으며, 이를 위하여 참조 소프트웨어 ECM(Enhanced Compression Model)을 개발하고 있다. 현재 ECM4.0 에는 다양한 후보 구성 및 예측 성능 개선 기법을 추가하여 기존 VVC 의 GPM(Geometric Partitioning Mode)을 확장한 GPM-MMVD(GPM with merge MV differences), GPM-TM(GPM with template matching) 등을 채택하고 있다. 본 논문에서는 ECM 에 채택된 확장된 GPM 기술들의 각 기술 별 선택 빈도를 분석하고 이를 바탕으로 보다 효율적인 GPM 확장 모드 시그널링 방식을 제안한다. 또한 후보 탐색 알고리즘을 간소화한 복잡도 감소 기법을 제시한다. 실험결과 제안하는 시그널링 기법은 ECM4.0 대비 Y와 Cb, Cr 에서 각각 0.02%, 0.16%, 0.09% BD-rate 부호화 성능 향상을 보였고 GPM 인덱스 탐색 간소화 기법은 ECM4.0 대비 Y 와 Cr 에서 각각 0.02%, 0.18% BD-rate 부호화 성능 향상을 보였다.
PDF

Enhanced video frame interpolation based on NAFNet (NAFNet 기반 개선된 비디오 프레임 보간 기법)

Yoon, Kihwan;Jeong, Jinwoo;Kim, Sungjei;Huh, Jingang
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2022.06a
- /
- pp.1333-1335
- /
- 2022
최근 딥러닝은 다양한 컴퓨터 비전에 적용되어 높은 성능을 제공하고 있고 이에 따라 중간 프레임을 생성하는 비디오 프레임 보간 기법에도 딥러닝이 적용되고 있다. 많은 딥러닝 기반의 비디오 프레임 보간 기법은 크게 옵티컬 플로우를 추정하는 플로우 추정 네트워크와 합성 네트워크로 구성되며 본 논문에서는 합성 네트워크 부분의 성능향상을 위한 네트워크에 대하여 다룬다. 합성 네트워크에 주로 사용되는 UNet 구조와 GridNet 구조의 장단점과 네트워크에 따른 보간 결과의 차이에 대해서 알아보고 영상 복원에서 제안된 NAFNet 을 비디오 보간 기법에 맞게 변형시켜 합성 네트워크에 적용한 보간 결과의 차이를 보였다. 실험결과는 기존 네트워크 대비 Vimeo90K 데이터셋에 대하여 PSNR 값이 0.63dB 개선됨을 보여준다.
PDF

FPGA-based Object Recognition System (FPGA기반 객체인식 시스템)

Shin, Seong-Yoon;Cho, Gwang-Hyun;Cho, Seung-Pyo;Shin, Kwang-Seong
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2022.10a
- /
- pp.407-408
- /
- 2022
In this paper, we will look at the components of the FPGA-based object recognition system one by one. Let's take a look at each function of the components camera, DLM, service system, video output monitor, deep trainer software, and external deep learning software.
PDF

Search Result 2,926, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)