• Title/Summary/Keyword: 생성적 적대적 신경망

Search Result 119, Processing Time 0.031 seconds

A Study on Audio Watermarking based on Deep Learning (딥러닝 기반의 오디오 워터마킹 기술 연구)

  • Hur, Jung-Hoe;Woo, Seongmi;Lee, Daewon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.05a
    • /
    • pp.153-156
    • /
    • 2022
  • 오늘날 이미지 및 오디오와 같은 디지털 미디어의 활용이 급격하게 증가함에 따라, 디지털 콘텐츠의 저작권을 보호하기 위한 워터마킹 기술의 중요성이 대두되고 있다. 최근 딥러닝 기반 이미지 워터마킹 기술에 대한 다양한 연구 결과가 발표되고 있는 반면, 딥러닝을 이용한 오디오 워터마킹에 관련된 연구는 미진한 것이 현실이다. 본 논문에서는 딥러닝을 기반으로 오디오 워터마킹 기술을 개발하기 위한 오토인코더 모델 및 생성적 적대 신경망 모델에 대해 제안한다.

360 RGBD Image Synthesis from a Sparse Set of Images with Narrow Field-of-View (소수의 협소화각 RGBD 영상으로부터 360 RGBD 영상 합성)

  • Kim, Soojie;Park, In Kyu
    • Journal of Broadcast Engineering
    • /
    • v.27 no.4
    • /
    • pp.487-498
    • /
    • 2022
  • Depth map is an image that contains distance information in 3D space on a 2D plane and is used in various 3D vision tasks. Many existing depth estimation studies mainly use narrow FoV images, in which a significant portion of the entire scene is lost. In this paper, we propose a technique for generating 360° omnidirectional RGBD images from a sparse set of narrow FoV images. The proposed generative adversarial network based image generation model estimates the relative FoV for the entire panoramic image from a small number of non-overlapping images and produces a 360° RGB and depth image simultaneously. In addition, it shows improved performance by configuring a network reflecting the spherical characteristics of the 360° image.

Optimization of Abdominal X-ray Images using Generative Adversarial Network to Realize Minimized Radiation Dose (방사선 조사선량의 최소화를 위한 생성적 적대 신경망을 활용한 복부 엑스선 영상 최적화 연구)

  • Sangwoo Kim;Jae-Dong Rhim
    • Journal of the Korean Society of Radiology
    • /
    • v.17 no.2
    • /
    • pp.191-199
    • /
    • 2023
  • This study aimed to propose minimized radiation doses with an optimized abdomen x-ray image, which realizes a Deep Blind Image Super-Resolution Generative adversarial network (BSRGAN) technique. Entrance surface doses (ESD) measured were collected by changing exposure conditions. In the identical exposures, abdominal images were acquired and were processed with the BSRGAN. The images reconstructed by the BSRGAN were compared to a reference image with 80 kVp and 320 mA, which was evaluated by mean squared error (MSE), peak signal-to-noise ratio (PSNR), and structural similarity index measure (SSIM). In addition, signal profile analysis was employed to validate the effect of the images reconstructed by the BSRGAN. The exposure conditions with the lowest MSE (about 0.285) were shown in 90 kVp, 125 mA and 100 kVp, 100 mA, which decreased the ESD in about 52 to 53% reduction), exhibiting PSNR = 37.694 and SSIM = 0.999. The signal intensity variations in the optimized conditions rather decreased than that of the reference image. This means that the optimized exposure conditions would obtain reasonable image quality with a substantial decrease of the radiation dose, indicating it could sufficiently reflect the concept of As Low As Reasonably Achievable (ALARA) as the principle of radiation protection.

Anomaly detection performance improvement technique through weight matrix-based optical flow equalization (가중치 행렬 기반 광학 흐름 평활화를 통한 이상 행동 탐지 성능 향상 기법)

  • Lim, Hyun-seok;Kim, In-ki;Kang, Jaeyong;Gwak, Jeong-hwan
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2021.07a
    • /
    • pp.145-146
    • /
    • 2021
  • 본 연구에서는 카메라의 촬영 시점에 의해서 발생되는 원근감이 광학 흐름 생성에 어떠한 영향을 주는지 살펴보고 광학 흐름 기반 이상행동 탐지 솔루션의 성능을 고도화하기 위해 기존 광학 흐름 영상으로부터 소실점 기반 가중치 행렬을 계산하여 원근감에 따른 광학 흐름 정도를 평활하는 기법에 대해서 연구한다. 카메라의 뷰포인트에 따라 원근감의 발생 정도나 객체의 크기 및 움직임의 정도가 달라지게 되며, 이는 원본 영상 프레임을 광학 흐름의 크기와 방향성으로 표현하는 영상 변환 네트워크를 가진 생성적 적대 신경망을 학습할 때 정상적인 행동 패턴의 범위를 결정짓는 데 방해가 될 수 있다. 이러한 문제를 해결하기 위하여 데이터셋의 배경으로부터 소실점을 추출하고 원근감에 따라 결정되는 광학 흐름의 크기를 평활하는 기법을 개발하여 기존 모델의 성능과 비교하였으며, 프레임 단위의 정확도 성능이 5.75% 향상된 것으로 확인되었다.

  • PDF

A research on the possibility of restoring cultural assets of artificial intelligence through the application of artificial neural networks to roof tile(Wadang)

  • Kim, JunO;Lee, Byong-Kwon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.1
    • /
    • pp.19-26
    • /
    • 2021
  • Cultural assets excavated in historical areas have their own characteristics based on the background of the times, and it can be seen that their patterns and characteristics change little by little according to the history and the flow of the spreading area. Cultural properties excavated in some areas represent the culture of the time and some maintain their intact appearance, but most of them are damaged/lost or divided into parts, and many experts are mobilized to research the composition and repair the damaged parts. The purpose of this research is to learn patterns and characteristics of the past through artificial intelligence neural networks for such restoration research, and to restore the lost parts of the excavated cultural assets based on Generative Adversarial Network(GAN)[1]. The research is a process in which the rest of the damaged/lost parts are restored based on some of the cultural assets excavated based on the GAN. To recover some parts of dammed of cultural asset, through training with the 2D image of a complete cultural asset. This research is focused on how much recovered not only damaged parts but also reproduce colors and materials. Finally, through adopted this trained neural network to real damaged cultural, confirmed area of recovered area and limitation.

Development of Autonomous Vehicle Learning Data Generation System (자율주행 차량의 학습 데이터 자동 생성 시스템 개발)

  • Yoon, Seungje;Jung, Jiwon;Hong, June;Lim, Kyungil;Kim, Jaehwan;Kim, Hyungjoo
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.19 no.5
    • /
    • pp.162-177
    • /
    • 2020
  • The perception of traffic environment based on various sensors in autonomous driving system has a direct relationship with driving safety. Recently, as the perception model based on deep neural network is used due to the development of machine learning/in-depth neural network technology, a the perception model training and high quality of a training dataset are required. However, there are several realistic difficulties to collect data on all situations that may occur in self-driving. The performance of the perception model may be deteriorated due to the difference between the overseas and domestic traffic environments, and data on bad weather where the sensors can not operate normally can not guarantee the qualitative part. Therefore, it is necessary to build a virtual road environment in the simulator rather than the actual road to collect the traning data. In this paper, a training dataset collection process is suggested by diversifying the weather, illumination, sensor position, type and counts of vehicles in the simulator environment that simulates the domestic road situation according to the domestic situation. In order to achieve better performance, the authors changed the domain of image to be closer to due diligence and diversified. And the performance evaluation was conducted on the test data collected in the actual road environment, and the performance was similar to that of the model learned only by the actual environmental data.

A Case Study on an Educational Model of Medical AI Using Chest X-ray Synthetized by GAN (GAN 으로 합성된 흉부 X-ray 를 활용한 의료 인공지능 교육 모델에 관한 사례 연구)

  • Lee, Gyubin;Yoon, Yebin;Ham, Sojin;Bae, Hyun-Jin;You, Wonsang
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.11a
    • /
    • pp.887-890
    • /
    • 2021
  • 최근 AI 를 활용한 의료 진단 솔루션 시장이 크게 성장함에 따라 의료 인공지능 기술에 대한 대학 교육에 대한 수요가 증가하고 있지만, 개인정보 유출의 위험성 등으로 인하여 의료 데이터를 대학 교육에 활용하기 어려운 실정이다. 본 논문에서는 실제 의료 데이터 대신 생성적 적대 신경망(GAN)으로 합성된 흉부 X-ray 영상을 활용한 의료 인공지능 교육 모델의 사례를 제시한다. 프로메디우스(주)에 의해 제공받은 흉부 X-ray 합성영상을 사용하여, VGG-16 모델을 훈련하고 성능을 검증 및 평가하며 미세조정을 통해 성능을 개선하는 교육 모델을 구성하였다. 또한 교육모델이 의료 인공지능에 대한 학생들의 이해력 향상에 기여한 효과를 정량적으로 평가하였다.

Application of transfer learning to develop radar-based rainfall prediction model with GAN(Generative Adversarial Network) for multiple dam domains (다중 댐 유역에 대한 강우예측모델 개발을 위한 전이학습 기법의 적용)

  • Choi, Suyeon;Kim, Yeonjoo
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2022.05a
    • /
    • pp.61-61
    • /
    • 2022
  • 최근 머신러닝 기술의 발달에 따라 이를 활용한 레이더 자료기반 강우예측기법이 활발히 개발되고 있다. 기존 머신러닝을 이용한 강우예측모델 개발 관련 연구는 주로 한 지역에 대해 수행되며, 데이터 기반으로 훈련되는 머신러닝 기법의 특성상 개발된 모델이 훈련된 지역에 대해서만 좋은 성능을 보인다는 한계점이 존재한다. 이러한 한계점을 해결하기 위해 사전 훈련된 모델을 이용하여 새로운 데이터에 대해 모델을 훈련하는 전이학습 기법 (transfer learning)을 적용하여 여러 유역에 대한 강우예측모델을 개발하고자 하였다. 본 연구에서는 사전 훈련된 강우예측 모델로 생성적 적대 신경망 기반 기법(Generative Adversarial Network, GAN)을 이용한 미래 강우예측모델을 사용하였다. 해당 모델은 기상청에서 제공된 2014년~2017년 여름의 레이더 이미지 자료를 이용하여 초단기, 단기 강우예측을 수행하도록 학습시켰으며, 2018년 레이더 이미지 자료를 이용한 단기강우예측 모의에서 좋은 성능을 보였다. 본 연구에서는 훈련된 모델을 이용해 새로운 댐 유역(안동댐, 충주댐)에 대한 강우예측모델을 개발하기 위해 여러 전이학습 기법을 적용하고, 그 결과를 비교하였다. 결과를 통해 새로운 데이터로 처음부터 훈련시킨 모델보다 전이학습 기법을 사용하였을 때 좋은 성능을 보이는 것을 확인하였으며, 이를 통해 여러 댐 유역에 대한 모델 개발 시 전이학습 기법이 효율적으로 적용될 수 있음을 확인하였다.

  • PDF

A study on speech disentanglement framework based on adversarial learning for speaker recognition (화자 인식을 위한 적대학습 기반 음성 분리 프레임워크에 대한 연구)

  • Kwon, Yoohwan;Chung, Soo-Whan;Kang, Hong-Goo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.5
    • /
    • pp.447-453
    • /
    • 2020
  • In this paper, we propose a system to extract effective speaker representations from a speech signal using a deep learning method. Based on the fact that speech signal contains identity unrelated information such as text content, emotion, background noise, and so on, we perform a training such that the extracted features only represent speaker-related information but do not represent speaker-unrelated information. Specifically, we propose an auto-encoder based disentanglement method that outputs both speaker-related and speaker-unrelated embeddings using effective loss functions. To further improve the reconstruction performance in the decoding process, we also introduce a discriminator popularly used in Generative Adversarial Network (GAN) structure. Since improving the decoding capability is helpful for preserving speaker information and disentanglement, it results in the improvement of speaker verification performance. Experimental results demonstrate the effectiveness of our proposed method by improving Equal Error Rate (EER) on benchmark dataset, Voxceleb1.

Voice Conversion using Generative Adversarial Nets conditioned by Phonetic Posterior Grams (Phonetic Posterior Grams에 의해 조건화된 적대적 생성 신경망을 사용한 음성 변환 시스템)

  • Lim, Jin-su;Kang, Cheon-seong;Kim, Dong-Ha;Kim, Kyung-sup
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2018.10a
    • /
    • pp.369-372
    • /
    • 2018
  • This paper suggests non-parallel-voice-conversion network conversing voice between unmapped voice pair as source voice and target voice. Conventional voice conversion researches used learning methods that minimize spectrogram's distance error. Not only these researches have some problem that is lost spectrogram resolution by methods averaging pixels. But also have used parallel data that is hard to collect. This research uses PPGs that is input voice's phonetic data and a GAN learning method to generate more clear voices. To evaluate the suggested method, we conduct MOS test with GMM based Model. We found that the performance is improved compared to the conventional methods.

  • PDF