• Title/Summary/Keyword: adversarial training

Search Result 102, Processing Time 0.029 seconds

Land Use and Land Cover Mapping from Kompsat-5 X-band Co-polarized Data Using Conditional Generative Adversarial Network

  • Jang, Jae-Cheol;Park, Kyung-Ae
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.1
    • /
    • pp.111-126
    • /
    • 2022
  • Land use and land cover (LULC) mapping is an important factor in geospatial analysis. Although highly precise ground-based LULC monitoring is possible, it is time consuming and costly. Conversely, because the synthetic aperture radar (SAR) sensor is an all-weather sensor with high resolution, it could replace field-based LULC monitoring systems with low cost and less time requirement. Thus, LULC is one of the major areas in SAR applications. We developed a LULC model using only KOMPSAT-5 single co-polarized data and digital elevation model (DEM) data. Twelve HH-polarized images and 18 VV-polarized images were collected, and two HH-polarized images and four VV-polarized images were selected for the model testing. To train the LULC model, we applied the conditional generative adversarial network (cGAN) method. We used U-Net combined with the residual unit (ResUNet) model to generate the cGAN method. When analyzing the training history at 1732 epochs, the ResUNet model showed a maximum overall accuracy (OA) of 93.89 and a Kappa coefficient of 0.91. The model exhibited high performance in the test datasets with an OA greater than 90. The model accurately distinguished water body areas and showed lower accuracy in wetlands than in the other LULC types. The effect of the DEM on the accuracy of LULC was analyzed. When assessing the accuracy with respect to the incidence angle, owing to the radar shadow caused by the side-looking system of the SAR sensor, the OA tended to decrease as the incidence angle increased. This study is the first to use only KOMPSAT-5 single co-polarized data and deep learning methods to demonstrate the possibility of high-performance LULC monitoring. This study contributes to Earth surface monitoring and the development of deep learning approaches using the KOMPSAT-5 data.

A New Image Processing Scheme For Face Swapping Using CycleGAN (순환 적대적 생성 신경망을 이용한 안면 교체를 위한 새로운 이미지 처리 기법)

  • Ban, Tae-Won
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.9
    • /
    • pp.1305-1311
    • /
    • 2022
  • With the recent rapid development of mobile terminals and personal computers and the advent of neural network technology, real-time face swapping using images has become possible. In particular, the cycle generative adversarial network made it possible to replace faces using uncorrelated image data. In this paper, we propose an input data processing scheme that can improve the quality of face swapping with less training data and time. The proposed scheme can improve the image quality while preserving facial structure and expression information by combining facial landmarks extracted through a pre-trained neural network with major information that affects the structure and expression of the face. Using the blind/referenceless image spatial quality evaluator (BRISQUE) score, which is one of the AI-based non-reference quality metrics, we quantitatively analyze the performance of the proposed scheme and compare it to the conventional schemes. According to the numerical results, the proposed scheme obtained BRISQUE scores improved by about 4.6% to 14.6%, compared to the conventional schemes.

Synthesis of T2-weighted images from proton density images using a generative adversarial network in a temporomandibular joint magnetic resonance imaging protocol

  • Chena, Lee;Eun-Gyu, Ha;Yoon Joo, Choi;Kug Jin, Jeon;Sang-Sun, Han
    • Imaging Science in Dentistry
    • /
    • v.52 no.4
    • /
    • pp.393-398
    • /
    • 2022
  • Purpose: This study proposed a generative adversarial network (GAN) model for T2-weighted image (WI) synthesis from proton density (PD)-WI in a temporomandibular joint(TMJ) magnetic resonance imaging (MRI) protocol. Materials and Methods: From January to November 2019, MRI scans for TMJ were reviewed and 308 imaging sets were collected. For training, 277 pairs of PD- and T2-WI sagittal TMJ images were used. Transfer learning of the pix2pix GAN model was utilized to generate T2-WI from PD-WI. Model performance was evaluated with the structural similarity index map (SSIM) and peak signal-to-noise ratio (PSNR) indices for 31 predicted T2-WI (pT2). The disc position was clinically diagnosed as anterior disc displacement with or without reduction, and joint effusion as present or absent. The true T2-WI-based diagnosis was regarded as the gold standard, to which pT2-based diagnoses were compared using Cohen's ĸ coefficient. Results: The mean SSIM and PSNR values were 0.4781(±0.0522) and 21.30(±1.51) dB, respectively. The pT2 protocol showed almost perfect agreement(ĸ=0.81) with the gold standard for disc position. The number of discordant cases was higher for normal disc position (17%) than for anterior displacement with reduction (2%) or without reduction (10%). The effusion diagnosis also showed almost perfect agreement(ĸ=0.88), with higher concordance for the presence (85%) than for the absence (77%) of effusion. Conclusion: The application of pT2 images for a TMJ MRI protocol useful for diagnosis, although the image quality of pT2 was not fully satisfactory. Further research is expected to enhance pT2 quality.

GENERATION OF FUTURE MAGNETOGRAMS FROM PREVIOUS SDO/HMI DATA USING DEEP LEARNING

  • Jeon, Seonggyeong;Moon, Yong-Jae;Park, Eunsu;Shin, Kyungin;Kim, Taeyoung
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.44 no.1
    • /
    • pp.82.3-82.3
    • /
    • 2019
  • In this study, we generate future full disk magnetograms in 12, 24, 36 and 48 hours advance from SDO/HMI images using deep learning. To perform this generation, we apply the convolutional generative adversarial network (cGAN) algorithm to a series of SDO/HMI magnetograms. We use SDO/HMI data from 2011 to 2016 for training four models. The models make AI-generated images for 2017 HMI data and compare them with the actual HMI magnetograms for evaluation. The AI-generated images by each model are very similar to the actual images. The average correlation coefficient between the two images for about 600 data sets are about 0.85 for four models. We are examining hundreds of active regions for more detail comparison. In the future we will use pix2pix HD and video2video translation networks for image prediction.

  • PDF

Generation of global coronal field extrapolation from frontside and AI-generated farside magnetograms

  • Jeong, Hyunjin;Moon, Yong-Jae;Park, Eunsu;Lee, Harim;Kim, Taeyoung
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.44 no.1
    • /
    • pp.52.2-52.2
    • /
    • 2019
  • Global map of solar surface magnetic field, such as the synoptic map or daily synchronic frame, does not tell us real-time information about the far side of the Sun. A deep-learning technique based on Conditional Generative Adversarial Network (cGAN) is used to generate farside magnetograms from EUVI $304{\AA}$ of STEREO spacecrafts by training SDO spacecraft's data pairs of HMI and AIA $304{\AA}$. Farside(or backside) data of daily synchronic frames are replaced by the Ai-generated magnetograms. The new type of data is used to calculate the Potential Field Source Surface (PFSS) model. We compare the results of the global field with observations as well as those of the conventional method. We will discuss advantage and disadvantage of the new method and future works.

  • PDF

Design of Image Generation System for DCGAN-Based Kids' Book Text

  • Cho, Jaehyeon;Moon, Nammee
    • Journal of Information Processing Systems
    • /
    • v.16 no.6
    • /
    • pp.1437-1446
    • /
    • 2020
  • For the last few years, smart devices have begun to occupy an essential place in the life of children, by allowing them to access a variety of language activities and books. Various studies are being conducted on using smart devices for education. Our study extracts images and texts from kids' book with smart devices and matches the extracted images and texts to create new images that are not represented in these books. The proposed system will enable the use of smart devices as educational media for children. A deep convolutional generative adversarial network (DCGAN) is used for generating a new image. Three steps are involved in training DCGAN. Firstly, images with 11 titles and 1,164 images on ImageNet are learned. Secondly, Tesseract, an optical character recognition engine, is used to extract images and text from kids' book and classify the text using a morpheme analyzer. Thirdly, the classified word class is matched with the latent vector of the image. The learned DCGAN creates an image associated with the text.

A multi-label Classification of Attributes on Face Images

  • Le, Giang H.;Lee, Yeejin
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2021.06a
    • /
    • pp.105-108
    • /
    • 2021
  • Generative adversarial networks (GANs) have reached a great result at creating the synthesis image, especially in the face generation task. Unlike other deep learning tasks, the input of GANs is usually the random vector sampled by a probability distribution, which leads to unstable training and unpredictable output. One way to solve those problems is to employ the label condition in both the generator and discriminator. CelebA and FFHQ are the two most famous datasets for face image generation. While CelebA contains attribute annotations for more than 200,000 images, FFHQ does not have attribute annotations. Thus, in this work, we introduce a method to learn the attributes from CelebA then predict both soft and hard labels for FFHQ. The evaluated result from our model achieves 0.7611 points of the metric is the area under the receiver operating characteristic curve.

  • PDF

Adversarial Training for Grammatical Error Correction (문법 오류 교정을 위한 적대적 학습 방법)

  • Kwon, Soonchoul;Lee, Gary Geunbae
    • Annual Conference on Human and Language Technology
    • /
    • 2020.10a
    • /
    • pp.446-449
    • /
    • 2020
  • 최근 성공적인 문법 오류 교정 연구들에는 복잡한 인공신경망 모델이 사용되고 있다. 그러나 이러한 모델을 훈련할 수 있는 공개 데이터는 필요에 비해 부족하여 과적합 문제를 일으킨다. 이 논문에서는 적대적 훈련 방법을 적용해 문법 오류 교정 분야의 과적합 문제를 해결하는 방법을 탐색한다. 모델의 비용을 증가시키는 경사를 이용한 fast gradient sign method(FGSM)와, 인공신경망을 이용해 모델의 비용을 증가시키기 위한 변동을 학습하는 learned perturbation method(LPM)가 실험되었다. 실험 결과, LPM은 모델 훈련에 효과가 없었으나, FGSM은 적대적 훈련을 사용하지 않은 모델보다 높은 F0.5 성능을 보이는 것이 확인되었다.

  • PDF

Adversarial Training Method for Handling Class Imbalance Problems in Dialog Datasets (대화 데이터셋의 클래스 불균형 문제 보정을 위한 적대적 학습 기법)

  • Cho, Su-Phil;Choi, Yong Suk
    • Annual Conference on Human and Language Technology
    • /
    • 2019.10a
    • /
    • pp.434-439
    • /
    • 2019
  • 딥러닝 기반 분류 모델에 있어 데이터의 클래스 불균형 문제는 소수 클래스의 분류 성능을 크게 저하시킨다. 본 논문에서는 앞서 언급한 클래스 불균형 문제를 보완하기 위한 방안으로 적대적 학습 기법을 제안한다. 적대적 학습 기법의 성능 향상 여부를 확인하기 위해 총 4종의 딥러닝 기반 분류 모델을 정의하였으며, 해당 모델 간 분류 성능을 비교하였다. 실험 결과, 대화 데이터셋을 이용한 모델 학습 시 적대적 학습 기법을 적용할 경우 다수 클래스의 분류 성능은 유지하면서 동시에 소수 클래스의 분류 성능을 크게 향상시킬 수 있음을 확인하였다.

  • PDF

Generative optical flow based abnormal object detection method using a spatio-temporal translation network

  • Lim, Hyunseok;Gwak, Jeonghwan
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.4
    • /
    • pp.11-19
    • /
    • 2021
  • An abnormal object refers to a person, an object, or a mechanical device that performs abnormal and unusual behavior and needs observation or supervision. In order to detect this through artificial intelligence algorithm without continuous human intervention, a method of observing the specificity of temporal features using optical flow technique is widely used. In this study, an abnormal situation is identified by learning an algorithm that translates an input image frame to an optical flow image using a Generative Adversarial Network (GAN). In particular, we propose a technique that improves the pre-processing process to exclude unnecessary outliers and the post-processing process to increase the accuracy of identification in the test dataset after learning to improve the performance of the model's abnormal behavior identification. UCSD Pedestrian and UMN Unusual Crowd Activity were used as training datasets to detect abnormal behavior. For the proposed method, the frame-level AUC 0.9450 and EER 0.1317 were shown in the UCSD Ped2 dataset, which shows performance improvement compared to the models in the previous studies.