• Title/Summary/Keyword: Image-to-image Translation

Search Result 306, Processing Time 0.029 seconds

Facial Feature Based Image-to-Image Translation Method

  • Kang, Shinjin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.12
    • /
    • pp.4835-4848
    • /
    • 2020
  • The recent expansion of the digital content market is increasing the technical demand for various facial image transformations within the virtual environment. The recent image translation technology enables changes between various domains. However, current image-to-image translation techniques do not provide stable performance through unsupervised learning, especially for shape learning in the face transition field. This is because the face is a highly sensitive feature, and the quality of the resulting image is significantly affected, especially if the transitions in the eyes, nose, and mouth are not effectively performed. We herein propose a new unsupervised method that can transform an in-wild face image into another face style through radical transformation. Specifically, the proposed method applies two face-specific feature loss functions for a generative adversarial network. The proposed technique shows that stable domain conversion to other domains is possible while maintaining the image characteristics in the eyes, nose, and mouth.

Multi Cycle Consistent Adversarial Networks for Multi Attribute Image to Image Translation

  • Jo, Seok Hee;Cho, Kyu Cheol
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.9
    • /
    • pp.63-69
    • /
    • 2020
  • Image-image translation is a technology that creates a target image through input images, and has recently shown high performance in creating a more realistic image by utilizing GAN, which is a non-map learning structure. Therefore, there are various studies on image-to-image translation using GAN. At this point, most image-to-image translations basically target one attribute translation. But the data used and obtainable in real life consist of a variety of features that are hard to explain with one feature. Therefore, if you aim to change multiple attributes that can divide the image creation process by attributes to take advantage of the various attributes, you will be able to play a better role in image-to-image translation. In this paper, we propose Multi CycleGAN, a dual attribute transformation structure, by utilizing CycleGAN, which showed high performance among image-image translation structures using GAN. This structure implements a dual transformation structure in which three domains conduct two-way learning to learn about the two properties of an input domain. Experiments have shown that images through the new structure maintain the properties of the input area and show high performance with the target properties applied. Using this structure, it is possible to create more diverse images in the future, so we can expect to utilize image generation in more diverse areas.

Performance Improvement of Image-to-Image Translation with RAPGAN and RRDB (RAPGAN와 RRDB를 이용한 Image-to-Image Translation의 성능 개선)

  • Dongsik Yoon;Noyoon Kwak
    • Journal of Internet of Things and Convergence
    • /
    • v.9 no.1
    • /
    • pp.131-138
    • /
    • 2023
  • This paper is related to performance improvement of Image-to-Image translation using Relativistic Average Patch GAN and Residual in Residual Dense Block. The purpose of this paper is to improve performance through technical improvements in three aspects to compensate for the shortcomings of the previous pix2pix, a type of Image-to-Image translation. First, unlike the previous pix2pix constructor, it enables deeper learning by using Residual in Residual Block in the part of encoding the input image. Second, since we use a loss function based on Relativistic Average Patch GAN to predict how real the original image is compared to the generated image, both of these images affect adversarial generative learning. Finally, the generator is pre-trained to prevent the discriminator from being learned prematurely. According to the proposed method, it was possible to generate images superior to the previous pix2pix by more than 13% on average at the aspect of FID.

A Study on the Translation Invariant Matching Algorithm for Fingerprint Recognition (위치이동에 무관한 지문인식 정합 알고리즘에 관한 연구)

  • Kim, Eun-Hee;Cho, Seong-Won;Kim, Jae-Min
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.51 no.2
    • /
    • pp.61-68
    • /
    • 2002
  • This paper presents a new matching algorithm for fingerprint recognition, which is robust to image translation. The basic idea of this paper is to estimate the translation vector of an imput fingerprint image using N minutiae at which the gradient of the ridge direction field is large. Using the estimated translation vector we select minutiae irrelevant to the translation. We experimentally prove that the presented algorithm results in good performance even if there are large translation and pseudo-minutiae.

Pseudo-RGB-based Place Recognition through Thermal-to-RGB Image Translation (열화상 영상의 Image Translation을 통한 Pseudo-RGB 기반 장소 인식 시스템)

  • Seunghyeon Lee;Taejoo Kim;Yukyung Choi
    • The Journal of Korea Robotics Society
    • /
    • v.18 no.1
    • /
    • pp.48-52
    • /
    • 2023
  • Many studies have been conducted to ensure that Visual Place Recognition is reliable in various environments, including edge cases. However, existing approaches use visible imaging sensors, RGB cameras, which are greatly influenced by illumination changes, as is widely known. Thus, in this paper, we use an invisible imaging sensor, a long wave length infrared camera (LWIR) instead of RGB, that is shown to be more reliable in low-light and highly noisy conditions. In addition, although the camera sensor used to solve this problem is an LWIR camera, but since the thermal image is converted into RGB image the proposed method is highly compatible with existing algorithms and databases. We demonstrate that the proposed method outperforms the baseline method by about 0.19 for recall performance.

GAN-based Image-to-image Translation using Multi-scale Images (다중 스케일 영상을 이용한 GAN 기반 영상 간 변환 기법)

  • Chung, Soyoung;Chung, Min Gyo
    • The Journal of the Convergence on Culture Technology
    • /
    • v.6 no.4
    • /
    • pp.767-776
    • /
    • 2020
  • GcGAN is a deep learning model to translate styles between images under geometric consistency constraint. However, GcGAN has a disadvantage that it does not properly maintain detailed content of an image, since it preserves the content of the image through limited geometric transformation such as rotation or flip. Therefore, in this study, we propose a new image-to-image translation method, MSGcGAN(Multi-Scale GcGAN), which improves this disadvantage. MSGcGAN, an extended model of GcGAN, performs style translation between images in a direction to reduce semantic distortion of images and maintain detailed content by learning multi-scale images simultaneously and extracting scale-invariant features. The experimental results showed that MSGcGAN was better than GcGAN in both quantitative and qualitative aspects, and it translated the style more naturally while maintaining the overall content of the image.

SkelGAN: A Font Image Skeletonization Method

  • Ko, Debbie Honghee;Hassan, Ammar Ul;Majeed, Saima;Choi, Jaeyoung
    • Journal of Information Processing Systems
    • /
    • v.17 no.1
    • /
    • pp.1-13
    • /
    • 2021
  • In this research, we study the problem of font image skeletonization using an end-to-end deep adversarial network, in contrast with the state-of-the-art methods that use mathematical algorithms. Several studies have been concerned with skeletonization, but a few have utilized deep learning. Further, no study has considered generative models based on deep neural networks for font character skeletonization, which are more delicate than natural objects. In this work, we take a step closer to producing realistic synthesized skeletons of font characters. We consider using an end-to-end deep adversarial network, SkelGAN, for font-image skeletonization, in contrast with the state-of-the-art methods that use mathematical algorithms. The proposed skeleton generator is proved superior to all well-known mathematical skeletonization methods in terms of character structure, including delicate strokes, serifs, and even special styles. Experimental results also demonstrate the dominance of our method against the state-of-the-art supervised image-to-image translation method in font character skeletonization task.

U-net and Residual-based Cycle-GAN for Improving Object Transfiguration Performance (물체 변형 성능을 향상하기 위한 U-net 및 Residual 기반의 Cycle-GAN)

  • Kim, Sewoon;Park, Kwang-Hyun
    • The Journal of Korea Robotics Society
    • /
    • v.13 no.1
    • /
    • pp.1-7
    • /
    • 2018
  • The image-to-image translation is one of the deep learning applications using image data. In this paper, we aim at improving the performance of object transfiguration which transforms a specific object in an image into another specific object. For object transfiguration, it is required to transform only the target object and maintain background images. In the existing results, however, it is observed that other parts in the image are also transformed. In this paper, we have focused on the structure of artificial neural networks that are frequently used in the existing methods and have improved the performance by adding constraints to the exiting structure. We also propose the advanced structure that combines the existing structures to maintain their advantages and complement their drawbacks. The effectiveness of the proposed methods are shown in experimental results.

Sign Language Image Recognition System Using Artificial Neural Network

  • Kim, Hyung-Hoon;Cho, Jeong-Ran
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.2
    • /
    • pp.193-200
    • /
    • 2019
  • Hearing impaired people are living in a voice culture area, but due to the difficulty of communicating with normal people using sign language, many people experience discomfort in daily life and social life and various disadvantages unlike their desires. Therefore, in this paper, we study a sign language translation system for communication between a normal person and a hearing impaired person using sign language and implement a prototype system for this. Previous studies on sign language translation systems for communication between normal people and hearing impaired people using sign language are classified into two types using video image system and shape input device. However, existing sign language translation systems have some problems that they do not recognize various sign language expressions of sign language users and require special devices. In this paper, we use machine learning method of artificial neural network to recognize various sign language expressions of sign language users. By using generalized smart phone and various video equipment for sign language image recognition, we intend to improve the usability of sign language translation system.

Facial Image Synthesis by Controlling Skin Microelements (피부 미세요소 조절을 통한 얼굴 영상 합성)

  • Kim, Yujin;Park, In Kyu
    • Journal of Broadcast Engineering
    • /
    • v.27 no.3
    • /
    • pp.369-377
    • /
    • 2022
  • Recent deep learning-based face synthesis research shows the result of generating a realistic face including overall style or elements such as hair, glasses, and makeup. However, previous methods cannot create a face at a very detailed level, such as the microstructure of the skin. In this paper, to overcome this limitation, we propose a technique for synthesizing a more realistic facial image from a single face label image by controlling the types and intensity of skin microelements. The proposed technique uses Pix2PixHD, an Image-to-Image Translation method, to convert a label image showing the facial region and skin elements such as wrinkles, pores, and redness to create a facial image with added microelements. Experimental results show that it is possible to create various realistic face images reflecting fine skin elements corresponding to this by generating various label images with adjusted skin element regions.