• Title/Summary/Keyword: Image-to-image Translation

Search Result 306, Processing Time 0.027 seconds

Application of Artificial Neural Network For Sign Language Translation

  • Cho, Jeong-Ran;Kim, Hyung-Hoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.2
    • /
    • pp.185-192
    • /
    • 2019
  • In the case of a hearing impaired person using sign language, there are many difficulties in communicating with a normal person who does not understand sign language. The sign language translation system is a system that enables communication between the hearing impaired person using sign language and the normal person who does not understand sign language in this situation. Previous studies on sign language translation systems for communication between normal people and hearing impaired people using sign language are classified into two types using video image system and shape input device. However, the existing sign language translation system does not solve such difficulties due to some problems. Existing sign language translation systems have some problems that they do not recognize various sign language expressions of sign language users and require special devices. Therefore, in this paper, a sign language translation system using an artificial neural network is devised to overcome the problems of the existing system.

Interpretability on Deep Retinal Image Understanding Network

  • Manal AlGhamdi
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.10
    • /
    • pp.206-212
    • /
    • 2024
  • In the last 10 years, artificial intelligence (AI) has shown more predictive accuracy than humans in many fields. Its promising future founded on its great performance increases people's concern about its black-box mechanism. In many fields, such as medicine, mistakes lacking explanations are hardly accepted. As a result, research on interpretable AI is of great significance. Although much work about interpretable AI methods are common in classification tasks, little has focused on segmentation tasks. In this paper, we explored the interpretability on a Deep Retinal Image Understanding (DRIU) network, which is used to segment the vessels from retinal images. We combine the Grad Class Activation Mapping (Grad-CAM), commonly used in image classification, to generate saliency map, with the segmentation task network. Through the saliency map, we got information about the contribution of each layer in the network during predicting the vessels. Therefore, we adjusted the weights of last convolutional layer manually to prove the accuracy of the saliency map generated by Grad-CAM. According to the result, we found the layer 'upsample2' to be the most important during segmentation, and we improved the mIoU score (an evaluation method) to some extent.

A Study on Improving the Accuracy of Medical Images Classification Using Data Augmentation

  • Cheon-Ho Park;Min-Guan Kim;Seung-Zoon Lee;Jeongil Choi
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.12
    • /
    • pp.167-174
    • /
    • 2023
  • This paper attempted to improve the accuracy of the colorectal cancer diagnosis model using image data augmentation in convolutional neural network. Image data augmentation was performed by flipping, rotation, translation, shearing and zooming with basic image manipulation method. This study split 4000 training data and 1000 test data for 5000 image data held, the model is learned by adding 4000 and 8000 images by image data augmentation technique to 4000 training data. The evaluation results showed that the clasification accuracy for 4000, 8000, and 12,000 training data were 85.1%, 87.0%, and 90.2%, respectively, and the improvement effect depending on the increase of image data was confirmed.

Watermarking Algorithm that is Adaptive on Geometric Distortion in consequence of Restoration Pattern Matching (복구패턴 정합을 통한 기하학적 왜곡에 적응적인 워터마킹)

  • Jun Young-Min;Ko Il-Ju;Kim Dongho
    • The KIPS Transactions:PartB
    • /
    • v.12B no.3 s.99
    • /
    • pp.283-290
    • /
    • 2005
  • The mismatched allocation of watermarking position due to parallel translation, rotation, and scaling distortion is a problem that requires an answer in watermarking. In this paper, we propose a watermarking method robust enough to hold against geometrical distorting using restoration pattern matching. The proposed method defines restoration pattern, then inserts the pattern to a watermark embedded image for distribution. Geometrical distortion is verified by comparing restoration pattern extracted from distributed image and the original restoration pattern inserted to the image. If geometrical distortion is found, inverse transformation is equally performed to synchronize the watermark insertion and extraction position. To evaluate the performance of the proposed method, experiments in translation, rotation, and scaling attack are performed.

A Data Hiding Scheme for Grayscale Images Using a Square Function

  • Kwon, Hyejin;Kim, Haemun;Kim, Soonja
    • Journal of Korea Multimedia Society
    • /
    • v.17 no.4
    • /
    • pp.466-477
    • /
    • 2014
  • Many image hiding schemes based on least significant bit (LSB) transformation have been proposed. One of the LSB-based image hiding schemes that employs diamond encoding was proposed in 2008. In this scheme, the binary secret data is converted into base n representation, and the converted secret data is concealed in the cover image. Here, we show that this scheme has two vulnerabilities: noticeable spots in the stego-image, i.e., a non-smooth embedding result, and inefficiency caused by rough re-adjustment of falling-off-boundary value and impractical base translation. Moreover, we propose a new scheme that is efficient and produces a smooth and high quality embedding result by restricting n to power of 2 and using a sophisticated re-adjustment procedure. Our experimental results show that our scheme yields high quality stego-images and is secure against RS detection attack.

Imaging a scene from experience given verbal experssions

  • Sakai, Y.;Kitazawa, M.;Takahashi, S.
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1995.10a
    • /
    • pp.307-310
    • /
    • 1995
  • In the conventional systems, a human must have knowledge of machines and of their special language in communicating with machines. In one side, it is desirable for a human but in another side, it is true that achieving it is very elaborate and is also a significant cause of human error. To reduce this sort of human load, an intelligent man-machine interface is desirable to exist between a human operator and machines to be operated. In the ordinary human communication, not only linguistic information but also visual information is effective, compensating for each others defect. From this viewpoint, problem of translating verbal expressions to some visual image is discussed here in this paper. The location relation between any two objects in a visual scene is a key in translating verbal information to visual information, as is the case in Fig.l. The present translation system advances in knowledge with experience. It consists of Japanese Language processing, image processing, and Japanese-scene translation functions.

  • PDF

Image-to-Image Translation Based on U-Net with R2 and Attention (R2와 어텐션을 적용한 유넷 기반의 영상 간 변환에 관한 연구)

  • Lim, So-hyun;Chun, Jun-chul
    • Journal of Internet Computing and Services
    • /
    • v.21 no.4
    • /
    • pp.9-16
    • /
    • 2020
  • In the Image processing and computer vision, the problem of reconstructing from one image to another or generating a new image has been steadily drawing attention as hardware advances. However, the problem of computer-generated images also continues to emerge when viewed with human eyes because it is not natural. Due to the recent active research in deep learning, image generating and improvement problem using it are also actively being studied, and among them, the network called Generative Adversarial Network(GAN) is doing well in the image generating. Various models of GAN have been presented since the proposed GAN, allowing for the generation of more natural images compared to the results of research in the image generating. Among them, pix2pix is a conditional GAN model, which is a general-purpose network that shows good performance in various datasets. pix2pix is based on U-Net, but there are many networks that show better performance among U-Net based networks. Therefore, in this study, images are generated by applying various networks to U-Net of pix2pix, and the results are compared and evaluated. The images generated through each network confirm that the pix2pix model with Attention, R2, and Attention-R2 networks shows better performance than the existing pix2pix model using U-Net, and check the limitations of the most powerful network. It is suggested as a future study.

Robust PCB Image Alignment using SIFT (잡음과 회전에 강인한 SIFT 기반 PCB 영상 정렬 알고리즘 개발)

  • Kim, Jun-Chul;Cui, Xue-Nan;Park, Eun-Soo;Choi, Hyo-Hoon;Kim, Hak-Il
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.16 no.7
    • /
    • pp.695-702
    • /
    • 2010
  • This paper presents an image alignment algorithm for application of AOI (Automatic Optical Inspection) based on SIFT. Since the correspondences result using SIFT descriptor have many wrong points for aligning, this paper modified and classified those points by five measures called the CCFMR (Cascade Classifier for False Matching Reduction) After reduced the false matching, rotation and translation are estimated by point selection method. Experimental results show that the proposed method has fewer fail matching in comparison to commercial software MIL 8.0, and specially, less than twice with the well-controlled environment’s data sets (such as AOI system). The rotation and translation accuracy is robust than MIL in the noise data sets, but the errors are higher than in a rotation variation data sets although that also meaningful result in the practical system. In addition to, the computational time consumed by the proposed method is four times shorter than that by MIL which increases linearly according to noise.

A Region Depth Estimation Algorithm using Motion Vector from Monocular Video Sequence (단안영상에서 움직임 벡터를 이용한 영역의 깊이추정)

  • 손정만;박영민;윤영우
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.5 no.2
    • /
    • pp.96-105
    • /
    • 2004
  • The recovering 3D image from 2D requires the depth information for each picture element. The manual creation of those 3D models is time consuming and expensive. The goal in this paper is to estimate the relative depth information of every region from single view image with camera translation. The paper is based on the fact that the motion of every point within image which taken from camera translation depends on the depth. Motion vector using full-search motion estimation is compensated for camera rotation and zooming. We have developed a framework that estimates the average frame depth by analyzing motion vector and then calculates relative depth of region to average frame depth. Simulation results show that the depth of region belongs to a near or far object is consistent accord with relative depth that man recognizes.

  • PDF

Real-time Multiple Stereo Image Synthesis using Depth Information (깊이 정보를 이용한 실시간 다시점 스테레오 영상 합성)

  • Jang Se hoon;Han Chung shin;Bae Jin woo;Yoo Ji sang
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.4C
    • /
    • pp.239-246
    • /
    • 2005
  • In this paper. we generate a virtual right image corresponding to the input left image by using given RGB texture data and 8 bit gray scale depth data. We first transform the depth data to disparity data and then produce the virtual right image with this disparity. We also proposed a stereo image synthesis algorithm which is adaptable to a viewer's position and an real-time processing algorithm with a fast LUT(look up table) method. Finally, we could synthesize a total of eleven stereo images with different view points for SD quality of a texture image with 8 bit depth information in a real time.