• Title/Summary/Keyword: Learned images

Search Result 208, Processing Time 0.024 seconds

Design of Image Generation System for DCGAN-Based Kids' Book Text

  • Cho, Jaehyeon;Moon, Nammee
    • Journal of Information Processing Systems
    • /
    • v.16 no.6
    • /
    • pp.1437-1446
    • /
    • 2020
  • For the last few years, smart devices have begun to occupy an essential place in the life of children, by allowing them to access a variety of language activities and books. Various studies are being conducted on using smart devices for education. Our study extracts images and texts from kids' book with smart devices and matches the extracted images and texts to create new images that are not represented in these books. The proposed system will enable the use of smart devices as educational media for children. A deep convolutional generative adversarial network (DCGAN) is used for generating a new image. Three steps are involved in training DCGAN. Firstly, images with 11 titles and 1,164 images on ImageNet are learned. Secondly, Tesseract, an optical character recognition engine, is used to extract images and text from kids' book and classify the text using a morpheme analyzer. Thirdly, the classified word class is matched with the latent vector of the image. The learned DCGAN creates an image associated with the text.

Learning Discriminative Fisher Kernel for Image Retrieval

  • Wang, Bin;Li, Xiong;Liu, Yuncai
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.7 no.3
    • /
    • pp.522-538
    • /
    • 2013
  • Content based image retrieval has become an increasingly important research topic for its wide application. It is highly challenging when facing to large-scale database with large variance. The retrieval systems rely on a key component, the predefined or learned similarity measures over images. We note that, the similarity measures can be potential improved if the data distribution information is exploited using a more sophisticated way. In this paper, we propose a similarity measure learning approach for image retrieval. The similarity measure, so called Fisher kernel, is derived from the probabilistic distribution of images and is the function over observed data, hidden variable and model parameters, where the hidden variables encode high level information which are powerful in discrimination and are failed to be exploited in previous methods. We further propose a discriminative learning method for the similarity measure, i.e., encouraging the learned similarity to take a large value for a pair of images with the same label and to take a small value for a pair of images with distinct labels. The learned similarity measure, fully exploiting the data distribution, is well adapted to dataset and would improve the retrieval system. We evaluate the proposed method on Corel-1000, Corel5k, Caltech101 and MIRFlickr 25,000 databases. The results show the competitive performance of the proposed method.

Model Verification Algorithm for ATM Security System (ATM 보안 시스템을 위한 모델 인증 알고리즘)

  • Jeong, Heon;Lim, Chun-Hwan;Pyeon, Suk-Bum
    • Journal of the Institute of Electronics Engineers of Korea TE
    • /
    • v.37 no.3
    • /
    • pp.72-78
    • /
    • 2000
  • In this study, we propose a model verification algorithm based on DCT and neural network for ATM security system. We construct database about facial images after capturing thirty persons facial images in the same lumination and distance. To simulate model verification, we capture four learning images and test images per a man. After detecting edge in facial images, we detect a characteristic area of square shape using edge distribution in facial images. Characteristic area contains eye bows, eyes, nose, mouth and cheek. We extract characteristic vectors to calculate diagonally coefficients sum after obtaining DCT coefficients about characteristic area. Characteristic vectors is normalized between +1 and -1, and then used for input vectors of neural networks. Not considering passwords, simulations results showed 100% verification rate when facial images were learned and 92% verification rate when facial images weren't learned. But considering passwords, the proposed algorithm showed 100% verification rate in case of two simulations.

  • PDF

Multistage Transfer Learning for Breast Cancer Early Diagnosis via Ultrasound (유방암 조기 진단을 위한 초음파 영상의 다단계 전이 학습)

  • Ayana, Gelan;Park, Jinhyung;Choe, Se-woon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.134-136
    • /
    • 2021
  • Research related to early diagnosis of breast cancer using artificial intelligence algorithms has been actively conducted in recent years. Although various algorithms that classify breast cancer based on a few publicly available ultrasound breast cancer images have been published, these methods show various limitations such as, processing speed and accuracy suitable for the user's purpose. To solve this problem, in this paper, we propose a multi-stage transfer learning where ResNet model trained on ImageNet is transfer learned to microscopic cancer cell line images, which was again transfer learned to classify ultrasound breast cancer images as benign and malignant. The images for the experiment consisted of 250 breast cancer ultrasound images including benign and malignant images and 27,200 cancer cell line images. The proposed multi-stage transfer learning algorithm showed more than 96% accuracy when classifying ultrasound breast cancer images, and is expected to show higher utilization and accuracy through the addition of more cancer cell lines and real-time image processing in the future.

  • PDF

Classification of Anteroposterior/Lateral Images and Segmentation of the Radius Using Deep Learning in Wrist X-rays Images (손목 관절 단순 방사선 영상에서 딥 러닝을 이용한 전후방 및 측면 영상 분류와 요골 영역 분할)

  • Lee, Gi Pyo;Kim, Young Jae;Lee, Sanglim;Kim, Kwang Gi
    • Journal of Biomedical Engineering Research
    • /
    • v.41 no.2
    • /
    • pp.94-100
    • /
    • 2020
  • The purpose of this study was to present the models for classifying the wrist X-ray images by types and for segmenting the radius automatically in each image using deep learning and to verify the learned models. The data were a total of 904 wrist X-rays with the distal radius fracture, consisting of 472 anteroposterior (AP) and 432 lateral images. The learning model was the ResNet50 model for AP/lateral image classification, and the U-Net model for segmentation of the radius. In the model for AP/lateral image classification, 100.0% was showed in precision, recall, and F1 score and area under curve (AUC) was 1.0. The model for segmentation of the radius showed an accuracy of 99.46%, a sensitivity of 89.68%, a specificity of 99.72%, and a Dice similarity coefficient of 90.05% in AP images and an accuracy of 99.37%, a sensitivity of 88.65%, a specificity of 99.69%, and a Dice similarity coefficient of 86.05% in lateral images. The model for AP/lateral classification and the segmentation model of the radius learned through deep learning showed favorable performances to expect clinical application.

Classification and Safety Score Evaluation of Street Images Using CNN (CNN을 이용한 거리 사진의 분류와 안전도 평가)

  • Bae, Kyu Ho;Yun, Jung Un;Park, In Kyu
    • Journal of Broadcast Engineering
    • /
    • v.23 no.3
    • /
    • pp.345-350
    • /
    • 2018
  • CNN (convolution neural network) has become the most popular artificial intelligence technique and shows remarkable performance in image classification task. In this paper, we propose a CNN-based classification method for various street images as well as a method of evaluating the safety score for the street. The proposed method consists of learning four types of street images using CNN and classifying input street images using the learned CNN model followed by evaluating the safety score. During the learning process, four types of street images are collected and augmented, and then CNN learning is performed. It is shown that learned CNN model classifies input images correctly and the safety scores are evaluated quantitatively by combining the probabilities of different street types.

Asymmetric Semi-Supervised Boosting Scheme for Interactive Image Retrieval

  • Wu, Jun;Lu, Ming-Yu
    • ETRI Journal
    • /
    • v.32 no.5
    • /
    • pp.766-773
    • /
    • 2010
  • Support vector machine (SVM) active learning plays a key role in the interactive content-based image retrieval (CBIR) community. However, the regular SVM active learning is challenged by what we call "the small example problem" and "the asymmetric distribution problem." This paper attempts to integrate the merits of semi-supervised learning, ensemble learning, and active learning into the interactive CBIR. Concretely, unlabeled images are exploited to facilitate boosting by helping augment the diversity among base SVM classifiers, and then the learned ensemble model is used to identify the most informative images for active learning. In particular, a bias-weighting mechanism is developed to guide the ensemble model to pay more attention on positive images than negative images. Experiments on 5000 Corel images show that the proposed method yields better retrieval performance by an amount of 0.16 in mean average precision compared to regular SVM active learning, which is more effective than some existing improved variants of SVM active learning.

Face Detection Tracking in Sequential Images using Backpropagation (역전파 신경망을 이용한 동영상에서의 얼굴 검출 및 트래킹)

  • 지승환;김용주;김정환;박민용
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 1997.11a
    • /
    • pp.124-127
    • /
    • 1997
  • In this paper, we propose the new face detection and tracking angorithm in sequential images which have complex background. In order to apply face deteciton algorithm efficently, we convert the conventional RGB coordiantes into CIE coordonates and make the input images insensitive to luminace. And human face shapes and colors are learned using ueural network's backpropagation. For variable face size, we make mosaic size of input images vary and get the face location with various size through neural network. Besides, in sequential images, we suggest face motion tracking algorithm through image substraction processing and thresholding. At this time, for accurate face tracking, we use the face location of previous. image. Finally, we verify the real-time applicability of the proposed algorithm by the simple simulation.

  • PDF

Enhancing Single Thermal Image Depth Estimation via Multi-Channel Remapping for Thermal Images (열화상 이미지 다중 채널 재매핑을 통한 단일 열화상 이미지 깊이 추정 향상)

  • Kim, Jeongyun;Jeon, Myung-Hwan;Kim, Ayoung
    • The Journal of Korea Robotics Society
    • /
    • v.17 no.3
    • /
    • pp.314-321
    • /
    • 2022
  • Depth information used in SLAM and visual odometry is essential in robotics. Depth information often obtained from sensors or learned by networks. While learning-based methods have gained popularity, they are mostly limited to RGB images. However, the limitation of RGB images occurs in visually derailed environments. Thermal cameras are in the spotlight as a way to solve these problems. Unlike RGB images, thermal images reliably perceive the environment regardless of the illumination variance but show lacking contrast and texture. This low contrast in the thermal image prohibits an algorithm from effectively learning the underlying scene details. To tackle these challenges, we propose multi-channel remapping for contrast. Our method allows a learning-based depth prediction model to have an accurate depth prediction even in low light conditions. We validate the feasibility and show that our multi-channel remapping method outperforms the existing methods both visually and quantitatively over our dataset.

Far Distance Face Detection from The Interest Areas Expansion based on User Eye-tracking Information (시선 응시 점 기반의 관심영역 확장을 통한 원 거리 얼굴 검출)

  • Park, Heesun;Hong, Jangpyo;Kim, Sangyeol;Jang, Young-Min;Kim, Cheol-Su;Lee, Minho
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.49 no.9
    • /
    • pp.113-127
    • /
    • 2012
  • Face detection methods using image processing have been proposed in many different ways. Generally, the most widely used method for face detection is an Adaboost that is proposed by Viola and Jones. This method uses Haar-like feature for image learning, and the detection performance depends on the learned images. It is well performed to detect face images within a certain distance range, but if the image is far away from the camera, face images become so small that may not detect them with the pre-learned Haar-like feature of the face image. In this paper, we propose the far distance face detection method that combine the Aadaboost of Viola-Jones with a saliency map and user's attention information. Saliency Map is used to select the candidate face images in the input image, face images are finally detected among the candidated regions using the Adaboost with Haar-like feature learned in advance. And the user's eye-tracking information is used to select the interest regions. When a subject is so far away from the camera that it is difficult to detect the face image, we expand the small eye gaze spot region using linear interpolation method and reuse that as input image and can increase the face image detection performance. We confirmed the proposed model has better results than the conventional Adaboost in terms of face image detection performance and computational time.