• Title/Summary/Keyword: end-to-end learning

Search Result 1,150, Processing Time 0.031 seconds

Urinary Stones Segmentation Model and AI Web Application Development in Abdominal CT Images Through Machine Learning (기계학습을 통한 복부 CT영상에서 요로결석 분할 모델 및 AI 웹 애플리케이션 개발)

  • Lee, Chung-Sub;Lim, Dong-Wook;Noh, Si-Hyeong;Kim, Tae-Hoon;Park, Sung-Bin;Yoon, Kwon-Ha;Jeong, Chang-Won
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.10 no.11
    • /
    • pp.305-310
    • /
    • 2021
  • Artificial intelligence technology in the medical field initially focused on analysis and algorithm development, but it is gradually changing to web application development for service as a product. This paper describes a Urinary Stone segmentation model in abdominal CT images and an artificial intelligence web application based on it. To implement this, a model was developed using U-Net, a fully-convolutional network-based model of the end-to-end method proposed for the purpose of image segmentation in the medical imaging field. And for web service development, it was developed based on AWS cloud using a Python-based micro web framework called Flask. Finally, the result predicted by the urolithiasis segmentation model by model serving is shown as the result of performing the AI web application service. We expect that our proposed AI web application service will be utilized for screening test.

Image classification and captioning model considering a CAM-based disagreement loss

  • Yoon, Yeo Chan;Park, So Young;Park, Soo Myoung;Lim, Heuiseok
    • ETRI Journal
    • /
    • v.42 no.1
    • /
    • pp.67-77
    • /
    • 2020
  • Image captioning has received significant interest in recent years, and notable results have been achieved. Most previous approaches have focused on generating visual descriptions from images, whereas a few approaches have exploited visual descriptions for image classification. This study demonstrates that a good performance can be achieved for both description generation and image classification through an end-to-end joint learning approach with a loss function, which encourages each task to reach a consensus. When given images and visual descriptions, the proposed model learns a multimodal intermediate embedding, which can represent both the textual and visual characteristics of an object. The performance can be improved for both tasks by sharing the multimodal embedding. Through a novel loss function based on class activation mapping, which localizes the discriminative image region of a model, we achieve a higher score when the captioning and classification model reaches a consensus on the key parts of the object. Using the proposed model, we established a substantially improved performance for each task on the UCSD Birds and Oxford Flowers datasets.

Visual servo control of robots using fuzzy-neural-network (퍼지신경망을 이용한 로보트의 비쥬얼서보제어)

  • 서은택;정진현
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1994.10a
    • /
    • pp.566-571
    • /
    • 1994
  • This paper presents in image-based visual servo control scheme for tracking a workpiece with a hand-eye coordinated robotic system using the fuzzy-neural-network. The goal is to control the relative position and orientation between the end-effector and a moving workpiece using a single camera mounted on the end-effector of robot manipulator. We developed a fuzzy-neural-network that consists of a network-model fuzzy system and supervised learning rules. Fuzzy-neural-network is applied to approximate the nonlinear mapping which transforms the features and theire change into the desired camera motion. In addition a control strategy for real-time relative motion control based on this approximation is presented. Computer simulation results are illustrated to show the effectiveness of the fuzzy-neural-network method for visual servoing of robot manipulator.

  • PDF

Multi-focus Image Fusion using Fully Convolutional Two-stream Network for Visual Sensors

  • Xu, Kaiping;Qin, Zheng;Wang, Guolong;Zhang, Huidi;Huang, Kai;Ye, Shuxiong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.5
    • /
    • pp.2253-2272
    • /
    • 2018
  • We propose a deep learning method for multi-focus image fusion. Unlike most existing pixel-level fusion methods, either in spatial domain or in transform domain, our method directly learns an end-to-end fully convolutional two-stream network. The framework maps a pair of different focus images to a clean version, with a chain of convolutional layers, fusion layer and deconvolutional layers. Our deep fusion model has advantages of efficiency and robustness, yet demonstrates state-of-art fusion quality. We explore different parameter settings to achieve trade-offs between performance and speed. Moreover, the experiment results on our training dataset show that our network can achieve good performance with subjective visual perception and objective assessment metrics.

Perceptual Photo Enhancement with Generative Adversarial Networks (GAN 신경망을 통한 자각적 사진 향상)

  • Que, Yue;Lee, Hyo Jong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2019.05a
    • /
    • pp.522-524
    • /
    • 2019
  • In spite of a rapid development in the quality of built-in mobile cameras, their some physical restrictions hinder them to achieve the satisfactory results of digital single lens reflex (DSLR) cameras. In this work we propose an end-to-end deep learning method to translate ordinary images by mobile cameras into DSLR-quality photos. The method is based on the framework of generative adversarial networks (GANs) with several improvements. First, we combined the U-Net with DenseNet and connected dense block (DB) in terms of U-Net. The Dense U-Net acts as the generator in our GAN model. Then, we improved the perceptual loss by using the VGG features and pixel-wise content, which could provide stronger supervision for contrast enhancement and texture recovery.

Super-resolution of compressed image by deep residual network

  • Jin, Yan;Park, Bumjun;Jeong, Jechang
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2018.11a
    • /
    • pp.59-61
    • /
    • 2018
  • Highly compressed images typically not only have low resolution, but are also affected by compression artifacts. Performing image super-resolution (SR) directly on highly compressed image would simultaneously magnify the blocking artifacts. In this paper, a SR method based on deep learning is proposed. The method is an end-to-end trainable deep convolutional neural network which performs SR on compressed images so as to reduce compression artifacts and improve image resolution. The proposed network is divided into compression artifacts removal (CAR) part and SR reconstruction part, and the network is trained by three-step training method to optimize training procedure. Experiments on JPEG compressed images with quality factors of 10, 20, and 30 demonstrate the effectiveness of the proposed method on commonly used test images and image sets.

  • PDF

Variational autoencoder for prosody-based speaker recognition

  • Starlet Ben Alex;Leena Mary
    • ETRI Journal
    • /
    • v.45 no.4
    • /
    • pp.678-689
    • /
    • 2023
  • This paper describes a novel end-to-end deep generative model-based speaker recognition system using prosodic features. The usefulness of variational autoencoders (VAE) in learning the speaker-specific prosody representations for the speaker recognition task is examined herein for the first time. The speech signal is first automatically segmented into syllable-like units using vowel onset points (VOP) and energy valleys. Prosodic features, such as the dynamics of duration, energy, and fundamental frequency (F0), are then extracted at the syllable level and used to train/adapt a speaker-dependent VAE from a universal VAE. The initial comparative studies on VAEs and traditional autoencoders (AE) suggest that the former can efficiently learn speaker representations. Investigations on the impact of gender information in speaker recognition also point out that gender-dependent impostor banks lead to higher accuracies. Finally, the evaluation on the NIST SRE 2010 dataset demonstrates the usefulness of the proposed approach for speaker recognition.

Effects of Ongoing Feedback on Students' Attitudes towards Writing

  • Yang, Tae-Sun
    • English Language & Literature Teaching
    • /
    • v.16 no.1
    • /
    • pp.171-188
    • /
    • 2009
  • The purpose of this study was to investigate the role of ongoing feedback from the professor in students' processes of learning and developing writing skills. Specifically, the researcher was concerned with how ongoing feedback affected students' attitudes towards writing because in EFL contexts, motivating students to write is a first step to engage them in a challenging journey of academic writing. 20 freshmen taking a writing course, "Paragraph & Essay Writing", at A university participated in this study and they were asked to complete the questionnaire at the end of the spring semester 2009. The results revealed that receiving ongoing feedback from the professor had a positive influence on affective domain, was helpful to develop learning strategies, and was valuable in learning outcomes. However, they also expressed negative opinions: feeling a burden, focusing on forms, and feeling confused. To reflect their opinions, the following four suggestions were made to create a more effective learning environment: promoting learner autonomy, facilitating individual writing conferences, giving balanced feedback in between form and content, and using judicious feedback through careful streaming.

  • PDF

Seamless Mobile Learning: Possibilities and Challenges Arising from the Singapore Experience

  • SO, Hyo-Jeong;KIM, Insu;LOOI, Chee-Kit
    • Educational Technology International
    • /
    • v.9 no.2
    • /
    • pp.97-121
    • /
    • 2008
  • The purposes of the present study are to describe the design of mobile learning scenarios based on learning sciences theories, and to discuss implications for the future research in this area. To move beyond mere speculations about the abundant possibilities of mobile learning and to make real impact in K-12 school settings, it is critical to conduct school-based research grounded on the learning sciences theories. Towards this end, this paper describes school-based mobile learning projects conducted by a research team at the Learning Sciences Lab in Singapore, and then discusses the possibilities and challenges of mobile learning to further inform future research. Specifically, this paper explores the affordances of mobile technology, such as portability, connectivity and context-sensitivity, to design seamless learning scenarios that bridge formal and informal learning experiences. The authors present a framework for re-conceptualizing different types of learning based on physical settings and intentionality, and then describe two seamless learning scenarios, namely 3Rs and Chinatown Trail, which were implemented in one primary school in Singapore. In conclusion, the authors discuss the affordances of seamless mobile learning for enhancing one's lived experiences to build a living ecological relationship between the person and the environment, and how mobile technology can play a critical role for enabling such lived experiences.

A Study on Effects of AR and VR Assisted Lessons on Immersion in Learning and Academic Stress

  • Han, Ji-Woo
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.10 no.2
    • /
    • pp.19-24
    • /
    • 2018
  • This study investigated the academic stress and the immersion in learning in relation to AR and VR assisted instructions compared to traditional approaches. To that end, 78 $8^{th}$ graders in T and S city in Gangwondo were assigned to experimental and control groups. The experimental group received the VR and AR lessons. The academic stress was measured with the pre- and post-test scores, while the immersion in learning was measured with the post-test scores. In brief, AR and VR assisted lessons made statistically significant differences in the academic stress and immersion in learning in comparison to the traditional approaches.