• Title/Summary/Keyword: deep Learning

Search Result 5,795, Processing Time 0.037 seconds

Interaction art using Video Synthesis Technology

  • Kim, Sung-Soo;Eom, Hyun-Young;Lim, Chan
    • International Journal of Advanced Culture Technology
    • /
    • v.7 no.2
    • /
    • pp.195-200
    • /
    • 2019
  • Media art, which is a combination of media technology and art, is making a lot of progress in combination with AI, IoT and VR. This paper aims to meet people's needs by creating a video that simulates the dance moves of an object that users admire by using media art that features interactive interactions between users and works. The project proposed a universal image synthesis system that minimizes equipment constraints by utilizing a deep running-based Skeleton estimation system and one of the deep-running neural network structures, rather than a Kinect-based Skeleton image. The results of the experiment showed that the images implemented through the deep learning system were successful in generating the same results as the user did when they actually danced through inference and synthesis of motion that they did not actually behave.

Video Saliency Detection Using Bi-directional LSTM

  • Chi, Yang;Li, Jinjiang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.6
    • /
    • pp.2444-2463
    • /
    • 2020
  • Significant detection of video can more rationally allocate computing resources and reduce the amount of computation to improve accuracy. Deep learning can extract the edge features of the image, providing technical support for video saliency. This paper proposes a new detection method. We combine the Convolutional Neural Network (CNN) and the Deep Bidirectional LSTM Network (DB-LSTM) to learn the spatio-temporal features by exploring the object motion information and object motion information to generate video. A continuous frame of significant images. We also analyzed the sample database and found that human attention and significant conversion are time-dependent, so we also considered the significance detection of video cross-frame. Finally, experiments show that our method is superior to other advanced methods.

Joint Demosaicing and Super-resolution of Color Filter Array Image based on Deep Image Prior Network

  • Kurniawan, Edwin;Lee, Suk-Ho
    • International journal of advanced smart convergence
    • /
    • v.11 no.2
    • /
    • pp.13-21
    • /
    • 2022
  • In this paper, we propose a learning based joint demosaicing and super-resolution framework which uses only the mosaiced color filter array(CFA) image as the input. As the proposed method works only on the mosaicied CFA image itself, there is no need for a large dataset. Based on our framework, we proposed two different structures, where the first structure uses one deep image prior network, while the second uses two. Experimental results show that even though we use only the CFA image as the training image, the proposed method can result in better visual quality than other bilinear interpolation combined demosaicing methods, and therefore, opens up a new research area for joint demosaicing and super-resolution on raw images.

Building Change Detection Using Deep Learning for Remote Sensing Images

  • Wang, Chang;Han, Shijing;Zhang, Wen;Miao, Shufeng
    • Journal of Information Processing Systems
    • /
    • v.18 no.4
    • /
    • pp.587-598
    • /
    • 2022
  • To increase building change recognition accuracy, we present a deep learning-based building change detection using remote sensing images. In the proposed approach, by merging pixel-level and object-level information of multitemporal remote sensing images, we create the difference image (DI), and the frequency-domain significance technique is used to generate the DI saliency map. The fuzzy C-means clustering technique pre-classifies the coarse change detection map by defining the DI saliency map threshold. We then extract the neighborhood features of the unchanged pixels and the changed (buildings) from pixel-level and object-level feature images, which are then used as valid deep neural network (DNN) training samples. The trained DNNs are then utilized to identify changes in DI. The suggested strategy was evaluated and compared to current detection methods using two datasets. The results suggest that our proposed technique can detect more building change information and improve change detection accuracy.

Development of AI Systems for Counting Visitors and Check of Wearing Masks Using Deep Learning Algorithms (딥러닝 알고리즘을 활용한 출입자 통계와 마스크 착용 판별 인공지능 시스템)

  • Cho, Won-Young;Park, Sung-Leol;Kim, Hyun-Soo;Yun, Tae-Jin
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2020.07a
    • /
    • pp.285-286
    • /
    • 2020
  • 전 세계적으로 유행하는 COVID-19(코로나19)로 인해 사람들은 대면 접촉을 피하게 되었고, 전염성이 높은 이유로 마스크의 착용이 의무화되고 있고, 이를 검사하는 업무가 증가하고 있다. 그래서, 인공지능 기술을 통해 업무를 도와줄 수 있는 출입자 통계와 출입자 마스크 착용 검사를 할 수 있는 시스템이 필요하다. 이를 위해 본 논문에서는 딥러닝 알고리즘을 활용한 출입자 통계와 마스크 착용 판별 시스템을 제시한다. 또한, 실시간 영상인식에 많이 활용되고 있는 YOLO-v3와 YOLO-v4, YOLO-Tiny 알고리즘을 데스크탑 PC와 Nvidia사의 Jetson Nano에 적용하여 알고리즘별 성능 비교를 통해 적합한 방법을 찾고 적용하였다.

  • PDF

Enhanced deep soft interference cancellation for multiuser symbol detection

  • Jihyung Kim;Junghyun Kim;Moon-Sik Lee
    • ETRI Journal
    • /
    • v.45 no.6
    • /
    • pp.929-938
    • /
    • 2023
  • The detection of all the symbols transmitted simultaneously in multiuser systems using limited wireless resources is challenging. Traditional model-based methods show high performance with perfect channel state information (CSI); however, severe performance degradation will occur if perfect CSI cannot be acquired. In contrast, data-driven methods perform slightly worse than model-based methods in terms of symbol error ratio performance in perfect CSI states; however, they are also able to overcome extreme performance degradation in imperfect CSI states. This study proposes a novel deep learning-based method by improving a state-of-the-art data-driven technique called deep soft interference cancellation (DSIC). The enhanced DSIC (EDSIC) method detects multiuser symbols in a fully sequential manner and uses an efficient neural network structure to ensure high performance. Additionally, error-propagation mitigation techniques are used to ensure robustness against channel uncertainty. The EDSIC guarantees a performance that is very close to the optimal performance of the existing model-based methods in perfect CSI environments and the best performance in imperfect CSI environments.

Enhancing Underwater Images through Deep Curve Estimation (깊은 곡선 추정을 이용한 수중 영상 개선)

  • Muhammad Tariq Mahmood;Young Kyu Choi
    • Journal of the Semiconductor & Display Technology
    • /
    • v.23 no.2
    • /
    • pp.23-27
    • /
    • 2024
  • Underwater images are typically degraded due to color distortion, light absorption, scattering, and noise from artificial light sources. Restoration of these images is an essential task in many underwater applications. In this paper, we propose a two-phase deep learning-based method, Underwater Deep Curve Estimation (UWDCE), designed to effectively enhance the quality of underwater images. The first phase involves a white balancing and color correction technique to compensate for color imbalances. The second phase introduces a novel deep learning model, UWDCE, to learn the mapping between the color-corrected image and its best-fitting curve parameter maps. The model operates iteratively, applying light-enhancement curves to achieve better contrast and maintain pixel values within a normalized range. The results demonstrate the effectiveness of our method, producing higher-quality images compared to state-of-the-art methods.

  • PDF

Crack growth prediction on a concrete structure using deep ConvLSTM

  • Man-Sung Kang;Yun-Kyu An
    • Smart Structures and Systems
    • /
    • v.33 no.4
    • /
    • pp.301-311
    • /
    • 2024
  • This paper proposes a deep convolutional long short-term memory (ConvLSTM)-based crack growth prediction technique for predictive maintenance of structures. Since cracks are one of the critical damage types in a structure, their regular inspection has been mandatory for structural safety and serviceability. To effectively establish the structural maintenance plan using the inspection results, crack propagation or growth prediction is essential. However, conventional crack prediction techniques based on mathematical models are not typically suitable for tracking complex nonlinear crack propagation mechanism on civil structures under harsh environmental conditions. To address the technical issue, a field data-driven crack growth prediction technique using ConvLSTM is newly proposed in this study. The proposed technique consists of the four steps: (1) time-series crack image acquisition, (2) target image stabilization, (3) deep learning-based crack detection and quantification and (4) crack growth prediction. The performance of the proposed technique is experimentally validated using a concrete mock-up specimen by applying step-wise bending loads to generate crack growth. The validation test results reveal the prediction accuracy of 94% on average compared with the ground truth obtained by field measurement.

Two person Interaction Recognition Based on Effective Hybrid Learning

  • Ahmed, Minhaz Uddin;Kim, Yeong Hyeon;Kim, Jin Woo;Bashar, Md Rezaul;Rhee, Phill Kyu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.2
    • /
    • pp.751-770
    • /
    • 2019
  • Action recognition is an essential task in computer vision due to the variety of prospective applications, such as security surveillance, machine learning, and human-computer interaction. The availability of more video data than ever before and the lofty performance of deep convolutional neural networks also make it essential for action recognition in video. Unfortunately, limited crafted video features and the scarcity of benchmark datasets make it challenging to address the multi-person action recognition task in video data. In this work, we propose a deep convolutional neural network-based Effective Hybrid Learning (EHL) framework for two-person interaction classification in video data. Our approach exploits a pre-trained network model (the VGG16 from the University of Oxford Visual Geometry Group) and extends the Faster R-CNN (region-based convolutional neural network a state-of-the-art detector for image classification). We broaden a semi-supervised learning method combined with an active learning method to improve overall performance. Numerous types of two-person interactions exist in the real world, which makes this a challenging task. In our experiment, we consider a limited number of actions, such as hugging, fighting, linking arms, talking, and kidnapping in two environment such simple and complex. We show that our trained model with an active semi-supervised learning architecture gradually improves the performance. In a simple environment using an Intelligent Technology Laboratory (ITLab) dataset from Inha University, performance increased to 95.6% accuracy, and in a complex environment, performance reached 81% accuracy. Our method reduces data-labeling time, compared to supervised learning methods, for the ITLab dataset. We also conduct extensive experiment on Human Action Recognition benchmarks such as UT-Interaction dataset, HMDB51 dataset and obtain better performance than state-of-the-art approaches.

A Study on Residual U-Net for Semantic Segmentation based on Deep Learning (딥러닝 기반의 Semantic Segmentation을 위한 Residual U-Net에 관한 연구)

  • Shin, Seokyong;Lee, SangHun;Han, HyunHo
    • Journal of Digital Convergence
    • /
    • v.19 no.6
    • /
    • pp.251-258
    • /
    • 2021
  • In this paper, we proposed an encoder-decoder model utilizing residual learning to improve the accuracy of the U-Net-based semantic segmentation method. U-Net is a deep learning-based semantic segmentation method and is mainly used in applications such as autonomous vehicles and medical image analysis. The conventional U-Net occurs loss in feature compression process due to the shallow structure of the encoder. The loss of features causes a lack of context information necessary for classifying objects and has a problem of reducing segmentation accuracy. To improve this, The proposed method efficiently extracted context information through an encoder using residual learning, which is effective in preventing feature loss and gradient vanishing problems in the conventional U-Net. Furthermore, we reduced down-sampling operations in the encoder to reduce the loss of spatial information included in the feature maps. The proposed method showed an improved segmentation result of about 12% compared to the conventional U-Net in the Cityscapes dataset experiment.