• Title/Summary/Keyword: deep transfer learning

Search Result 257, Processing Time 0.026 seconds

Humming: Image Based Automatic Music Composition Using DeepJ Architecture (허밍: DeepJ 구조를 이용한 이미지 기반 자동 작곡 기법 연구)

  • Kim, Taehun;Jung, Keechul;Lee, Insung
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.5
    • /
    • pp.748-756
    • /
    • 2022
  • Thanks to the competition of AlphaGo and Sedol Lee, machine learning has received world-wide attention and huge investments. The performance improvement of computing devices greatly contributed to big data processing and the development of neural networks. Artificial intelligence not only imitates human beings in many fields, but also seems to be better than human capabilities. Although humans' creation is still considered to be better and higher, several artificial intelligences continue to challenge human creativity. The quality of some creative outcomes by AI is as good as the real ones produced by human beings. Sometimes they are not distinguishable, because the neural network has the competence to learn the common features contained in big data and copy them. In order to confirm whether artificial intelligence can express the inherent characteristics of different arts, this paper proposes a new neural network model called Humming. It is an experimental model that combines vgg16, which extracts image features, and DeepJ's architecture, which excels in creating various genres of music. A dataset produced by our experiment shows meaningful and valid results. Different results, however, are produced when the amount of data is increased. The neural network produced a similar pattern of music even though it was a different classification of images, which was not what we were aiming for. However, these new attempts may have explicit significance as a starting point for feature transfer that will be further studied.

Applications of a Deep Neural Network to Illustration Art Style Design of City Architectural

  • Yue Wang;Jia-Wei Zhao;Ming-Yue Zheng;Ming-Yu Li;Xue Sun;Hao Liu;Zhen Liu
    • Journal of Information Processing Systems
    • /
    • v.20 no.1
    • /
    • pp.53-66
    • /
    • 2024
  • With the continuous advancement of computer technology, deep learning models have emerged as innovative tools in shaping various aspects of architectural design. Recognizing the distinctive perspective of children, which differs significantly from that of adults, this paper contends that conventional standards may not always be the most suitable approach in designing urban structures tailored for children. The primary objective of this study is to leverage neural style networks within the design process, specifically adopting the artistic viewpoint found in children's illustrations. By combining the aesthetic paradigm of urban architecture with inspiration drawn from children's aesthetic preferences, the aim is to unearth more creative and subversive aesthetics that challenge traditional norms. The selected context for exploration is the landmark buildings in Qingdao City, Shandong Province, China. Employing the neural style network, the study uses architectural elements of the chosen buildings as content images while preserving their inherent characteristics. The process involves artistic stylization inspired by classic children's illustrations and images from children's picture books. Acting as a conduit for deep learning technology, the research delves into the prospect of seamlessly integrating architectural design styles with the imaginative world of children's illustrations. The outcomes aim to provide fresh perspectives and effective support for the artistic design of contemporary urban buildings.

A Study on Leakage Detection Technique Using Transfer Learning-Based Feature Fusion (전이학습 기반 특징융합을 이용한 누출판별 기법 연구)

  • YuJin Han;Tae-Jin Park;Jonghyuk Lee;Ji-Hoon Bae
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.2
    • /
    • pp.41-47
    • /
    • 2024
  • When there were disparities in performance between models trained in the time and frequency domains, even after conducting an ensemble, we observed that the performance of the ensemble was compromised due to imbalances in the individual model performances. Therefore, this paper proposes a leakage detection technique to enhance the accuracy of pipeline leakage detection through a step-wise learning approach that extracts features from both the time and frequency domains and integrates them. This method involves a two-step learning process. In the Stage 1, independent model training is conducted in the time and frequency domains to effectively extract crucial features from the provided data in each domain. In Stage 2, the pre-trained models were utilized by removing their respective classifiers. Subsequently, the features from both domains were fused, and a new classifier was added for retraining. The proposed transfer learning-based feature fusion technique in this paper performs model training by integrating features extracted from the time and frequency domains. This integration exploits the complementary nature of features from both domains, allowing the model to leverage diverse information. As a result, it achieved a high accuracy of 99.88%, demonstrating outstanding performance in pipeline leakage detection.

Sex determination from lateral cephalometric radiographs using an automated deep learning convolutional neural network

  • Khazaei, Maryam;Mollabashi, Vahid;Khotanlou, Hassan;Farhadian, Maryam
    • Imaging Science in Dentistry
    • /
    • v.52 no.3
    • /
    • pp.239-244
    • /
    • 2022
  • Purpose: Despite the proliferation of numerous morphometric and anthropometric methods for sex identification based on linear, angular, and regional measurements of various parts of the body, these methods are subject to error due to the observer's knowledge and expertise. This study aimed to explore the possibility of automated sex determination using convolutional neural networks(CNNs) based on lateral cephalometric radiographs. Materials and Methods: Lateral cephalometric radiographs of 1,476 Iranian subjects (794 women and 682 men) from 18 to 49 years of age were included. Lateral cephalometric radiographs were considered as a network input and output layer including 2 classes(male and female). Eighty percent of the data was used as a training set and the rest as a test set. Hyperparameter tuning of each network was done after preprocessing and data augmentation steps. The predictive performance of different architectures (DenseNet, ResNet, and VGG) was evaluated based on their accuracy in test sets. Results: The CNN based on the DenseNet121 architecture, with an overall accuracy of 90%, had the best predictive power in sex determination. The prediction accuracy of this model was almost equal for men and women. Furthermore, with all architectures, the use of transfer learning improved predictive performance. Conclusion: The results confirmed that a CNN could predict a person's sex with high accuracy. This prediction was independent of human bias because feature extraction was done automatically. However, for more accurate sex determination on a wider scale, further studies with larger sample sizes are desirable.

Deep learning for the classification of cervical maturation degree and pubertal growth spurts: A pilot study

  • Mohammad-Rahimi, Hossein;Motamadian, Saeed Reza;Nadimi, Mohadeseh;Hassanzadeh-Samani, Sahel;Minabi, Mohammad A. S.;Mahmoudinia, Erfan;Lee, Victor Y.;Rohban, Mohammad Hossein
    • The korean journal of orthodontics
    • /
    • v.52 no.2
    • /
    • pp.112-122
    • /
    • 2022
  • Objective: This study aimed to present and evaluate a new deep learning model for determining cervical vertebral maturation (CVM) degree and growth spurts by analyzing lateral cephalometric radiographs. Methods: The study sample included 890 cephalograms. The images were classified into six cervical stages independently by two orthodontists. The images were also categorized into three degrees on the basis of the growth spurt: pre-pubertal, growth spurt, and post-pubertal. Subsequently, the samples were fed to a transfer learning model implemented using the Python programming language and PyTorch library. In the last step, the test set of cephalograms was randomly coded and provided to two new orthodontists in order to compare their diagnosis to the artificial intelligence (AI) model's performance using weighted kappa and Cohen's kappa statistical analyses. Results: The model's validation and test accuracy for the six-class CVM diagnosis were 62.63% and 61.62%, respectively. Moreover, the model's validation and test accuracy for the three-class classification were 75.76% and 82.83%, respectively. Furthermore, substantial agreements were observed between the two orthodontists as well as one of them and the AI model. Conclusions: The newly developed AI model had reasonable accuracy in detecting the CVM stage and high reliability in detecting the pubertal stage. However, its accuracy was still less than that of human observers. With further improvements in data quality, this model should be able to provide practical assistance to practicing dentists in the future.

A Study of Lightening SRGAN Using Knowledge Distillation (지식증류 기법을 사용한 SRGAN 경량화 연구)

  • Lee, Yeojin;Park, Hanhoon
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.12
    • /
    • pp.1598-1605
    • /
    • 2021
  • Recently, convolutional neural networks (CNNs) have been widely used with excellent performance in various computer vision fields, including super-resolution (SR). However, CNN is computationally intensive and requires a lot of memory, making it difficult to apply to limited hardware resources such as mobile or Internet of Things devices. To solve these limitations, network lightening studies have been actively conducted to reduce the depth or size of pre-trained deep CNN models while maintaining their performance as much as possible. This paper aims to lighten the SR CNN model, SRGAN, using the knowledge distillation among network lightening technologies; thus, it proposes four techniques with different methods of transferring the knowledge of the teacher network to the student network and presents experiments to compare and analyze the performance of each technique. In our experimental results, it was confirmed through quantitative and qualitative evaluation indicators that student networks with knowledge transfer performed better than those without knowledge transfer, and among the four knowledge transfer techniques, the technique of conducting adversarial learning after transferring knowledge from the teacher generator to the student generator showed the best performance.

Research on Intelligent Anomaly Detection System Based on Real-Time Unstructured Object Recognition Technique (실시간 비정형객체 인식 기법 기반 지능형 이상 탐지 시스템에 관한 연구)

  • Lee, Seok Chang;Kim, Young Hyun;Kang, Soo Kyung;Park, Myung Hye
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.3
    • /
    • pp.546-557
    • /
    • 2022
  • Recently, the demand to interpret image data with artificial intelligence in various fields is rapidly increasing. Object recognition and detection techniques using deep learning are mainly used, and video integration analysis to determine unstructured object recognition is a particularly important problem. In the case of natural disasters or social disasters, there is a limit to the object recognition structure alone because it has an unstructured shape. In this paper, we propose intelligent video integration analysis system that can recognize unstructured objects based on video turning point and object detection. We also introduce a method to apply and evaluate object recognition using virtual augmented images from 2D to 3D through GAN.

No-Reference Image Quality Assessment based on Quality Awareness Feature and Multi-task Training

  • Lai, Lijing;Chu, Jun;Leng, Lu
    • Journal of Multimedia Information System
    • /
    • v.9 no.2
    • /
    • pp.75-86
    • /
    • 2022
  • The existing image quality assessment (IQA) datasets have a small number of samples. Some methods based on transfer learning or data augmentation cannot make good use of image quality-related features. A No Reference (NR)-IQA method based on multi-task training and quality awareness is proposed. First, single or multiple distortion types and levels are imposed on the original image, and different strategies are used to augment different types of distortion datasets. With the idea of weak supervision, we use the Full Reference (FR)-IQA methods to obtain the pseudo-score label of the generated image. Then, we combine the classification information of the distortion type, level, and the information of the image quality score. The ResNet50 network is trained in the pre-train stage on the augmented dataset to obtain more quality-aware pre-training weights. Finally, the fine-tuning stage training is performed on the target IQA dataset using the quality-aware weights to predicate the final prediction score. Various experiments designed on the synthetic distortions and authentic distortions datasets (LIVE, CSIQ, TID2013, LIVEC, KonIQ-10K) prove that the proposed method can utilize the image quality-related features better than the method using only single-task training. The extracted quality-aware features improve the accuracy of the model.

A Study on Combine Artificial Intelligence Models for multi-classification for an Abnormal Behaviors in CCTV images (CCTV 영상의 이상행동 다중 분류를 위한 결합 인공지능 모델에 관한 연구)

  • Lee, Hongrae;Kim, Youngtae;Seo, Byung-suk
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.05a
    • /
    • pp.498-500
    • /
    • 2022
  • CCTV protects people and assets safely by identifying dangerous situations and responding promptly. However, it is difficult to continuously monitor the increasing number of CCTV images. For this reason, there is a need for a device that continuously monitors CCTV images and notifies when abnormal behavior occurs. Recently, many studies using artificial intelligence models for image data analysis have been conducted. This study simultaneously learns spatial and temporal characteristic information between image data to classify various abnormal behaviors that can be observed in CCTV images. As an artificial intelligence model used for learning, we propose a multi-classification deep learning model that combines an end-to-end 3D convolutional neural network(CNN) and ResNet.

  • PDF

Ensemble Knowledge Distillation for Classification of 14 Thorax Diseases using Chest X-ray Images (흉부 X-선 영상을 이용한 14 가지 흉부 질환 분류를 위한 Ensemble Knowledge Distillation)

  • Ho, Thi Kieu Khanh;Jeon, Younghoon;Gwak, Jeonghwan
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2021.07a
    • /
    • pp.313-315
    • /
    • 2021
  • Timely and accurate diagnosis of lung diseases using Chest X-ray images has been gained much attention from the computer vision and medical imaging communities. Although previous studies have presented the capability of deep convolutional neural networks by achieving competitive binary classification results, their models were seemingly unreliable to effectively distinguish multiple disease groups using a large number of x-ray images. In this paper, we aim to build an advanced approach, so-called Ensemble Knowledge Distillation (EKD), to significantly boost the classification accuracies, compared to traditional KD methods by distilling knowledge from a cumbersome teacher model into an ensemble of lightweight student models with parallel branches trained with ground truth labels. Therefore, learning features at different branches of the student models could enable the network to learn diverse patterns and improve the qualify of final predictions through an ensemble learning solution. Although we observed that experiments on the well-established ChestX-ray14 dataset showed the classification improvements of traditional KD compared to the base transfer learning approach, the EKD performance would be expected to potentially enhance classification accuracy and model generalization, especially in situations of the imbalanced dataset and the interdependency of 14 weakly annotated thorax diseases.

  • PDF