• Title/Summary/Keyword: Deep Learning Dataset

Search Result 796, Processing Time 0.028 seconds

Automated Facial Wrinkle Segmentation Scheme Using UNet++

  • Hyeonwoo Kim;Junsuk Lee;Jehyeok, Rew;Eenjun Hwang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.8
    • /
    • pp.2333-2345
    • /
    • 2024
  • Facial wrinkles are widely used to evaluate skin condition or aging for various fields such as skin diagnosis, plastic surgery consultations, and cosmetic recommendations. In order to effectively process facial wrinkles in facial image analysis, accurate wrinkle segmentation is required to identify wrinkled regions. Existing deep learning-based methods have difficulty segmenting fine wrinkles due to insufficient wrinkle data and the imbalance between wrinkle and non-wrinkle data. Therefore, in this paper, we propose a new facial wrinkle segmentation method based on a UNet++ model. Specifically, we construct a new facial wrinkle dataset by manually annotating fine wrinkles across the entire face. We then extract only the skin region from the facial image using a facial landmark point extractor. Lastly, we train the UNet++ model using both dice loss and focal loss to alleviate the class imbalance problem. To validate the effectiveness of the proposed method, we conduct comprehensive experiments using our facial wrinkle dataset. The experimental results showed that the proposed method was superior to the latest wrinkle segmentation method by 9.77%p and 10.04%p in IoU and F1 score, respectively.

Efficient Recognition of Easily-confused Chinese Herbal Slices Images Using Enhanced ResNeSt

  • Qi Zhang;Jinfeng Ou;Huaying Zhou
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.8
    • /
    • pp.2103-2118
    • /
    • 2024
  • Chinese herbal slices (CHS) automated recognition based on computer vision plays a critical role in the practical application of intelligent Chinese medicine. Due to the complexity and similarity of herbal images, identifying Chinese herbal slices is still a challenging task. Especially, easily-confused CHS have higher inter-class and intra-class complexity and similarity issues, the existing deep learning models are less adaptable to identify them efficiently. To comprehensively address these problems, a novel tiny easily-confused CHS dataset has been built firstly, which includes six pairs of twelve categories with about 2395 samples. Furthermore, we propose a ResNeSt-CHS model that combines multilevel perception fusion (MPF) and perceptive sparse fusion (PSF) blocks for efficiently recognizing easilyconfused CHS images. To verify the superiority of the ResNeSt-CHS and the effectiveness of our dataset, experiments have been employed, validating that the ResNeSt-CHS is optimal for easily-confused CHS recognition, with 2.1% improvement of the original ResNeSt model. Additionally, the results indicate that ResNeSt-CHS is applied on a relatively small-scale dataset yet high accuracy. This model has obtained state-of-the-art easily-confused CHS classification performance, with accuracy of 90.8%, far beyond other models (EfficientNet, Transformer, and ResNeSt, etc) in terms of evaluation criteria.

Pretext Task Analysis for Self-Supervised Learning Application of Medical Data (의료 데이터의 자기지도학습 적용을 위한 pretext task 분석)

  • Kong, Heesan;Park, Jaehun;Kim, Kwangsu
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.38-40
    • /
    • 2021
  • Medical domain has a massive number of data records without the response value. Self-supervised learning is a suitable method for medical data since it learns pretext-task and supervision, which the model can understand the semantic representation of data without response values. However, since self-supervised learning performance depends on the expression learned by the pretext-task, it is necessary to define an appropriate Pretext-task with data feature consideration. In this paper, to actively exploit the unlabeled medical data into artificial intelligence research, experimentally find pretext-tasks that suitable for the medical data and analyze the result. We use the x-ray image dataset which is effectively utilizable for the medical domain.

  • PDF

Hybrid All-Reduce Strategy with Layer Overlapping for Reducing Communication Overhead in Distributed Deep Learning (분산 딥러닝에서 통신 오버헤드를 줄이기 위해 레이어를 오버래핑하는 하이브리드 올-리듀스 기법)

  • Kim, Daehyun;Yeo, Sangho;Oh, Sangyoon
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.10 no.7
    • /
    • pp.191-198
    • /
    • 2021
  • Since the size of training dataset become large and the model is getting deeper to achieve high accuracy in deep learning, the deep neural network training requires a lot of computation and it takes too much time with a single node. Therefore, distributed deep learning is proposed to reduce the training time by distributing computation across multiple nodes. In this study, we propose hybrid allreduce strategy that considers the characteristics of each layer and communication and computational overlapping technique for synchronization of distributed deep learning. Since the convolution layer has fewer parameters than the fully-connected layer as well as it is located at the upper, only short overlapping time is allowed. Thus, butterfly allreduce is used to synchronize the convolution layer. On the other hand, fully-connecter layer is synchronized using ring all-reduce. The empirical experiment results on PyTorch with our proposed scheme shows that the proposed method reduced the training time by up to 33% compared to the baseline PyTorch.

SIFT Image Feature Extraction based on Deep Learning (딥 러닝 기반의 SIFT 이미지 특징 추출)

  • Lee, Jae-Eun;Moon, Won-Jun;Seo, Young-Ho;Kim, Dong-Wook
    • Journal of Broadcast Engineering
    • /
    • v.24 no.2
    • /
    • pp.234-242
    • /
    • 2019
  • In this paper, we propose a deep neural network which extracts SIFT feature points by determining whether the center pixel of a cropped image is a SIFT feature point. The data set of this network consists of a DIV2K dataset cut into $33{\times}33$ size and uses RGB image unlike SIFT which uses black and white image. The ground truth consists of the RobHess SIFT features extracted by setting the octave (scale) to 0, the sigma to 1.6, and the intervals to 3. Based on the VGG-16, we construct an increasingly deep network of 13 to 23 and 33 convolution layers, and experiment with changing the method of increasing the image scale. The result of using the sigmoid function as the activation function of the output layer is compared with the result using the softmax function. Experimental results show that the proposed network not only has more than 99% extraction accuracy but also has high extraction repeatability for distorted images.

Training a semantic segmentation model for cracks in the concrete lining of tunnel (터널 콘크리트 라이닝 균열 분석을 위한 의미론적 분할 모델 학습)

  • Ham, Sangwoo;Bae, Soohyeon;Kim, Hwiyoung;Lee, Impyeong;Lee, Gyu-Phil;Kim, Donggyou
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.23 no.6
    • /
    • pp.549-558
    • /
    • 2021
  • In order to keep infrastructures such as tunnels and underground facilities safe, cracks of concrete lining in tunnel should be detected by regular inspections. Since regular inspections are accomplished through manual efforts using maintenance lift vehicles, it brings about traffic jam, exposes works to dangerous circumstances, and deteriorates consistency of crack inspection data. This study aims to provide methodology to automatically extract cracks from tunnel concrete lining images generated by the existing tunnel image acquisition system. Specifically, we train a deep learning based semantic segmentation model with open dataset, and evaluate its performance with the dataset from the existing tunnel image acquisition system. In particular, we compare the model performance in case of using all of a public dataset, subset of the public dataset which are related to tunnel surfaces, and the tunnel-related subset with negative examples. As a result, the model trained using the tunnel-related subset with negative examples reached the best performance. In the future, we expect that this research can be used for planning efficient model training strategy for crack detection.

Development of Type 2 Prediction Prediction Based on Big Data (빅데이터 기반 2형 당뇨 예측 알고리즘 개발)

  • Hyun Sim;HyunWook Kim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.5
    • /
    • pp.999-1008
    • /
    • 2023
  • Early prediction of chronic diseases such as diabetes is an important issue, and improving the accuracy of diabetes prediction is especially important. Various machine learning and deep learning-based methodologies are being introduced for diabetes prediction, but these technologies require large amounts of data for better performance than other methodologies, and the learning cost is high due to complex data models. In this study, we aim to verify the claim that DNN using the pima dataset and k-fold cross-validation reduces the efficiency of diabetes diagnosis models. Machine learning classification methods such as decision trees, SVM, random forests, logistic regression, KNN, and various ensemble techniques were used to determine which algorithm produces the best prediction results. After training and testing all classification models, the proposed system provided the best results on XGBoost classifier with ADASYN method, with accuracy of 81%, F1 coefficient of 0.81, and AUC of 0.84. Additionally, a domain adaptation method was implemented to demonstrate the versatility of the proposed system. An explainable AI approach using the LIME and SHAP frameworks was implemented to understand how the model predicts the final outcome.

Multi-focus Image Fusion using Fully Convolutional Two-stream Network for Visual Sensors

  • Xu, Kaiping;Qin, Zheng;Wang, Guolong;Zhang, Huidi;Huang, Kai;Ye, Shuxiong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.5
    • /
    • pp.2253-2272
    • /
    • 2018
  • We propose a deep learning method for multi-focus image fusion. Unlike most existing pixel-level fusion methods, either in spatial domain or in transform domain, our method directly learns an end-to-end fully convolutional two-stream network. The framework maps a pair of different focus images to a clean version, with a chain of convolutional layers, fusion layer and deconvolutional layers. Our deep fusion model has advantages of efficiency and robustness, yet demonstrates state-of-art fusion quality. We explore different parameter settings to achieve trade-offs between performance and speed. Moreover, the experiment results on our training dataset show that our network can achieve good performance with subjective visual perception and objective assessment metrics.

Road Damage Detection and Classification based on Multi-level Feature Pyramids

  • Yin, Junru;Qu, Jiantao;Huang, Wei;Chen, Qiqiang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.2
    • /
    • pp.786-799
    • /
    • 2021
  • Road damage detection is important for road maintenance. With the development of deep learning, more and more road damage detection methods have been proposed, such as Fast R-CNN, Faster R-CNN, Mask R-CNN and RetinaNet. However, because shallow and deep layers cannot be extracted at the same time, the existing methods do not perform well in detecting objects with fewer samples. In addition, these methods cannot obtain a highly accurate detecting bounding box. This paper presents a Multi-level Feature Pyramids method based on M2det. Because the feature layer has multi-scale and multi-level architecture, the feature layer containing more information and obvious features can be extracted. Moreover, an attention mechanism is used to improve the accuracy of local boundary boxes in the dataset. Experimental results show that the proposed method is better than the current state-of-the-art methods.

TVM-based Performance Optimization for Image Classification in Embedded Systems (임베디드 시스템에서의 객체 분류를 위한 TVM기반의 성능 최적화 연구)

  • Cheonghwan Hur;Minhae Ye;Ikhee Shin;Daewoo Lee
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.18 no.3
    • /
    • pp.101-108
    • /
    • 2023
  • Optimizing the performance of deep neural networks on embedded systems is a challenging task that requires efficient compilers and runtime systems. We propose a TVM-based approach that consists of three steps: quantization, auto-scheduling, and ahead-of-time compilation. Our approach reduces the computational complexity of models without significant loss of accuracy, and generates optimized code for various hardware platforms. We evaluate our approach on three representative CNNs using ImageNet Dataset on the NVIDIA Jetson AGX Xavier board and show that it outperforms baseline methods in terms of processing speed.