• Title/Summary/Keyword: 스킵 연결

Search Result 14, Processing Time 0.022 seconds

2D and 3D Hand Pose Estimation Based on Skip Connection Form (스킵 연결 형태 기반의 손 관절 2D 및 3D 검출 기법)

  • Ku, Jong-Hoe;Kim, Mi-Kyung;Cha, Eui-Young
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.12
    • /
    • pp.1574-1580
    • /
    • 2020
  • Traditional pose estimation methods include using special devices or images through image processing. The disadvantage of using a device is that the environment in which the device can be used is limited and costly. The use of cameras and image processing has the advantage of reducing environmental constraints and costs, but the performance is lower. CNN(Convolutional Neural Networks) were studied for pose estimation just using only camera without these disadvantage. Various techniques were proposed to increase cognitive performance. In this paper, the effect of the skip connection on the network was experimented by using various skip connections on the joint recognition of the hand. Experiments have confirmed that the presence of additional skip connections other than the basic skip connections has a better effect on performance, but the network with downward skip connections is the best performance.

Clustering Performance Analysis of Autoencoder with Skip Connection (스킵연결이 적용된 오토인코더 모델의 클러스터링 성능 분석)

  • Jo, In-su;Kang, Yunhee;Choi, Dong-bin;Park, Young B.
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.12
    • /
    • pp.403-410
    • /
    • 2020
  • In addition to the research on noise removal and super-resolution using the data restoration (Output result) function of Autoencoder, research on the performance improvement of clustering using the dimension reduction function of autoencoder are actively being conducted. The clustering function and data restoration function using Autoencoder have common points that both improve performance through the same learning. Based on these characteristics, this study conducted an experiment to see if the autoencoder model designed to have excellent data recovery performance is superior in clustering performance. Skip connection technique was used to design autoencoder with excellent data recovery performance. The output result performance and clustering performance of both autoencoder model with Skip connection and model without Skip connection were shown as graph and visual extract. The output result performance was increased, but the clustering performance was decreased. This result indicates that the neural network models such as autoencoders are not sure that each layer has learned the characteristics of the data well if the output result is good. Lastly, the performance degradation of clustering was compensated by using both latent code and skip connection. This study is a prior study to solve the Hanja Unicode problem by clustering.

Single Image Super-resolution using Recursive Residual Architecture Via Dense Skip Connections (고밀도 스킵 연결을 통한 재귀 잔차 구조를 이용한 단일 이미지 초해상도 기법)

  • Chen, Jian;Jeong, Jechang
    • Journal of Broadcast Engineering
    • /
    • v.24 no.4
    • /
    • pp.633-642
    • /
    • 2019
  • Recently, the convolution neural network (CNN) model at a single image super-resolution (SISR) have been very successful. The residual learning method can improve training stability and network performance in CNN. In this paper, we propose a SISR using recursive residual network architecture by introducing dense skip connections for learning nonlinear mapping from low-resolution input image to high-resolution target image. The proposed SISR method adopts a method of the recursive residual learning to mitigate the difficulty of the deep network training and remove unnecessary modules for easier to optimize in CNN layers because of the concise and compact recursive network via dense skip connection method. The proposed method not only alleviates the vanishing-gradient problem of a very deep network, but also get the outstanding performance with low complexity of neural network, which allows the neural network to perform training, thereby exhibiting improved performance of SISR method.

Fine-tuning of Attention-based BART Model for Text Summarization (텍스트 요약을 위한 어텐션 기반 BART 모델 미세조정)

  • Ahn, Young-Pill;Park, Hyun-Jun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.12
    • /
    • pp.1769-1776
    • /
    • 2022
  • Automatically summarizing long sentences is an important technique. The BART model is one of the widely used models in the summarization task. In general, in order to generate a summarization model of a specific domain, fine-tuning is performed by re-training a language model trained on a large dataset to fit the domain. The fine-tuning is usually done by changing the number of nodes in the last fully connected layer. However, in this paper, we propose a fine-tuning method by adding an attention layer, which has been recently applied to various models and shows good performance. In order to evaluate the performance of the proposed method, various experiments were conducted, such as accumulating layers deeper, fine-tuning without skip connections during the fine tuning process, and so on. As a result, the BART model using two attention layers with skip connection shows the best score.

A Study on the Age Prediction Model Using ResNet (ResNet을 이용한 나이 예측 모델 연구)

  • Ji-Hun Kim;Young-Tae Shin
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2024.05a
    • /
    • pp.803-806
    • /
    • 2024
  • 본 연구는 디지털 기술과 인공지능의 발전을 배경으로, ResNet 모델을 활용하여 얼굴 인식 및 나이 예측 시스템을 개발하고 평가한다. ResNet의 잔차 학습과 스킵 연결 기능은 깊은 신경망에서 발생할 수 있는 기울기 소실 문제를 해결하여 모델의 학습 효율을 높이는 데 중요한 역할을 한다. 또한 All-Age-Faces Dataset을 이용하여 나이 예측에서 아시아 인종에 대한 편향 없이 고르게 좋은 성능을 보여주는 것을 목표로 한다.

Label Embedding for Improving Classification Accuracy UsingAutoEncoderwithSkip-Connections (다중 레이블 분류의 정확도 향상을 위한 스킵 연결 오토인코더 기반 레이블 임베딩 방법론)

  • Kim, Museong;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.175-197
    • /
    • 2021
  • Recently, with the development of deep learning technology, research on unstructured data analysis is being actively conducted, and it is showing remarkable results in various fields such as classification, summary, and generation. Among various text analysis fields, text classification is the most widely used technology in academia and industry. Text classification includes binary class classification with one label among two classes, multi-class classification with one label among several classes, and multi-label classification with multiple labels among several classes. In particular, multi-label classification requires a different training method from binary class classification and multi-class classification because of the characteristic of having multiple labels. In addition, since the number of labels to be predicted increases as the number of labels and classes increases, there is a limitation in that performance improvement is difficult due to an increase in prediction difficulty. To overcome these limitations, (i) compressing the initially given high-dimensional label space into a low-dimensional latent label space, (ii) after performing training to predict the compressed label, (iii) restoring the predicted label to the high-dimensional original label space, research on label embedding is being actively conducted. Typical label embedding techniques include Principal Label Space Transformation (PLST), Multi-Label Classification via Boolean Matrix Decomposition (MLC-BMaD), and Bayesian Multi-Label Compressed Sensing (BML-CS). However, since these techniques consider only the linear relationship between labels or compress the labels by random transformation, it is difficult to understand the non-linear relationship between labels, so there is a limitation in that it is not possible to create a latent label space sufficiently containing the information of the original label. Recently, there have been increasing attempts to improve performance by applying deep learning technology to label embedding. Label embedding using an autoencoder, a deep learning model that is effective for data compression and restoration, is representative. However, the traditional autoencoder-based label embedding has a limitation in that a large amount of information loss occurs when compressing a high-dimensional label space having a myriad of classes into a low-dimensional latent label space. This can be found in the gradient loss problem that occurs in the backpropagation process of learning. To solve this problem, skip connection was devised, and by adding the input of the layer to the output to prevent gradient loss during backpropagation, efficient learning is possible even when the layer is deep. Skip connection is mainly used for image feature extraction in convolutional neural networks, but studies using skip connection in autoencoder or label embedding process are still lacking. Therefore, in this study, we propose an autoencoder-based label embedding methodology in which skip connections are added to each of the encoder and decoder to form a low-dimensional latent label space that reflects the information of the high-dimensional label space well. In addition, the proposed methodology was applied to actual paper keywords to derive the high-dimensional keyword label space and the low-dimensional latent label space. Using this, we conducted an experiment to predict the compressed keyword vector existing in the latent label space from the paper abstract and to evaluate the multi-label classification by restoring the predicted keyword vector back to the original label space. As a result, the accuracy, precision, recall, and F1 score used as performance indicators showed far superior performance in multi-label classification based on the proposed methodology compared to traditional multi-label classification methods. This can be seen that the low-dimensional latent label space derived through the proposed methodology well reflected the information of the high-dimensional label space, which ultimately led to the improvement of the performance of the multi-label classification itself. In addition, the utility of the proposed methodology was identified by comparing the performance of the proposed methodology according to the domain characteristics and the number of dimensions of the latent label space.

Scattered X-ray Correction Using a Modified Auto-Encoder (수정된 구조의 AE 모델을 이용한 X-ray 산란선 보정 기법)

  • Seo, Hyogyeong;Jeong, Jihoon;Lee, Donggyu;Han, Seunghwa;Kim, Hojoon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.11a
    • /
    • pp.708-710
    • /
    • 2021
  • 본 논문에서는 X-ray 진단에서 산란선으로 인한 영상의 왜곡을 보정하는 방법으로서 수정된 구조의 AE(Auto-Encoder) 모델에 기반한 방법론을 제안한다. 기존 AE 모델의 계층에 따라 특징지도의 크기가 축소되고 팽창되는 과정에서 영상 복원에 필요한 정보가 소실될 가능성을 보완하기 위하여 동일 레벨 계층 간에 스킵 연결을 추가하였다. 또한 X-ray 영상에서 피사체 세부 부위의 두께와 밀도에 따라 산란선의 영향이 서로 다른 형태로 나타난다는 특성을 학습 과정에 효과적으로 반영하기 위하여 어텐션 모듈을 추가한 네트워크 구조를 도입하였다. 총 80 쌍의 흉부 X-ray 영상 데이터에 대하여 기존의 AE 모델을 사용한 방법 및 U-Net 과 FFA-Net 모델을 사용한 영상 복원 기법의 실험 결과를 상호 비교함으로써 제안된 방법의 타당성을 평가하였다.

Image Segmentation of Fuzzy Deep Learning using Fuzzy Logic (퍼지 논리를 이용한 퍼지 딥러닝 영상 분할)

  • Jongjin Park
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.23 no.5
    • /
    • pp.71-76
    • /
    • 2023
  • In this paper, we propose a fuzzy U-Net, a fuzzy deep learning model that applies fuzzy logic to improve performance in image segmentation using deep learning. Fuzzy modules using fuzzy logic were combined with U-Net, a deep learning model that showed excellent performance in image segmentation, and various types of fuzzy modules were simulated. The fuzzy module of the proposed deep learning model learns intrinsic and complex rules between feature maps of images and corresponding segmentation results. To this end, the superiority of the proposed method was demonstrated by applying it to dental CBCT data. As a result of the simulation, it can be seen that the performance of the ADD-RELU fuzzy module structure of the model using the addition skip connection in the proposed fuzzy U-Net is 0.7928 for the test dataset and the best.

Attention-based deep learning framework for skin lesion segmentation (피부 병변 분할을 위한 어텐션 기반 딥러닝 프레임워크)

  • Afnan Ghafoor;Bumshik Lee
    • Smart Media Journal
    • /
    • v.13 no.3
    • /
    • pp.53-61
    • /
    • 2024
  • This paper presents a novel M-shaped encoder-decoder architecture for skin lesion segmentation, achieving better performance than existing approaches. The proposed architecture utilizes the left and right legs to enable multi-scale feature extraction and is further enhanced by integrating an attention module within the skip connection. The image is partitioned into four distinct patches, facilitating enhanced processing within the encoder-decoder framework. A pivotal aspect of the proposed method is to focus more on critical image features through an attention mechanism, leading to refined segmentation. Experimental results highlight the effectiveness of the proposed approach, demonstrating superior accuracy, precision, and Jaccard Index compared to existing methods

A study on training DenseNet-Recurrent Neural Network for sound event detection (음향 이벤트 검출을 위한 DenseNet-Recurrent Neural Network 학습 방법에 관한 연구)

  • Hyeonjin Cha;Sangwook Park
    • The Journal of the Acoustical Society of Korea
    • /
    • v.42 no.5
    • /
    • pp.395-401
    • /
    • 2023
  • Sound Event Detection (SED) aims to identify not only sound category but also time interval for target sounds in an audio waveform. It is a critical technique in field of acoustic surveillance system and monitoring system. Recently, various models have introduced through Detection and Classification of Acoustic Scenes and Events (DCASE) Task 4. This paper explored how to design optimal parameters of DenseNet based model, which has led to outstanding performance in other recognition system. In experiment, DenseRNN as an SED model consists of DensNet-BC and bi-directional Gated Recurrent Units (GRU). This model is trained with Mean teacher model. With an event-based f-score, evaluation is performed depending on parameters, related to model architecture as well as model training, under the assessment protocol of DCASE task4. Experimental result shows that the performance goes up and has been saturated to near the best. Also, DenseRNN would be trained more effectively without dropout technique.