• 제목/요약/키워드: Deep Learning Models

검색결과 1,262건 처리시간 0.031초

Stacked Bidirectional LSTM-CRFs를 이용한 한국어 의미역 결정 (Korean Semantic Role Labeling using Stacked Bidirectional LSTM-CRFs)

  • 배장성;이창기
    • 정보과학회 논문지
    • /
    • 제44권1호
    • /
    • pp.36-43
    • /
    • 2017
  • 의미역 결정 연구에 있어 구문 분석 정보는 술어-논항 사이의 의존 관계를 포함하고 있기 때문에 의미역 결정 성능 향상에 큰 도움이 된다. 그러나 의미역 결정 이전에 구문 분석을 수행해야 하는 비용(overhead)이 발생하게 되고, 구문 분석 단계에서 발생하는 오류를 그대로 답습하는 단점이 있다. 이러한 문제점을 해결하기 위해 본 논문에서는 구문 분석 정보를 제외한 형태소 분석 정보만을 사용하는 End-to-end SRL 방식의 한국어 의미역 결정 시스템을 제안하고, 순차 데이터 모델링에 적합한 LSTM RNN을 확장한 Stacked Bidirectional LSTM-CRFs 모델을 적용해 구문 분석 정보 없이 기존 연구보다 더 높은 성능을 얻을 수 있음을 보인다.

딥러닝 기반 암세포 사진 분류 알고리즘 (Deep Learning Algorithm to Identify Cancer Pictures)

  • 서영민;한종기
    • 방송공학회논문지
    • /
    • 제23권5호
    • /
    • pp.669-681
    • /
    • 2018
  • 본 논문에서는 고해상도 자궁경부암 세포사진을 CNN(Convolution Neural Network)을 통해 효과적으로 인식 및 분류하는 방법을 소개한다. 이때 고려되는 세포의 종류는 Ascus, Inflammation, RCC, Normal 로 네 가지가 있다. 본 논문에서는 먼저 기존의 고해상도 이미지를 분류하는 알고리즘을 소개하고, 이 방법을 이용하여 고해상도 세포사진을 분류하는 과정에서 어떤 정보의 손실이 발생하는지 분석한 후, 이를 해결하기 위한 방법을 제시한다. 이를 위해서 본 논문에서 제안하는 학습 모델에서는 dilated convolution을 이용하여 고해상도 사진의 정보의 손실을 최소한으로 줄임과 동시에 학습속도 빠르게 하는 알고리즘을 제시한다. 또한 이미지 전처리 과정으로 임계치를 사용함으로써 암세포를 판단하는데 혼란을 줄 수 있는 부분을 제거함으로써 인식률을 향상시킨다. 본 논문에서 제시되는 실험 결과를 통해, 제안한 알고리즘이 기존 기술보다 높은 인식률을 제공하는 것을 확인할 수 있었다.

계층적 포인터 네트워크를 이용한 상호참조해결 (Coreference Resolution using Hierarchical Pointer Networks)

  • 박천음;이창기
    • 정보과학회 컴퓨팅의 실제 논문지
    • /
    • 제23권9호
    • /
    • pp.542-549
    • /
    • 2017
  • Sequence-to-sequence 모델과 이와 유사한 포인터 네트워크는 입력이 여러 문장으로 이루어 지거나 입력 문장의 길이가 길어지면 성능이 저하되는 문제가 있다. 이러한 문제를 해결하기 위해 본 논문에서는 여러 문장으로 이루어진 입력열을 단어 레벨과 문장 레벨로 인코딩을 수행하고, 디코딩에서 단어 레벨과 문장 레벨 정보를 모두 이용하는 계층적 포인터 네트워크 모델을 제안하고, 이를 이용하여 모든 멘션(mention)에 대한 상호참조해결을 수행하는 계층적 포인터 네트워크 기반 상호참조해결을 제안한다. 실험 결과, 본 논문에서 제안한 모델이 정확률 87.07%, 재현율 65.39%, CoNLL F1 74.61%의 성능을 보였으며, 기존 규칙기반 모델 대비 24.01%의 성능 향상을 보였다.

Camera Source Identification of Digital Images Based on Sample Selection

  • Wang, Zhihui;Wang, Hong;Li, Haojie
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제12권7호
    • /
    • pp.3268-3283
    • /
    • 2018
  • With the advent of the Information Age, the source identification of digital images, as a part of digital image forensics, has attracted increasing attention. Therefore, an effective technique to identify the source of digital images is urgently needed at this stage. In this paper, first, we study and implement some previous work on image source identification based on sensor pattern noise, such as the Lukas method, principal component analysis method and the random subspace method. Second, to extract a purer sensor pattern noise, we propose a sample selection method to improve the random subspace method. By analyzing the image texture feature, we select a patch with less complexity to extract more reliable sensor pattern noise, which improves the accuracy of identification. Finally, experiment results reveal that the proposed sample selection method can extract a purer sensor pattern noise, which further improves the accuracy of image source identification. At the same time, this approach is less complicated than the deep learning models and is close to the most advanced performance.

Understanding recurrent neural network for texts using English-Korean corpora

  • Lee, Hagyeong;Song, Jongwoo
    • Communications for Statistical Applications and Methods
    • /
    • 제27권3호
    • /
    • pp.313-326
    • /
    • 2020
  • Deep Learning is the most important key to the development of Artificial Intelligence (AI). There are several distinguishable architectures of neural networks such as MLP, CNN, and RNN. Among them, we try to understand one of the main architectures called Recurrent Neural Network (RNN) that differs from other networks in handling sequential data, including time series and texts. As one of the main tasks recently in Natural Language Processing (NLP), we consider Neural Machine Translation (NMT) using RNNs. We also summarize fundamental structures of the recurrent networks, and some topics of representing natural words to reasonable numeric vectors. We organize topics to understand estimation procedures from representing input source sequences to predict target translated sequences. In addition, we apply multiple translation models with Gated Recurrent Unites (GRUs) in Keras on English-Korean sentences that contain about 26,000 pairwise sequences in total from two different corpora, colloquialism and news. We verified some crucial factors that influence the quality of training. We found that loss decreases with more recurrent dimensions and using bidirectional RNN in the encoder when dealing with short sequences. We also computed BLEU scores which are the main measures of the translation performance, and compared them with the score from Google Translate using the same test sentences. We sum up some difficulties when training a proper translation model as well as dealing with Korean language. The use of Keras in Python for overall tasks from processing raw texts to evaluating the translation model also allows us to include some useful functions and vocabulary libraries as well.

Low-Quality Banknote Serial Number Recognition Based on Deep Neural Network

  • Jang, Unsoo;Suh, Kun Ha;Lee, Eui Chul
    • Journal of Information Processing Systems
    • /
    • 제16권1호
    • /
    • pp.224-237
    • /
    • 2020
  • Recognition of banknote serial number is one of the important functions for intelligent banknote counter implementation and can be used for various purposes. However, the previous character recognition method is limited to use due to the font type of the banknote serial number, the variation problem by the solid status, and the recognition speed issue. In this paper, we propose an aspect ratio based character region segmentation and a convolutional neural network (CNN) based banknote serial number recognition method. In order to detect the character region, the character area is determined based on the aspect ratio of each character in the serial number candidate area after the banknote area detection and de-skewing process is performed. Then, we designed and compared four types of CNN models and determined the best model for serial number recognition. Experimental results showed that the recognition accuracy of each character was 99.85%. In addition, it was confirmed that the recognition performance is improved as a result of performing data augmentation. The banknote used in the experiment is Indian rupee, which is badly soiled and the font of characters is unusual, therefore it can be regarded to have good performance. Recognition speed was also enough to run in real time on a device that counts 800 banknotes per minute.

영상 기반 강아지의 이상 행동 탐지 (Camera-based Dog Unwanted Behavior Detection)

  • 오스만;이종욱;박대희;정용화
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2019년도 춘계학술발표대회
    • /
    • pp.419-422
    • /
    • 2019
  • The recent increase in single-person households and family income has led to an increase in the number of pet owners. However, due to the owners' difficulty to communicate with them for 24 hours, pets, and especially dogs, tend to display unwanted behavior that can be harmful to themselves and their environment when left alone. Therefore, detecting those behaviors when the owner is absent is necessary to suppress them and prevent any damage. In this paper, we propose a camera-based system that detects a set of normal and unwanted behaviors using deep learning algorithms to monitor dogs when left alone at home. The frames collected from the camera are arranged into sequences of RGB frames and their corresponding optical flow sequences, and then features are extracted from each data flow using pre-trained VGG-16 models. The extracted features from each sequence are concatenated and input to a bi-directional LSTM network that classifies the dog action into one of the targeted classes. The experimental results show that our method achieves a good performance exceeding 0.9 in precision, recall and f-1 score.

딥러닝 알고리즘에 기반한 퇴원 학생 예측모델 비교 (Comparison of Student Churning Prediction Models based on Deep Learning Algorithms)

  • 고영상;임희석
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2019년도 추계학술발표대회
    • /
    • pp.833-835
    • /
    • 2019
  • 교육열이 강한 우리나라에서는 사교육은 언제나 뜨거운 감자이다. 교육대상 연령층의 인구수가 1990 년부터 빠르게 감소하기 시작했으며, 2005 년을 전후로 초등학생 수의 감소가 더욱 빨라지고 있다. 통계청 데이터에 따르면 2016 년 출생아 수는 40 만 6 천여명에서 2017 년은 35 만 7 천여명으로 향후에도 지속적으로 줄어들 추세이다. 이렇듯 매년 학생수가 감소함에도 불구하고 2018 년 사교육비 총액은 19 조 5 천억수준으로 2017 년 18 조 7 천억보다 8 천억원이 늘어 났다. 학생수는 전년보다 2.5% 줄었지만 사교육비는 반대로 4.4% 늘어났다. 이렇듯 사교육 시장이 심화 되게 되면 경쟁은 더욱 치열해 질 수 밖에 없으며 이 경쟁에서 살아 남기 위해서는 다양한 비즈니스 전략이 필요하며 특히 학생들의 이탈을 줄이는 것은 사업의 가장 중요한 포인트라고 볼 수 있을 것이다. 학원에서의 학생이 퇴원을 하는 이유에 대한 영향도를 분석하고 그 영향도 분석을 통해 학원 학생들의 퇴원 방지에 활용하고자 한다. 본 논문의 주요 연구 내용은 사교육을 대표하는 국내 사설 학원에서의 성적, 출결사항 및 학원 상담 내역 등의 다양한 학원 데이터들을 최적의 딥러닝 알고리즘 분석을 통한 퇴원 학생을 사전 예측하기 위한 논문임을 밝힌다.

딥 러닝 기반의 팬옵틱 분할 기법 분석 (Survey on Deep Learning-based Panoptic Segmentation Methods)

  • 권정은;조성인
    • 대한임베디드공학회논문지
    • /
    • 제16권5호
    • /
    • pp.209-214
    • /
    • 2021
  • Panoptic segmentation, which is now widely used in computer vision such as medical image analysis, and autonomous driving, helps understanding an image with holistic view. It identifies each pixel by assigning a unique class ID, and an instance ID. Specifically, it can classify 'thing' from 'stuff', and provide pixel-wise results of semantic prediction and object detection. As a result, it can solve both semantic segmentation and instance segmentation tasks through a unified single model, producing two different contexts for two segmentation tasks. Semantic segmentation task focuses on how to obtain multi-scale features from large receptive field, without losing low-level features. On the other hand, instance segmentation task focuses on how to separate 'thing' from 'stuff' and how to produce the representation of detected objects. With the advances of both segmentation techniques, several panoptic segmentation models have been proposed. Many researchers try to solve discrepancy problems between results of two segmentation branches that can be caused on the boundary of the object. In this survey paper, we will introduce the concept of panoptic segmentation, categorize the existing method into two representative methods and explain how it is operated on two methods: top-down method and bottom-up method. Then, we will analyze the performance of various methods with experimental results.

Stage-GAN with Semantic Maps for Large-scale Image Super-resolution

  • Wei, Zhensong;Bai, Huihui;Zhao, Yao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제13권8호
    • /
    • pp.3942-3961
    • /
    • 2019
  • Recently, the models of deep super-resolution networks can successfully learn the non-linear mapping from the low-resolution inputs to high-resolution outputs. However, for large scaling factors, this approach has difficulties in learning the relation of low-resolution to high-resolution images, which lead to the poor restoration. In this paper, we propose Stage Generative Adversarial Networks (Stage-GAN) with semantic maps for image super-resolution (SR) in large scaling factors. We decompose the task of image super-resolution into a novel semantic map based reconstruction and refinement process. In the initial stage, the semantic maps based on the given low-resolution images can be generated by Stage-0 GAN. In the next stage, the generated semantic maps from Stage-0 and corresponding low-resolution images can be used to yield high-resolution images by Stage-1 GAN. In order to remove the reconstruction artifacts and blurs for high-resolution images, Stage-2 GAN based post-processing module is proposed in the last stage, which can reconstruct high-resolution images with photo-realistic details. Extensive experiments and comparisons with other SR methods demonstrate that our proposed method can restore photo-realistic images with visual improvements. For scale factor ${\times}8$, our method performs favorably against other methods in terms of gradients similarity.