• Title/Summary/Keyword: Dropout layer

Search Result 14, Processing Time 0.05 seconds

Feasibility of Deep Learning Algorithms for Binary Classification Problems (이진 분류문제에서의 딥러닝 알고리즘의 활용 가능성 평가)

  • Kim, Kitae;Lee, Bomi;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.1
    • /
    • pp.95-108
    • /
    • 2017
  • Recently, AlphaGo which is Bakuk (Go) artificial intelligence program by Google DeepMind, had a huge victory against Lee Sedol. Many people thought that machines would not be able to win a man in Go games because the number of paths to make a one move is more than the number of atoms in the universe unlike chess, but the result was the opposite to what people predicted. After the match, artificial intelligence technology was focused as a core technology of the fourth industrial revolution and attracted attentions from various application domains. Especially, deep learning technique have been attracted as a core artificial intelligence technology used in the AlphaGo algorithm. The deep learning technique is already being applied to many problems. Especially, it shows good performance in image recognition field. In addition, it shows good performance in high dimensional data area such as voice, image and natural language, which was difficult to get good performance using existing machine learning techniques. However, in contrast, it is difficult to find deep leaning researches on traditional business data and structured data analysis. In this study, we tried to find out whether the deep learning techniques have been studied so far can be used not only for the recognition of high dimensional data but also for the binary classification problem of traditional business data analysis such as customer churn analysis, marketing response prediction, and default prediction. And we compare the performance of the deep learning techniques with that of traditional artificial neural network models. The experimental data in the paper is the telemarketing response data of a bank in Portugal. It has input variables such as age, occupation, loan status, and the number of previous telemarketing and has a binary target variable that records whether the customer intends to open an account or not. In this study, to evaluate the possibility of utilization of deep learning algorithms and techniques in binary classification problem, we compared the performance of various models using CNN, LSTM algorithm and dropout, which are widely used algorithms and techniques in deep learning, with that of MLP models which is a traditional artificial neural network model. However, since all the network design alternatives can not be tested due to the nature of the artificial neural network, the experiment was conducted based on restricted settings on the number of hidden layers, the number of neurons in the hidden layer, the number of output data (filters), and the application conditions of the dropout technique. The F1 Score was used to evaluate the performance of models to show how well the models work to classify the interesting class instead of the overall accuracy. The detail methods for applying each deep learning technique in the experiment is as follows. The CNN algorithm is a method that reads adjacent values from a specific value and recognizes the features, but it does not matter how close the distance of each business data field is because each field is usually independent. In this experiment, we set the filter size of the CNN algorithm as the number of fields to learn the whole characteristics of the data at once, and added a hidden layer to make decision based on the additional features. For the model having two LSTM layers, the input direction of the second layer is put in reversed position with first layer in order to reduce the influence from the position of each field. In the case of the dropout technique, we set the neurons to disappear with a probability of 0.5 for each hidden layer. The experimental results show that the predicted model with the highest F1 score was the CNN model using the dropout technique, and the next best model was the MLP model with two hidden layers using the dropout technique. In this study, we were able to get some findings as the experiment had proceeded. First, models using dropout techniques have a slightly more conservative prediction than those without dropout techniques, and it generally shows better performance in classification. Second, CNN models show better classification performance than MLP models. This is interesting because it has shown good performance in binary classification problems which it rarely have been applied to, as well as in the fields where it's effectiveness has been proven. Third, the LSTM algorithm seems to be unsuitable for binary classification problems because the training time is too long compared to the performance improvement. From these results, we can confirm that some of the deep learning algorithms can be applied to solve business binary classification problems.

A Deep Neural Network Model Based on a Mutation Operator (돌연변이 연산 기반 효율적 심층 신경망 모델)

  • Jeon, Seung Ho;Moon, Jong Sub
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.12
    • /
    • pp.573-580
    • /
    • 2017
  • Deep Neural Network (DNN) is a large layered neural network which is consisted of a number of layers of non-linear units. Deep Learning which represented as DNN has been applied very successfully in various applications. However, many issues in DNN have been identified through past researches. Among these issues, generalization is the most well-known problem. A Recent study, Dropout, successfully addressed this problem. Also, Dropout plays a role as noise, and so it helps to learn robust feature during learning in DNN such as Denoising AutoEncoder. However, because of a large computations required in Dropout, training takes a lot of time. Since Dropout keeps changing an inter-layer representation during the training session, the learning rates should be small, which makes training time longer. In this paper, using mutation operation, we reduce computation and improve generalization performance compared with Dropout. Also, we experimented proposed method to compare with Dropout method and showed that our method is superior to the Dropout one.

LSTM based sequence-to-sequence Model for Korean Automatic Word-spacing (LSTM 기반의 sequence-to-sequence 모델을 이용한 한글 자동 띄어쓰기)

  • Lee, Tae Seok;Kang, Seung Shik
    • Smart Media Journal
    • /
    • v.7 no.4
    • /
    • pp.17-23
    • /
    • 2018
  • We proposed a LSTM-based RNN model that can effectively perform the automatic spacing characteristics. For those long or noisy sentences which are known to be difficult to handle within Neural Network Learning, we defined a proper input data format and decoding data format, and added dropout, bidirectional multi-layer LSTM, layer normalization, and attention mechanism to improve the performance. Despite of the fact that Sejong corpus contains some spacing errors, a noise-robust learning model developed in this study with no overfitting through a dropout method helped training and returned meaningful results of Korean word spacing and its patterns. The experimental results showed that the performance of LSTM sequence-to-sequence model is 0.94 in F1-measure, which is better than the rule-based deep-learning method of GRU-CRF.

Research on a handwritten character recognition algorithm based on an extended nonlinear kernel residual network

  • Rao, Zheheng;Zeng, Chunyan;Wu, Minghu;Wang, Zhifeng;Zhao, Nan;Liu, Min;Wan, Xiangkui
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.1
    • /
    • pp.413-435
    • /
    • 2018
  • Although the accuracy of handwritten character recognition based on deep networks has been shown to be superior to that of the traditional method, the use of an overly deep network significantly increases time consumption during parameter training. For this reason, this paper took the training time and recognition accuracy into consideration and proposed a novel handwritten character recognition algorithm with newly designed network structure, which is based on an extended nonlinear kernel residual network. This network is a non-extremely deep network, and its main design is as follows:(1) Design of an unsupervised apriori algorithm for intra-class clustering, making the subsequent network training more pertinent; (2) presentation of an intermediate convolution model with a pre-processed width level of 2;(3) presentation of a composite residual structure that designs a multi-level quick link; and (4) addition of a Dropout layer after the parameter optimization. The algorithm shows superior results on MNIST and SVHN dataset, which are two character benchmark recognition datasets, and achieves better recognition accuracy and higher recognition efficiency than other deep structures with the same number of layers.

Accident Detection System in Tunnel using CCTV (CCTV를 이용한 터널내 사고감지 시스템)

  • Lee, Se-Hoon;Lee, Seung-Yeob;Noh, Yeong-Hun
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2021.07a
    • /
    • pp.3-4
    • /
    • 2021
  • 폐쇄된 터널 내부에서는 사고가 일어날 경우 외부에서는 터널 내 상황을 알 수가 없어 경미한 사고라 하더라도 대형 후속 2차 사고로 이어질 가능성이 크다. 또한영상탐지로사고 상황의 오검출을 줄이기 위해서, 본 연구에서는기존의 많은 CNN 모델 중 보유한 데이터에 가장 적합한 모델을 선택하는 과정에서 가장 좋은 성능을 보인 VGG16 모델을 전이학습 시키고 fully connected layer의 일부 layer에 Dropout을 적용시켜 Overfitting을일부방지하는 CNN 모델을 생성한 뒤Yolo를 이용한 영상 내 객체인식, OpenCV를 이용한 영상 프레임 내에서 객체의ROI를 추출하고이를 CNN 모델과 비교하여오검출을 줄이면서 사고를 검출하는 시스템을 제안하였다.

  • PDF

A Study on the Hyper-parameter Optimization of Bitcoin Price Prediction LSTM Model (비트코인 가격 예측을 위한 LSTM 모델의 Hyper-parameter 최적화 연구)

  • Kim, Jun-Ho;Sung, Hanul
    • Journal of the Korea Convergence Society
    • /
    • v.13 no.4
    • /
    • pp.17-24
    • /
    • 2022
  • Bitcoin is a peer-to-peer cryptocurrency designed for electronic transactions that do not depend on the government or financial institutions. Since Bitcoin was first issued, a huge blockchain financial market has been created, and as a result, research to predict Bitcoin price data using machine learning has been increasing. However, the inefficient Hyper-parameter optimization process of machine learning research is interrupting the progress of the research. In this paper, we analyzes and presents the direction of Hyper-parameter optimization through experiments that compose the entire combination of the Timesteps, the number of LSTM units, and the Dropout ratio among the most representative Hyper-parameter and measure the predictive performance for each combination based on Bitcoin price prediction model using LSTM layer.

A Study on the Fishing Efficiency of the Jigging Gear Neon Flying Squid , Ommastrephes Bartrami in the North Pacific (북태평양 빨강오징어 채낚기의 조획성능에 관한 연구)

  • 오희국
    • Journal of the Korean Society of Fisheries and Ocean Technology
    • /
    • v.30 no.3
    • /
    • pp.150-160
    • /
    • 1994
  • Drift gillnet fishery for neon flying squid in the North pacific was one of the major pelagic fisheries of Korea until 1992, its annual catch was 79, 000M/T as average during 1988-1992, but moratoriumed since 1993 according to the decision of UN. Therefore, for the developing of the new fishing gear for the squid, the seven types of rip hook by automatic squid jigging machine were experimented by the korean research vessel Pusan 851 (G/T 1.126, 2.600 PS) in the North Pacific (38 $^{\circ}$30'-43 $^{\circ}$N, 152 $^{\circ}$E-178 $^{\circ}$W) from July 6. 1993 to August 31. 1993. The investigation on catch rate, dropout rate, and catch condition of the rip hooks related to the fishing lamp power for aggregating the squid were carried out during the period. The results obtained are as follows: The composition of catch by automatic squid jigging machine was 83.9% for neon flying squid. 15.5% for boreopacific gonate squid. 0.6% for boreal clubhook squid, and 0.01% for luminous flying squid. The catch rate of neon flying squid was 94.6% in 13.6-18.3$^{\circ}C$ of surface water temperature and 5.4% in others. The higher catch rate of neon flying squid was made in the range 13.6-18.3$^{\circ}C$ of temperature at the surface and about 1$0^{\circ}C$ of temperature at the 100m layer. The CPUE of neon flying squid in the 13.6-18.3$^{\circ}C$ of surface water temperature was ranged 0.8-11.8kg (8.7kg as average). The mantle length and body weight of neon flying squid caught in the experiment were ranged 18.3-51.3 cm, 140-3, 980g and mean mantle length and mean body weight were 29.4cm, 972g respectively. The catch rate of neon flying squid was the highest at dawn with a value of 25.0% of the total catch. The body weight of neon flying squid caught by the D type hooks was 1.7 times more than that of the A type hooks. The dropout rate of neon flying squid caught by the seven types hooks was 7.9-57.5% (19.0% as average), and dropout rate of the D type hooks was 7.9% with 2.7 times decrease than that of the A type hooks. The catch efficiency of small sized neon flying squid in case of using on-off switch method on fishing lamp in 15 minutes intervals was 2.6 times higher than that of the on-switch method with same fishing lamp power.

  • PDF

Electrooculography Filtering Model Based on Machine Learning (머신러닝 기반의 안전도 데이터 필터링 모델)

  • Hong, Ki Hyeon;Lee, Byung Mun
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.2
    • /
    • pp.274-284
    • /
    • 2021
  • Customized services to a sleep induction for better sleepcare are more effective because of different satisfaction levels to users. The EOG data measured at the frontal lobe when a person blinks his eyes can be used as biometric data because it has different values for each person. The accuracy of measurement is degraded by a noise source, such as toss and turn. Therefore, it is necessary to analyze the noisy data and remove them from normal EOG by filtering. There are low-pass filtering and high-pass filtering as filtering using a frequency band. However, since filtering within a frequency band range is also required for more effective performance, we propose a machine learning model for the filtering of EOG data in this paper as the second filtering method. In addition, optimal values of parameters such as the depth of the hidden layer, the number of nodes of the hidden layer, the activation function, and the dropout were found through experiments, to improve the performance of the machine learning filtering model, and the filtering performance of 95.7% was obtained. Eventually, it is expected that it can be used for effective user identification services by using filtering model for EOG data.

A comparison of methods to reduce overfitting in neural networks

  • Kim, Ho-Chan;Kang, Min-Jae
    • International journal of advanced smart convergence
    • /
    • v.9 no.2
    • /
    • pp.173-178
    • /
    • 2020
  • A common problem with neural network learning is that it is too suitable for the specificity of learning. In this paper, various methods were compared to avoid overfitting: regularization, drop-out, different numbers of data and different types of neural networks. Comparative studies of the above-mentioned methods have been provided to evaluate the test accuracy. I found that the more data using method is better than the regularization and dropout methods. Moreover, we know that deep convolutional neural networks outperform multi-layer neural networks and simple convolution neural networks.

Prediction of Asphalt Pavement Service Life using Deep Learning (딥러닝을 활용한 일반국도 아스팔트포장의 공용수명 예측)

  • Choi, Seunghyun;Do, Myungsik
    • International Journal of Highway Engineering
    • /
    • v.20 no.2
    • /
    • pp.57-65
    • /
    • 2018
  • PURPOSES : The study aims to predict the service life of national highway asphalt pavements through deep learning methods by using maintenance history data of the National Highway Pavement Management System. METHODS : For the configuration of a deep learning network, this study used Tensorflow 1.5, an open source program which has excellent usability among deep learning frameworks. For the analysis, nine variables of cumulative annual average daily traffic, cumulative equivalent single axle loads, maintenance layer, surface, base, subbase, anti-frost layer, structural number of pavement, and region were selected as input data, while service life was chosen to construct the input layer and output layers as output data. Additionally, for scenario analysis, in this study, a model was formed with four different numbers of 1, 2, 4, and 8 hidden layers and a simulation analysis was performed according to the applicability of the over fitting resolution algorithm. RESULTS : The results of the analysis have shown that regardless of the number of hidden layers, when an over fitting resolution algorithm, such as dropout, is applied, the prediction capability is improved as the coefficient of determination ($R^2$) of the test data increases. Furthermore, the result of the sensitivity analysis of the applicability of region variables demonstrates that estimating service life requires sufficient consideration of regional characteristics as $R^2$ had a maximum of between 0.73 and 0.84, when regional variables where taken into consideration. CONCLUSIONS : As a result, this study proposes that it is possible to precisely predict the service life of national highway pavement sections with the consideration of traffic, pavement thickness, and regional factors and concludes that the use of the prediction of service life is fundamental data in decision making within pavement management systems.