• Title/Summary/Keyword: deep learning models

Search Result 1,392, Processing Time 0.025 seconds

Topic Modeling Insomnia Social Media Corpus using BERTopic and Building Automatic Deep Learning Classification Model (BERTopic을 활용한 불면증 소셜 데이터 토픽 모델링 및 불면증 경향 문헌 딥러닝 자동분류 모델 구축)

  • Ko, Young Soo;Lee, Soobin;Cha, Minjung;Kim, Seongdeok;Lee, Juhee;Han, Ji Yeong;Song, Min
    • Journal of the Korean Society for information Management
    • /
    • v.39 no.2
    • /
    • pp.111-129
    • /
    • 2022
  • Insomnia is a chronic disease in modern society, with the number of new patients increasing by more than 20% in the last 5 years. Insomnia is a serious disease that requires diagnosis and treatment because the individual and social problems that occur when there is a lack of sleep are serious and the triggers of insomnia are complex. This study collected 5,699 data from 'insomnia', a community on 'Reddit', a social media that freely expresses opinions. Based on the International Classification of Sleep Disorders ICSD-3 standard and the guidelines with the help of experts, the insomnia corpus was constructed by tagging them as insomnia tendency documents and non-insomnia tendency documents. Five deep learning language models (BERT, RoBERTa, ALBERT, ELECTRA, XLNet) were trained using the constructed insomnia corpus as training data. As a result of performance evaluation, RoBERTa showed the highest performance with an accuracy of 81.33%. In order to in-depth analysis of insomnia social data, topic modeling was performed using the newly emerged BERTopic method by supplementing the weaknesses of LDA, which is widely used in the past. As a result of the analysis, 8 subject groups ('Negative emotions', 'Advice and help and gratitude', 'Insomnia-related diseases', 'Sleeping pills', 'Exercise and eating habits', 'Physical characteristics', 'Activity characteristics', 'Environmental characteristics') could be confirmed. Users expressed negative emotions and sought help and advice from the Reddit insomnia community. In addition, they mentioned diseases related to insomnia, shared discourse on the use of sleeping pills, and expressed interest in exercise and eating habits. As insomnia-related characteristics, we found physical characteristics such as breathing, pregnancy, and heart, active characteristics such as zombies, hypnic jerk, and groggy, and environmental characteristics such as sunlight, blankets, temperature, and naps.

Application of deep learning method for decision making support of dam release operation (댐 방류 의사결정지원을 위한 딥러닝 기법의 적용성 평가)

  • Jung, Sungho;Le, Xuan Hien;Kim, Yeonsu;Choi, Hyungu;Lee, Giha
    • Journal of Korea Water Resources Association
    • /
    • v.54 no.spc1
    • /
    • pp.1095-1105
    • /
    • 2021
  • The advancement of dam operation is further required due to the upcoming rainy season, typhoons, or torrential rains. Besides, physical models based on specific rules may sometimes have limitations in controlling the release discharge of dam due to inherent uncertainty and complex factors. This study aims to forecast the water level of the nearest station to the dam multi-timestep-ahead and evaluate the availability when it makes a decision for a release discharge of dam based on LSTM (Long Short-Term Memory) of deep learning. The LSTM model was trained and tested on eight data sets with a 1-hour temporal resolution, including primary data used in the dam operation and downstream water level station data about 13 years (2009~2021). The trained model forecasted the water level time series divided by the six lead times: 1, 3, 6, 9, 12, 18-hours, and compared and analyzed with the observed data. As a result, the prediction results of the 1-hour ahead exhibited the best performance for all cases with an average accuracy of MAE of 0.01m, RMSE of 0.015 m, and NSE of 0.99, respectively. In addition, as the lead time increases, the predictive performance of the model tends to decrease slightly. The model may similarly estimate and reliably predicts the temporal pattern of the observed water level. Thus, it is judged that the LSTM model could produce predictive data by extracting the characteristics of complex hydrological non-linear data and can be used to determine the amount of release discharge from the dam when simulating the operation of the dam.

Business Application of Convolutional Neural Networks for Apparel Classification Using Runway Image (합성곱 신경망의 비지니스 응용: 런웨이 이미지를 사용한 의류 분류를 중심으로)

  • Seo, Yian;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.1-19
    • /
    • 2018
  • Large amount of data is now available for research and business sectors to extract knowledge from it. This data can be in the form of unstructured data such as audio, text, and image data and can be analyzed by deep learning methodology. Deep learning is now widely used for various estimation, classification, and prediction problems. Especially, fashion business adopts deep learning techniques for apparel recognition, apparel search and retrieval engine, and automatic product recommendation. The core model of these applications is the image classification using Convolutional Neural Networks (CNN). CNN is made up of neurons which learn parameters such as weights while inputs come through and reach outputs. CNN has layer structure which is best suited for image classification as it is comprised of convolutional layer for generating feature maps, pooling layer for reducing the dimensionality of feature maps, and fully-connected layer for classifying the extracted features. However, most of the classification models have been trained using online product image, which is taken under controlled situation such as apparel image itself or professional model wearing apparel. This image may not be an effective way to train the classification model considering the situation when one might want to classify street fashion image or walking image, which is taken in uncontrolled situation and involves people's movement and unexpected pose. Therefore, we propose to train the model with runway apparel image dataset which captures mobility. This will allow the classification model to be trained with far more variable data and enhance the adaptation with diverse query image. To achieve both convergence and generalization of the model, we apply Transfer Learning on our training network. As Transfer Learning in CNN is composed of pre-training and fine-tuning stages, we divide the training step into two. First, we pre-train our architecture with large-scale dataset, ImageNet dataset, which consists of 1.2 million images with 1000 categories including animals, plants, activities, materials, instrumentations, scenes, and foods. We use GoogLeNet for our main architecture as it has achieved great accuracy with efficiency in ImageNet Large Scale Visual Recognition Challenge (ILSVRC). Second, we fine-tune the network with our own runway image dataset. For the runway image dataset, we could not find any previously and publicly made dataset, so we collect the dataset from Google Image Search attaining 2426 images of 32 major fashion brands including Anna Molinari, Balenciaga, Balmain, Brioni, Burberry, Celine, Chanel, Chloe, Christian Dior, Cividini, Dolce and Gabbana, Emilio Pucci, Ermenegildo, Fendi, Giuliana Teso, Gucci, Issey Miyake, Kenzo, Leonard, Louis Vuitton, Marc Jacobs, Marni, Max Mara, Missoni, Moschino, Ralph Lauren, Roberto Cavalli, Sonia Rykiel, Stella McCartney, Valentino, Versace, and Yve Saint Laurent. We perform 10-folded experiments to consider the random generation of training data, and our proposed model has achieved accuracy of 67.2% on final test. Our research suggests several advantages over previous related studies as to our best knowledge, there haven't been any previous studies which trained the network for apparel image classification based on runway image dataset. We suggest the idea of training model with image capturing all the possible postures, which is denoted as mobility, by using our own runway apparel image dataset. Moreover, by applying Transfer Learning and using checkpoint and parameters provided by Tensorflow Slim, we could save time spent on training the classification model as taking 6 minutes per experiment to train the classifier. This model can be used in many business applications where the query image can be runway image, product image, or street fashion image. To be specific, runway query image can be used for mobile application service during fashion week to facilitate brand search, street style query image can be classified during fashion editorial task to classify and label the brand or style, and website query image can be processed by e-commerce multi-complex service providing item information or recommending similar item.

Visual Classification of Wood Knots Using k-Nearest Neighbor and Convolutional Neural Network (k-Nearest Neighbor와 Convolutional Neural Network에 의한 제재목 표면 옹이 종류의 화상 분류)

  • Kim, Hyunbin;Kim, Mingyu;Park, Yonggun;Yang, Sang-Yun;Chung, Hyunwoo;Kwon, Ohkyung;Yeo, Hwanmyeong
    • Journal of the Korean Wood Science and Technology
    • /
    • v.47 no.2
    • /
    • pp.229-238
    • /
    • 2019
  • Various wood defects occur during tree growing or wood processing. Thus, to use wood practically, it is necessary to objectively assess their quality based on the usage requirement by accurately classifying their defects. However, manual visual grading and species classification may result in differences due to subjective decisions; therefore, computer-vision-based image analysis is required for the objective evaluation of wood quality and the speeding up of wood production. In this study, the SIFT+k-NN and CNN models were used to implement a model that automatically classifies knots and analyze its accuracy. Toward this end, a total of 1,172 knot images in various shapes from five domestic conifers were used for learning and validation. For the SIFT+k-NN model, SIFT technology was used to extract properties from the knot images and k-NN was used for the classification, resulting in the classification with an accuracy of up to 60.53% when k-index was 17. The CNN model comprised 8 convolution layers and 3 hidden layers, and its maximum accuracy was 88.09% after 1205 epoch, which was higher than that of the SIFT+k-NN model. Moreover, if there is a large difference in the number of images by knot types, the SIFT+k-NN tended to show a learning biased toward the knot type with a higher number of images, whereas the CNN model did not show a drastic bias regardless of the difference in the number of images. Therefore, the CNN model showed better performance in knot classification. It is determined that the wood knot classification by the CNN model will show a sufficient accuracy in its practical applicability.

LSTM Prediction of Streamflow during Peak Rainfall of Piney River (LSTM을 이용한 Piney River유역의 최대강우시 유량예측)

  • Kareem, Kola Yusuff;Seong, Yeonjeong;Jung, Younghun
    • Journal of Korean Society of Disaster and Security
    • /
    • v.14 no.4
    • /
    • pp.17-27
    • /
    • 2021
  • Streamflow prediction is a very vital disaster mitigation approach for effective flood management and water resources planning. Lately, torrential rainfall caused by climate change has been reported to have increased globally, thereby causing enormous infrastructural loss, properties and lives. This study evaluates the contribution of rainfall to streamflow prediction in normal and peak rainfall scenarios, typical of the recent flood at Piney Resort in Vernon, Hickman County, Tennessee, United States. Daily streamflow, water level, and rainfall data for 20 years (2000-2019) from two USGS gage stations (03602500 upstream and 03599500 downstream) of the Piney River watershed were obtained, preprocesssed and fitted with Long short term memory (LSTM) model. Tensorflow and Keras machine learning frameworks were used with Python to predict streamflow values with a sequence size of 14 days, to determine whether the model could have predicted the flooding event in August 21, 2021. Model skill analysis showed that LSTM model with full data (water level, streamflow and rainfall) performed better than the Naive Model except some rainfall models, indicating that only rainfall is insufficient for streamflow prediction. The final LSTM model recorded optimal NSE and RMSE values of 0.68 and 13.84 m3/s and predicted peak flow with the lowest prediction error of 11.6%, indicating that the final model could have predicted the flood on August 24, 2021 given a peak rainfall scenario. Adequate knowledge of rainfall patterns will guide hydrologists and disaster prevention managers in designing efficient early warning systems and policies aimed at mitigating flood risks.

Conformer with lexicon transducer for Korean end-to-end speech recognition (Lexicon transducer를 적용한 conformer 기반 한국어 end-to-end 음성인식)

  • Son, Hyunsoo;Park, Hosung;Kim, Gyujin;Cho, Eunsoo;Kim, Ji-Hwan
    • The Journal of the Acoustical Society of Korea
    • /
    • v.40 no.5
    • /
    • pp.530-536
    • /
    • 2021
  • Recently, due to the development of deep learning, end-to-end speech recognition, which directly maps graphemes to speech signals, shows good performance. Especially, among the end-to-end models, conformer shows the best performance. However end-to-end models only focuses on the probability of which grapheme will appear at the time. The decoding process uses a greedy search or beam search. This decoding method is easily affected by the final probability output by the model. In addition, the end-to-end models cannot use external pronunciation and language information due to structual problem. Therefore, in this paper conformer with lexicon transducer is proposed. We compare phoneme-based model with lexicon transducer and grapheme-based model with beam search. Test set is consist of words that do not appear in training data. The grapheme-based conformer with beam search shows 3.8 % of CER. The phoneme-based conformer with lexicon transducer shows 3.4 % of CER.

Comparative Analysis of Anomaly Detection Models using AE and Suggestion of Criteria for Determining Outliers

  • Kang, Gun-Ha;Sohn, Jung-Mo;Sim, Gun-Wu
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.8
    • /
    • pp.23-30
    • /
    • 2021
  • In this study, we present a comparative analysis of major autoencoder(AE)-based anomaly detection methods for quality determination in the manufacturing process and a new anomaly discrimination criterion. Due to the characteristics of manufacturing site, anomalous instances are few and their types greatly vary. These properties degrade the performance of an AI-based anomaly detection model using the dataset for both normal and anomalous cases, and incur a lot of time and costs in obtaining additional data for performance improvement. To solve this problem, the studies on AE-based models such as AE and VAE are underway, which perform anomaly detection using only normal data. In this work, based on Convolutional AE, VAE, and Dilated VAE models, statistics on residual images, MSE, and information entropy were selected as outlier discriminant criteria to compare and analyze the performance of each model. In particular, the range value applied to the Convolutional AE model showed the best performance with AUC PRC 0.9570, F1 Score 0.8812 and AUC ROC 0.9548, accuracy 87.60%. This shows a performance improvement of an accuracy about 20%P(Percentage Point) compared to MSE, which was frequently used as a standard for determining outliers, and confirmed that model performance can be improved according to the criteria for determining outliers.

A Deep Learning Based Approach to Recognizing Accompanying Status of Smartphone Users Using Multimodal Data (스마트폰 다종 데이터를 활용한 딥러닝 기반의 사용자 동행 상태 인식)

  • Kim, Kilho;Choi, Sangwoo;Chae, Moon-jung;Park, Heewoong;Lee, Jaehong;Park, Jonghun
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.163-177
    • /
    • 2019
  • As smartphones are getting widely used, human activity recognition (HAR) tasks for recognizing personal activities of smartphone users with multimodal data have been actively studied recently. The research area is expanding from the recognition of the simple body movement of an individual user to the recognition of low-level behavior and high-level behavior. However, HAR tasks for recognizing interaction behavior with other people, such as whether the user is accompanying or communicating with someone else, have gotten less attention so far. And previous research for recognizing interaction behavior has usually depended on audio, Bluetooth, and Wi-Fi sensors, which are vulnerable to privacy issues and require much time to collect enough data. Whereas physical sensors including accelerometer, magnetic field and gyroscope sensors are less vulnerable to privacy issues and can collect a large amount of data within a short time. In this paper, a method for detecting accompanying status based on deep learning model by only using multimodal physical sensor data, such as an accelerometer, magnetic field and gyroscope, was proposed. The accompanying status was defined as a redefinition of a part of the user interaction behavior, including whether the user is accompanying with an acquaintance at a close distance and the user is actively communicating with the acquaintance. A framework based on convolutional neural networks (CNN) and long short-term memory (LSTM) recurrent networks for classifying accompanying and conversation was proposed. First, a data preprocessing method which consists of time synchronization of multimodal data from different physical sensors, data normalization and sequence data generation was introduced. We applied the nearest interpolation to synchronize the time of collected data from different sensors. Normalization was performed for each x, y, z axis value of the sensor data, and the sequence data was generated according to the sliding window method. Then, the sequence data became the input for CNN, where feature maps representing local dependencies of the original sequence are extracted. The CNN consisted of 3 convolutional layers and did not have a pooling layer to maintain the temporal information of the sequence data. Next, LSTM recurrent networks received the feature maps, learned long-term dependencies from them and extracted features. The LSTM recurrent networks consisted of two layers, each with 128 cells. Finally, the extracted features were used for classification by softmax classifier. The loss function of the model was cross entropy function and the weights of the model were randomly initialized on a normal distribution with an average of 0 and a standard deviation of 0.1. The model was trained using adaptive moment estimation (ADAM) optimization algorithm and the mini batch size was set to 128. We applied dropout to input values of the LSTM recurrent networks to prevent overfitting. The initial learning rate was set to 0.001, and it decreased exponentially by 0.99 at the end of each epoch training. An Android smartphone application was developed and released to collect data. We collected smartphone data for a total of 18 subjects. Using the data, the model classified accompanying and conversation by 98.74% and 98.83% accuracy each. Both the F1 score and accuracy of the model were higher than the F1 score and accuracy of the majority vote classifier, support vector machine, and deep recurrent neural network. In the future research, we will focus on more rigorous multimodal sensor data synchronization methods that minimize the time stamp differences. In addition, we will further study transfer learning method that enables transfer of trained models tailored to the training data to the evaluation data that follows a different distribution. It is expected that a model capable of exhibiting robust recognition performance against changes in data that is not considered in the model learning stage will be obtained.

Warning Classification Method Based On Artificial Neural Network Using Topics of Source Code (소스코드 주제를 이용한 인공신경망 기반 경고 분류 방법)

  • Lee, Jung-Been
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.9 no.11
    • /
    • pp.273-280
    • /
    • 2020
  • Automatic Static Analysis Tools help developers to quickly find potential defects in source code with less effort. However, the tools reports a large number of false positive warnings which do not have to fix. In our study, we proposed an artificial neural network-based warning classification method using topic models of source code blocks. We collect revisions for fixing bugs from software change management (SCM) system and extract code blocks modified by developers. In deep learning stage, topic distribution values of the code blocks and the binary data that present the warning removal in the blocks are used as input and target data in an simple artificial neural network, respectively. In our experimental results, our warning classification model based on neural network shows very high performance to predict label of warnings such as true or false positive.

Expansion of Word Representation for Named Entity Recognition Based on Bidirectional LSTM CRFs (Bidirectional LSTM CRF 기반의 개체명 인식을 위한 단어 표상의 확장)

  • Yu, Hongyeon;Ko, Youngjoong
    • Journal of KIISE
    • /
    • v.44 no.3
    • /
    • pp.306-313
    • /
    • 2017
  • Named entity recognition (NER) seeks to locate and classify named entities in text into pre-defined categories such as names of persons, organizations, locations, expressions of times, etc. Recently, many state-of-the-art NER systems have been implemented with bidirectional LSTM CRFs. Deep learning models based on long short-term memory (LSTM) generally depend on word representations as input. In this paper, we propose an approach to expand word representation by using pre-trained word embedding, part of speech (POS) tag embedding, syllable embedding and named entity dictionary feature vectors. Our experiments show that the proposed approach creates useful word representations as an input of bidirectional LSTM CRFs. Our final presentation shows its efficacy to be 8.05%p higher than baseline NERs with only the pre-trained word embedding vector.