• Title/Summary/Keyword: deep machine learning

Search Result 1,085, Processing Time 0.028 seconds

Dynamic forecasts of bankruptcy with Recurrent Neural Network model (RNN(Recurrent Neural Network)을 이용한 기업부도예측모형에서 회계정보의 동적 변화 연구)

  • Kwon, Hyukkun;Lee, Dongkyu;Shin, Minsoo
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.139-153
    • /
    • 2017
  • Corporate bankruptcy can cause great losses not only to stakeholders but also to many related sectors in society. Through the economic crises, bankruptcy have increased and bankruptcy prediction models have become more and more important. Therefore, corporate bankruptcy has been regarded as one of the major topics of research in business management. Also, many studies in the industry are in progress and important. Previous studies attempted to utilize various methodologies to improve the bankruptcy prediction accuracy and to resolve the overfitting problem, such as Multivariate Discriminant Analysis (MDA), Generalized Linear Model (GLM). These methods are based on statistics. Recently, researchers have used machine learning methodologies such as Support Vector Machine (SVM), Artificial Neural Network (ANN). Furthermore, fuzzy theory and genetic algorithms were used. Because of this change, many of bankruptcy models are developed. Also, performance has been improved. In general, the company's financial and accounting information will change over time. Likewise, the market situation also changes, so there are many difficulties in predicting bankruptcy only with information at a certain point in time. However, even though traditional research has problems that don't take into account the time effect, dynamic model has not been studied much. When we ignore the time effect, we get the biased results. So the static model may not be suitable for predicting bankruptcy. Thus, using the dynamic model, there is a possibility that bankruptcy prediction model is improved. In this paper, we propose RNN (Recurrent Neural Network) which is one of the deep learning methodologies. The RNN learns time series data and the performance is known to be good. Prior to experiment, we selected non-financial firms listed on the KOSPI, KOSDAQ and KONEX markets from 2010 to 2016 for the estimation of the bankruptcy prediction model and the comparison of forecasting performance. In order to prevent a mistake of predicting bankruptcy by using the financial information already reflected in the deterioration of the financial condition of the company, the financial information was collected with a lag of two years, and the default period was defined from January to December of the year. Then we defined the bankruptcy. The bankruptcy we defined is the abolition of the listing due to sluggish earnings. We confirmed abolition of the list at KIND that is corporate stock information website. Then we selected variables at previous papers. The first set of variables are Z-score variables. These variables have become traditional variables in predicting bankruptcy. The second set of variables are dynamic variable set. Finally we selected 240 normal companies and 226 bankrupt companies at the first variable set. Likewise, we selected 229 normal companies and 226 bankrupt companies at the second variable set. We created a model that reflects dynamic changes in time-series financial data and by comparing the suggested model with the analysis of existing bankruptcy predictive models, we found that the suggested model could help to improve the accuracy of bankruptcy predictions. We used financial data in KIS Value (Financial database) and selected Multivariate Discriminant Analysis (MDA), Generalized Linear Model called logistic regression (GLM), Support Vector Machine (SVM), Artificial Neural Network (ANN) model as benchmark. The result of the experiment proved that RNN's performance was better than comparative model. The accuracy of RNN was high in both sets of variables and the Area Under the Curve (AUC) value was also high. Also when we saw the hit-ratio table, the ratio of RNNs that predicted a poor company to be bankrupt was higher than that of other comparative models. However the limitation of this paper is that an overfitting problem occurs during RNN learning. But we expect to be able to solve the overfitting problem by selecting more learning data and appropriate variables. From these result, it is expected that this research will contribute to the development of a bankruptcy prediction by proposing a new dynamic model.

Real data-based active sonar signal synthesis method (실데이터 기반 능동 소나 신호 합성 방법론)

  • Yunsu Kim;Juho Kim;Jongwon Seok;Jungpyo Hong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.43 no.1
    • /
    • pp.9-18
    • /
    • 2024
  • The importance of active sonar systems is emerging due to the quietness of underwater targets and the increase in ambient noise due to the increase in maritime traffic. However, the low signal-to-noise ratio of the echo signal due to multipath propagation of the signal, various clutter, ambient noise and reverberation makes it difficult to identify underwater targets using active sonar. Attempts have been made to apply data-based methods such as machine learning or deep learning to improve the performance of underwater target recognition systems, but it is difficult to collect enough data for training due to the nature of sonar datasets. Methods based on mathematical modeling have been mainly used to compensate for insufficient active sonar data. However, methodologies based on mathematical modeling have limitations in accurately simulating complex underwater phenomena. Therefore, in this paper, we propose a sonar signal synthesis method based on a deep neural network. In order to apply the neural network model to the field of sonar signal synthesis, the proposed method appropriately corrects the attention-based encoder and decoder to the sonar signal, which is the main module of the Tacotron model mainly used in the field of speech synthesis. It is possible to synthesize a signal more similar to the actual signal by training the proposed model using the dataset collected by arranging a simulated target in an actual marine environment. In order to verify the performance of the proposed method, Perceptual evaluation of audio quality test was conducted and within score difference -2.3 was shown compared to actual signal in a total of four different environments. These results prove that the active sonar signal generated by the proposed method approximates the actual signal.

Development of Autonomous Vehicle Learning Data Generation System (자율주행 차량의 학습 데이터 자동 생성 시스템 개발)

  • Yoon, Seungje;Jung, Jiwon;Hong, June;Lim, Kyungil;Kim, Jaehwan;Kim, Hyungjoo
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.19 no.5
    • /
    • pp.162-177
    • /
    • 2020
  • The perception of traffic environment based on various sensors in autonomous driving system has a direct relationship with driving safety. Recently, as the perception model based on deep neural network is used due to the development of machine learning/in-depth neural network technology, a the perception model training and high quality of a training dataset are required. However, there are several realistic difficulties to collect data on all situations that may occur in self-driving. The performance of the perception model may be deteriorated due to the difference between the overseas and domestic traffic environments, and data on bad weather where the sensors can not operate normally can not guarantee the qualitative part. Therefore, it is necessary to build a virtual road environment in the simulator rather than the actual road to collect the traning data. In this paper, a training dataset collection process is suggested by diversifying the weather, illumination, sensor position, type and counts of vehicles in the simulator environment that simulates the domestic road situation according to the domestic situation. In order to achieve better performance, the authors changed the domain of image to be closer to due diligence and diversified. And the performance evaluation was conducted on the test data collected in the actual road environment, and the performance was similar to that of the model learned only by the actual environmental data.

Development of an intelligent skin condition diagnosis information system based on social media

  • Kim, Hyung-Hoon;Ohk, Seung-Ho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.8
    • /
    • pp.241-251
    • /
    • 2022
  • Diagnosis and management of customer's skin condition is an important essential function in the cosmetics and beauty industry. As the social media environment spreads and generalizes to all fields of society, the interaction of questions and answers to various and delicate concerns and requirements regarding the diagnosis and management of skin conditions is being actively dealt with in the social media community. However, since social media information is very diverse and atypical big data, an intelligent skin condition diagnosis system that combines appropriate skin condition information analysis and artificial intelligence technology is necessary. In this paper, we developed the skin condition diagnosis system SCDIS to intelligently diagnose and manage the skin condition of customers by processing the text analysis information of social media into learning data. In SCDIS, an artificial neural network model, AnnTFIDF, that automatically diagnoses skin condition types using artificial neural network technology, a deep learning machine learning method, was built up and used. The performance of the artificial neural network model AnnTFIDF was analyzed using test sample data, and the accuracy of the skin condition type diagnosis prediction value showed a high performance of about 95%. Through the experimental and performance analysis results of this paper, SCDIS can be evaluated as an intelligent tool that can be used efficiently in the skin condition analysis and diagnosis management process in the cosmetic and beauty industry. And this study can be used as a basic research to solve the new technology trend, customized cosmetics manufacturing and consumer-oriented beauty industry technology demand.

Automatic Extraction of Hangul Stroke Element Using Faster R-CNN for Font Similarity (글꼴 유사도 판단을 위한 Faster R-CNN 기반 한글 글꼴 획 요소 자동 추출)

  • Jeon, Ja-Yeon;Park, Dong-Yeon;Lim, Seo-Young;Ji, Yeong-Seo;Lim, Soon-Bum
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.8
    • /
    • pp.953-964
    • /
    • 2020
  • Ever since media contents took over the world, the importance of typography has increased, and the influence of fonts has be n recognized. Nevertheless, the current Hangul font system is very poor and is provided passively, so it is practically impossible to understand and utilize all the shape characteristics of more than six thousand Hangul fonts. In this paper, the characteristics of Hangul font shapes were selected based on the Hangul structure of similar fonts. The stroke element detection training was performed by fine tuning Faster R-CNN Inception v2, one of the deep learning object detection models. We also propose a system that automatically extracts the stroke element characteristics from characters by introducing an automatic extraction algorithm. In comparison to the previous research which showed poor accuracy while using SVM(Support Vector Machine) and Sliding Window Algorithm, the proposed system in this paper has shown the result of 10 % accuracy to properly detect and extract stroke elements from various fonts. In conclusion, if the stroke element characteristics based on the Hangul structural information extracted through the system are used for similar classification, problems such as copyright will be solved in an era when typography's competitiveness becomes stronger, and an automated process will be provided to users for more convenience.

Performance Improvement of Convolutional Neural Network for Pulmonary Nodule Detection (폐 결절 검출을 위한 합성곱 신경망의 성능 개선)

  • Kim, HanWoong;Kim, Byeongnam;Lee, JeeEun;Jang, Won Seuk;Yoo, Sun K.
    • Journal of Biomedical Engineering Research
    • /
    • v.38 no.5
    • /
    • pp.237-241
    • /
    • 2017
  • Early detection of the pulmonary nodule is important for diagnosis and treatment of lung cancer. Recently, CT has been used as a screening tool for lung nodule detection. And, it has been reported that computer aided detection(CAD) systems can improve the accuracy of the radiologist in detection nodules on CT scan. The previous study has been proposed a method using Convolutional Neural Network(CNN) in Lung CAD system. But the proposed model has a limitation in accuracy due to its sparse layer structure. Therefore, we propose a Deep Convolutional Neural Network to overcome this limitation. The model proposed in this work is consist of 14 layers including 8 convolutional layers and 4 fully connected layers. The CNN model is trained and tested with 61,404 regions-of-interest (ROIs) patches of lung image including 39,760 nodules and 21,644 non-nodules extracted from the Lung Image Database Consortium(LIDC) dataset. We could obtain the classification accuracy of 91.79% with the CNN model presented in this work. To prevent overfitting, we trained the model with Augmented Dataset and regularization term in the cost function. With L1, L2 regularization at Training process, we obtained 92.39%, 92.52% of accuracy respectively. And we obtained 93.52% with data augmentation. In conclusion, we could obtain the accuracy of 93.75% with L2 Regularization and Data Augmentation.

Hybrid dropout (하이브리드 드롭아웃)

  • Park, Chongsun;Lee, MyeongGyu
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.6
    • /
    • pp.899-908
    • /
    • 2019
  • Massive in-depth neural networks with numerous parameters are powerful machine learning methods, but they have overfitting problems due to the excessive flexibility of the models. Dropout is one methods to overcome the problem of oversized neural networks. It is also an effective method that randomly drops input and hidden nodes from the neural network during training. Every sample is fed to a thinned network from an exponential number of different networks. In this study, instead of feeding one sample for each thinned network, two or more samples are used in fitting for one thinned network known as a Hybrid Dropout. Simulation results using real data show that the new method improves the stability of estimates and reduces the minimum error for the verification data.

Impurity profiling and chemometric analysis of methamphetamine seizures in Korea

  • Shin, Dong Won;Ko, Beom Jun;Cheong, Jae Chul;Lee, Wonho;Kim, Suhkmann;Kim, Jin Young
    • Analytical Science and Technology
    • /
    • v.33 no.2
    • /
    • pp.98-107
    • /
    • 2020
  • Methamphetamine (MA) is currently the most abused illicit drug in Korea. MA is produced by chemical synthesis, and the final target drug that is produced contains small amounts of the precursor chemicals, intermediates, and by-products. To identify and quantify these trace compounds in MA seizures, a practical and feasible approach for conducting chromatographic fingerprinting with a suite of traditional chemometric methods and recently introduced machine learning approaches was examined. This was achieved using gas chromatography (GC) coupled with a flame ionization detector (FID) and mass spectrometry (MS). Following appropriate examination of all the peaks in 71 samples, 166 impurities were selected as the characteristic components. Unsupervised (principal component analysis (PCA), hierarchical cluster analysis (HCA), and K-means clustering) and supervised (partial least squares-discriminant analysis (PLS-DA), orthogonal partial least squares-discriminant analysis (OPLS-DA), support vector machines (SVM), and deep neural network (DNN) with Keras) chemometric techniques were employed for classifying the 71 MA seizures. The results of the PCA, HCA, K-means clustering, PLS-DA, OPLS-DA, SVM, and DNN methods for quality evaluation were in good agreement. However, the tested MA seizures possessed distinct features, such as chirality, cutting agents, and boiling points. The study indicated that the established qualitative and semi-quantitative methods will be practical and useful analytical tools for characterizing trace compounds in illicit MA seizures. Moreover, they will provide a statistical basis for identifying the synthesis route, sources of supply, trafficking routes, and connections between seizures, which will support drug law enforcement agencies in their effort to eliminate organized MA crime.

Mortality Prediction of Older Adults Admitted to the Emergency Department (응급실 방문 노인 환자의 사망률 예측)

  • Park, Junhyeok;Lee, Songwook
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.7
    • /
    • pp.275-280
    • /
    • 2018
  • As the global population becomes aging, the demand for health services for the elderly is expected to increase. In particular, The elderly visiting the emergency department sometimes have complex medical, social, and physical problems, such as having a variety of illnesses or complaints of unusual symptoms. The proposed system is designed to predict the mortality of the elderly patients who are over 65 years old and have admitted the emergency department. For mortality prediction, we compare the support vector machines and Feed Forward Neural Network (FFNN) trained with medical data such as age, sex, blood pressure, body temperature, etc. The results of the FFNN with a hidden layer are best in the mortality prediction, and F1 score and the AUC is 52.0%, 88.6% respectively. If we improve the performance of the proposed system by extracting better medical features, we will be able to provide better medical services through an effective and quick allocation of medical resources for the elderly patients visiting the emergency department.

DCNN Optimization Using Multi-Resolution Image Fusion

  • Alshehri, Abdullah A.;Lutz, Adam;Ezekiel, Soundararajan;Pearlstein, Larry;Conlen, John
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.11
    • /
    • pp.4290-4309
    • /
    • 2020
  • In recent years, advancements in machine learning capabilities have allowed it to see widespread adoption for tasks such as object detection, image classification, and anomaly detection. However, despite their promise, a limitation lies in the fact that a network's performance quality is based on the data which it receives. A well-trained network will still have poor performance if the subsequent data supplied to it contains artifacts, out of focus regions, or other visual distortions. Under normal circumstances, images of the same scene captured from differing points of focus, angles, or modalities must be separately analysed by the network, despite possibly containing overlapping information such as in the case of images of the same scene captured from different angles, or irrelevant information such as images captured from infrared sensors which can capture thermal information well but not topographical details. This factor can potentially add significantly to the computational time and resources required to utilize the network without providing any additional benefit. In this study, we plan to explore using image fusion techniques to assemble multiple images of the same scene into a single image that retains the most salient key features of the individual source images while discarding overlapping or irrelevant data that does not provide any benefit to the network. Utilizing this image fusion step before inputting a dataset into the network, the number of images would be significantly reduced with the potential to improve the classification performance accuracy by enhancing images while discarding irrelevant and overlapping regions.