• Title/Summary/Keyword: deep machine learning

Search Result 1,085, Processing Time 0.037 seconds

Experiment and Implementation of a Machine-Learning Based k-Value Prediction Scheme in a k-Anonymity Algorithm (k-익명화 알고리즘에서 기계학습 기반의 k값 예측 기법 실험 및 구현)

  • Muh, Kumbayoni Lalu;Jang, Sung-Bong
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.9 no.1
    • /
    • pp.9-16
    • /
    • 2020
  • The k-anonymity scheme has been widely used to protect private information when Big Data are distributed to a third party for research purposes. When the scheme is applied, an optimal k value determination is one of difficult problems to be resolved because many factors should be considered. Currently, the determination has been done almost manually by human experts with their intuition. This leads to degrade performance of the anonymization, and it takes much time and cost for them to do a task. To overcome this problem, a simple idea has been proposed that is based on machine learning. This paper describes implementations and experiments to realize the proposed idea. In thi work, a deep neural network (DNN) is implemented using tensorflow libraries, and it is trained and tested using input dataset. The experiment results show that a trend of training errors follows a typical pattern in DNN, but for validation errors, our model represents a different pattern from one shown in typical training process. The advantage of the proposed approach is that it can reduce time and cost for experts to determine k value because it can be done semi-automatically.

Machine Learning-based Classification of Hyperspectral Imagery

  • Haq, Mohd Anul;Rehman, Ziaur;Ahmed, Ahsan;Khan, Mohd Abdul Rahim
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.4
    • /
    • pp.193-202
    • /
    • 2022
  • The classification of hyperspectral imagery (HSI) is essential in the surface of earth observation. Due to the continuous large number of bands, HSI data provide rich information about the object of study; however, it suffers from the curse of dimensionality. Dimensionality reduction is an essential aspect of Machine learning classification. The algorithms based on feature extraction can overcome the data dimensionality issue, thereby allowing the classifiers to utilize comprehensive models to reduce computational costs. This paper assesses and compares two HSI classification techniques. The first is based on the Joint Spatial-Spectral Stacked Autoencoder (JSSSA) method, the second is based on a shallow Artificial Neural Network (SNN), and the third is used the SVM model. The performance of the JSSSA technique is better than the SNN classification technique based on the overall accuracy and Kappa coefficient values. We observed that the JSSSA based method surpasses the SNN technique with an overall accuracy of 96.13% and Kappa coefficient value of 0.95. SNN also achieved a good accuracy of 92.40% and a Kappa coefficient value of 0.90, and SVM achieved an accuracy of 82.87%. The current study suggests that both JSSSA and SNN based techniques prove to be efficient methods for hyperspectral classification of snow features. This work classified the labeled/ground-truth datasets of snow in multiple classes. The labeled/ground-truth data can be valuable for applying deep neural networks such as CNN, hybrid CNN, RNN for glaciology, and snow-related hazard applications.

Application of POD reduced-order algorithm on data-driven modeling of rod bundle

  • Kang, Huilun;Tian, Zhaofei;Chen, Guangliang;Li, Lei;Wang, Tianyu
    • Nuclear Engineering and Technology
    • /
    • v.54 no.1
    • /
    • pp.36-48
    • /
    • 2022
  • As a valid numerical method to obtain a high-resolution result of a flow field, computational fluid dynamics (CFD) have been widely used to study coolant flow and heat transfer characteristics in fuel rod bundles. However, the time-consuming, iterative calculation of Navier-Stokes equations makes CFD unsuitable for the scenarios that require efficient simulation such as sensitivity analysis and uncertainty quantification. To solve this problem, a reduced-order model (ROM) based on proper orthogonal decomposition (POD) and machine learning (ML) is proposed to simulate the flow field efficiently. Firstly, a validated CFD model to output the flow field data set of the rod bundle is established. Secondly, based on the POD method, the modes and corresponding coefficients of the flow field were extracted. Then, an deep feed-forward neural network, due to its efficiency in approximating arbitrary functions and its ability to handle high-dimensional and strong nonlinear problems, is selected to build a model that maps the non-linear relationship between the mode coefficients and the boundary conditions. A trained surrogate model for modes coefficients prediction is obtained after a certain number of training iterations. Finally, the flow field is reconstructed by combining the product of the POD basis and coefficients. Based on the test dataset, an evaluation of the ROM is carried out. The evaluation results show that the proposed POD-ROM accurately describe the flow status of the fluid field in rod bundles with high resolution in only a few milliseconds.

Prediction of the shear capacity of reinforced concrete slender beams without stirrups by applying artificial intelligence algorithms in a big database of beams generated by 3D nonlinear finite element analysis

  • Markou, George;Bakas, Nikolaos P.
    • Computers and Concrete
    • /
    • v.28 no.6
    • /
    • pp.533-547
    • /
    • 2021
  • Calculating the shear capacity of slender reinforced concrete beams without shear reinforcement was the subject of numerous studies, where the eternal problem of developing a single relationship that will be able to predict the expected shear capacity is still present. Using experimental results to extrapolate formulae was so far the main approach for solving this problem, whereas in the last two decades different research studies attempted to use artificial intelligence algorithms and available data sets of experimentally tested beams to develop new models that would demonstrate improved prediction capabilities. Given the limited number of available experimental databases, these studies were numerically restrained, unable to holistically address this problem. In this manuscript, a new approach is proposed where a numerically generated database is used to train machine-learning algorithms and develop an improved model for predicting the shear capacity of slender concrete beams reinforced only with longitudinal rebars. Finally, the proposed predictive model was validated through the use of an available ACI database that was developed by using experimental results on physical reinforced concrete beam specimens without shear and compressive reinforcement. For the first time, a numerically generated database was used to train a model for computing the shear capacity of slender concrete beams without stirrups and was found to have improved predictive abilities compared to the corresponding ACI equations. According to the analysis performed in this research work, it is deemed necessary to further enrich the current numerically generated database with additional data to further improve the dataset used for training and extrapolation. Finally, future research work foresees the study of beams with stirrups and deep beams for the development of improved predictive models.

A comparison of ATR-FTIR and Raman spectroscopy for the non-destructive examination of terpenoids in medicinal plants essential oils

  • Rahul Joshi;Sushma Kholiya;Himanshu Pandey;Ritu Joshi;Omia Emmanuel;Ameeta Tewari;Taehyun Kim;Byoung-Kwan Cho
    • Korean Journal of Agricultural Science
    • /
    • v.50 no.4
    • /
    • pp.675-696
    • /
    • 2023
  • Terpenoids, also referred to as terpenes, are a large family of naturally occurring chemical compounds present in the essential oils extracted from medicinal plants. In this study, a nondestructive methodology was created by combining ATR-FT-IR (attenuated total reflectance-Fourier transform infrared), and Raman spectroscopy for the terpenoids assessment in medicinal plants essential oils from ten different geographical locations. Partial least squares regression (PLSR) and support vector regression (SVR) were used as machine learning methodologies. However, a deep learning based model called as one-dimensional convolutional neural network (1D CNN) were also developed for models comparison. With a correlation coefficient (R2) of 0.999 and a lowest RMSEP (root mean squared error of prediction) of 0.006% for the prediction datasets, the SVR model created for FT-IR spectral data outperformed both the PLSR and 1 D CNN models. On the other hand, for the classification of essential oils derived from plants collected from various geographical regions, the created SVM (support vector machine) classification model for Raman spectroscopic data obtained an overall classification accuracy of 0.997% which was superior than the FT-IR (0.986%) data. Based on the results we propose that FT-IR spectroscopy, when coupled with the SVR model, has a significant potential for the non-destructive identification of terpenoids in essential oils compared with destructive chemical analysis methods.

Sensor Fault Detection Scheme based on Deep Learning and Support Vector Machine (딥 러닝 및 서포트 벡터 머신기반 센서 고장 검출 기법)

  • Yang, Jae-Wan;Lee, Young-Doo;Koo, In-Soo
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.18 no.2
    • /
    • pp.185-195
    • /
    • 2018
  • As machines have been automated in the field of industries in recent years, it is a paramount importance to manage and maintain the automation machines. When a fault occurs in sensors attached to the machine, the machine may malfunction and further, a huge damage will be caused in the process line. To prevent the situation, the fault of sensors should be monitored, diagnosed and classified in a proper way. In the paper, we propose a sensor fault detection scheme based on SVM and CNN to detect and classify typical sensor errors such as erratic, drift, hard-over, spike, and stuck faults. Time-domain statistical features are utilized for the learning and testing in the proposed scheme, and the genetic algorithm is utilized to select the subset of optimal features. To classify multiple sensor faults, a multi-layer SVM is utilized, and ensemble technique is used for CNN. As a result, the SVM that utilizes a subset of features selected by the genetic algorithm provides better performance than the SVM that utilizes all the features. However, the performance of CNN is superior to that of the SVM.

Development of an AI Analysis Service System based on OpenFaaS (OpenFaaS 기반 AI 분석 서비스 시스템 구축)

  • Jang, Rae-young;Lee, Ryong;Park, Min-woo;Lee, Sang-hwan
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.7
    • /
    • pp.97-106
    • /
    • 2020
  • Due to the rapid development and dissemination of 5G communication and IoT technologies, there are increasing demands for big data analysis techniques and service systems. In particular, explosively growing demands on AI technology adoption are also causing high competitions to take advantages of machine/deep-learning models to extract novel values from enormously collected data. In order to adopt AI technology to various research and application domains, it is necessary to prepare high-performance GPU-equipped systems and perform complicated settings to utilze deep learning models. To relieve the efforts and lower the barrier to utilize AI techniques, AIaaS(AI as a service) platform is attracting a great deal of attention as a promising on-line service, where the complexity of preparation and operation can be hidden behind the cloud side and service developers only need to utilize the high-level AI services easily. In this paper, we propose an AIaaS system which can support the creation of AI services based on Docker and OpenFaaS from the registration of models to the on-line operation. We also describe a case study to show how AI services can be easily generated by the proposed system.

The way to make training data for deep learning model to recognize keywords in product catalog image at E-commerce (온라인 쇼핑몰에서 상품 설명 이미지 내의 키워드 인식을 위한 딥러닝 훈련 데이터 자동 생성 방안)

  • Kim, Kitae;Oh, Wonseok;Lim, Geunwon;Cha, Eunwoo;Shin, Minyoung;Kim, Jongwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.1-23
    • /
    • 2018
  • From the 21st century, various high-quality services have come up with the growth of the internet or 'Information and Communication Technologies'. Especially, the scale of E-commerce industry in which Amazon and E-bay are standing out is exploding in a large way. As E-commerce grows, Customers could get what they want to buy easily while comparing various products because more products have been registered at online shopping malls. However, a problem has arisen with the growth of E-commerce. As too many products have been registered, it has become difficult for customers to search what they really need in the flood of products. When customers search for desired products with a generalized keyword, too many products have come out as a result. On the contrary, few products have been searched if customers type in details of products because concrete product-attributes have been registered rarely. In this situation, recognizing texts in images automatically with a machine can be a solution. Because bulk of product details are written in catalogs as image format, most of product information are not searched with text inputs in the current text-based searching system. It means if information in images can be converted to text format, customers can search products with product-details, which make them shop more conveniently. There are various existing OCR(Optical Character Recognition) programs which can recognize texts in images. But existing OCR programs are hard to be applied to catalog because they have problems in recognizing texts in certain circumstances, like texts are not big enough or fonts are not consistent. Therefore, this research suggests the way to recognize keywords in catalog with the Deep Learning algorithm which is state of the art in image-recognition area from 2010s. Single Shot Multibox Detector(SSD), which is a credited model for object-detection performance, can be used with structures re-designed to take into account the difference of text from object. But there is an issue that SSD model needs a lot of labeled-train data to be trained, because of the characteristic of deep learning algorithms, that it should be trained by supervised-learning. To collect data, we can try labelling location and classification information to texts in catalog manually. But if data are collected manually, many problems would come up. Some keywords would be missed because human can make mistakes while labelling train data. And it becomes too time-consuming to collect train data considering the scale of data needed or costly if a lot of workers are hired to shorten the time. Furthermore, if some specific keywords are needed to be trained, searching images that have the words would be difficult, as well. To solve the data issue, this research developed a program which create train data automatically. This program can make images which have various keywords and pictures like catalog and save location-information of keywords at the same time. With this program, not only data can be collected efficiently, but also the performance of SSD model becomes better. The SSD model recorded 81.99% of recognition rate with 20,000 data created by the program. Moreover, this research had an efficiency test of SSD model according to data differences to analyze what feature of data exert influence upon the performance of recognizing texts in images. As a result, it is figured out that the number of labeled keywords, the addition of overlapped keyword label, the existence of keywords that is not labeled, the spaces among keywords and the differences of background images are related to the performance of SSD model. This test can lead performance improvement of SSD model or other text-recognizing machine based on deep learning algorithm with high-quality data. SSD model which is re-designed to recognize texts in images and the program developed for creating train data are expected to contribute to improvement of searching system in E-commerce. Suppliers can put less time to register keywords for products and customers can search products with product-details which is written on the catalog.

Prediction of Traffic Congestion in Seoul by Deep Neural Network (심층인공신경망(DNN)과 다각도 상황 정보 기반의 서울시 도로 링크별 교통 혼잡도 예측)

  • Kim, Dong Hyun;Hwang, Kee Yeon;Yoon, Young
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.18 no.4
    • /
    • pp.44-57
    • /
    • 2019
  • Various studies have been conducted to solve traffic congestions in many metropolitan cities through accurate traffic flow prediction. Most studies are based on the assumption that past traffic patterns repeat in the future. Models based on such an assumption fall short in case irregular traffic patterns abruptly occur. Instead, the approaches such as predicting traffic pattern through big data analytics and artificial intelligence have emerged. Specifically, deep learning algorithms such as RNN have been prevalent for tackling the problems of predicting temporal traffic flow as a time series. However, these algorithms do not perform well in terms of long-term prediction. In this paper, we take into account various external factors that may affect the traffic flows. We model the correlation between the multi-dimensional context information with temporal traffic speed pattern using deep neural networks. Our model trained with the traffic data from TOPIS system by Seoul, Korea can predict traffic speed on a specific date with the accuracy reaching nearly 90%. We expect that the accuracy can be improved further by taking into account additional factors such as accidents and constructions for the prediction.

Predicting Future ESG Performance using Past Corporate Financial Information: Application of Deep Neural Networks (심층신경망을 활용한 데이터 기반 ESG 성과 예측에 관한 연구: 기업 재무 정보를 중심으로)

  • Min-Seung Kim;Seung-Hwan Moon;Sungwon Choi
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.2
    • /
    • pp.85-100
    • /
    • 2023
  • Corporate ESG performance (environmental, social, and corporate governance) reflecting a company's strategic sustainability has emerged as one of the main factors in today's investment decisions. The traditional ESG performance rating process is largely performed in a qualitative and subjective manner based on the institution-specific criteria, entailing limitations in reliability, predictability, and timeliness when making investment decisions. This study attempted to predict the corporate ESG rating through automated machine learning based on quantitative and disclosed corporate financial information. Using 12 types (21,360 cases) of market-disclosed financial information and 1,780 ESG measures available through the Korea Institute of Corporate Governance and Sustainability during 2019 to 2021, we suggested a deep neural network prediction model. Our model yielded about 86% of accurate classification performance in predicting ESG rating, showing better performance than other comparative models. This study contributed the literature in a way that the model achieved relatively accurate ESG rating predictions through an automated process using quantitative and publicly available corporate financial information. In terms of practical implications, the general investors can benefit from the prediction accuracy and time efficiency of our proposed model with nominal cost. In addition, this study can be expanded by accumulating more Korean and international data and by developing a more robust and complex model in the future.