• Title/Summary/Keyword: class imbalance

Search Result 120, Processing Time 0.03 seconds

Automatic Augmentation Technique of an Autoencoder-based Numerical Training Data (오토인코더 기반 수치형 학습데이터의 자동 증강 기법)

  • Jeong, Ju-Eun;Kim, Han-Joon;Chun, Jong-Hoon
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.22 no.5
    • /
    • pp.75-86
    • /
    • 2022
  • This study aims to solve the problem of class imbalance in numerical data by using a deep learning-based Variational AutoEncoder and to improve the performance of the learning model by augmenting the learning data. We propose 'D-VAE' to artificially increase the number of records for a given table data. The main features of the proposed technique go through discretization and feature selection in the preprocessing process to optimize the data. In the discretization process, K-means are applied and grouped, and then converted into one-hot vectors by one-hot encoding technique. Subsequently, for memory efficiency, sample data are generated with Variational AutoEncoder using only features that help predict with RFECV among feature selection techniques. To verify the performance of the proposed model, we demonstrate its validity by conducting experiments by data augmentation ratio.

Structural health monitoring data anomaly detection by transformer enhanced densely connected neural networks

  • Jun, Li;Wupeng, Chen;Gao, Fan
    • Smart Structures and Systems
    • /
    • v.30 no.6
    • /
    • pp.613-626
    • /
    • 2022
  • Guaranteeing the quality and integrity of structural health monitoring (SHM) data is very important for an effective assessment of structural condition. However, sensory system may malfunction due to sensor fault or harsh operational environment, resulting in multiple types of data anomaly existing in the measured data. Efficiently and automatically identifying anomalies from the vast amounts of measured data is significant for assessing the structural conditions and early warning for structural failure in SHM. The major challenges of current automated data anomaly detection methods are the imbalance of dataset categories. In terms of the feature of actual anomalous data, this paper proposes a data anomaly detection method based on data-level and deep learning technique for SHM of civil engineering structures. The proposed method consists of a data balancing phase to prepare a comprehensive training dataset based on data-level technique, and an anomaly detection phase based on a sophisticatedly designed network. The advanced densely connected convolutional network (DenseNet) and Transformer encoder are embedded in the specific network to facilitate extraction of both detail and global features of response data, and to establish the mapping between the highest level of abstractive features and data anomaly class. Numerical studies on a steel frame model are conducted to evaluate the performance and noise immunity of using the proposed network for data anomaly detection. The applicability of the proposed method for data anomaly classification is validated with the measured data of a practical supertall structure. The proposed method presents a remarkable performance on data anomaly detection, which reaches a 95.7% overall accuracy with practical engineering structural monitoring data, which demonstrates the effectiveness of data balancing and the robust classification capability of the proposed network.

Development of a Deep Learning Algorithm for Anomaly Detection of Manufacturing Facility (설비 이상탐지를 위한 딥러닝 알고리즘 개발)

  • Kim, Min-Hee;Jin, Kyo-Hong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.2
    • /
    • pp.199-206
    • /
    • 2022
  • A malfunction or breakdown of a manufacturing facility leads to product defects and the suspension of production lines, resulting in huge financial losses for manufacturers. Due to the spread of smart factory services, a large amount of data is being collected in factories, and AI-based research is being conducted to predict and diagnose manufacturing facility breakdowns or manufacturing site efficiency. However, because of the characteristics of manufacturing data, such as a severe class imbalance about abnormalities and ambiguous label information that distinguishes abnormalities, developing classification or anomaly detection models is highly difficult. In this paper, we present an deep learning algorithm for anomaly detection of a manufacturing facility using reconstruction loss of CNN-based model and ananlyze its performance. The algorithm detects anomalies by relying solely on normal data from the facility's manufacturing data in the exclusion of abnormal data.

Comparison of Anomaly Detection Performance Based on GRU Model Applying Various Data Preprocessing Techniques and Data Oversampling (다양한 데이터 전처리 기법과 데이터 오버샘플링을 적용한 GRU 모델 기반 이상 탐지 성능 비교)

  • Yoo, Seung-Tae;Kim, Kangseok
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.2
    • /
    • pp.201-211
    • /
    • 2022
  • According to the recent change in the cybersecurity paradigm, research on anomaly detection methods using machine learning and deep learning techniques, which are AI implementation technologies, is increasing. In this study, a comparative study on data preprocessing techniques that can improve the anomaly detection performance of a GRU (Gated Recurrent Unit) neural network-based intrusion detection model using NGIDS-DS (Next Generation IDS Dataset), an open dataset, was conducted. In addition, in order to solve the class imbalance problem according to the ratio of normal data and attack data, the detection performance according to the oversampling ratio was compared and analyzed using the oversampling technique applied with DCGAN (Deep Convolutional Generative Adversarial Networks). As a result of the experiment, the method preprocessed using the Doc2Vec algorithm for system call feature and process execution path feature showed good performance, and in the case of oversampling performance, when DCGAN was used, improved detection performance was shown.

Ensemble-based deep learning for autonomous bridge component and damage segmentation leveraging Nested Reg-UNet

  • Abhishek Subedi;Wen Tang;Tarutal Ghosh Mondal;Rih-Teng Wu;Mohammad R. Jahanshahi
    • Smart Structures and Systems
    • /
    • v.31 no.4
    • /
    • pp.335-349
    • /
    • 2023
  • Bridges constantly undergo deterioration and damage, the most common ones being concrete damage and exposed rebar. Periodic inspection of bridges to identify damages can aid in their quick remediation. Likewise, identifying components can provide context for damage assessment and help gauge a bridge's state of interaction with its surroundings. Current inspection techniques rely on manual site visits, which can be time-consuming and costly. More recently, robotic inspection assisted by autonomous data analytics based on Computer Vision (CV) and Artificial Intelligence (AI) has been viewed as a suitable alternative to manual inspection because of its efficiency and accuracy. To aid research in this avenue, this study performs a comparative assessment of different architectures, loss functions, and ensembling strategies for the autonomous segmentation of bridge components and damages. The experiments lead to several interesting discoveries. Nested Reg-UNet architecture is found to outperform five other state-of-the-art architectures in both damage and component segmentation tasks. The architecture is built by combining a Nested UNet style dense configuration with a pretrained RegNet encoder. In terms of the mean Intersection over Union (mIoU) metric, the Nested Reg-UNet architecture provides an improvement of 2.86% on the damage segmentation task and 1.66% on the component segmentation task compared to the state-of-the-art UNet architecture. Furthermore, it is demonstrated that incorporating the Lovasz-Softmax loss function to counter class imbalance can boost performance by 3.44% in the component segmentation task over the most employed alternative, weighted Cross Entropy (wCE). Finally, weighted softmax ensembling is found to be quite effective when used synchronously with the Nested Reg-UNet architecture by providing mIoU improvement of 0.74% in the component segmentation task and 1.14% in the damage segmentation task over a single-architecture baseline. Overall, the best mIoU of 92.50% for the component segmentation task and 84.19% for the damage segmentation task validate the feasibility of these techniques for autonomous bridge component and damage segmentation using RGB images.

Research on data augmentation algorithm for time series based on deep learning

  • Shiyu Liu;Hongyan Qiao;Lianhong Yuan;Yuan Yuan;Jun Liu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.6
    • /
    • pp.1530-1544
    • /
    • 2023
  • Data monitoring is an important foundation of modern science. In most cases, the monitoring data is time-series data, which has high application value. The deep learning algorithm has a strong nonlinear fitting capability, which enables the recognition of time series by capturing anomalous information in time series. At present, the research of time series recognition based on deep learning is especially important for data monitoring. Deep learning algorithms require a large amount of data for training. However, abnormal sample is a small sample in time series, which means the number of abnormal time series can seriously affect the accuracy of recognition algorithm because of class imbalance. In order to increase the number of abnormal sample, a data augmentation method called GANBATS (GAN-based Bi-LSTM and Attention for Time Series) is proposed. In GANBATS, Bi-LSTM is introduced to extract the timing features and then transfer features to the generator network of GANBATS.GANBATS also modifies the discriminator network by adding an attention mechanism to achieve global attention for time series. At the end of discriminator, GANBATS is adding averagepooling layer, which merges temporal features to boost the operational efficiency. In this paper, four time series datasets and five data augmentation algorithms are used for comparison experiments. The generated data are measured by PRD(Percent Root Mean Square Difference) and DTW(Dynamic Time Warping). The experimental results show that GANBATS reduces up to 26.22 in PRD metric and 9.45 in DTW metric. In addition, this paper uses different algorithms to reconstruct the datasets and compare them by classification accuracy. The classification accuracy is improved by 6.44%-12.96% on four time series datasets.

Family and Society Revealed from the Film (영화 <기생충>을 통해 본 가족과 사회)

  • Yook, Jung-Hak
    • Journal of Korea Entertainment Industry Association
    • /
    • v.14 no.5
    • /
    • pp.37-48
    • /
    • 2020
  • The film handles some happenings based on the story that a poor family sponges off a rich family. Junho Bong, a film director, has won the Palme d'Or in Cannes Film Festival and received a Academy Award for best picture, best original screen play, best international film and best director. The film has accomplished the cinematic achievements, but it seems that the implications the film aims to show might not be seriously appreciated to the public. The film has an unusual synopsis, which demonstrates that a deprived family is parasitic to a wealthy family. The storyline specifies how great the gap between the rich and the poor in Korea is. Accordingly, this article investigates some implications of the house, family, and society in the film .Consequently, three families (Kitaek's house, Park's luxury house, and maid's hidden basement) explicitly reveal distinctive social hierarchy. The common features found in two families are like this: the lower classes are willing to help one another but have no conscience and morality. The social implications in the film are closely associated with the class system based on the gap between the rich and the poor, the symbols of stone, and tragic ending. From the ending of the film, it is expected that the extreme social imbalance precedes the gap between the wealthy and the poor.

A Case of Pleural Effusion in a Patient with Heart Failure with Preserved Ejection Fraction Improved by A Combined Korean-Western Medicine Approach (좌심실 수축 기능 보전 심부전증으로 인한 흉수에 대한 한양방 복합치료 치험 1례)

  • Ha, Won Jung;Seo, Yuna;Lee, Young seon;Cho, Ki-Ho;Mun, Sang-Kwan;Jung, Woo-Sang;Kwon, Seungwon
    • The Journal of the Society of Stroke on Korean Medicine
    • /
    • v.22 no.1
    • /
    • pp.45-56
    • /
    • 2021
  • ■ Background Heart Failure with Preserved Ejection Fraction(HFpEF) is a heart failure that appears to have normal contraction function. In the case of HFpEF, no pharmacological therapy has been found to improve clinical prognosis, so it should be approached as an symptomatic treatment, therefore alternatives are needed due to concerns over adverse effects such as electrolyte imbalance caused by medication. ■ Case report A 81 year old female patient with Heart Failure with Preserved Ejection Fraction(HFpEF) patient complained dyspnea. Herbal prescription Mokbanggi-tang and Oryeongsan was administered on 6th day and 8th day respectively since the symptoms started. The NYHA Classification and Chest X-ray had been evaluated during the treatment period. Until the 7th day, the patient was classified as Class II, and when discharged from the hospital on the 28th day, it gradually improved and was classified as Class II. Chest X-Ray took on 2nd day showed pleural effusion and it was aggravated until 13th day. Follow up Chest X-Ray showed improving state of pleural effusion from 20th day and gradually got better. Mokbanggi-tang treatment continued for 52 days and stopped on 58th day. After Mokbanggi-tang treatment ended, only Oryeongsan treatment was maintained. ■ Conclusion The present case report suggests that Korean-Western medicine approach with Mokbangki-tang and Oryeongsan might be effective to pleural effusion and heart failure symptoms such as poor physical activity shown in a NYHA Classification. This shows that Mokbanggi-tang and Oryeongsan can be a therapeutic option as a treatment for patient with Heart Failure with Preserved Ejection Fraction(HFpEF).

Sorghum Field Segmentation with U-Net from UAV RGB (무인기 기반 RGB 영상 활용 U-Net을 이용한 수수 재배지 분할)

  • Kisu Park;Chanseok Ryu ;Yeseong Kang;Eunri Kim;Jongchan Jeong;Jinki Park
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.5_1
    • /
    • pp.521-535
    • /
    • 2023
  • When converting rice fields into fields,sorghum (sorghum bicolor L. Moench) has excellent moisture resistance, enabling stable production along with soybeans. Therefore, it is a crop that is expected to improve the self-sufficiency rate of domestic food crops and solve the rice supply-demand imbalance problem. However, there is a lack of fundamental statistics,such as cultivation fields required for estimating yields, due to the traditional survey method, which takes a long time even with a large manpower. In this study, U-Net was applied to RGB images based on unmanned aerial vehicle to confirm the possibility of non-destructive segmentation of sorghum cultivation fields. RGB images were acquired on July 28, August 13, and August 25, 2022. On each image acquisition date, datasets were divided into 6,000 training datasets and 1,000 validation datasets with a size of 512 × 512 images. Classification models were developed based on three classes consisting of Sorghum fields(sorghum), rice and soybean fields(others), and non-agricultural fields(background), and two classes consisting of sorghum and non-sorghum (others+background). The classification accuracy of sorghum cultivation fields was higher than 0.91 in the three class-based models at all acquisition dates, but learning confusion occurred in the other classes in the August dataset. In contrast, the two-class-based model showed an accuracy of 0.95 or better in all classes, with stable learning on the August dataset. As a result, two class-based models in August will be advantageous for calculating the cultivation fields of sorghum.

A Study on Improvement of the Pilot Certification System for stabilizing Supply and Demand of Harbour Pilots (도선사수급안정화를 위한 도선사 자격제도 개선에 관한 연구)

  • Jeon, Yeong-Woo;Kim, Tae-goun;Ji, Sangwon;Kim, JinKwan
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.23 no.7
    • /
    • pp.834-846
    • /
    • 2017
  • An increase in the number of retiring experienced pilots as well as drastic graying of new pilots will raise the problems on deepening imbalance of supply and demand of pilots over the next 7 years, which could entail fatal problems for the safety of harbour pilotage. In this study, the improvement plan of legal system to ease imbalance between supply and demand of pilots and help secure more experienced pilots has been proposed. A current state survey and analysis, statistical analysis, questionnaire survey on foreign countries, in-depth consultation with experts, etc., were all carried out to support this research. The conclusions of this study are, firstly, to propose an amendment that the minimum requirement of 5 years of seagoing service as a master to sit for the pilot exam should be relaxed to 2 years(which must include at least 1 year of master's seagoing service within the most recent 5 years) but the minimum requirement of 1 year of pilotage service should be reinforced to 1 year and 6 months to obtain a higher class of pilot certificate. Secondly, it is proposed that an amendment offering an additional 1 point per year over the minimum period of 2 years of seagoing service as a master should be added, with a maximum of 10 points in order to rationalize the additional incentive point system. In order to secure experienced pilots and resolve the legal conflict between the certificate revalidation system and the retirement system, it is also proposed that an amendment be passed revoking the retirement system and limiting the validity of any new certificates only to 68 years of age when issuing or revalidating a certificate, if an applicant is over a certain age. Promotional work, such as collecting opinions from interested parties and generating positive public awareness, should be carried out in the future. It will also be necessary to conduct a study on the training pilot exam system.