• Title/Summary/Keyword: Oversampling Technique

Search Result 56, Processing Time 0.025 seconds

Experimental Analysis of Bankruptcy Prediction with SHAP framework on Polish Companies

  • Tuguldur Enkhtuya;Dae-Ki Kang
    • International journal of advanced smart convergence
    • /
    • v.12 no.1
    • /
    • pp.53-58
    • /
    • 2023
  • With the fast development of artificial intelligence day by day, users are demanding explanations about the results of algorithms and want to know what parameters influence the results. In this paper, we propose a model for bankruptcy prediction with interpretability using the SHAP framework. SHAP (SHAPley Additive exPlanations) is framework that gives a visualized result that can be used for explanation and interpretation of machine learning models. As a result, we can describe which features are important for the result of our deep learning model. SHAP framework Force plot result gives us top features which are mainly reflecting overall model score. Even though Fully Connected Neural Networks are a "black box" model, Shapley values help us to alleviate the "black box" problem. FCNNs perform well with complex dataset with more than 60 financial ratios. Combined with SHAP framework, we create an effective model with understandable interpretation. Bankruptcy is a rare event, then we avoid imbalanced dataset problem with the help of SMOTE. SMOTE is one of the oversampling technique that resulting synthetic samples are generated for the minority class. It uses K-nearest neighbors algorithm for line connecting method in order to producing examples. We expect our model results assist financial analysts who are interested in forecasting bankruptcy prediction of companies in detail.

LSTM-based fraud detection system framework using real-time data resampling techniques (실시간 리샘플링 기법을 활용한 LSTM 기반의 사기 거래 탐지 시스템)

  • Seo-Yi Kim;Yeon-Ji Lee;Il-Gu Lee
    • Annual Conference of KIPS
    • /
    • 2024.05a
    • /
    • pp.505-508
    • /
    • 2024
  • 금융산업의 디지털 전환은 사용자에게 편리함을 제공하지만 기존에 존재하지 않던 보안상 취약점을 유발했다. 이러한 문제를 해결하기 위해 기계학습 기술을 적용한 사기 거래 탐지 시스템에 대한 연구가 활발하게 이루어지고 있다. 하지만 모델 학습 과정에서 발생하는 데이터 불균형 문제로 인해 오랜 시간이 소요되고 탐지 성능이 저하되는 문제가 있다. 본 논문에서는 실시간 데이터 오버 샘플링을 통해 이상 거래 탐지 시 데이터 불균형 문제를 해결하고 모델 학습 시간을 개선한 새로운 이상 거래 탐지 시스템(Fraud Detection System, FDS)을 제안한다. 본 논문에서 제안하는 SMOTE(Synthetic Minority Oversampling Technique)를 적용한 LSTM(Long-Short Term Memory) 알고리즘 기반의 FDS 프레임워크는 종래의 LSTM 알고리즘 기반의 FDS 모델과 비교했을 때, 데이터 사이즈가 96.5% 감소했으며, 정밀도, 재현율, F1-Score 가 34.81%, 11.14%, 22.51% 개선되었다.

A Low-Voltage Low-Power Delta-Sigma Modulator for Cardiac Pacemaker Applications (심장박동 조절장치를 위한 저전압 저전력 델타 시그마 모듈레이터)

  • Chae, Young-Cheol;Lee, Jeong-Whan;Lee, In-Hee;Han, Gun-Hee
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.46 no.1
    • /
    • pp.52-58
    • /
    • 2009
  • A low voltage, low power delta-sigma modulator is proposed for cardiac pacemaker applications. A cascade of delta-sigma modulator stages that employ a feedforward topology has been used to implement a high-resolution oversampling ADC under the low supply. An inverter-based switched-capacitor circuit technique is used for low-voltage operation and ultra-low power consumption. An experimental prototype of the proposed circuit has been implemented in a $0.35-{\mu}m$ CMOS process, and it achieves 61-dB SNDR, 63-dB SNR, and 65-dB DR for a 120-Hz signal bandwidth at 7.6-kHz sampling frequency. The power consumption is only 280 nW at 1-V power supply.

Design of 4th Order ΣΔ modulator employing a low power reconfigurable operational amplifier (전력절감용 재구성 연산증폭기를 사용한 4차 델타-시그마 변조기 설계)

  • Lee, Dong-Hyun;Yoon, Kwang-Sub
    • Journal of IKEEE
    • /
    • v.22 no.4
    • /
    • pp.1025-1030
    • /
    • 2018
  • The proposed modulator is designed by utilizing a conventional structure employing time division technique to realize the 4th order delta-sigma modulator using one op-amp. In order to reduce the influence of KT/C noise, the capacitance in the first and second integrators reused was chosen to be 20pF and capacitance of third and fourth integrators was designed to be 1pF. The stage variable technique in the low power reconfigurable op-amp was used to solve the stability issue due to different capacitance loads for the reduction of KT/C noise. This technique enabled the proposed modulator to reduce the power consumption of 15% with respect to the conventional one. The proposed modulator was fabricated with 0.18um CMOS N-well 1 poly 6 metal process and consumes 305uW at supply voltage of 1.8V. The measurement results demonstrated that SNDR, ENOB, DR, FoM(Walden), and FoM(Schreier) were 66.3 dB, 10.6 bits, 83 dB, 98 pJ/step, and 142.8 dB at the sampling frequency of 256kHz, oversampling ratio of 128, clock frequency of 1.024 MHz, and input frequency of 250 Hz, respectively.

Blind frequency offset estimation method in OFDM systems (OFDM에서 블라인드 주파수 옵셋 추정 방법)

  • Jeon, Hyoung-Goo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.4
    • /
    • pp.823-832
    • /
    • 2011
  • In this paper, an efficient blind carrier frequency offset (CFO) estimation method in orthogonal frequency division multiplexing (OFDM) systems is proposed. In the proposed method, we obtain two time different received OFDM symbols by using both the cyclic prefix and oversampling technique, and a cost function is defined by using the two OFDM symbols. We show that the cost function can be approximately expressed as a cosine function. Using a property of the cosine function, a formular for estimating the CFO is derived. The estimator of the CFO requires three independent cost function values calculated at three different points of frequency offset. The proposed method is very efficient in computational complexity since no searching operation for the minimum cost value is required. The proposed method reduces 97% of the amount of FFT computation, compared with the ML method. Unlike the conventional methods such as the ML method and the MUSIC] method, the accuracy of the proposed method is independent of the searching resolution since the closed form solution exists. The computer simulation shows that the performance of the proposed method is superior to those of the MUSIC and the ML method.

Design of Optimal FIR Filters for Data Transmission (데이터 전송을 위한 최적 FIR 필터 설계)

  • 이상욱;이용환
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.18 no.8
    • /
    • pp.1226-1237
    • /
    • 1993
  • For data transmission over strictly band-limited non-ideal channels, different types of filters with arbitrary responses are needed. In this paper. we proposed two efficient techniques for the design of such FIR filters whose response is specified in either the time or the frequency domain. In particular when a fractionally-spaced structure is used for the transceiver, these filters can be efficiently designed by making use of characteristics of oversampling. By using a minimum mean-squared error criterion, we design a fractionally-spaced FIR filter whose frequency response can be controlled without affecting the output error. With proper specification of the shape of the additive noise signals, for example, the design results in a receiver filter that can perform compromise equalization as well as phase splitting filtering for QAM demodulation. The second method ad-dresses the design of an FIR filter whose desired response can be arbitrarily specified in the frequency domain. For optimum design, we use an iterative optimization technique based on a weighted least mean square algorithm. A new adaptation algorithm for updating the weighting function is proposed for fast and stable convergence. It is shown that these two independent methods can be efficiently combined together for more complex applications.

  • PDF

Second-order Sigma-Delta Modulator for Mobile BMIC Applications (모바일 기기용 BMIC를 위한 2차 시그마 델타 모듈레이터)

  • Park, Chulkyu;Jang, Kichang;Kim, Hyojae;Choi, Joongho
    • Journal of IKEEE
    • /
    • v.18 no.2
    • /
    • pp.263-271
    • /
    • 2014
  • This paper presents design of the second-order sigma-delta modulator for converting voltage and temperature signals to digital ones in Battery Management IC (BMIC) for mobile applications. The second-order single-loop switched-capacitor sigma-delta modulator with 1-bit quantization in 0.13-um CMOS technology is proposed. The proposed modulator is designed using switched-opamp technique for saving power consumption. With an oversampling ratio of 256 and clock frequency of 256kHz, the modulator achieves a measured 83-dB dynamic range and a peak signal-to-(noise+distortion) ratio (SNDR) of 81.7dB. Power dissipation is about 0.66 mW at 3.3 V power supply and the occupied core area is $0.425mm^2$.

Discriminant analysis for unbalanced data using HDBSCAN (불균형자료를 위한 판별분석에서 HDBSCAN의 활용)

  • Lee, Bo-Hui;Kim, Tae-Heon;Choi, Yong-Seok
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.4
    • /
    • pp.599-609
    • /
    • 2021
  • Data with a large difference in the number of objects between clusters are called unbalanced data. In discriminant analysis of unbalanced data, it is more important to classify objects in minority categories than to classify objects in majority categories well. However, objects in minority categories are often misclassified into majority categories. In this study, we propose a method that combined hierarchical DBSCAN (HDBSCAN) and SMOTE to solve this problem. Using HDBSCAN, it removes noise in minority categories and majority categories. Then it applies SMOTE to create new data. Area under the roc curve (AUC) and F1 scores were used to compare performance with existing methods. As a result, in most cases, the method combining HDBSCAN and synthetic minority oversampling technique (SMOTE) showed a high performance index, and it was found to be an excellent method for classifying unbalanced data.

Arrhythmia Classification using GAN-based Over-Sampling Method and Combination Model of CNN-BLSTM (GAN 오버샘플링 기법과 CNN-BLSTM 결합 모델을 이용한 부정맥 분류)

  • Cho, Ik-Sung;Kwon, Hyeog-Soong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.10
    • /
    • pp.1490-1499
    • /
    • 2022
  • Arrhythmia is a condition in which the heart has an irregular rhythm or abnormal heart rate, early diagnosis and management is very important because it can cause stroke, cardiac arrest, or even death. In this paper, we propose arrhythmia classification using hybrid combination model of CNN-BLSTM. For this purpose, the QRS features are detected from noise removed signal through pre-processing and a single bit segment was extracted. In this case, the GAN oversampling technique is applied to solve the data imbalance problem. It consisted of CNN layers to extract the patterns of the arrhythmia precisely, used them as the input of the BLSTM. The weights were learned through deep learning and the learning model was evaluated by the validation data. To evaluate the performance of the proposed method, classification accuracy, precision, recall, and F1-score were compared by using the MIT-BIH arrhythmia database. The achieved scores indicate 99.30%, 98.70%, 97.50%, 98.06% in terms of the accuracy, precision, recall, F1 score, respectively.

Boosting the Performance of the Predictive Model on the Imbalanced Dataset Using SVM Based Bagging and Out-of-Distribution Detection (SVM 기반 Bagging과 OoD 탐색을 활용한 제조공정의 불균형 Dataset에 대한 예측모델의 성능향상)

  • Kim, Jong Hoon;Oh, Hayoung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.11
    • /
    • pp.455-464
    • /
    • 2022
  • There are two unique characteristics of the datasets from a manufacturing process. They are the severe class imbalance and lots of Out-of-Distribution samples. Some good strategies such as the oversampling over the minority class, and the down-sampling over the majority class, are well known to handle the class imbalance. In addition, SMOTE has been chosen to address the issue recently. But, Out-of-Distribution samples have been studied just with neural networks. It seems to be hardly shown that Out-of-Distribution detection is applied to the predictive model using conventional machine learning algorithms such as SVM, Random Forest and KNN. It is known that conventional machine learning algorithms are much better than neural networks in prediction performance, because neural networks are vulnerable to over-fitting and requires much bigger dataset than conventional machine learning algorithms does. So, we suggests a new approach to utilize Out-of-Distribution detection based on SVM algorithm. In addition to that, bagging technique will be adopted to improve the precision of the model.