통합 검색 | Korea Science

자기 공명 영상 데이터의 oversampling을 통한 quantization noise 개선

김휴정;안창범
- 대한자기공명의과학회:학술대회논문집
- /
- 대한자기공명의과학회 2002년도 제7차 학술대회 초록집
- /
- pp.96-96
- /
- 2002
목적： MRI 시스템의 비약적인 발전으로 인하여, 시스템에서 발생되는 noise가 상당히 줄었다. 따라서 시스템에서 발생되는 random noise뿐만 아니라 sampling 과정에서 발생되는 quantization noise도 중요하게 고려하여야 할 요소가 되었다. 특히, MRI 신호의 경우 dynamic range가 크기 때문에 bit 수가 큰 ADC를 이용하여 데이터를 얻어야 한다. 그러나, bit 수가 크고 높은 sampling rate를 갖는 ADC의 경우 가격이 높을 뿐만 아니라, 기존의 장비를 교체해야하는 어려움이 있다. 본 연구는 oversampling과 quantization noise와의 관계를 컴퓨터 시뮬레이션을 통하여 알아보고, MRI영상에서 oversampling을 통하여 quantization noise를 줄임으로써 영상의 질을 개선하고자 한다.
PDF

Experimental Analysis of Equilibrization in Binary Classification for Non-Image Imbalanced Data Using Wasserstein GAN

Wang, Zhi-Yong;Kang, Dae-Ki
- International Journal of Internet, Broadcasting and Communication
- /
- 제11권4호
- /
- pp.37-42
- /
- 2019
In this paper, we explore the details of three classic data augmentation methods and two generative model based oversampling methods. The three classic data augmentation methods are random sampling (RANDOM), Synthetic Minority Over-sampling Technique (SMOTE), and Adaptive Synthetic Sampling (ADASYN). The two generative model based oversampling methods are Conditional Generative Adversarial Network (CGAN) and Wasserstein Generative Adversarial Network (WGAN). In imbalanced data, the whole instances are divided into majority class and minority class, where majority class occupies most of the instances in the training set and minority class only includes a few instances. Generative models have their own advantages when they are used to generate more plausible samples referring to the distribution of the minority class. We also adopt CGAN to compare the data augmentation performance with other methods. The experimental results show that WGAN-based oversampling technique is more stable than other approaches (RANDOM, SMOTE, ADASYN and CGAN) even with the very limited training datasets. However, when the imbalanced ratio is too small, generative model based approaches cannot achieve satisfying performance than the conventional data augmentation techniques. These results suggest us one of future research directions.
https://doi.org/10.7236/IJIBC.2019.11.4.37 인용 PDF KSCI

SMOTE와 Light GBM 기반의 불균형 데이터 개선 기법 (Imbalanced Data Improvement Techniques Based on SMOTE and Light GBM)

한영진;조인휘
- 정보처리학회논문지:컴퓨터 및 통신 시스템
- /
- 제11권12호
- /
- pp.445-452
- /
- 2022
디지털 세상에서 불균형 데이터에 대한 클래스 분포는 중요한 부분이며 사이버 보안에 큰 의미를 차지한다. 불균형 데이터의 비정상적인 활동을 찾고 문제를 해결해야 한다. 모든 트랜잭션의 패턴을 추적할 수 있는 시스템이 필요하지만, 일반적으로 패턴이 비정상인 불균형 데이터로 기계학습을 하면 소수 계층에 대한 성능은 무시되고 저하되며 예측 모델은 부정확하게 편향될 수 있다. 본 논문에서는 불균형 데이터 세트를 해결하기 위한 접근 방식으로 Synthetic Minority Oversampling Technique(SMOTE)와 Light GBM 알고리즘을 이용하여 추정치를 결합하여 대상 변수를 예측하고 정확도를 향상시켰다. 실험 결과는 Logistic Regression, Decision Tree, KNN, Random Forest, XGBoost 알고리즘과 비교하였다. 정확도, 재현율에서는 성능이 모두 비슷했으나 정밀도에서는 2개의 알고리즘 Random Forest 80.76%, Light GBM 97.16% 성능이 나왔고, F1-score에서는 Random Forest 84.67%, Light GBM 91.96% 성능이 나왔다. 이 실험 결과로 Light GBM은 성능이 5개의 알고리즘과 비교하여 편차없이 비슷하거나 최대 16% 향상됨을 접근 방식으로 확인할 수 있었다.
https://doi.org/10.3745/KTCCS.2022.11.12.445 인용 PDF KSCI

사출성형공정에서 데이터의 불균형 해소를 위한 담금질모사 (Simulated Annealing for Overcoming Data Imbalance in Mold Injection Process)

이동주
- 산업경영시스템학회지
- /
- 제45권4호
- /
- pp.233-239
- /
- 2022
The injection molding process is a process in which thermoplastic resin is heated and made into a fluid state, injected under pressure into the cavity of a mold, and then cooled in the mold to produce a product identical to the shape of the cavity of the mold. It is a process that enables mass production and complex shapes, and various factors such as resin temperature, mold temperature, injection speed, and pressure affect product quality. In the data collected at the manufacturing site, there is a lot of data related to good products, but there is little data related to defective products, resulting in serious data imbalance. In order to efficiently solve this data imbalance, undersampling, oversampling, and composite sampling are usally applied. In this study, oversampling techniques such as random oversampling (ROS), minority class oversampling (SMOTE), ADASYN(Adaptive Synthetic Sampling), etc., which amplify data of the minority class by the majority class, and complex sampling using both undersampling and oversampling, are applied. For composite sampling, SMOTE+ENN and SMOTE+Tomek were used. Artificial neural network techniques is used to predict product quality. Especially, MLP and RNN are applied as artificial neural network techniques, and optimization of various parameters for MLP and RNN is required. In this study, we proposed an SA technique that optimizes the choice of the sampling method, the ratio of minority classes for sampling method, the batch size and the number of hidden layer units for parameters of MLP and RNN. The existing sampling methods and the proposed SA method were compared using accuracy, precision, recall, and F1 Score to prove the superiority of the proposed method.
https://doi.org/10.11627/jksie.2022.45.4.233 인용 PDF KSCI

Study of oversampling algorithms for soil classifications by field velocity resistivity probe

Lee, Jong-Sub;Park, Junghee;Kim, Jongchan;Yoon, Hyung-Koo
- Geomechanics and Engineering
- /
- 제30권3호
- /
- pp.247-258
- /
- 2022
A field velocity resistivity probe (FVRP) can measure compressional waves, shear waves and electrical resistivity in boreholes. The objective of this study is to perform the soil classification through a machine learning technique through elastic wave velocity and electrical resistivity measured by FVRP. Field and laboratory tests are performed, and the measured values are used as input variables to classify silt sand, sand, silty clay, and clay-sand mixture layers. The accuracy of k-nearest neighbors (KNN), naive Bayes (NB), random forest (RF), and support vector machine (SVM), selected to perform classification and optimize the hyperparameters, is evaluated. The accuracies are calculated as 0.76, 0.91, 0.94, and 0.88 for KNN, NB, RF, and SVM algorithms, respectively. To increase the amount of data at each soil layer, the synthetic minority oversampling technique (SMOTE) and conditional tabular generative adversarial network (CTGAN) are applied to overcome imbalance in the dataset. The CTGAN provides improved accuracy in the KNN, NB, RF and SVM algorithms. The results demonstrate that the measured values by FVRP can classify soil layers through three kinds of data with machine learning algorithms.
https://doi.org/10.12989/gae.2022.30.3.247 인용 KSCI

Prediction Model for Gastric Cancer via Class Balancing Techniques

Danish, Jamil ;Sellappan, Palaniappan;Sanjoy Kumar, Debnath;Muhammad, Naseem;Susama, Bagchi ;Asiah, Lokman
- International Journal of Computer Science & Network Security
- /
- 제23권1호
- /
- pp.53-63
- /
- 2023
Many researchers are trying hard to minimize the incidence of cancers, mainly Gastric Cancer (GC). For GC, the five-year survival rate is generally 5-25%, but for Early Gastric Cancer (EGC), it is almost 90%. Predicting the onset of stomach cancer based on risk factors will allow for an early diagnosis and more effective treatment. Although there are several models for predicting stomach cancer, most of these models are based on unbalanced datasets, which favours the majority class. However, it is imperative to correctly identify cancer patients who are in the minority class. This research aims to apply three class-balancing approaches to the NHS dataset before developing supervised learning strategies: Oversampling (Synthetic Minority Oversampling Technique or SMOTE), Undersampling (SpreadSubsample), and Hybrid System (SMOTE + SpreadSubsample). This study uses Naive Bayes, Bayesian Network, Random Forest, and Decision Tree (C4.5) methods. We measured these classifiers' efficacy using their Receiver Operating Characteristics (ROC) curves, sensitivity, and specificity. The validation data was used to test several ways of balancing the classifiers. The final prediction model was built on the one that did the best overall.
https://doi.org/10.22937/IJCSNS.2023.23.1.8 인용 PDF

A Single-ended Simultaneous Bidirectional Transceiver in 65-nm CMOS Technology

Jeon, Min-Ki;Yoo, Changsik
- JSTS:Journal of Semiconductor Technology and Science
- /
- 제16권6호
- /
- pp.817-824
- /
- 2016
A simultaneous bidirectional transceiver over a single wire has been developed in a 65 nm CMOS technology for a command and control bus. The echo signals of the simultaneous bidirectional link are cancelled by controlling the decision level of receiver comparators without power-hungry operational amplifier (op-amp) based circuits. With the clock information embedded in the rising edges of the signals sent from the source side to the sink side, the data is recovered by an open-loop digital circuit with 20 times blind oversampling. The data rate of the simultaneous bidirectional transceiver in each direction is 75 Mbps and therefore the overall signaling bandwidth is 150 Mbps. The measured energy efficiency of the transceiver is 56.7 pJ/b and the bit-error-rate (BER) is less than $10^{-12}$ with $2^7-1$ pseudo-random binary sequence (PRBS) pattern for both signaling directions.
https://doi.org/10.5573/JSTS.2016.16.6.817 인용 PDF KSCI

데이터의 불균형성을 제거한 네트워크 침입 탐지 모델 비교 분석 (Experimental Comparison of Network Intrusion Detection Models Solving Imbalanced Data Problem)

이종화;방지원;김종욱;최미정
- KNOM Review
- /
- 제23권2호
- /
- pp.18-28
- /
- 2020
컴퓨팅 환경의 발전에 따라 IT 기술이 의료, 산업, 통신, 문화 등의 분야에서 사람들에게 제공해주는 혜택이 늘어나 삶의 질도 향상되고 있다. 그에 따라 발전된 네트워크 환경을 노리는 다양한 악의적인 공격이 존재한다. 이러한 공격들을 사전에 탐지하기 위해 방화벽, 침입 탐지 시스템 등이 존재하지만, 나날이 진화하는 악성 공격들을 탐지하는 데에는 한계가 있다. 이를 해결하기 위해 기계 학습을 이용한 침입 탐지 연구가 활발히 진행되고 있지만, 학습 데이터셋의 불균형으로 인한 오탐 및 미탐이 발생하고 있다. 본 논문에서는 네트워크 침입 탐지에 사용되는 UNSW-NB15 데이터셋의 불균형성 문제를 해결하기 위해 랜덤 오버샘플링 방법을 사용했다. 실험을 통해 모델들의 accuracy, precision, recall, F1-score, 학습 및 예측 시간, 하드웨어 자원 소모량을 비교 분석했다. 나아가 본 연구를 기반으로 랜덤 오버샘플링 방법 이외에 불균형한 데이터 문제를 해결할 수 있는 다른 방법들과 성능이 높은 모델들을 이용하여 좀 더 효율적인 네트워크 침입 탐지 모델 연구로 발전시키고자 한다.
https://doi.org/10.22670/knom.2020.23.2.18 인용

쾌삭 303계 스테인리스강 소형 압연 선재 제조 공정의 생산품질 예측 모형 (Quality Prediction Model for Manufacturing Process of Free-Machining 303-series Stainless Steel Small Rolling Wire Rods)

서석준;김흥섭
- 산업경영시스템학회지
- /
- 제44권4호
- /
- pp.12-22
- /
- 2021
This article suggests the machine learning model, i.e., classifier, for predicting the production quality of free-machining 303-series stainless steel(STS303) small rolling wire rods according to the operating condition of the manufacturing process. For the development of the classifier, manufacturing data for 37 operating variables were collected from the manufacturing execution system(MES) of Company S, and the 12 types of derived variables were generated based on literature review and interviews with field experts. This research was performed with data preprocessing, exploratory data analysis, feature selection, machine learning modeling, and the evaluation of alternative models. In the preprocessing stage, missing values and outliers are removed, and oversampling using SMOTE(Synthetic oversampling technique) to resolve data imbalance. Features are selected by variable importance of LASSO(Least absolute shrinkage and selection operator) regression, extreme gradient boosting(XGBoost), and random forest models. Finally, logistic regression, support vector machine(SVM), random forest, and XGBoost are developed as a classifier to predict the adequate or defective products with new operating conditions. The optimal hyper-parameters for each model are investigated by the grid search and random search methods based on k-fold cross-validation. As a result of the experiment, XGBoost showed relatively high predictive performance compared to other models with an accuracy of 0.9929, specificity of 0.9372, F₁-score of 0.9963, and logarithmic loss of 0.0209. The classifier developed in this study is expected to improve productivity by enabling effective management of the manufacturing process for the STS303 small rolling wire rods.
https://doi.org/10.11627/jksie.2021.44.4.012 인용 PDF KSCI

자연어 처리 기반 『상한론(傷寒論)』 변병진단체계(辨病診斷體系) 분류를 위한 기계학습 모델 선정 (Selecting Machine Learning Model Based on Natural Language Processing for Shanghanlun Diagnostic System Classification)

김영남
- 대한상한금궤의학회지
- /
- 제14권1호
- /
- pp.41-50
- /
- 2022
Objective : The purpose of this study is to explore the most suitable machine learning model algorithm for Shanghanlun diagnostic system classification using natural language processing (NLP). Methods : A total of 201 data items were collected from 『Shanghanlun』 and 『Clinical Shanghanlun』, 'Taeyangbyeong-gyeolhyung' and 'Eumyangyeokchahunobokbyeong' were excluded to prevent oversampling or undersampling. Data were pretreated using a twitter Korean tokenizer and trained by logistic regression, ridge regression, lasso regression, naive bayes classifier, decision tree, and random forest algorithms. The accuracy of the models were compared. Results : As a result of machine learning, ridge regression and naive Bayes classifier showed an accuracy of 0.843, logistic regression and random forest showed an accuracy of 0.804, and decision tree showed an accuracy of 0.745, while lasso regression showed an accuracy of 0.608. Conclusions : Ridge regression and naive Bayes classifier are suitable NLP machine learning models for the Shanghanlun diagnostic system classification.
PDF

검색결과 20건 처리시간 0.025초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)