• 제목/요약/키워드: bayesian decision

검색결과 204건 처리시간 0.028초

A Review of Machine Learning Algorithms for Fraud Detection in Credit Card Transaction

  • Lim, Kha Shing;Lee, Lam Hong;Sim, Yee-Wai
    • International Journal of Computer Science & Network Security
    • /
    • 제21권9호
    • /
    • pp.31-40
    • /
    • 2021
  • The increasing number of credit card fraud cases has become a considerable problem since the past decades. This phenomenon is due to the expansion of new technologies, including the increased popularity and volume of online banking transactions and e-commerce. In order to address the problem of credit card fraud detection, a rule-based approach has been widely utilized to detect and guard against fraudulent activities. However, it requires huge computational power and high complexity in defining and building the rule base for pattern matching, in order to precisely identifying the fraud patterns. In addition, it does not come with intelligence and ability in predicting or analysing transaction data in looking for new fraud patterns and strategies. As such, Data Mining and Machine Learning algorithms are proposed to overcome the shortcomings in this paper. The aim of this paper is to highlight the important techniques and methodologies that are employed in fraud detection, while at the same time focusing on the existing literature. Methods such as Artificial Neural Networks (ANNs), Support Vector Machines (SVMs), naïve Bayesian, k-Nearest Neighbour (k-NN), Decision Tree and Frequent Pattern Mining algorithms are reviewed and evaluated for their performance in detecting fraudulent transaction.

발틱운임지수(BDI)와 해상 물동량의 인과성 검정 (Analysis of causality of Baltic Drybulk index (BDI) and maritime trade volume)

  • 배성훈;박근식
    • 무역학회지
    • /
    • 제44권2호
    • /
    • pp.127-141
    • /
    • 2019
  • In this study, the relationship between Baltic Dry Index(BDI) and maritime trade volume in the dry cargo market was verified using the vector autoregressive (VAR) model. Data was analyzed from 1992 to 2018 for iron ore, steam coal, coking coal, grain, and minor bulks of maritime trade volume and BDI. Granger causality analysis showed that the BDI affects the trade volume of coking coal and minor bulks but the trade volume of iron ore, steam coal and grain do not correlate with the BDI freight index. Impulse response analysis showed that the shock of BDI had the greatest impact on coking coal at the two years lag and the impact was negligible at the ten years lag. In addition, the shock of BDI on minor cargoes was strongest at the three years lag, and were negligible at the ten years lag. This study examined the relationship between maritime trade volume and BDI in the dry bulk shipping market in which uncertainty is high. As a result of this study, there is an economic aspect of sustainability that has helped the risk management of shipping companies. In addition, it is significant from an academic point of view that the long-term relationship between the two time series was analyzed through the causality test between variables. However, it is necessary to develop a forecasting model that will help decision makers in maritime markets using more sophisticated methods such as the Bayesian VAR model.

Forecasting tunnel path geology using Gaussian process regression

  • Mahmoodzadeh, Arsalan;Mohammadi, Mokhtar;Abdulhamid, Sazan Nariman;Ali, Hunar Farid Hama;Ibrahim, Hawkar Hashim;Rashidi, Shima
    • Geomechanics and Engineering
    • /
    • 제28권4호
    • /
    • pp.359-374
    • /
    • 2022
  • Geology conditions are crucial in decision-making during the planning and design phase of a tunnel project. Estimation of the geology conditions of road tunnels is subject to significant uncertainties. In this work, the effectiveness of a novel regression method in estimating geological or geotechnical parameters of road tunnel projects was explored. This method, called Gaussian process regression (GPR), formulates the learning of the regressor within a Bayesian framework. The GPR model was trained with data of old tunnel projects. To verify its feasibility, the GPR technique was applied to a road tunnel to predict the state of three geological/geomechanical parameters of Rock Mass Rating (RMR), Rock Structure Rating (RSR) and Q-value. Finally, in order to validate the GPR approach, the forecasted results were compared to the field-observed results. From this comparison, it was concluded that, the GPR is presented very good predictions. The R-squared values between the predicted results of the GPR vs. field-observed results for the RMR, RSR and Q-value were obtained equal to 0.8581, 0.8148 and 0.8788, respectively.

Differentiation among stability regimes of alumina-water nanofluids using smart classifiers

  • Daryayehsalameh, Bahador;Ayari, Mohamed Arselene;Tounsi, Abdelouahed;Khandakar, Amith;Vaferi, Behzad
    • Advances in nano research
    • /
    • 제12권5호
    • /
    • pp.489-499
    • /
    • 2022
  • Nanofluids have recently triggered a substantial scientific interest as cooling media. However, their stability is challenging for successful engagement in industrial applications. Different factors, including temperature, nanoparticles and base fluids characteristics, pH, ultrasonic power and frequency, agitation time, and surfactant type and concentration, determine the nanofluid stability regime. Indeed, it is often too complicated and even impossible to accurately find the conditions resulting in a stabilized nanofluid. Furthermore, there are no empirical, semi-empirical, and even intelligent scenarios for anticipating the stability of nanofluids. Therefore, this study introduces a straightforward and reliable intelligent classifier for discriminating among the stability regimes of alumina-water nanofluids based on the Zeta potential margins. In this regard, various intelligent classifiers (i.e., deep learning and multilayer perceptron neural network, decision tree, GoogleNet, and multi-output least squares support vector regression) have been designed, and their classification accuracy was compared. This comparison approved that the multilayer perceptron neural network (MLPNN) with the SoftMax activation function trained by the Bayesian regularization algorithm is the best classifier for the considered task. This intelligent classifier accurately detects the stability regimes of more than 90% of 345 different nanofluid samples. The overall classification accuracy and misclassification percent of 90.1% and 9.9% have been achieved by this model. This research is the first try toward anticipting the stability of water-alumin nanofluids from some easily measured independent variables.

Prediction Model for Gastric Cancer via Class Balancing Techniques

  • Danish, Jamil ;Sellappan, Palaniappan;Sanjoy Kumar, Debnath;Muhammad, Naseem;Susama, Bagchi ;Asiah, Lokman
    • International Journal of Computer Science & Network Security
    • /
    • 제23권1호
    • /
    • pp.53-63
    • /
    • 2023
  • Many researchers are trying hard to minimize the incidence of cancers, mainly Gastric Cancer (GC). For GC, the five-year survival rate is generally 5-25%, but for Early Gastric Cancer (EGC), it is almost 90%. Predicting the onset of stomach cancer based on risk factors will allow for an early diagnosis and more effective treatment. Although there are several models for predicting stomach cancer, most of these models are based on unbalanced datasets, which favours the majority class. However, it is imperative to correctly identify cancer patients who are in the minority class. This research aims to apply three class-balancing approaches to the NHS dataset before developing supervised learning strategies: Oversampling (Synthetic Minority Oversampling Technique or SMOTE), Undersampling (SpreadSubsample), and Hybrid System (SMOTE + SpreadSubsample). This study uses Naive Bayes, Bayesian Network, Random Forest, and Decision Tree (C4.5) methods. We measured these classifiers' efficacy using their Receiver Operating Characteristics (ROC) curves, sensitivity, and specificity. The validation data was used to test several ways of balancing the classifiers. The final prediction model was built on the one that did the best overall.

WEB 기반 교통사고 분석 (Analysis System for Traffic Accident based on WEB)

  • 홍유식;한창평
    • 한국인터넷방송통신학회논문지
    • /
    • 제22권6호
    • /
    • pp.13-20
    • /
    • 2022
  • 겨울철 도로에서 발생하는 안개 및 결빙구간 교통사고 사망률의 경우는 도로조건 및 기상조건이 매우 중요한 요소 이다. 본 논문에서는 교통사고 예측 데이터를 가정하고 교통사고 위험율을 에측 하는 모의실험을 수행하였다. 그뿐만 아니라, 본 논문에서는 교통사고를 줄이고 교통사고를 예방하기 위해서, 교통공단에서 제공하는 교통사고 사망자 데이터를 WEKA 데이터 마이닝 기법 및 TENSOR FLOW 공개 소스를 이용해서 요인 분석 및 교통사고 치사율 사망을 예측하였다. 추가적인 기능으로는 지도 표시 기능을 이용해서, 운전자가 WEB 기반에서, 안개 및 결빙구간 정보를 운전자에게 제공하는 모의실험 및 교통사고 사진을 실시간으로 전송하는 모의실험 결과를 설명하였다.

속성 값 빈도 기반의 전문가 다수결 분류기 (Committee Learning Classifier based on Attribute Value Frequency)

  • 이창환;정인철;권영식
    • 한국정보과학회논문지:데이타베이스
    • /
    • 제37권4호
    • /
    • pp.177-184
    • /
    • 2010
  • 센서 정보, 물류/유통정보, 신용 정보, 주식 정보 등이 과거보다 다양하면서 대용량의 연속 발생 형태 데이터가 발생하고 있다. 이러한 데이터는 대용량의 특의 변화가 빠른 특징들을 가지고 있기 때문에 학습이 어렵다. 이러한 문제점을 해결하기 위해 일정 윈도우 크기의 최근 데이터를 연속적으로 학습시킴으로써 전체 모형을 새롭게 만들거나 모형의 일부분을 대체 하는 방법을 사용하여 왔다. 그러나 이러한 방법은 계속해서 새로운 학습모형을 만들어야 하므로 대용량의 연속 데이터를 학습시키는데 많은 시간과 비용이 든다. 따라서, 이러한 특성에 대비하기 위하여 추가적인 학습 데이터가 발생할 때 마다, 점진적이며 지속적으로 학습을 할 수 있는 학습 기법이 필요하다. 보다 빠른 속도로 학습 모형의 변화 없이 분류를 하기 위하여 대표적인 점진적 학습 방법으로 베이지안 분류기를 사용할 수 있지만, 사전확률을 알고 있다는 가정으로부터 시작을 하게 되어 일정량 이상의 학습데이터가 필요하다. 따라서 본 연구에서는 베이지안 분류기와 같이 점진적으로 학습을 할 수 있지만, 사전 확률을 알지 못하더라고 학습을 할 수 있는 새로운 점진적 학습 알고리즘을 제안하고자 한다. 본 연구에서 제안하는 알고리즘의 기본 개념은 여러 전문가의 의견을 종합하는 방식이다. 여기서는 속성값(attribute value)을 한명의 전문가로 보고 전문가 집단의 의사 결정이 맞을 경우에는 가점을 주고 틀릴 경우에는 감점을 하는 방식으로 학습을 하게 된다. 실험결과 이 방법은 의사결정나무나 베이지언 분류기와 비교해 비슷한 성능을 나타내었으며, 향후에 스트림 데이터 분석에 사용할 가능성을 보였다.

신뢰도이론에서 위험측도를 이용한 할증보험료 결정에 대한 고찰 (A Study on the Determination of the Risk-Loaded Premium using Risk Measures in the Credibility Theory)

  • 김현태;전용호
    • 응용통계연구
    • /
    • 제27권1호
    • /
    • pp.71-87
    • /
    • 2014
  • 손해보험의 신뢰도이론에서 순보험료로 사용되는 베이즈보험료는 꼬리위험을 반영하지 못한다는 한계점이 있다. 본 논문에서는 꼬리위험측도를 이용하여 할증보험료를 결정하는데 있어 중요하다고 여겨지는 두 가지 주제를 다루었다. 첫째, 위험측도로부터 유도되는 안전할증은 내재된 담보의 위험을 보다 정확히 반영할 수 있으며, 동시에 베이즈보험료만을 사용할 경우 초래될 수 있는 잘못된 의사결정을 피할 수 있음을 보였다. 둘째, 동일한 사전분포가 주어지더라도 서로 다른 조건부손실분포의 꼬리위험 순위와 그에 상응하는 예측분포의 꼬리위험순위는 일반적으로 다를 수 있음을 모수적 모형에 기반하여 보였다. 따라서 안전할증은 조건부손실분포의 위험측도가 아니라 예측분포의 위험측도를 사용해야 함을 알 수 있다.

Comparison of the fit of automatic milking system and test-day records with the use of lactation curves

  • Sitkowska, B.;Kolenda, M.;Piwczynski, D.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • 제33권3호
    • /
    • pp.408-415
    • /
    • 2020
  • Objective: The aim of the paper was to compare the fit of data derived from daily automatic milking systems (AMS) and monthly test-day records with the use of lactation curves; data was analysed separately for primiparas and multiparas. Methods: The study was carried out on three Polish Holstein-Friesians (PHF) dairy herds. The farms were equipped with an automatic milking system which provided information on milking performance throughout lactation. Once a month cows were also subjected to test-day milkings (method A4). Most studies described in the literature are based on test-day data; therefore, we aimed to compare models based on both test-day and AMS data to determine which mathematical model (Wood or Wilmink) would be the better fit. Results: Results show that lactation curves constructed from data derived from the AMS were better adjusted to the actual milk yield (MY) data regardless of the lactation number and model. Also, we found that the Wilmink model may be a better fit for modelling the lactation curve of PHF cows milked by an AMS as it had the lowest values of Akaike information criterion, Bayesian information criterion, mean square error, the highest coefficient of determination values, and was more accurate in estimating MY than the Wood model. Although both models underestimated peak MY, mean, and total MY, the Wilmink model was closer to the real values. Conclusion: Models of lactation curves may have an economic impact and may be helpful in terms of herd management and decision-making as they assist in forecasting MY at any moment of lactation. Also, data obtained from modelling can help with monitoring milk performance of each cow, diet planning, as well as monitoring the health of the cow.

소셜 네트워크 서비스에 노출된 개인정보의 소유자 식별 방법 (A Method of Identifying Ownership of Personal Information exposed in Social Network Service)

  • 김석현;조진만;진승헌;최대선
    • 정보보호학회논문지
    • /
    • 제23권6호
    • /
    • pp.1103-1110
    • /
    • 2013
  • 본 논문에서는 소셜 네트워크 서비스 상에 공개된 개인정보의 소유자 식별 방법을 제안한다. 구체적으로는 트위터상에 언급된 지역 정보가 게시자의 거주지를 의미하는지를 자동으로 판단하는 방법이다. 개인정보 소유자 식별은 특정인의 개인정보가 온라인 상에 얼마나 노출되어 있는지 파악하여 그 위험도를 산정하기 위한 과정의 일부로서 필수적이다. 제안 방법은 트윗 문장의 어휘 및 구조적 특징 13개를 자질(feature set)로 활용한 소유자 식별 규칙들을 통해 지역정보가 게시자의 거주지를 의미하는지 판단한다. 실제 트위터 데이터를 이용한 실험에서 제안방법이 n-gram을 자질로 사용한 나이브베이지안 같은 전통적인 문서 분류 모델보다 더 높은 성능 (F1값 0.876)을 보였다.