• 제목/요약/키워드: prediction algorithms

검색결과 994건 처리시간 0.03초

협업 필터링 추천에서 대응평균 알고리즘의 예측 성능에 관한 연구 (A study on the Prediction Performance of the Correspondence Mean Algorithm in Collaborative Filtering Recommendation)

  • 이석준;이희춘
    • 경영정보학연구
    • /
    • 제9권1호
    • /
    • pp.85-103
    • /
    • 2007
  • 본 연구의 목적은 좀 더 정확한 고객 선호도 예측을 위한 협업 필터링 알고리즘의 예측 성능을 평가하기 위한 것이다. 고객 선호도 예측의 정확도를 비교하기 위하여 이웃 기반의 협업 필터링 알고리즘과 대응평균 알고리즘에 의한 고객 선호도 예측의 MAE를 비교하였다. 예측 알고리즘의 정확성을 분석하기 위하여 MovieLens 1 Million dataset을 이용하여 실험을 하였다. 각 예측 알고리즘에 사용된 유사도 가중치는 일반적으로 이용되는 피어슨 상관계수와 벡터 유사도를 이용하였으며 분석결과 대응평균 알고리즘의 예측 정확도가 이웃 기반의 협업 필터링 알고리즘의 예측 정확도 보다 우수한 것으로 나타났다. 두 알고리즘에 사용된 유사도 가중치인 피어슨 상관계수와 벡터 유사도는 두 고객이 특정 상품에 대하여 공통으로 평가한 선호도 평가치를 이용하여 계산된다. 이때 공통으로 평가한 선호도 평가치의 개수가 적으면 계산된 유사도 가중치가 과대 평가된다. 과대 평가된 유사도 가중치를 보정하여 고객 선호도 예측의 정확도를 높이기 위하여 기존의 연구에서 고려한 공통 평가 영화의 개수 보다 확대된 범위를 적용하였으며 각 예측 방법에 따라 서로 다른 개선 경향을 파악할 수 있었다.

머신러닝 기반 골프 퍼팅 방향 예측 모델을 활용한 중요 변수 분석 방법론 (Method of Analyzing Important Variables using Machine Learning-based Golf Putting Direction Prediction Model)

  • Kim, Yeon Ho;Cho, Seung Hyun;Jung, Hae Ryun;Lee, Ki Kwang
    • 한국운동역학회지
    • /
    • 제32권1호
    • /
    • pp.1-8
    • /
    • 2022
  • Objective: This study proposes a methodology to analyze important variables that have a significant impact on the putting direction prediction using a machine learning-based putting direction prediction model trained with IMU sensor data. Method: Putting data were collected using an IMU sensor measuring 12 variables from 6 adult males in their 20s at K University who had no golf experience. The data was preprocessed so that it could be applied to machine learning, and a model was built using five machine learning algorithms. Finally, by comparing the performance of the built models, the model with the highest performance was selected as the proposed model, and then 12 variables of the IMU sensor were applied one by one to analyze important variables affecting the learning performance. Results: As a result of comparing the performance of five machine learning algorithms (K-NN, Naive Bayes, Decision Tree, Random Forest, and Light GBM), the prediction accuracy of the Light GBM-based prediction model was higher than that of other algorithms. Using the Light GBM algorithm, which had excellent performance, an experiment was performed to rank the importance of variables that affect the direction prediction of the model. Conclusion: Among the five machine learning algorithms, the algorithm that best predicts the putting direction was the Light GBM algorithm. When the model predicted the putting direction, the variable that had the greatest influence was the left-right inclination (Roll).

Combining genetic algorithms and support vector machines for bankruptcy prediction

  • Min, Sung-Hwan;Lee, Ju-Min;Han, In-Goo
    • 한국지능정보시스템학회:학술대회논문집
    • /
    • 한국지능정보시스템학회 2004년도 추계학술대회
    • /
    • pp.179-188
    • /
    • 2004
  • Bankruptcy prediction is an important and widely studied topic since it can have significant impact on bank lending decisions and profitability. Recently, support vector machine (SVM) has been applied to the problem of bankruptcy prediction. The SVM-based method has been compared with other methods such as neural network, logistic regression and has shown good results. Genetic algorithm (GA) has been increasingly applied in conjunction with other AI techniques such as neural network, CBR. However, few studies have dealt with integration of GA and SVM, though there is a great potential for useful applications in this area. This study proposes the methods for improving SVM performance in two aspects: feature subset selection and parameter optimization. GA is used to optimize both feature subset and parameters of SVM simultaneously for bankruptcy prediction.

  • PDF

다중 머신러닝 알고리즘을 이용한 악성 URL 예측 시스템 설계 및 구현 (Design and Implementation of Malicious URL Prediction System based on Multiple Machine Learning Algorithms)

  • 강홍구;신삼신;김대엽;박순태
    • 한국멀티미디어학회논문지
    • /
    • 제23권11호
    • /
    • pp.1396-1405
    • /
    • 2020
  • Cyber threats such as forced personal information collection and distribution of malicious codes using malicious URLs continue to occur. In order to cope with such cyber threats, a security technologies that quickly detects malicious URLs and prevents damage are required. In a web environment, malicious URLs have various forms and are created and deleted from time to time, so there is a limit to the response as a method of detecting or filtering by signature matching. Recently, researches on detecting and predicting malicious URLs using machine learning techniques have been actively conducted. Existing studies have proposed various features and machine learning algorithms for predicting malicious URLs, but most of them are only suggesting specialized algorithms by supplementing features and preprocessing, so it is difficult to sufficiently reflect the strengths of various machine learning algorithms. In this paper, a system for predicting malicious URLs using multiple machine learning algorithms was proposed, and an experiment was performed to combine the prediction results of multiple machine learning models to increase the accuracy of predicting malicious URLs. Through experiments, it was proved that the combination of multiple models is useful in improving the prediction performance compared to a single model.

Partial AUC maximization for essential gene prediction using genetic algorithms

  • Hwang, Kyu-Baek;Ha, Beom-Yong;Ju, Sanghun;Kim, Sangsoo
    • BMB Reports
    • /
    • 제46권1호
    • /
    • pp.41-46
    • /
    • 2013
  • Identifying genes indispensable for an organism's life and their characteristics is one of the central questions in current biological research, and hence it would be helpful to develop computational approaches towards the prediction of essential genes. The performance of a predictor is usually measured by the area under the receiver operating characteristic curve (AUC). We propose a novel method by implementing genetic algorithms to maximize the partial AUC that is restricted to a specific interval of lower false positive rate (FPR), the region relevant to follow-up experimental validation. Our predictor uses various features based on sequence information, protein-protein interaction network topology, and gene expression profiles. A feature selection wrapper was developed to alleviate the over-fitting problem and to weigh each feature's relevance to prediction. We evaluated our method using the proteome of budding yeast. Our implementation of genetic algorithms maximizing the partial AUC below 0.05 or 0.10 of FPR outperformed other popular classification methods.

Machine learning-based prediction of wind forces on CAARC standard tall buildings

  • Yi Li;Jie-Ting Yin;Fu-Bin Chen;Qiu-Sheng Li
    • Wind and Structures
    • /
    • 제36권6호
    • /
    • pp.355-366
    • /
    • 2023
  • Although machine learning (ML) techniques have been widely used in various fields of engineering practice, their applications in the field of wind engineering are still at the initial stage. In order to evaluate the feasibility of machine learning algorithms for prediction of wind loads on high-rise buildings, this study took the exposure category type, wind direction and the height of local wind force as the input features and adopted four different machine learning algorithms including k-nearest neighbor (KNN), support vector machine (SVM), gradient boosting regression tree (GBRT) and extreme gradient (XG) boosting to predict wind force coefficients of CAARC standard tall building model. All the hyper-parameters of four ML algorithms are optimized by tree-structured Parzen estimator (TPE). The result shows that mean drag force coefficients and RMS lift force coefficients can be well predicted by the GBRT algorithm model while the RMS drag force coefficients can be forecasted preferably by the XG boosting algorithm model. The proposed machine learning based algorithms for wind loads prediction can be an alternative of traditional wind tunnel tests and computational fluid dynamic simulations.

계산량 제어가 가능한 문턱치 기반 고속 움직임 예측 알고리즘 (Fast Motion Estimation Algorithm Based on Thresholds with Controllable Computation)

  • 김종남
    • 융합신호처리학회논문지
    • /
    • 제20권2호
    • /
    • pp.84-90
    • /
    • 2019
  • 비디오 압축을 위한 움직임 예측의 전 영역 탐색 및 무손실 방법의 많은 계산량은 고속 움직임 예측 알고리즘 개발을 이끌어 왔다. 여전히 계산량과 예측 화질의 적절한 제어가 필요하며, 본 논문에서는 전 영역 탐색 기반의 방법과 비교하여 예측 화질은 거의 유지하면서 효율적으로 계산량을 줄이고, 동시에 화질과 연산량 제어가 가능한 고속 움직임 예측 방법을 제안한다. 제안하는 알고리즘은 부분 블록에러합과 각 단계별 최소 에러 위치 변동의 문턱치들을 이용하여, 각 후보 지점에 대하여 부분 블록 에러 합을 계산하고, 이를 일련의 문턱치들 적용하여 불가능한 후보들을 조기에 제거하고, 각 단계별 최소 에러 지점의 최적 후보의 불변동 횟수를 비교 판단하여 고속의 움직임 예측을 구현하며, 문턱치를 조절하여 화질과 연산량을 쉽게 제어한다. 제안하는 알고리즘은 단독으로 사용할 뿐만 아니라 기존의 고속 알고리즘들과 결합하여 사용해도 예측 화질 대비 우수한 연산량 감소를 얻을 수 있으며, 실험 결과에서 이를 검증한다.

딥러닝과 앙상블 머신러닝 모형의 하천 탁도 예측 특성 비교 연구 (Comparative characteristic of ensemble machine learning and deep learning models for turbidity prediction in a river)

  • 박정수
    • 상하수도학회지
    • /
    • 제35권1호
    • /
    • pp.83-91
    • /
    • 2021
  • The increased turbidity in rivers during flood events has various effects on water environmental management, including drinking water supply systems. Thus, prediction of turbid water is essential for water environmental management. Recently, various advanced machine learning algorithms have been increasingly used in water environmental management. Ensemble machine learning algorithms such as random forest (RF) and gradient boosting decision tree (GBDT) are some of the most popular machine learning algorithms used for water environmental management, along with deep learning algorithms such as recurrent neural networks. In this study GBDT, an ensemble machine learning algorithm, and gated recurrent unit (GRU), a recurrent neural networks algorithm, are used for model development to predict turbidity in a river. The observation frequencies of input data used for the model were 2, 4, 8, 24, 48, 120 and 168 h. The root-mean-square error-observations standard deviation ratio (RSR) of GRU and GBDT ranges between 0.182~0.766 and 0.400~0.683, respectively. Both models show similar prediction accuracy with RSR of 0.682 for GRU and 0.683 for GBDT. The GRU shows better prediction accuracy when the observation frequency is relatively short (i.e., 2, 4, and 8 h) where GBDT shows better prediction accuracy when the observation frequency is relatively long (i.e. 48, 120, 160 h). The results suggest that the characteristics of input data should be considered to develop an appropriate model to predict turbidity.

기계학습 알고리즘을 활용한 지역 별 아파트 실거래가격지수 예측모델 비교: LIME 해석력 검증 (Comparative Analysis for Real-Estate Price Index Prediction Models using Machine Learning Algorithms: LIME's Interpretability Evaluation)

  • 조보근;박경배;하성호
    • 한국정보시스템학회지:정보시스템연구
    • /
    • 제29권3호
    • /
    • pp.119-144
    • /
    • 2020
  • Purpose Real estate usually takes charge of the highest proportion of physical properties which individual, organizations, and government hold and instability of real estate market affects the economic condition seriously for each economic subject. Consequently, practices for predicting the real estate market have attention for various reasons, such as financial investment, administrative convenience, and wealth management. Additionally, development of machine learning algorithms and computing hardware enhances the expectation for more precise and useful prediction models in real estate market. Design/methodology/approach In response to the demand, this paper aims to provide a framework for forecasting the real estate market with machine learning algorithms. The framework consists of demonstrating the prediction efficiency of each machine learning algorithm, interpreting the interior feature effects of prediction model with a state-of-art algorithm, LIME(Local Interpretable Model-agnostic Explanation), and comparing the results in different cities. Findings This research could not only enhance the academic base for information system and real estate fields, but also resolve information asymmetry on real estate market among economic subjects. This research revealed that macroeconomic indicators, real estate-related indicators, and Google Trends search indexes can predict real-estate prices quite well.

Response prediction of laced steel-concrete composite beams using machine learning algorithms

  • Thirumalaiselvi, A.;Verma, Mohit;Anandavalli, N.;Rajasankar, J.
    • Structural Engineering and Mechanics
    • /
    • 제66권3호
    • /
    • pp.399-409
    • /
    • 2018
  • This paper demonstrates the potential application of machine learning algorithms for approximate prediction of the load and deflection capacities of the novel type of Laced Steel Concrete-Composite (LSCC) beams proposed by Anandavalli et al. (Engineering Structures 2012). Initially, global and local responses measured on LSCC beam specimen in an experiment are used to validate nonlinear FE model of the LSCC beams. The data for the machine learning algorithms is then generated using validated FE model for a range of values of the identified sensitive parameters. The performance of four well-known machine learning algorithms, viz., Support Vector Regression (SVR), Minimax Probability Machine Regression (MPMR), Relevance Vector Machine (RVM) and Multigene Genetic Programing (MGGP) for the approximate estimation of the load and deflection capacities are compared in terms of well-defined error indices. Through relative comparison of the estimated values, it is demonstrated that the algorithms explored in the present study provide a good alternative to expensive experimental testing and sophisticated numerical simulation of the response of LSCC beams. The load carrying and displacement capacity of the LSCC was predicted well by MGGP and MPMR, respectively.