• 제목/요약/키워드: random forest model

검색결과 532건 처리시간 0.032초

Estimating Population Density of Leopard Cat (Prionailurus bengalensis) from Camera Traps in Maekdo Riparian Park, South Korea

  • Park, Heebok;Lim, Anya;Choi, Tae-Young;Lim, Sang-Jin;Park, Yung-Chul
    • Journal of Forest and Environmental Science
    • /
    • 제33권3호
    • /
    • pp.239-242
    • /
    • 2017
  • Although camera traps have been widely used to understand the abundance of wildlife in recent decades, the effort has been restricted to small sub-set of wildlife which can mark-and-recapture. The Random Encounter Model shows an alternative approach to estimate the absolute abundance from camera trap detection rate for any animals without the need for individual recognition. Our study aims to examine the feasibility and validity of the Random Encounter Model for the density estimation of endangered leopard cats (Prionailurus bengalensis) in Maekdo riparian park, Busan, South Korea. According to the model, the estimated leopard cat density was $1.76km^{-2}$ (CI 95%, 0.74-3.49), which indicated 2.46 leopard cats in $1.4km^2$ of our study area. This estimate was not statistically different from the previous leopard cat population count ($2.33{\pm}0.58$) in the same area. As follows, our research demonstrated the application and usefulness of the Random Encounter Model in density estimation of unmarked wildlife which helps to manage and protect the target species with a better understanding of their status.

Ensemble Deep Learning Model using Random Forest for Patient Shock Detection

  • Minsu Jeong;Namhwa Lee;Byuk Sung Ko;Inwhee Joe
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제17권4호
    • /
    • pp.1080-1099
    • /
    • 2023
  • Digital healthcare combined with telemedicine services in the form of convergence with digital technology and AI is developing rapidly. Digital healthcare research is being conducted on many conditions including shock. However, the causes of shock are diverse, and the treatment is very complicated, requiring a high level of medical knowledge. In this paper, we propose a shock detection method based on the correlation between shock and data extracted from hemodynamic monitoring equipment. From the various parameters expressed by this equipment, four parameters closely related to patient shock were used as the input data for a machine learning model in order to detect the shock. Using the four parameters as input data, that is, feature values, a random forest-based ensemble machine learning model was constructed. The value of the mean arterial pressure was used as the correct answer value, the so called label value, to detect the patient's shock state. The performance was then compared with the decision tree and logistic regression model using a confusion matrix. The average accuracy of the random forest model was 92.80%, which shows superior performance compared to other models. We look forward to our work playing a role in helping medical staff by making recommendations for the diagnosis and treatment of complex and difficult cases of shock.

Construction of an Internet of Things Industry Chain Classification Model Based on IRFA and Text Analysis

  • Zhimin Wang
    • Journal of Information Processing Systems
    • /
    • 제20권2호
    • /
    • pp.215-225
    • /
    • 2024
  • With the rapid development of Internet of Things (IoT) and big data technology, a large amount of data will be generated during the operation of related industries. How to classify the generated data accurately has become the core of research on data mining and processing in IoT industry chain. This study constructs a classification model of IoT industry chain based on improved random forest algorithm and text analysis, aiming to achieve efficient and accurate classification of IoT industry chain big data by improving traditional algorithms. The accuracy, precision, recall, and AUC value size of the traditional Random Forest algorithm and the algorithm used in the paper are compared on different datasets. The experimental results show that the algorithm model used in this paper has better performance on different datasets, and the accuracy and recall performance on four datasets are better than the traditional algorithm, and the accuracy performance on two datasets, P-I Diabetes and Loan Default, is better than the random forest model, and its final data classification results are better. Through the construction of this model, we can accurately classify the massive data generated in the IoT industry chain, thus providing more research value for the data mining and processing technology of the IoT industry chain.

랜덤포레스트 기법을 이용한 생체 신호 기반의 스트레스 평가 방법 (Stress Assesment based on Bio-Signals using Random Forest Algorithm)

  • 임태균;허정헌;정규원;김혜리
    • 한국안전학회지
    • /
    • 제35권1호
    • /
    • pp.62-69
    • /
    • 2020
  • Most people suffer from stress during day life because modernized society is very complex and changes fast. Because stress can affect to many kind of physiological phenomena it is even considered as a disease. Therefore, it should be detected earlier, then must be released. When a person is being stressed several bio-signals such as heart rate, etc. are changed. So, those can be detected using medical electronics techniques. In this paper, stress assessment system is studied using random forest algorithm based on heart rate, RR interval and Galvanic skin response. The random forest model was trained and tested using the data set obtained from the bio-signals. It is found that the stress assessment procedure developed in this paper is very useful.

다중 선형 회귀와 랜덤 포레스트 기반의 코로나19 신규 확진자 예측 (Prediction of New Confirmed Cases of COVID-19 based on Multiple Linear Regression and Random Forest)

  • 김준수;최병재
    • 대한임베디드공학회논문지
    • /
    • 제17권4호
    • /
    • pp.249-255
    • /
    • 2022
  • The COVID-19 virus appeared in 2019 and is extremely contagious. Because it is very infectious and has a huge impact on people's mobility. In this paper, multiple linear regression and random forest models are used to predict the number of COVID-19 cases using COVID-19 infection status data (open source data provided by the Ministry of health and welfare) and Google Mobility Data, which can check the liquidity of various categories. The data has been divided into two sets. The first dataset is COVID-19 infection status data and all six variables of Google Mobility Data. The second dataset is COVID-19 infection status data and only two variables of Google Mobility Data: (1) Retail stores and leisure facilities (2) Grocery stores and pharmacies. The models' performance has been compared using the mean absolute error indicator. We also a correlation analysis of the random forest model and the multiple linear regression model.

랜덤 포레스트 기법을 이용한 건설현장 안전재해 예측 모형 기초 연구 (Basic Study on Safety Accident Prediction Model Using Random Forest in Construction Field)

  • 강경수;류한국
    • 한국건축시공학회:학술대회논문집
    • /
    • 한국건축시공학회 2018년도 추계 학술논문 발표대회
    • /
    • pp.59-60
    • /
    • 2018
  • The purpose of this study is to predict and classify the accident types based on the KOSHA (Korea Occupational Safety & Health Agency) and weather data. We also have an effort to suggest an important management method according to accident types by deriving feature importance. We designed two models based on accident data and weather data (model(a)) and only weather data (model(b)). As a result of random forest method, the model(b) showed a lack of accuracy in prediction. However, the model(a) presented more accurate prediction results than the model(b). Thus we presented safety management plan based on the results. In the future, this study will continue to carry out real time prediction to occurrence types to prevent safety accidents by supplementing the real time accident data and weather data.

  • PDF

Classification Model and Crime Occurrence City Forecasting Based on Random Forest Algorithm

  • KANG, Sea-Am;CHOI, Jeong-Hyun;KANG, Min-soo
    • 한국인공지능학회지
    • /
    • 제10권1호
    • /
    • pp.21-25
    • /
    • 2022
  • Korea has relatively less crime than other countries. However, the crime rate is steadily increasing. Many people think the crime rate is decreasing, but the crime arrest rate has increased. The goal is to check the relationship between CCTV and the crime rate as a way to lower the crime rate, and to identify the correlation between areas without CCTV and areas without CCTV. If you see a crime that can happen at any time, I think you should use a random forest algorithm. We also plan to use machine learning random forest algorithms to reduce the risk of overfitting, reduce the required training time, and verify high-level accuracy. The goal is to identify the relationship between CCTV and crime occurrence by creating a crime prevention algorithm using machine learning random forest techniques. Assuming that no crime occurs without CCTV, it compares the crime rate between the areas where the most crimes occur and the areas where there are no crimes, and predicts areas where there are many crimes. The impact of CCTV on crime prevention and arrest can be interpreted as a comprehensive effect in part, and the purpose isto identify areas and frequency of frequent crimes by comparing the time and time without CCTV.

머신러닝 기법을 활용한 대졸 구직자 취업 예측모델에 관한 연구 (Study on the Prediction Model for Employment of University Graduates Using Machine Learning Classification)

  • 이동훈;김태형
    • 한국정보시스템학회지:정보시스템연구
    • /
    • 제29권2호
    • /
    • pp.287-306
    • /
    • 2020
  • Purpose Youth unemployment is a social problem that continues to emerge in Korea. In this study, we create a model that predicts the employment of college graduates using decision tree, random forest and artificial neural network among machine learning techniques and compare the performance between each model through prediction results. Design/methodology/approach In this study, the data processing was performed, including the acquisition of the college graduates' vocational path survey data first, then the selection of independent variables and setting up dependent variables. We use R to create decision tree, random forest, and artificial neural network models and predicted whether college graduates were employed through each model. And at the end, the performance of each model was compared and evaluated. Findings The results showed that the random forest model had the highest performance, and the artificial neural network model had a narrow difference in performance than the decision tree model. In the decision-making tree model, key nodes were selected as to whether they receive economic support from their families, major affiliates, the route of obtaining information for jobs at universities, the importance of working income when choosing jobs and the location of graduation universities. Identifying the importance of variables in the random forest model, whether they receive economic support from their families as important variables, majors, the route to obtaining job information, the degree of irritating feelings for a month, and the location of the graduating university were selected.

머신러닝 기반 신체 계측정보를 이용한 CT 피폭선량 예측모델 비교 (Comparison of CT Exposure Dose Prediction Models Using Machine Learning-based Body Measurement Information)

  • 홍동희
    • 대한방사선기술학회지:방사선기술과학
    • /
    • 제43권6호
    • /
    • pp.503-509
    • /
    • 2020
  • This study aims to develop a patient-specific radiation exposure dose prediction model based on anthropometric data that can be easily measurable during CT examination, and to be used as basic data for DRL setting and radiation dose management system in the future. In addition, among the machine learning algorithms, the most suitable model for predicting exposure doses is presented. The data used in this study were chest CT scan data, and a data set was constructed based on the data including the patient's anthropometric data. In the pre-processing and sample selection of the data, out of the total number of samples of 250 samples, only chest CT scans were performed without using a contrast agent, and 110 samples including height and weight variables were extracted. Of the 110 samples extracted, 66% was used as a training set, and the remaining 44% were used as a test set for verification. The exposure dose was predicted through random forest, linear regression analysis, and SVM algorithm using Orange version 3.26.0, an open software as a machine learning algorithm. Results Algorithm model prediction accuracy was R^2 0.840 for random forest, R^2 0.969 for linear regression analysis, and R^2 0.189 for SVM. As a result of verifying the prediction rate of the algorithm model, the random forest is the highest with R^2 0.986 of the random forest, R^2 0.973 of the linear regression analysis, and R^2 of 0.204 of the SVM, indicating that the model has the best predictive power.

Random Walk Model의 최적 파라미터 추출에 의한 토석류 피해범위 분석 - 경북 봉화군 토석류 발생지를 대상으로 - (Analysis of Debris Flow Hazard Zone by the Optimal Parameters Extraction of Random Walk Model - Case on Debris Flow Area of Bonghwa County in Gyeongbuk Province -)

  • 이창우;우충식;윤호중
    • 한국산림과학회지
    • /
    • 제100권4호
    • /
    • pp.664-671
    • /
    • 2011
  • Random Walk Model을 이용하면 토석류 피해범위 예측할 수 있지만 이 모델을 적용하기 위해서는 각 지형조건에 맞는 3가지 파라미터가 추출되어야 한다. 본 연구에서는 이 3가지 파라미터인 1회토사량 및 정지조건, 관성가중치에서 최적의 파라미터를 추출하기 위한 기법을 개발하였고, 토석류가 발생한 지역에 적용하여 검증을 실시하였다. 최적 파라미터의 추출은 일치율이란 정확도 판단방법을 개발하고 3가지 파라미터의 범위를 한정하여 무작위로 수행하였다. 이 중 가장 정확도 및 일관성이 높은 파라미터 조합을 최적 파라미터로 결정하였다. 연구지역인 봉화군지역에 적용하여 추출한 최적 파라미터는 일치율이 -0.2일 때 1회토사량이 $1.0m^3$, 정지조건이 $4.2^{\circ}$, 관성가중치가 2로 결정되었고, 검증결과도 일치율이 평균 -0.2에 가깝게 나타냈다.