• 제목/요약/키워드: Classification Variables

검색결과 932건 처리시간 0.034초

Finding a plan to improve recognition rate using classification analysis

  • Kim, SeungJae;Kim, SungHwan
    • International journal of advanced smart convergence
    • /
    • 제9권4호
    • /
    • pp.184-191
    • /
    • 2020
  • With the emergence of the 4th Industrial Revolution, core technologies that will lead the 4th Industrial Revolution such as AI (artificial intelligence), big data, and Internet of Things (IOT) are also at the center of the topic of the general public. In particular, there is a growing trend of attempts to present future visions by discovering new models by using them for big data analysis based on data collected in a specific field, and inferring and predicting new values with the models. In order to obtain the reliability and sophistication of statistics as a result of big data analysis, it is necessary to analyze the meaning of each variable, the correlation between the variables, and multicollinearity. If the data is classified differently from the hypothesis test from the beginning, even if the analysis is performed well, unreliable results will be obtained. In other words, prior to big data analysis, it is necessary to ensure that data is well classified according to the purpose of analysis. Therefore, in this study, data is classified using a decision tree technique and a random forest technique among classification analysis, which is a machine learning technique that implements AI technology. And by evaluating the degree of classification of the data, we try to find a way to improve the classification and analysis rate of the data.

한국 주요 토양유형의 공간적 분포와 토양형성요인을 이용한 예측가능성 평가 (Spatial Distribution of Major Soil Types in Korea and an Assessment of Soil Predictability Using Soil Forming Factors)

  • 박수진;손연규;홍석영;박찬원;장용선
    • 대한지리학회지
    • /
    • 제45권1호
    • /
    • pp.95-118
    • /
    • 2010
  • 이 연구에서는 현재 한국에서 사용되고 있는 구분류법에 근거한 대토양군과 신분류법에 근거한 대군의 공간적 분포를 살펴보고, 토양형성과 관련된 각종 환경요인들과의 상관관계를 분석하였다. 그 결과를 토대로 의사결정나무기법을 이용하여 토양분포의 예측가능성을 평가하였다. 대토양군의 경우에는 분포를 보다 직관적으로 이해할 수 있는 장점이 있지만, 환경요인을 이용한 예측가능성면에서는 대군에 뒤지는 결과를 보여주었다. 토양분포를 결정하는 요인들로는 산지와 평탄지가 뚜렷하게 구분되는 한국의 지형특성이 가장 중요한 요인으로 나타났으며 부차적으로 기후특성, 그리고 사변을 따라 나타나는 토양연속성이 제시되었다. 의사결정나무기법을 이용한 토양의 예측가능성 평가에서는 예측변수의 수와 종류에 따라 35%에서 75%의 분류정확도를 보여주었다. 신분류법의 경우에는 지형요인이 가장 중요한 예측변수로 평가된 반면, 구분류법의 경우에는 기후변수가 중요한 예측변수로 평가되어 대조를 보였다.

Classification 및 Ordination 방법에 의한 지리산 대원계곡의 삼림군집구조 분석 (Analysis on the Forest Community of Daewon Vally in Mt. Chiri by the Classification and Ordination Techniques)

  • 이경재;구관효;최재식;조현서
    • 한국환경생태학회지
    • /
    • 제5권1호
    • /
    • pp.54-67
    • /
    • 1991
  • 지리산 대원계곡의 입지환경구배에 따른 삼림군집구조분석을 위하여 89개소의 조사지(1조사구당 500$m^2$)를 설정하고 식생조사를 실시하여 얻은 자료에 대하여 TWINSPAN에 의한 classification 및 ordination의 한 기법인 DCA를 적용하여 분석하였다. Classification에 의하여 낮은 해발고의 건조지에서 소나무군집, 굴참나무군집, 졸참나무-굴참나무군집으로. 습윤지에서 서어나무군집으로 분류되었고, 높은 해발고의 건조지에서 신갈나무군집, 습윤지에서 층층나무-신갈나무군집으로 분류되었으며, DCA기법에 의한 ordination분석결과에서도 classification 분석결과와 같은 경향을 나타내었다. 종에 대한 두 기법분석에 의해 추정된 천이과정은 교목상층에서 해발고가 낮은 지역은 소나무$\longrightarrow$굴참나무, 졸참나무$\longrightarrow$서어나무 순이었고, 해발고가 높은 지위는 신갈나무$\longrightarrow$층층나무 순이었으며, 교목하층은 진달래, 개옻나무, 때죽나무$\longrightarrow$사람주나무, 나도밤나무, 대팻집나무$\longrightarrow$참개암나무, 당단풍, 함박꽃나무 순이었다. 환경인자의 ordination 분석에서 소나무군집과 굴참나무군집은 토양수분함량, 유기물함량, 전질소함량, 유효인산. 치환성용량 등의 토양양료가 낮은 지위에 분포하고 신갈나무군집과 층층나무군집은 토양양료가 높은 지위에 분포하였으며, 서어나무군집은 중간지위에 주로 분포하였다.

  • PDF

Estimation and Classification of Flow Regimes for South Korean Streams and River

  • Park, Kyug Seo;Choi, Ji-Woong;Park, Chan-Seo;An, Kwang-Guk;Wiley, Michael J.
    • 한국수자원학회:학술대회논문집
    • /
    • 한국수자원학회 2015년도 학술발표회
    • /
    • pp.106-106
    • /
    • 2015
  • The information of flow regimes continues to be norm in water resource and watershed management, in that stream flow regime is a crucial factor influencing water quality, geomorphology, and the community structure of stream biota. The objectives of this study were to estimate Korean stream flows from landscape variables, classify stream flow gages using hydraulic characteristics, and then apply these methods to ungaged biological monitoring sites for effective ecological assessment. Here I used a linear modeling approach (MLR, PCA, and PCR) to describe and predict seasonal flow statistics from landscape variables. MLR models were successfully built for a range of exceedance discharges and time frames (annual, January, May, July, and October), and these models explained a high degree of the observed variation with r squares ranging from 0.555 (Q95 in January) to 0.899 (Q05 in July). In validation testing, predicted and observed exceedance discharges were all significantly correlated (p<0.01) and for most models no significant difference was found between predicted and observed values (Paired samples T-test; p>0.05). I classified Korean stream flow regimes with respect to hydraulic and hydrologic regime into four categories: flashier and higher-powered (F-HP), flashier and lower-powered (F-LP), more stable and higher-powered (S-HP), and more stable and lower-powered (S-LP). These four categories of Korean streams were related to with the characteristics of environmental variables, such as catchment size, site slope, stream order, and land use patterns. I then applied the models at 684 ungaged biological sampling sites used in the National Aquatic Ecological Monitoring Program in order to classify them with respect to basic hydrologic characteristics and similarity to the government's array of hydrologic gauging stations. Flashier-lower powered sites appeared to be relatively over-represented and more stable-higher powered sites under-represented in the bioassessment data sets.

  • PDF

Decision based uncertainty model to predict rockburst in underground engineering structures using gradient boosting algorithms

  • Kidega, Richard;Ondiaka, Mary Nelima;Maina, Duncan;Jonah, Kiptanui Arap Too;Kamran, Muhammad
    • Geomechanics and Engineering
    • /
    • 제30권3호
    • /
    • pp.259-272
    • /
    • 2022
  • Rockburst is a dynamic, multivariate, and non-linear phenomenon that occurs in underground mining and civil engineering structures. Predicting rockburst is challenging since conventional models are not standardized. Hence, machine learning techniques would improve the prediction accuracies. This study describes decision based uncertainty models to predict rockburst in underground engineering structures using gradient boosting algorithms (GBM). The model input variables were uniaxial compressive strength (UCS), uniaxial tensile strength (UTS), maximum tangential stress (MTS), excavation depth (D), stress ratio (SR), and brittleness coefficient (BC). Several models were trained using different combinations of the input variables and a 3-fold cross-validation resampling procedure. The hyperparameters comprising learning rate, number of boosting iterations, tree depth, and number of minimum observations were tuned to attain the optimum models. The performance of the models was tested using classification accuracy, Cohen's kappa coefficient (k), sensitivity and specificity. The best-performing model showed a classification accuracy, k, sensitivity and specificity values of 98%, 93%, 1.00 and 0.957 respectively by optimizing model ROC metrics. The most and least influential input variables were MTS and BC, respectively. The partial dependence plots revealed the relationship between the changes in the input variables and model predictions. The findings reveal that GBM can be used to anticipate rockburst and guide decisions about support requirements before mining development.

Multiple Relationships Between Impairment, Activity and Participation-based Clinical Outcome Measures in 200 Low Back Pain

  • Chanhee Park
    • 한국전문물리치료학회지
    • /
    • 제30권2호
    • /
    • pp.136-143
    • /
    • 2023
  • Background: The International Classification of Functioning, Disability and Health (ICF) model, created by the World Health Organization, provides a theoretical framework that can be applied in the diagnosis and treatment of various disorders. Objects: Our research purposed to ascertain the relationship between structure/function, activity, and participation domain variables of the ICF and pain, pain-associated disability, activities of daily living (ADL), and quality of life in patients with chronic low back pain (LBP). Methods: Two-hundred patients with chronic LBP (mean age: 35.5 ± 8.8 years, females, n = 40) were recruited from hospital and community settings. We evaluated the body structure/function domain variable using the Numeric Pain Rating Scale (NPRS) and Roland-Morris disability (RMD) questionnaire. To evaluate the activity domain variable, we used the Oswestry Disability Index (ODI) and Quebec Back Pain Disability Scale (QBDS). For clinical outcome measures, we used Short-form 12 (SF-12). Pearson's correlation coefficient was used to ascertain the relationships among the variables (p < 0.05). All the participants with LBP received 30 minutes of conventional physical therapy 3 days/week for 4 weeks. Results: There were significant correlations between the body structure/function domain (NPRS and RMD questionnaire), activity domain (ODI and QBDS), and participation domain variables (SF-12), rending from pre-intervention (r = -0.723 to 0.783) and postintervention (r = -0.742 to 0.757, p < 0.05). Conclusion: The identification of a significant difference between these domain variables point to important relationships between pain, disability, performance of ADL, and quality in participants with LBP.

실대화재시험의 화재성능 등급분류에 관한 연구 (Study on the combustion performance's classification system for large scale fire tests)

  • 박계원;임홍순;정재군
    • 한국화재소방학회:학술대회논문집
    • /
    • 한국화재소방학회 2008년도 추계학술논문발표회 논문집
    • /
    • pp.99-104
    • /
    • 2008
  • The combustion properties of sandwich panels were tested and analyzed according to ISO 13784-1(Room Corner Test for Sandwich panel building systems) test method for the purpose of establishing the classification of reaction to fire performance. Several variables including heat release rate, smoke production rate, FIGRA, SMOGRA, and so on, were analyzed for specific four materials about sandwich panel systems on each 5 times, totally 20 times. Finally, elements for Classification system were suggested and evaluations for those elements were made.

  • PDF

전자상거래 서비스상품 분류에 대한 연구 (Classification Framework for Services and Service Products in Electronic Commerce)

  • 조성의;박광태
    • Asia pacific journal of information systems
    • /
    • 제12권3호
    • /
    • pp.169-189
    • /
    • 2002
  • In this study, we newly classify services and service products in EC into Mass Service, Interactive Service, Ordering Service, and Professional Service. For classification, five selected variables and two factorized dimensions such as 1) ease of on-line interaction and 2) needs of on-line interactions are used. On the result of classification, we investigate the relationships with customer purchase intentions in EC. For this purpose, customer surveys were conducted on the respondent groups who are known to be frequent purchasers in EC and have advanced knowledge on services and service products. The survey is analyzed by statistical methods such as factor score plot, cluster analysis, analysis of variance, etc.

무의 중기 선행관측모형 개발 (Development of a mid-term preceding observation model for radish)

  • 조재환;이한성
    • 농업과학연구
    • /
    • 제38권3호
    • /
    • pp.571-581
    • /
    • 2011
  • This study develops a mid-term preceding observation model of radish to complement an existing short-term agricultural observation model. The first purpose of the study is to extend a three seasonal classification(spring, summer, fall) of fruit-vegetables to a four seasonal classification that involves the winter additionally. This allows us to verify the reason for demand and supply unbalance and unstable price of radish. The second purpose is to construct a mid-term preceding observation model that would be used to forecast planted areas, output, monthly shipment and price. To achieve these purposes, several multiple regression models are estimated. A system is consisted of a planted areas equation, a yield equation, monthly shipment distribution equation, and monthly price equation. To calculate output an auxiliary equation is involved in the system and the consumer price index etc are considered as exogenous variables.

조종사 비행훈련 성패예측모형 구축을 위한 중요변수 선정 (Selection of Important Variables in the Classification Model for Successful Flight Training)

  • 이상헌;이선두
    • 산업공학
    • /
    • 제20권1호
    • /
    • pp.41-48
    • /
    • 2007
  • The main purpose of this paper is cost reduction in absurd pilot positive expense and human accident prevention which is caused by in the pilot selection process. We use classification models such as logistic regression, decision tree, and neural network based on aptitude test results of 505 ROK Air Force applicants in 2001~2004. First, we determine the reliability and propriety against the aptitude test system which has been improved. Based on this conference flight simulator test item was compared to the new aptitude test item in order to make additional yes or no decision from different models in terms of classification accuracy, ROC and Response Threshold side. Decision tree was selected as the most efficient for each sequential flight training result and the last flight training results predict excellent. Therefore, we propose that the standard of pilot selection be adopted by the decision tree and it presents in the aptitude test item which is new a conference flight simulator test.