• 제목/요약/키워드: model tree technique

검색결과 198건 처리시간 0.031초

Wage Determinants Analysis by Quantile Regression Tree

  • Chang, Young-Jae
    • Communications for Statistical Applications and Methods
    • /
    • 제19권2호
    • /
    • pp.293-301
    • /
    • 2012
  • Quantile regression proposed by Koenker and Bassett (1978) is a statistical technique that estimates conditional quantiles. The advantage of using quantile regression is the robustness in response to large outliers compared to ordinary least squares(OLS) regression. A regression tree approach has been applied to OLS problems to fit flexible models. Loh (2002) proposed the GUIDE algorithm that has a negligible selection bias and relatively low computational cost. Quantile regression can be regarded as an analogue of OLS, therefore it can also be applied to GUIDE regression tree method. Chaudhuri and Loh (2002) proposed a nonparametric quantile regression method that blends key features of piecewise polynomial quantile regression and tree-structured regression based on adaptive recursive partitioning. Lee and Lee (2006) investigated wage determinants in the Korean labor market using the Korean Labor and Income Panel Study(KLIPS). Following Lee and Lee, we fit three kinds of quantile regression tree models to KLIPS data with respect to the quantiles, 0.05, 0.2, 0.5, 0.8, and 0.95. Among the three models, multiple linear piecewise quantile regression model forms the shortest tree structure, while the piecewise constant quantile regression model has a deeper tree structure with more terminal nodes in general. Age, gender, marriage status, and education seem to be the determinants of the wage level throughout the quantiles; in addition, education experience appears as the important determinant of the wage level in the highly paid group.

Hybridized Decision Tree methods for Detecting Generic Attack on Ciphertext

  • Alsariera, Yazan Ahmad
    • International Journal of Computer Science & Network Security
    • /
    • 제21권7호
    • /
    • pp.56-62
    • /
    • 2021
  • The surge in generic attacks execution against cipher text on the computer network has led to the continuous advancement of the mechanisms to protect information integrity and confidentiality. The implementation of explicit decision tree machine learning algorithm is reported to accurately classifier generic attacks better than some multi-classification algorithms as the multi-classification method suffers from detection oversight. However, there is a need to improve the accuracy and reduce the false alarm rate. Therefore, this study aims to improve generic attack classification by implementing two hybridized decision tree algorithms namely Naïve Bayes Decision tree (NBTree) and Logistic Model tree (LMT). The proposed hybridized methods were developed using the 10-fold cross-validation technique to avoid overfitting. The generic attack detector produced a 99.8% accuracy, an FPR score of 0.002 and an MCC score of 0.995. The performances of the proposed methods were better than the existing decision tree method. Similarly, the proposed method outperformed multi-classification methods for detecting generic attacks. Hence, it is recommended to implement hybridized decision tree method for detecting generic attacks on a computer network.

의사결정나무모형을 이용한 편마암 지역에서의 급경사지재해 예측기법 개발 (Development to Prediction Technique of Slope Hazards in Gneiss Area using Decision Tree Model)

  • 송영석;채병곤
    • 지질공학
    • /
    • 제18권1호
    • /
    • pp.45-54
    • /
    • 2008
  • 본 연구에서는 기 조사된 편마암 지역에서의 급경사지재해 발생지역 및 미발생지역에 대한 현장조사자료 및 토질시험자료를 토대로 통계적인 분석방법인 의사결정나무모형을 이용하여 급경사지재해 예측기법을 개발하였다. 편마암 지역에서의 조사된 급경사지재해 자료는 서울 및 경기지역에서 1998년 집중호우로 발생된 104개소구간이다. 이 가운데 예측모델 개발에 활용된 자료수는 결측치를 제외한 61개소로서, 급경사지재해 발생구간 34개소와 미발생구간 27개소이다. 의사결정나무모형을 이용한 통계적인 분석은 카이제곱 통계량, 지니 지수 및 엔트로피 지수를 적용하여 실시하였다. 분석결과 사면경사, 포화도 및 사면고도가 분리기준으로 선택되었으며, 엔트로피 지수를 이용한 의사결정나무모형 예측모델이 정확도가 가장 높은 것으로 나타났다. 선정된 급경사지재해 예측모델의 분리기준은 최상위부터 사면경사, 포화도 및 사면고도의 순서로 선택되었으며, 각각의 분리기준치는 사면경사의 경우 $17.9^{\circ}$, 포화도의 경우 52.1%, 사면고도의 경우 320m로 결정되었다.

데이터마이닝 기법을 활용한 맞춤형 고혈압 사후관리 모형 개발 (A Development of a Tailored Follow up Management Model Using the Data Mining Technique on Hypertension)

  • 박일수;용왕식;김유미;강성홍;한준태
    • 응용통계연구
    • /
    • 제21권4호
    • /
    • pp.639-647
    • /
    • 2008
  • 본 연구는 국민건강보험공단의 건강검진데이터, 자격 및 보험료 그리고 진료비 데이터를 활용하여 고혈압 관리를 위한 맞춤형 고혈압 사후관리모형(고혈압 진료예측모형 및 고혈압 진료순응도세분화모형)을 개발하고자 하였다. 모형 개발에는 데이터마이닝의 로지스틱 회귀모형, 의사결정나무 그리고 앙상블 모형을 활용하였다. 고혈압 진료예측모형에서는 3가지 모형 중 로지스틱 회귀모형이 가장 우수한 모형으로 채택되었으며, 고혈압 진료순응도세분화모형은 의사결정나무모형을 통해 개발되었다. 본 연구는 전국 규모의 수년간 축적된 자료를 데이터마이닝을 활용함으로써 고혈압의 진료 및 진료순응도에 이르는 고혈압 사후관리 프로세스 전반에 걸친 결과를 도출함으로써 우리나라 고혈압 사후관리체계 구축에 기여할 것으로 사료된다.

의사결정나무모형을 이용한 급경사지재해 예측프로그램 개발 및 적용 (Development and its APPLIcation of Computer Program for Slope Hazards Prediction using Decision Tree Model)

  • 송영석;조용찬;서용석;안상로
    • 대한토목학회논문집
    • /
    • 제29권2C호
    • /
    • pp.59-69
    • /
    • 2009
  • 본 연구에서는 화강암, 편마암 등 결정질암 지역에서의 급경사지재해 발생지역 및 미발생지역에 대한 현장조사자료 및 토질시험자료를 토대로 의사결정나무모형을 이용한 급경사지재해 예측모델을 개발하였다. 선정된 급경사지재해 예측모델의 분리기준은 최상위부터 사면경사, 투수계수 및 간극비로 선정되었다. 그리고 이를 토대로 GIS기법을 이용한 국가 주요시설물 주변 급경사지 재해 예측프로그램 SHAPP ver 1.0을 개발하였다. 개발된 예측모델 및 예측프로그램을 검증하기 위하여 강릉시 주문진읍 일대의 현장조사결과와 대상현장에 대한 예측결과를 비교 검토하였다. 검토결과 실제 급경사지 재해가 발생된 구간과 급경사지재해 예측구간이 유사하게 일치하고 있는 것으로 나타났다. 추후 지속적인 연구를 통하여 급경사지재해 예측 결과에 대한 정확도를 높이고, 이를 실용화하여 범용적으로 사용이 가능하도록 할 예정이다.

A formal approach to support the identification of unsafe control actions of STPA for nuclear protection systems

  • Jung, Sejin;Heo, Yoona;Yoo, Junbeom
    • Nuclear Engineering and Technology
    • /
    • 제54권5호
    • /
    • pp.1635-1643
    • /
    • 2022
  • STPA (System-Theoretic Process Analysis) is a widely used safety analysis technique to identify UCAs (Unsafe Control Actions) resulting in potential losses. It is totally dependent on the experience and ability of analysts to construct an information model called Control Structures, upon which analysts try to identify unsafe controls between system components. This paper proposes a formal approach to support the manual identification of UCAs, effectively and systematically. It allows analysts to mechanically extract Process Model, an important element that makes up the Control Structures, from a formal requirements specification for a software controller. It then concisely constructs the contents of Context Tables, from which analysts can identify all relevant UCAs effectively, using a software fault tree analysis technique. The case study with a preliminary version of a Korean nuclear reactor protections system shows the proposed approach's effectiveness and applicability.

확률적 프로세스 트리 생성을 위한 타부 검색 -유전자 프로세스 마이닝 알고리즘 (Tabu Search-Genetic Process Mining Algorithm for Discovering Stochastic Process Tree)

  • 주우민;최진영
    • 산업경영시스템학회지
    • /
    • 제42권4호
    • /
    • pp.183-193
    • /
    • 2019
  • Process mining is an analytical technique aimed at obtaining useful information about a process by extracting a process model from events log. However, most existing process models are deterministic because they do not include stochastic elements such as the occurrence probabilities or execution times of activities. Therefore, available information is limited, resulting in the limitations on analyzing and understanding the process. Furthermore, it is also important to develop an efficient methodology to discover the process model. Although genetic process mining algorithm is one of the methods that can handle data with noises, it has a limitation of large computation time when it is applied to data with large capacity. To resolve these issues, in this paper, we define a stochastic process tree and propose a tabu search-genetic process mining (TS-GPM) algorithm for a stochastic process tree. Specifically, we define a two-dimensional array as a chromosome to represent a stochastic process tree, fitness function, a procedure for generating stochastic process tree and a model trace as a string of activities generated from the process tree. Furthermore, by storing and comparing model traces with low fitness values in the tabu list, we can prevent duplicated searches for process trees with low fitness value being performed. In order to verify the performance of the proposed algorithm, we performed a numerical experiment by using two kinds of event log data used in the previous research. The results showed that the suggested TS-GPM algorithm outperformed the GPM algorithm in terms of fitness and computation time.

의사결정나무기법을 이용한 건설재해 사전 예측모델 개발 (Prediction Model of Construction Safety Accidents using Decision Tree Technique)

  • 조예림;김연철;신윤석
    • 한국건축시공학회지
    • /
    • 제17권3호
    • /
    • pp.295-303
    • /
    • 2017
  • 건설 산업 재해 예방을 위한 연구와 노력에도 불구하고 최근 7년간 국내 건설업 재해자 수가 꾸준히 증가했다. 건설현장에서 발생하는 재해는 다른 산업군에 비해 강도 높은 재해가 발생할 가능성이 크기 때문에 근본적으로 예방할 수 있는 방법이 필요하다. 따라서 본 연구에서는 모형에 대한 해석이 쉽고 변수의 상호작용 효과 해석이 용이한 의사결정나무 기법을 활용하여 건설재해 예측 모델을 제안하였다. 제안된 건설 재해 사전 예측 모델의 현장 활용 가능성을 평가하기 위하여 판별분석기법 기반 모델과의 건설 재해 예측 정확도를 비교하였다. 검토 결과 판별분석 모델에 비해 의사결정나무 모델의 누적 예측 정확도가 더 높은 것으로 나타났다. 의사결정나무 기법을 이용한 모델은 시간이 지남에 따라 데이터가 증가하기 때문에 예측 정확도가 더욱 높아지게 된다. 따라서 본 연구에서 제안된 건설 재해 예측 모델이 건설현장에서 활용된다면 효과적으로 안전 관리를 할 수 있고, 건설업 재해율 감소에도 기여할 수 있을 것으로 기대한다.

계층구조의 속성을 가지는 의사결정 문제의 선호순위도출을 위한 수리계획모형 (Mathematical Programming Models for Establishing Dominance with Hierarchically Structured Attribute Tree)

  • 한창희
    • 한국국방경영분석학회지
    • /
    • 제28권2호
    • /
    • pp.34-55
    • /
    • 2002
  • This paper deals with the multiple attribute decision making problem when a decision maker incompletely articulates his/her preferences about the attribute weight and alternative value. Furthermore, we consider the attribute tree which is structured hierarchically. Techniques for establishing dominance with linear partial information are proposed in a hierarchically structured attribute tree. The linear additive value function under certainty is used in the model. The incompletely specified information constructs a feasible region of linear constraints and therefore the pairwise dominance relationship between alternatives leads to intractable non-linear programming. Hence, we propose solution techniques to handle this difficulty. Also, to handle the tree structure, we break down the attribute tree into sub-trees. Due to there cursive structure of the solution technique, the optimization results from sub-trees can be utilized in computing the value interval on the topmost attribute. The value intervals computed by the proposed solution techniques can be used to establishing the pairwise dominance relation between alternatives. In this paper, pairwise dominance relation will be represented as strict dominance and weak dominance, which ware already defined in earlier researches.

의사결정나무분석을 활용한 코로나19 이후 농촌관광객의 선호 특성 세분화 연구 (A Study on Segmentation of Preferred Characteristics of Rural Tourists after COVID-19 Using Decision Tree Analysis)

  • 이승훈
    • 아태비즈니스연구
    • /
    • 제14권1호
    • /
    • pp.411-426
    • /
    • 2023
  • Purpose - The purpose of this study was to explore and diagnose the characteristics and behavioural patterns of rural tourists after COVID-19 using decision tree analysis to classify and identify key segmentation groups. Design/methodology/approach - The CHAID algorithm was used as the analysis technique for the decision tree. The explanatory variables used in the analysis of each decision tree model were demographic variables and rural tourism usage behaviour and perception variables, and the target variables were the preferences of rural tourists' activities after COVID-19. From the Rural Tourism 2020 survey data, 614 samples with rural tourism experience were extracted and used in the analysis. Findings - The variables that significantly explained the preference for each type of rural tourism activity after COVID-19 were rural tourism safety perception, repeated visits to the region, rural tourism priority activity, rural tourism accommodation experience, gender, age group, marital status, occupation, and education level. Among them, rural tourism safety perception was the most important explanatory variable in each analysis model. Research implications or Originality - Overall, to promote rural tourism, it is necessary to enhance the safety image of rural tourism, strengthen loyalty programs for repeat visitors, and develop customized products that reflect the preferred trends of rural tourism.