Search | Korea Science

A simple diagnostic statistic for determining the size of random forest (랜덤포레스트의 크기 결정을 위한 간편 진단통계량)

Park, Cheolyong
- Journal of the Korean Data and Information Science Society
- /
- v.27 no.4
- /
- pp.855-863
- /
- 2016
In this study, a simple diagnostic statistic for determining the size of random forest is proposed. This method is based on MV (margin of victory), a scaled difference in the votes at the infinite forest between the first and second most popular categories of the current random forest. We can note that if MV is negative then there is discrepancy between the current and infinite forests. More precisely, our method is based on the proportion of cases that -MV is greater than a fixed small positive number (say, 0.03). We derive an appropriate diagnostic statistic for our method and then calculate the distribution of the statistic. A simulation study is performed to compare our method with a recently proposed diagnostic statistic.
https://doi.org/10.7465/jkdi.2016.27.4.855 인용 PDF KSCI

Multivariate Autoregressive Moving Average(ARMA) process Control in Computer Integrated Manufacturing Systems (CIMS) (CIMS에서 다변량 ARMA 공정제어)

최성운
- Journal of Korean Society of Industrial and Systems Engineering
- /
- v.15 no.26
- /
- pp.181-187
- /
- 1992
본 논문은 CIMS에서 적응되는 ARMA 공정제어의 새로운 3단계절차를 제안한다. 첫번째 단계는 다변량 ARMA모델을 식별하여 모수를 추정하고, white noise로 진단된 잔차 series에 대하여 다변량 제어통계량(즉, 다변량 Hotelling T$^2$통계량, 다변량 CUSUM, 다변량 EWHA 통계량, 다변량 MA 통계량)등을 계산한다. 마지막으로 본 논문에서 제안한 8가지 다변량 제어통계량을 상호비교하여 이상점을 발견한다.
PDF

A study of a new statistic for detection of outliers and/or influential observations in regression diagnostics (회귀진단에서 이상치와 영향관측치를 동시에 발견하는 새로운 통계량에 관한 연구)

강은미
- The Korean Journal of Applied Statistics
- /
- v.6 no.1
- /
- pp.67-78
- /
- 1993
A new diagnostic statistic for detecting outliers and influential observations in linear models is suggested and studied in this paper. The proposed statistic is a weighted sum of two measures; one is for detecting outliers and the other is for detecting influential observations. The merit of this statistic is that it is possible to distinguish outliers from influential observations. We have done some Monte-Carlo Simulation to find the probability distribution of this statistic.
PDF

A measure of discrepancy based on margin of victory useful for the determination of random forest size (랜덤포레스트의 크기 결정에 유용한 승리표차에 기반한 불일치 측도)

Park, Cheolyong
- Journal of the Korean Data and Information Science Society
- /
- v.28 no.3
- /
- pp.515-524
- /
- 2017
In this study, a measure of discrepancy based on MV (margin of victory) has been suggested that might be useful in determining the size of random forest for classification. Here MV is a scaled difference in the votes, at infinite random forest, of two most popular classes of current random forest. More specifically, max(-MV,0) is proposed as a reasonable measure of discrepancy by noting that negative MV values mean a discrepancy in two most popular classes between the current and infinite random forests. We propose an appropriate diagnostic statistic based on this measure that might be useful for the determination of random forest size, and then we derive its asymptotic distribution. Finally, a simulation study has been conducted to compare the performances, in finite samples, between this proposed statistic and other recently proposed diagnostic statistics.
https://doi.org/10.7465/jkdi.2017.28.3.515 인용 PDF KSCI

Influence in Testing the Equality of Two Covariance Matrices (두개의 공분산 행렬의 동질성 검정에서의 영향치 분석)

Myung Geun Kim
- The Korean Journal of Applied Statistics
- /
- v.7 no.2
- /
- pp.213-224
- /
- 1994
A diagnostic method useful for detecting outliers in testing the equality of two covariance metrics is developed using the influence curve approach. This method is easily generalized to more than two covariance matrices. A sample version for the influence measure of detecting outliers is considered based on the empirical distribution functions. The sample version includes as its component terms the well-known test statistic for detecting one outlier at a time introduced by Wilks and its generalization to the two-group case.
PDF

Data-based On-line Diagnosis Using Multivariate Statistical Techniques (다변량 통계기법을 활용한 데이터기반 실시간 진단)

Cho, Hyun-Woo
- Journal of the Korea Academia-Industrial cooperation Society
- /
- v.17 no.1
- /
- pp.538-543
- /
- 2016
For a good product quality and plant safety, it is necessary to implement the on-line monitoring and diagnosis schemes of industrial processes. Combined with monitoring systems, reliable diagnosis schemes seek to find assignable causes of the process variables responsible for faults or special events in processes. This study deals with the real-time diagnosis of complicated industrial processes from the intelligent use of multivariate statistical techniques. The presented diagnosis scheme consists of a classification-based diagnosis using nonlinear representation and filtering of process data. A case study based on the simulation data was conducted, and the diagnosis results were obtained using different diagnosis schemes. In addition, the choice of future estimation methods was evaluated. The results showed that the performance of the presented scheme outperformed the other schemes.
https://doi.org/10.5762/KAIS.2016.17.1.538 인용 PDF KSCI

RegARIMA 모형을 이용한 음력 명절효과의 검정에 관한 연구

Mun, Gwon-Sun
- Proceedings of the Korean Statistical Society Conference
- /
- 2005.05a
- /
- pp.73-77
- /
- 2005
본 논문은 시계열에 내재된 설${\cdot}$추석 등 음력 명절효과의 존재를 검정하기 위해 RegARIMA 모형의 잔차에 대한 t-검정 통계량을 제시하였으며 Box-plot에 의한 그래프적 진단을 시도하였다. 제시된 t-검정 결과를 X-12-ARIMA의 AICC-사전검정 및 RegARIMA 모형에 의해 추정된 명절효과 회귀계수의 t-값과 비교하였다. 사용된 명절효과 변수는 Bell과 Hillmer(1983)의 명절효과 변수이다.
PDF

Fault diagnosis of wafer transfer robot based on time domain statistics (시간 영역 통계 기반 웨이퍼 이송 로봇의 고장 진단)

Hyejin Kim;Subin Hong;Youngdae Lee;Arum Park
- The Journal of the Convergence on Culture Technology
- /
- v.10 no.4
- /
- pp.663-668
- /
- 2024
This paper applies statistical analysis methods in the time domain to the fault diagnosis of wafer transfer robots, and proposes a methodology to discern the critical characteristics of vibration and torque signals. Subsequently, principal component analysis (PCA) is applied to diminish the data's dimensionality, followed by the development of a fault diagnosis algorithm utilizing Euclidean distance and Hotelling's T-square statistics. The algorithm establishes decision boundaries to categorize failure states based on the observed data. Our findings indicate that data classification incorporating velocity parameters enhances diagnostic accuracy. This approach serves to enhance the precision and efficacy of fault diagnosis.
https://doi.org/10.17703/JCCT.2024.10.4.663 인용 PDF

Application and evaluation of PD diagnostic algorithm for 3-phase in one enclosure type GIS (3상 일괄형 GIS 부분방전 진단 알고리즘 적용 및 평가)

Kim, Seong-Il;Choi, Young-Chan;Jung, Seung-Wan;Baek, Byung-San;Kwon, Joong-Lok;Hong, Cheol-Yong
- Proceedings of the KIEE Conference
- /
- 2008.07a
- /
- pp.1374-1375
- /
- 2008
본 논문은 3상 일괄형 GIS의 부분방전 진단을 위해 새롭게 개발한 진단 알고리즘에 관한 것이다. 진단 알고리즘 개발을 위해, 먼저 실시간 부분방전 데이터를 행벡터 및 열벡터로 구성하고 각각의 벡터에서 통계 특징량 및 질감 특징량을 추출하였다. 다음으로 이들 특징량을 GA-NN(Genetic Algorithm - Neural Network) 학습에 적용하여 진단 알고리즘을 구성하였다. 또한 진단 알고리즘의 위상독립성은 부분방전 신호의 위상변화에 관계없이 진단결과가 일치하는 것을 확인함으로써 검증하였다. 개발한 진단알고리즘의 실증 평가를 위해, 부분방전이 발생되고 있는 국내 3상 일괄형 GIS 변전소에 적용하였다. 적용 결과, 위상에 관계없이 부분방전 발생원을 정확히 진단함을 확인하였고, 이를 통해 개발 알고리즘의 우수성을 입증하였다.
PDF

Standardized polytomous discrimination index using concordance (부합성을 이용한 표준화된 다항판별지수)

Choi, Jin Soo;Hong, Chong Sun
- Journal of the Korean Data and Information Science Society
- /
- v.27 no.1
- /
- pp.33-44
- /
- 2016
There are many situations that the outcome for clinical decision and credit assessment should be predicted more than two categories. Five kinds of statistics which are used the concordance are proposed and used for these polytomous problems. However, these statistics are defined without exact distinction of categories, so that we have difficulty to use both the pair and set approaches and it is hard to understand the meanings of these statistics. Hence, it is not possible to compare and analyze them. In this paper, the polytomous confusion matrix is standardized and the concordance statistic can be represented based on the confusion matrix. The five kinds of statistics by using the concordance are defined. With the methods proposed in this paper, we could not only explain their meanings but also compare and analyze these statistics. Based on various data sets, properties of these five statistics are explored and explained.
https://doi.org/10.7465/jkdi.2016.27.1.33 인용 PDF KSCI

Search Result 259, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)