• Title/Summary/Keyword: statistical approach

Search Result 2,375, Processing Time 0.028 seconds

Wage Determinants Analysis by Quantile Regression Tree

  • Chang, Young-Jae
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.2
    • /
    • pp.293-301
    • /
    • 2012
  • Quantile regression proposed by Koenker and Bassett (1978) is a statistical technique that estimates conditional quantiles. The advantage of using quantile regression is the robustness in response to large outliers compared to ordinary least squares(OLS) regression. A regression tree approach has been applied to OLS problems to fit flexible models. Loh (2002) proposed the GUIDE algorithm that has a negligible selection bias and relatively low computational cost. Quantile regression can be regarded as an analogue of OLS, therefore it can also be applied to GUIDE regression tree method. Chaudhuri and Loh (2002) proposed a nonparametric quantile regression method that blends key features of piecewise polynomial quantile regression and tree-structured regression based on adaptive recursive partitioning. Lee and Lee (2006) investigated wage determinants in the Korean labor market using the Korean Labor and Income Panel Study(KLIPS). Following Lee and Lee, we fit three kinds of quantile regression tree models to KLIPS data with respect to the quantiles, 0.05, 0.2, 0.5, 0.8, and 0.95. Among the three models, multiple linear piecewise quantile regression model forms the shortest tree structure, while the piecewise constant quantile regression model has a deeper tree structure with more terminal nodes in general. Age, gender, marriage status, and education seem to be the determinants of the wage level throughout the quantiles; in addition, education experience appears as the important determinant of the wage level in the highly paid group.

A Statistical Approach to Examine the Impact of Various Meteorological Parameters on Pan Evaporation

  • Pandey, Swati;Kumar, Manoj;Chakraborty, Soubhik;Mahanti, N.C.
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.3
    • /
    • pp.515-530
    • /
    • 2009
  • Evaporation from surface water bodies is influenced by a number of meteorological parameters. The rate of evaporation is primarily controlled by incoming solar radiation, air and water temperature and wind speed and relative humidity. In the present study, influence of weekly meteorological variables such as air temperature, relative humidity, bright sunshine hours, wind speed, wind velocity, rainfall on rate of evaporation has been examined using 35 years(1971-2005) of meteorological data. Statistical analysis was carried out employing linear regression models. The developed regression models were tested for goodness of fit, multicollinearity along with normality test and constant variance test. These regression models were subsequently validated using the observed and predicted parameter estimates with the meteorological data of the year 2005. Further these models were checked with time order sequence of residual plots to identify the trend of the scatter plot and then new standardized regression models were developed using standardized equations. The highest significant positive correlation was observed between pan evaporation and maximum air temperature. Mean air temperature and wind velocity have highly significant influence on pan evaporation whereas minimum air temperature, relative humidity and wind direction have no such significant influence.

Statistical Characteristics of Fractal Dimension in Turbulent Prefixed Flame (난류 예혼합 화염에서의 프랙탈 차원의 통계적 특성)

  • Lee, Dae-Hun;Gwon, Se-Jin
    • Transactions of the Korean Society of Mechanical Engineers B
    • /
    • v.26 no.1
    • /
    • pp.18-26
    • /
    • 2002
  • With the introduction of Fractal notation, various fields of engineering adopted fractal notation to express characteristics of geometry involved and one of the most frequently applied areas was turbulence. With research on turbulence regarding the surface as fractal geometry, attempts to analyze turbulent premised flame as fractal geometry also attracted attention as a tool for modeling, for the flame surface can be viewed as fractal geometry. Experiments focused on disclosure of flame characteristics by measuring fractal parameters were done by researchers. But robust principle or theory can't be extracted. Only reported modeling efforts using fractal dimension is flame speed model by Gouldin. This model gives good predictions of flame speed in unstrained case but not in highly strained flame condition. In this research, approaches regarding fractal dimension of flame as one representative value is pointed out as a reason for the absence of robust model. And as an extort to establish robust modeling, Presents methods treating fractal dimension as statistical variable. From this approach flame characteristics reported by experiments such as Da effect on flame structure can be seen quantitatively and shows possibility of flame modeling using fractal parameters with statistical method. From this result more quantitative model can be derived.

Sample size calculation for comparing time-averaged responses in K-group repeated binary outcomes

  • Wang, Jijia;Zhang, Song;Ahn, Chul
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.3
    • /
    • pp.321-328
    • /
    • 2018
  • In clinical trials with repeated measurements, the time-averaged difference (TAD) may provide a more powerful evaluation of treatment efficacy than the rate of changes over time when the treatment effect has rapid onset and repeated measurements continue across an extended period after a maximum effect is achieved (Overall and Doyle, Controlled Clinical Trials, 15, 100-123, 1994). The sample size formula has been investigated by many researchers for the evaluation of TAD in two treatment groups. For the evaluation of TAD in multi-arm trials, Zhang and Ahn (Computational Statistics & Data Analysis, 58, 283-291, 2013) and Lou et al. (Communications in Statistics-Theory and Methods, 46, 11204-11213, 2017b) developed the sample size formulas for continuous outcomes and count outcomes, respectively. In this paper, we derive a sample size formula to evaluate the TAD of the repeated binary outcomes in multi-arm trials using the generalized estimating equation approach. This proposed sample size formula accounts for various correlation structures and missing patterns (including a mixture of independent missing and monotone missing patterns) that are frequently encountered by practitioners in clinical trials. We conduct simulation studies to assess the performance of the proposed sample size formula under a wide range of design parameters. The results show that the empirical powers and the empirical Type I errors are close to nominal levels. We illustrate our proposed method using a clinical trial example.

An Incremental Statistical Method for Daily Activity Pattern Extraction and User Intention Inference

  • Choi, Eu-Ri;Nam, Yun-Young;Kim, Bo-Ra;Cho, We-Duke
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.3 no.3
    • /
    • pp.219-234
    • /
    • 2009
  • This paper presents a novel approach for extracting simultaneously human daily activity patterns and discovering the temporal relations of these activity patterns. It is necessary to resolve the services conflict and to satisfy a user who wants to use multiple services. To extract the simultaneous activity patterns, context has been collected from physical sensors and electronic devices. In addition, a context model is organized by the proposed incremental statistical method to determine conflicts and to infer user intentions through analyzing the daily human activity patterns. The context model is represented by the sets of the simultaneous activity patterns and the temporal relations between the sets. To evaluate the method, experiments are carried out on a test-bed called the Ubiquitous Smart Space. Furthermore, the user-intention simulator based on the simultaneous activity patterns and the temporal relations from the results of the inferred intention is demonstrated.

Signal Subspace-based Voice Activity Detection Using Generalized Gaussian Distribution (일반화된 가우시안 분포를 이용한 신호 준공간 기반의 음성검출기법)

  • Um, Yong-Sub;Chang, Joon-Hyuk;Kim, Dong Kook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.32 no.2
    • /
    • pp.131-137
    • /
    • 2013
  • In this paper we propose an improved voice activity detection (VAD) algorithm using statistical models in the signal subspace domain. A uncorrelated signal subspace is generated using embedded prewhitening technique and the statistical characteristics of the noisy speech and noise are investigated in this domain. According to the characteristics of the signals in the signal subspace, a new statistical VAD method using GGD (Generalized Gaussian Distribution) is proposed. Experimental results show that the proposed GGD-based approach outperforms the Gaussian-based signal subspace method at 0-15 dB SNR simulation conditions.

Fatigue Life Estimation of Welded Joints considering Statistical Characteristics of Multiple Surface Cracks (복수 표면균열의 확률적 특성을 고려한 용접부 피로수명 평가)

  • Han, Jeong Woo;Han, Seung Ho
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.29 no.11 s.242
    • /
    • pp.1472-1479
    • /
    • 2005
  • Multiple surface crack distributed randomly along a weld toe influences strongly on the fatigue crack propagation life of welded joint. It is investigated by using statistical approaches based on series of systematic experiments. From the statistical results, initial crack numbers and its locations follow the normal distribution, and the probability of initial crack depths and lengths can be described well by tile Weibull distribution. These characteristics are used to calculate the fatigue crack propagation life, in which the mechanisms of mutual interaction and coalescence of the multiple cracks are considered as well as the Mk-factors obtained from a parametric study on the crack depths and lengths. The automatic calculation is achieved by the NESUSS, where the parameters such as the number, location and size of the cracks are all treated as random variables. The random variables are dealt through the Monte-Carlo simulation with sampling random numbers of 2,000. The simulation results show that the multiple cracks lead to much shorter crack propagation life compared with those in single crack situation. The sum of the simulation and tile fatigue crack initiation life derived by the notch strain approach agrees well with the experiments.

Statistical Prediction of Wake Fields on Propeller Plane by Neural Network using Back-Propagation

  • Hwangbo, Seungmyun;Shin, Hyunjoon
    • Journal of Ship and Ocean Technology
    • /
    • v.4 no.3
    • /
    • pp.1-12
    • /
    • 2000
  • A number of numerical methods like Computational Fluid Dynamics(CFD) have been developed to predict the flow fields of a vessel but the present study is developed to infer the wake fields on propeller plane by Statistical Fluid Dynamics(SFD) approach which is emerging as a new technique over a wide range of industrial fields nowadays. Neural network is well known as one prospective representative of the SFD tool and is widely applied even in the engineering fields. Further to its stable and effective system structure, generalization of input training patterns into different classification or categorization in training can offer more systematic treatments of input part and more reliable result. Because neural network has an ability to learn the knowledge through the external information, it is not necessary to use logical programming and it can flexibly handle the incomplete information which is not easy to make a definition clear. Three dimensional stern hull forms and nominal wake values from a model test are structured as processing elements of input and output layer respectively and a neural network is trained by the back-propagation method. The inferred results show similar figures to the experimental wake distribution.

  • PDF

Statistical properties of the maximum elastoplastic story drift of steel frames subjected to earthquake load

  • Li, Gang
    • Steel and Composite Structures
    • /
    • v.3 no.3
    • /
    • pp.185-198
    • /
    • 2003
  • The concept of performance based seismic design has been gradually accepted by the earthquake engineering profession recently, in which the cost-effectiveness criterion is one of the most important principles and more attention is paid to the structural performance at the inelastic stage. Since there are many uncertainties in seismic design, reliability analysis is a major task in performance based seismic design. However, structural reliability analysis may be very costly and time consuming because the limit state function is usually a highly nonlinear implicit function with respect to the basic design variables, especially for the complex large-scale structures for dynamic and nonlinear analysis. Understanding statistical properties of the structural inelastic deformation, which is the aim of the present paper, is helpful to develop an efficient approximate approach of reliability analysis. The present paper studies the statistical properties of the maximum elastoplastic story drift of steel frames subjected to earthquake load. The randomness of earthquake load, dead load, live load, steel elastic modulus, yield strength and structural member dimensions are considered. Possible probability distributions for the maximum story are evaluated using K-S test. The results show that the choice of the probability distribution for the maximum elastoplastic story drift of steel frames is related to the mean value of the maximum elastoplastic story drift. When the mean drift is small (less than 0.3%), an extreme value type I distribution is the best choice. However, for large drifts (more than 0.35%), an extreme value type II distribution is best.

Analysis of Bioequivalence Study using a Log-transformed Model (로그변환 모델에 따른 생물학적 동등성 판정 연구)

  • 이영주;김윤균;이명걸;정석재;이민화;심창구
    • YAKHAK HOEJI
    • /
    • v.44 no.4
    • /
    • pp.308-314
    • /
    • 2000
  • Logarithmic transformation of pharmacokinetic parameters is routinely used in bioequivalence studies based on pharmacokinetic and statistical grounds by the United States Food and Drug Administration (FDA), European Committee for Proprietary Medicinal Products (CPMP), and Japanese National Institute of Health and Science (NIHS). Although it has not yet been recommended by the Korea Food and Drug Administration (KFDA), its use is becoming increasingly necessary in order to harmonize with international standards. In the present study, statistical procedures for the analysis of a bioequivalence based on the log transformation and a related SAS procedure were demonstrated in order to aid the understanding and application. The AUC parameters used in this demonstration were taken from the previous bioequivalence study for two aceclofenac tablets, which were performed in a single-dose crossover design. Analysis of variance (ANOVA), statistical power to detect 20% difference between the tablets, minimum detectable difference and confidence intervals were all assessed following log-transformation of the data. Bioequivalence of two aceclofenac tablets was then estimated based on the guideline of FDA. Considering the international effort for harmaonization of guidelines for bioequivalence tests, this approach may require a further evaluation for a future adaptation in the Korea Guidelines of Bioequivalence Tests (KGBT).

  • PDF