• Title/Summary/Keyword: quantiles

Search Result 183, Processing Time 0.021 seconds

Variable selection with quantile regression tree (분위수 회귀나무를 이용한 변수선택 방법 연구)

  • Chang, Youngjae
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.6
    • /
    • pp.1095-1106
    • /
    • 2016
  • The quantile regression method proposed by Koenker et al. (1978) focuses on conditional quantiles given by independent variables, and analyzes the relationship between response variable and independent variables at the given quantile. Considering the linear programming used for the estimation of quantile regression coefficients, the model fitting job might be difficult when large data are introduced for analysis. Therefore, dimension reduction (or variable selection) could be a good solution for the quantile regression of large data sets. Regression tree methods are applied to a variable selection for quantile regression in this paper. Real data of Korea Baseball Organization (KBO) players are analyzed following the variable selection approach based on the regression tree. Analysis result shows that a few important variables are selected, which are also meaningful for the given quantiles of salary data of the baseball players.

Do Firm Characteristics Determine Capital Structure of Pakistan Listed Firms? A Quantile Regression Approach

  • KHAN, Karamat;QU, Jing;SHAH, Muhammad Haroon;BAH, Kebba;KHAN, Irfan Ullah
    • The Journal of Asian Finance, Economics and Business
    • /
    • v.7 no.5
    • /
    • pp.61-72
    • /
    • 2020
  • The purpose of this study is to investigate the determinants of the capital structure of firms operating in a developing economy, Pakistan. The quantile regression method is applied on a sample of 183 non-financial companies listed on the Pakistan Stock Exchange during the period of 2008-2017. Specifically, the empirical analysis focuses on changes in the coefficients of the determinants according to the leverage ratio quantiles of the examined listed firms. The findings show that the capital structure of Pakistan listed firms differs between firms in different quantiles of leverage. These differences are significant with the sign of explanatory variables changes with the level of leverage. The research result found tangibility, profitability and age to be positively related to leverage among listed firms in Pakistan. However, size, liquidity and non-debt tax shield (NDTS) are negatively related to leverage. A firm's growth and risk are found to be insignificant predictors of capital structure in Pakistan listed firms. Moreover, the study also found a significant impact of industry characteristic on leverage. The findings of this study indicate that an individual firm's finance policy needs to be responsive to the firm's characteristics and should match with the different borrowing requirements of listed firms.

Estimation for the Change of Daily Maxima Temperature (일일 최고기온의 변화에 대한 추정)

  • Ko, Wang-Kyung
    • The Korean Journal of Applied Statistics
    • /
    • v.20 no.1
    • /
    • pp.1-9
    • /
    • 2007
  • This investigation on the change of the daily maxima temperature in Seoul, Daegu, Chunchen, Youngchen was triggered by news items such as the earth is getting warmer and a recent news item that said that Korea is getting warmer due to this climatic change. A statistical analysis on the daily maxima for June over this period in Seoul revealed a positive trend of 1.1190 centigrade over the 45 years, a change of 0.0249 degrees annually. Due to the large variation on these maximum temperatures, one can raise the question on the significance of this increase. To check the goodness of fit of the proposed extreme value model, we shown a Q-Q plot of the observed quantiles against the simulated quantiles and a probability plot. And we calculated statistics each month and a tolerance limit. This is tested through simulating a large number of similar datasets from an Extreme Value distribution which described the observed data very well. Only 0.02% of the simulated datasets showed an increase of this degrees or larger, meaning that the probability is very low for such an event to occur.

Relationship between the Sample Quantiles and Sample Quantile Ranks (표본분위수와 표본분위의 관계)

  • Ahn, Sung-Jin
    • Communications for Statistical Applications and Methods
    • /
    • v.18 no.6
    • /
    • pp.707-716
    • /
    • 2011
  • Quantiles and quantile ranks(or plotting positions) are widely used in academia and industry. Sample quantile methods and sample quantile methods implemented in some major statistical software are at least seven, respectively. Small looking differences between the methods can make big differences in outcomes that result from decisions based on them. We discussed the characteristics and differences of the basic plotting position using the empirical cumulative probability and the six plotting positions derived from the suggestion of Blom (1958). After discussing the characteristics and differences of seven quantile methods used in the some major statistical software, we suggested a general expression covering all seven quantile methods. Using the insight obtained from the general expression, we proposed four propositions that make it possible to find the plotting position method that correspond to each of the seven quantile methods. These correspondences may help us to understand and apply quantile methodology.

A Graphical Method to Assess Goodness-of-Fit for Inverse Gaussian Distribution (역가우스분포에 대한 적합도 평가를 위한 그래프 방법)

  • Choi, Byungjin
    • The Korean Journal of Applied Statistics
    • /
    • v.26 no.1
    • /
    • pp.37-47
    • /
    • 2013
  • A Q-Q plot is an effective and convenient graphical method to assess a distributional assumption of data. The primary step in the construction of a Q-Q plot is to obtain a closed-form expression to represent the relation between observed quantiles and theoretical quantiles to be plotted in order that the points fall near the line y = a + bx. In this paper, we introduce a Q-Q plot to assess goodness-of-fit for inverse Gaussian distribution. The procedure is based on the distributional result that a transformed random variable $Y={\mid}\sqrt{\lambda}(X-{\mu})/{\mu}\sqrt{X}{\mid}$ follows a half-normal distribution with mean 0 and variance 1 when a random variable X has an inverse Gaussian distribution with location parameter ${\mu}$ and scale parameter ${\lambda}$. Simulations are performed to provide a guideline to interpret the pattern of points on the proposed inverse Gaussian Q-Q plot. An illustrative example is provided to show the usefulness of the inverse Gaussian Q-Q plot.

Optimization of Data Recovery using Non-Linear Equalizer in Cellular Mobile Channel (셀룰라 이동통신 채널에서 비선형 등화기를 이용한 최적의 데이터 복원)

  • Choi, Sang-Ho;Ho, Kwang-Chun;Kim, Yung-Kwon
    • Journal of IKEEE
    • /
    • v.5 no.1 s.8
    • /
    • pp.1-7
    • /
    • 2001
  • In this paper, we have investigated the CDMA(Code Division Multiple Access) Cellular System with non-linear equalizer in reverse link channel. In general, due to unknown characteristics of channel in the wireless communication, the distribution of the observables cannot be specified by a finite set of parameters; instead, we partitioned the m-dimensional sample space Into a finite number of disjointed regions by using quantiles and a vector quantizer based on training samples. The algorithm proposed is based on a piecewise approximation to regression function based on quantiles and conditional partition moments which are estimated by Robbins Monro Stochastic Approximation (RMSA) algorithm. The resulting equalizers and detectors are robust in the sense that they are insensitive to variations in noise distributions. The main idea is that the robust equalizers and robust partition detectors yield better performance in equiprobably partitioned subspace of observations than the conventional equalizer in unpartitioned observation space under any condition. And also, we apply this idea to the CDMA system and analyze the BER performance.

  • PDF

Comparison of Methods of Selecting the Threshold of Partial Duration Series for GPD Model (GPD 모형 산정을 위한 부분시계열 자료의 임계값 산정방법 비교)

  • Um, Myoung-Jin;Cho, Won-Cheol;Heo, Jun-Haeng
    • Journal of Korea Water Resources Association
    • /
    • v.41 no.5
    • /
    • pp.527-544
    • /
    • 2008
  • Generalized Pareto distribution (GPD) is frequently applied in hydrologic extreme value analysis. The main objective of statistics of extremes is the prediction of rare events, and the primary problem has been the estimation of the threshold and the exceedances which were difficult without an accurate method of calculation. In this paper, to obtain the threshold or the exceedances, four methods were considered. For this comparison a GPD model was used to estimate parameters and quantiles for the seven durations (1, 2, 3, 6, 12, 18 and 24 hours) and the ten return periods (2, 3, 5, 10, 20, 30, 50, 70, 80 and 100 years). The parameters and quantiles of the three-parameter generalized Pareto distribution were estimated with three methods (MOM, ML and PWM). To estimate the degree of fit, three methods (K-S, CVM and A-D test) were performed and the relative root mean squared error (RRMSE) was calculated for a Monte Carlo generated sample. Then the performance of these methods were compared with the objective of identifying the best method from their number.

Frequency Analysis of Daily Rainfall in Han River Basin Based on Regional L-moments Algorithm (L-모멘트법을 이용한 한강유역 일강우량자료의 지역빈도해석)

  • Lee, Dong-Jin;Heo, Jun-Haeng
    • Journal of Korea Water Resources Association
    • /
    • v.34 no.2
    • /
    • pp.119-130
    • /
    • 2001
  • At-site and regional frequency analyses of annual maximum 1-, 2-, and 3-days rainfall in Han River basin was performed and compared based on the regional L-moments algorithm. To perform regional frequency analysis, Han River basin was subdivided into 3 sub-basins such as South Han River, North Han River, and downstream regions. For each sub-basin, the discordancy and homogeneity tests were performed. As the results of goodness of fit tests, lognormal model was selected as an appropriate probability distribution for both South Han River and downstream regions and gamma-3 model for North han River region. From Monte carlo simulation, RBIAS and RRMSE of the estimated quantiles from regional frequency analysis and at-site frequency analysis were calculated and compared each other. Regional frequency analysis shows less RRMSE of the estimated quantiles than at-sites frequency analysis in overall return periods. The differences of BRMSE between two approaches increase as the return period increases. As a result, it is shown that regional frequency analysis performs better than at-site analysis for annual maximum rainfall data in Han River basin.

  • PDF

Probabilistic Analysis of Independent Storm Events: 2. Return Periods of Storm Events (독립호우사상의 확률론적 해석 : 2. 호우사상의 재현기간)

  • Yoo, Chul-Sang;Park, Min-Kyu
    • Journal of the Korean Society of Hazard Mitigation
    • /
    • v.11 no.2
    • /
    • pp.137-146
    • /
    • 2011
  • In this study, annual maximum storm events are evaluated by applying the bivariate extremal distribution. Rainfall quantiles of probabilistic storm event are calculated using OR case joint return period, AND case joint return period and interval conditional joint return period. The difference between each of three joint return periods was explained by the quadrant which shows probability calculation concept in the bivariate frequency analysis. Rainfall quantiles under AND case joint return periods are similar to rainfall depths in the univariate frequency analysis. The probabilistic storm events overcome the primary limitation of conventional univariate frequency analysis. The application of these storm event analysis provides a simple, statistically efficient means of characterizing frequency of extreme storm event.

Application of EDA Techniques for Estimating Rainfall Quantiles (확률강우량 산정을 위한 EDA 기법의 적용)

  • Park, Hyunkeun;Oh, Sejeong;Yoo, Chulsang
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.29 no.4B
    • /
    • pp.319-328
    • /
    • 2009
  • This study quantified the data by applying the EDA techniques considering the data structure, and the results were then used for the frequency analysis. Although traditional methods based on the method of moments provide very sensitive statistics to the extreme values, the EDA techniques have an advantage of providing very stable statistics with their small variation. For the application of the EDA techniques to the frequency analysis, it is necessary to normalization transform and inverse-transform to conserve the skewness of the raw data. That is, it is necessary to transform the raw data to make the data follow the normal distribution, to estimate the statistics by applying the EDA techniques, and then finally to inverse-transform the statistics of transformed data. These statistics decided are then applied for the frequency analysis with a given probability density function. This study analyzed the annual maxima one hour rainfall data at Seoul and Pohang stations. As a result, it was found that more stable rainfall quantiles, which were also less sensitive to extreme values, could be estimated by applying the EDA techniques. This methodology may be effectively used for the frequency analysis of rainfall at stations with especially high annual variations of rainfall due to climate change, etc.