• Title/Summary/Keyword: normality assumption

Search Result 85, Processing Time 0.02 seconds

Statistical Methods for Multivariate Missing Data in Health Survey Research (보건조사연구에서 다변량결측치가 내포된 자료를 효율적으로 분석하기 위한 통계학적 방법)

  • Kim, Dong-Kee;Park, Eun-Cheol;Sohn, Myong-Sei;Kim, Han-Joong;Park, Hyung-Uk;Ahn, Chae-Hyung;Lim, Jong-Gun;Song, Ki-Jun
    • Journal of Preventive Medicine and Public Health
    • /
    • v.31 no.4 s.63
    • /
    • pp.875-884
    • /
    • 1998
  • Missing observations are common in medical research and health survey research. Several statistical methods to handle the missing data problem have been proposed. The EM algorithm (Expectation-Maximization algorithm) is one of the ways of efficiently handling the missing data problem based on sufficient statistics. In this paper, we developed statistical models and methods for survey data with multivariate missing observations. Especially, we adopted the EM algorithm to handle the multivariate missing observations. We assume that the multivariate observations follow a multivariate normal distribution, where the mean vector and the covariance matrix are primarily of interest. We applied the proposed statistical method to analyze data from a health survey. The data set we used came from a physician survey on Resource-Based Relative Value Scale(RBRVS). In addition to the EM algorithm, we applied the complete case analysis, which uses only completely observed cases, and the available case analysis, which utilizes all available information. The residual and normal probability plots were evaluated to access the assumption of normality. We found that the residual sum of squares from the EM algorithm was smaller than those of the complete-case and the available-case analyses.

  • PDF

Analysis of relationship between K-CESA and creativity confluence competency and coaching skill of undergraduate students (대학생 핵심역량(K-CESA)이 창의융합역량에 미치는 영향과 코칭역량의 매개효과)

  • Park, Ji-Young
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.5
    • /
    • pp.206-215
    • /
    • 2020
  • This study analyzes the relationship between K-CESA of undergraduate students, and their coaching skill and creativity confluence competences. Totally, 344 students attending private colleges in G province were evaluated for their competence. The data collected were analyzed by descriptive statistics, Pearson's correlation analysis, multiple regression analysis and Sobel test, using the SPSS/PC 22.0 computer program. Our results indicate that K-CESA, coaching skill, and creative convergence fulfill the assumption of normality. Moreover, significant positive relationships were determined between K-CESA and creativity confluence, as well as between coaching skill and creativity confluence. Furthermore, K-CESA was an important factor that predicts the creativity confluence competency, and coaching skill had mediating effects on the relationship between K-CESA and creativity confluence competency. In conclusion, our results indicate that K-CESA and coaching skills are important factors that will help strengthen creativity confluence competences.

Applying a Forced Censoring Technique with Accelerated Modeling for Improving Estimation of Extremely Small Percentiles of Strengths

  • Chen Weiwei;Leon Ramon V.;Young Timothy M.;Guess Frank M.
    • International Journal of Reliability and Applications
    • /
    • v.7 no.1
    • /
    • pp.27-39
    • /
    • 2006
  • Many real world cases in material failure analysis do not follow perfectly the normal distribution. Forcing of the normality assumption may lead to inaccurate predictions and poor product quality. We examine the failure process of the internal bond (IB or tensile strength) of medium density fiberboard (MDF). We propose a forced censoring technique that closer fits the lower tails of strength distributions and better estimates extremely smaller percentiles, which may be valuable to continuous quality improvement initiatives. Further analyses are performed to build an accelerated common-shaped Weibull model for different product types using the $JMP^{(R)}$ Survival and Reliability platform. In this paper, a forced censoring technique is implemented for the first time as a software module, using $JMP^{(R)}$ Scripting Language (JSL) to expedite data processing, which is crucial for real-time manufacturing settings. Also, we use JSL to automate the task of fitting an accelerated Weibull model and testing model homogeneity in the shape parameter. Finally, a package script is written to readily provide field engineers customized reporting for model visualization, parameter estimation, and percentile forecasting. Our approach may be more accurate for product conformance evaluation, plus help reduce the cost of destructive testing and data management due to reduced frequency of testing. It may also be valuable for preventing field failure and improved product safety even when destructive testing is not reduced by yielding higher precision intervals at the same confidence level.

  • PDF

Analysis of various statistical techniques used in the articles published during last 19 years in The Journal of Korean Acupuncture & Moxibusition Society (침구학회지 논문에 응용된 통계방식에 관한 연구 -1984 창간호부터 2002년 19권 6호까지 19년간-)

  • Lee, Seung-deok
    • Journal of Acupuncture Research
    • /
    • v.20 no.1
    • /
    • pp.144-158
    • /
    • 2003
  • This study was carried out to investigate what kinds of statistical techniques have been used to analyze data from oriental medicine research, For study, 551 original articles which used statistical techniques in their data analysis were selected form the articles published in The journal of Korean Acupuncture & Moxibustion Society(JKAMS) between 1984 to 2002. among them, 122 articles used descriptive statistics while 429 articles used inferential statistics for data analysis. For that 429 articles, t-test (189 articles), analysis fo variance (111 articles), chi-square test (14 articles), correlation (10 articles), regression analysis (4 articles), factor analysis(5 articles), or nonparametric test (23 articles) were chose to analyze the data. Nonparametric approach has substantial power in case data do not meet the assumption of normality. This method is not only easy to use ut also provides measures of the statistical variation of nominal and ordinal scale. This study shows that more and more recent papers use nonparametric test compared to the old articles. nine different statistical software or packages (SAS, SPSS, Statview, Minitab, Sigma plot, ISP, Graphpad prism, Excel, Access) have been used in the articles published JKMAS. High level statistical techniques such as SAS, SPSS, and Statview are user friendly and used most for acupuncture and Moxibustion research. Including tables and plots in an article facilitates understanding family process data from a descriptive standpoint, minimized erroneous statistical conclusions, and clarifies theoretically important relationships among variables. Table and plots have been used 500 and 233 articles, respectively. A computer procedure is proposed and illustrated with statistical packages using SAS, SPSS, Statview and ISP.

  • PDF

The Effect of Childcare Teachers' Office Stress on Happiness and the Mediation Effect of Teacher Effectiveness (보육교사의 행정 업무 스트레스가 행복감에 미치는 영향에서 교사효능감의 매개효과)

  • Kim, Hye Youn
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.9
    • /
    • pp.331-337
    • /
    • 2020
  • study analyzed the relationship between ChildCare Teachers' Office Stress on Happiness with Teacher Effectiveness. A survey was conducted on 243 childcare teachers in J area. The collected data was analyzed using the JAMOVI program to analyze the mediated effects and technical statistics. Results found that office stress, teacher efficacy, and sense of happiness of childcare teachers were found to meet the normality assumption. Second, the analysis of the relationship between office stress, teacher efficacy, and happiness showed a statistically significant negative correlation between administrative stress and teacher efficacy and happiness and teacher efficacy showed a significant positive correlation. Higher office stress led to less teacher effectiveness and happiness, and higher happiness led to higher teacher effectiveness. Third, office stress is a significant predictor of happiness and teacher's efficacy. In conclusion, office stress should be considered important in terms of happiness and teacher effectiveness and the direction of improvement in the treatment of child care teachers was suggested.

ROC curve and AUC for linear growth models (선형성장모형에 대한 ROC 곡선과 AUC)

  • Hong, Chong Sun;Yang, Dae Soon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.6
    • /
    • pp.1367-1375
    • /
    • 2015
  • Consider the linear growth models for longitudinal data analysis. Several kind of linear growth models are selected such as time-effect and random-effect models as well as a dummy variable included model. In this work, simulation data are generated with normality assumption, and both binormal ROC curve and AUC are obtained and compared for various linear growth models. It is found that ROC curves have different shapes and AUC increase slowly, as values of the covariance increase and the time passes for random-effect models. On the other hand, AUC increases very fast as values of covariance decrease. When the covariance has positive value, we explored that the variances of random-effect models increase and the increment of AUC is smaller than that of AUC for time-effect models. And the increment of AUC for time-effect models is larger than the increment for random-effect models.

Finding optimal portfolio based on genetic algorithm with generalized Pareto distribution (GPD 기반의 유전자 알고리즘을 이용한 포트폴리오 최적화)

  • Kim, Hyundon;Kim, Hyun Tae
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.6
    • /
    • pp.1479-1494
    • /
    • 2015
  • Since the Markowitz's mean-variance framework for portfolio analysis, the topic of portfolio optimization has been an important topic in finance. Traditional approaches focus on maximizing the expected return of the portfolio while minimizing its variance, assuming that risky asset returns are normally distributed. The normality assumption however has widely been criticized as actual stock price distributions exhibit much heavier tails as well as asymmetry. To this extent, in this paper we employ the genetic algorithm to find the optimal portfolio under the Value-at-Risk (VaR) constraint, where the tail of risky assets are modeled with the generalized Pareto distribution (GPD), the standard distribution for exceedances in extreme value theory. An empirical study using Korean stock prices shows that the performance of the proposed method is efficient and better than alternative methods.

Small Area Estimation to Unemployment Statistics in Korea (시군 실업통계 작성을 위한 소지역 추정모형)

  • Kim, Jin;Kim, Jae-Kwang
    • Communications for Statistical Applications and Methods
    • /
    • v.17 no.3
    • /
    • pp.337-347
    • /
    • 2010
  • Most sample surveys are designed to estimate reliable statistics for the whole population and for some large subpopulations. However, the research for small area estimation have been increasing in recent years because users demand to reliable estimates for smaller subpopulations like small areas or specific domains. In Korea, the Economically Active Population Survey(EAPS) is the main household survey that produces monthly unemployment rates for nationwide and 16 large areas (7 metropolitans and 9 provinces) in Korea. For county level estimation, direct estimators are not reliable because of the small sample sizes. We consider small area estimation of the county level unemployment ratesfrom the sample observations in EAPS. To do this, we use an area level model to "borrow strength" from the auxiliary information, such as administrative data and census data. The proposed method is based on the assumption of normality of the model errors in the area level model. The proposed method is compared with the other alternatives in terms of the estimated mean squared errors.

A new sample selection model for overdispersed count data (과대산포 가산자료의 새로운 표본선택모형)

  • Jo, Sung Eun;Zhao, Jun;Kim, Hyoung-Moon
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.6
    • /
    • pp.733-749
    • /
    • 2018
  • Sample selection arises as a result of the partial observability of the outcome of interest in a study. Heckman introduced a sample selection model to analyze such data and proposed a full maximum likelihood estimation method under the assumption of normality. Recently sample selection models for binomial and Poisson response variables have been proposed. Based on the theory of symmetry-modulated distribution, we extend these to a model for overdispersed count data. This type of data with no sample selection is often modeled using negative binomial distribution. Hence we propose a sample selection model for overdispersed count data using the negative binomial distribution. A real data application is employed. Simulation studies reveal that our estimation method based on profile log-likelihood is stable.

Analysis on the Area by Forest Function and the Reflection of Ecosystem Service Concepts in Korea's National Forest Management Plans (최근 국유림경영계획에서 산림기능별 면적구분과 생태계서비스 개념의 반영에 관한 분석)

  • Ko, Kiyeon;Choi, Jaeyong
    • Journal of Korean Society of Forest Science
    • /
    • v.109 no.2
    • /
    • pp.211-222
    • /
    • 2020
  • This study tried to find out whether there is a change over time in the functional classification of forests in relation to human demand for forests. The level in which the concept of ecosystem services has been considered in national forest management plans was also examined. A total of 98 current and previous national forest management plans were available for this study. The composition ratios of the six functions of forests in both the current and previous national forest management plans were surveyed. We used a parametric t-test when the mean values of two (current and previous) groups were normally distributed and used nonparametric Wilcoxon code rank test when the assumption of normality was not met. Timber production forests were shown to follow a normal distribution, while five others, including water regeneration forests, disaster prevention forests, natural environment conservation forests, recreation forests, and living environment conservation forests were not shown to follow a normal distribution. Timber production forests and natural environment forests showed significant changes in the proportion of forest area between previous and current forest management plans. The concept of 'ecosystem services' began to actively appear in the 6th Basic Forest Plan, which started in 2018. However, the level of frequency of the ecosystem services mentioned varied by Regional Forest Services.