• Title/Summary/Keyword: small sample size

Search Result 741, Processing Time 0.385 seconds

A Resampling Method for Small Sample Size Problems in Face Recognition using LDA (LDA를 이용한 얼굴인식에서의 Small Sample Size문제 해결을 위한 Resampling 방법)

  • Oh, Jae-Hyun;Kwak, Jo-Jun
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.46 no.2
    • /
    • pp.78-88
    • /
    • 2009
  • In many face recognition problems, the number of available images is limited compared to the dimension of the input space which is usually equal to the number of pixels. This problem is called as the 'small sample size' problem and regularization methods are typically used to solve this problem in feature extraction methods such as LDA. By using regularization methods, the modified within class matrix becomes nonsingu1ar and LDA can be performed in its original form. However, in the process of adding a scaled version of the identity matrix to the original within scatter matrix, the scale factor should be set heuristically and the performance of the recognition system depends on highly the value of the scalar factor. By using the proposed resampling method, we can generate a set of images similar to but slightly different from the original image. With the increased number of images, the small sample size problem is alleviated and the classification performance increases. Unlike regularization method, the resampling method does not suffer from the heuristic setting of the parameter producing better performance.

Consolidation characteristics of soft ground using huge sample (대형 sample을 이용한 해안 연약지반 압밀특성에 관한 연구)

  • Hong, Sung-Jin;Lee, Moon-Joo;Jung, Doo-Suk;Lee, Woo-Jin
    • Proceedings of the Korean Geotechical Society Conference
    • /
    • 2008.10a
    • /
    • pp.1109-1114
    • /
    • 2008
  • To investigate the effect of sample size on coefficient of consolidation of non-homogeneous soil, the result of a large size consolidation test using a huge undisturbed sample with $1200mm(D){\times}2000mm(H)$ in dimension is compared with that of oedometer test using undisturbed small sample. In addition, test results are compared with those of same test using remold sample. Experimental results show that, due to the lump of sand/silt was mixed in sample, the coefficient of consolidation of undisturbed samples have a difference for each tests. Whereas, the difference of coefficient of consolidation between remolded large and small samples is not found. Because sample size affects the test results, sample must be carefully selected for non-homogeneous soil.

  • PDF

Development of A Lot Quality Assurance and Inspection Cost Estimation System for Process-Centered Inspections (공정 중심의 검사 방식에 대한 로트 품질보증 및 검사비용 추정 시스템 개발)

  • Lee, Do-Kyung;Lee, Seung-Woo
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.29 no.1
    • /
    • pp.34-40
    • /
    • 2006
  • Many producers put sampling inspection policy into the way of their convenience. Examples of the convenience are irregular lot size and too small sample size. Because they don't use a standard sampling inspection policy, they can not guarantee the quality level of their products. In this study, we developed a user-centered design program which can calculate the AOQL of their products to their buyers in the case of irregular lot size and too small sample size. Also this program propose a linear inspection cost by Hald's model.

An Overview of Bootstrapping Method Applicable to Survey Researches in Rehabilitation Science

  • Choi, Bong-sam
    • Physical Therapy Korea
    • /
    • v.23 no.2
    • /
    • pp.93-99
    • /
    • 2016
  • Background: Parametric statistical procedures are typically conducted under the condition in which a sample distribution is statistically identical with its population. In reality, investigators use inferential statistics to estimate parameters based on the sample drawn because population distributions are unknown. The uncertainty of limited data from the sample such as lack of sample size may be a challenge in most rehabilitation studies. Objects: The purpose of this study is to review the bootstrapping method to overcome shortcomings of limited sample size in rehabilitation studies. Methods: Articles were reviewed. Results: Bootstrapping method is a statistical procedure that permits the iterative re-sampling with replacement from a sample when the population distribution is unknown. This statistical procedure is to enhance the representativeness of the population being studied and to determine estimates of the parameters when sample size are too limited to generalize the study outcome to target population. The bootstrapping method would overcome limitations such as type II error resulting from small sample sizes. An application on a typical data of a study represented how to deal with challenges of estimating a parameter from small sample size and enhance the uncertainty with optimal confidence intervals and levels. Conclusion: Bootstrapping method may be an effective statistical procedure reducing the standard error of population parameters under the condition requiring both acceptable confidence intervals and confidence level (i.e., p=.05).

Effect of Positively Skewed Distribution on the Two sample t-test: Based on Chi-square Distribution

  • Heo, Sunyeong
    • Journal of Integrative Natural Science
    • /
    • v.14 no.3
    • /
    • pp.123-129
    • /
    • 2021
  • This research examines the effect of positively skewed population distribution on the two sample t-test through simulation. For simulation work, two independent samples were selected from the same chi-square distributions with 3, 5, 10, 15, 20, 30 degrees of freedom and sample sizes 3, 5, 10, 15, 20, 30, respectively. Chi-square distribution is largely skewed to the right at small degrees of freedom and getting symmetric as the degrees of freedom increase. Simulation results show that the sampled populations are distributed positively skewed like chi-square distribution with small degrees of freedom, the F-test for the equality of variances shows poor performances even at the relatively large degrees of freedom and sample sizes like 30 for both, and so it is recommended to avoid using F-test. When two population variances are equal, the skewness of population distribution does not affect on the t-test in terms of the confidence level. However even though for the highly positively skewed distribution and small sample sizes like three or five the t-test achieved the nominal confidence level, the error limits are very large at small sample size. Therefore, if the sampled population is expected to be highly skewed to the right, it will be recommended to use relatively large sample size, at least 20.

Sample Size Requirements in Diagnostic Test Performance Studies (진단검사의 특성 추정을 위한 표본크기)

  • Pak, Son-Il;Oh, Tae-Ho
    • Journal of Veterinary Clinics
    • /
    • v.32 no.1
    • /
    • pp.73-77
    • /
    • 2015
  • There has been increasing attention on sample size requirements in peer reviewed medical literatures. Accordingly, a statistically-valid sample size determination has been described for a variety of medical situations including diagnostic test accuracy studies. If the sample is too small, the estimate is too inaccurate to be useful. On the other hand, a very large sample size would yield the estimate with more accurate than required but may be costly and inefficient. Choosing the optimal sample size depends on statistical considerations, such as the desired precision, statistical power, confidence level and prevalence of disease, and non-statistical considerations, such as resources, cost and sample availability. In a previous paper (J Vet Clin 2012; 29: 68-77) we briefly described the statistical theory behind sample size calculations and provided practical methods of calculating sample size in different situations for different research purposes. This review describes how to calculate sample sizes when assessing diagnostic test performance such as sensitivity and specificity alone. Also included in this paper are tables and formulae to help researchers for designing diagnostic test studies and calculating sample size in studies evaluating test performance. For complex studies clinicians are encouraged to consult a statistician to help in the design and analysis for an accurate determination of the sample size.

Sample Size Calculation for Cluster Randomized Trials (임상시험의 표본크기 계산)

  • Pak, Son-Il;Oh, Tae-Ho
    • Journal of Veterinary Clinics
    • /
    • v.31 no.4
    • /
    • pp.288-292
    • /
    • 2014
  • A critical assumption of the standard sample size calculation is that the response (outcome) for an individual patient is completely independent to that for any other patient. However, this assumption no longer holds when there is a lack of statistical independence across subjects seen in cluster randomized designs. In this setting, patients within a cluster are more likely to respond in a similar manner; patient outcomes may correlate strongly within clusters. Thus, direct use of standard sample size formulae for cluster design, ignoring the clustering effect, may result in sample size that are too small, resulting in a study that is under-powered for detecting the desired level of difference between groups. This paper revisit worked examples for sample size calculation provided in a previous paper using nomogram to easy to access. Then we present the concept of cluster design illustrated with worked examples, and introduce design effect that is a factor to inflate the standard sample size estimates.

Effects of Particle Size and Gelatinization of Job's Tears Powder on the Instant Properties

  • Han, Sung-Hee;Park, Soo-Jea;Lee, Seog-Won;Rhee, Chul
    • Preventive Nutrition and Food Science
    • /
    • v.15 no.1
    • /
    • pp.67-73
    • /
    • 2010
  • The effects of particle sizes (small, medium and large sizes) and gelatinization treatment on the changes of the instant properties of Job's tears powder were investigated. The degree of gelatinization on the different particle size samples of Job's tears powder was the highest in the small particle size, and it also showed an increasing trend regardless of pregelatinizing whether it is or not as the particle size decreased from large particle size to small particle size. The water solubility index of the pregelatinized samples was high compared to that of ungelatinized samples regardless of particle size and temperatures. The water absorption and swelling power increased as particle size and temperature were increased. The dispersibility and sinkability of ungelatinized sample was increased as particle size and temperature were increased and it also showed lower value regardless of particle size and temperature. However, the dispersibility and sinkability of pregelatinized samples were shown to have the opposite result, such that the smallest particle size of pregelatinized sample had the lowest sinkability (11.3%). The turbidity of the pregelatinized small particle size was the highest by a factor of 1.08.

Small Sample Characteristics of Generalized Estimating Equations for Categorical Repeated Measurements (범주형 반복측정자료를 위한 일반화 추정방정식의 소표본 특성)

  • 김동욱;김재직
    • The Korean Journal of Applied Statistics
    • /
    • v.15 no.2
    • /
    • pp.297-310
    • /
    • 2002
  • Liang and Zeger proposed generalized estimating equations(GEE) for analyzing repeated data which is discrete or continuous. GEE model can be extended to model for repeated categorical data and its estimator has asymptotic multivariate normal distribution in large sample sizes. But GEE is based on large sample asymptotic theory. In this paper, we study the properties of GEE estimators for repeated ordinal data in small sample sizes. We generate ordinal repeated measurements for two groups using two methods. Through Monte Carlo simulation studies we investigate the empirical type 1 error rates, powers, relative efficiencies of the GEE estimators, the effect of unequal sample size of two groups, and the performance of variance estimators for polytomous ordinal response variables, especially in small sample sizes.

Optimal designs for small Poisson regression experiments using second-order asymptotic

  • Mansour, S. Mehr;Niaparast, M.
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.6
    • /
    • pp.527-538
    • /
    • 2019
  • This paper considers the issue of obtaining the optimal design in Poisson regression model when the sample size is small. Poisson regression model is widely used for the analysis of count data. Asymptotic theory provides the basis for making inference on the parameters in this model. However, for small size experiments, asymptotic approximations, such as unbiasedness, may not be valid. Therefore, first, we employ the second order expansion of the bias of the maximum likelihood estimator (MLE) and derive the mean square error (MSE) of MLE to measure the quality of an estimator. We then define DM-optimality criterion, which is based on a function of the MSE. This criterion is applied to obtain locally optimal designs for small size experiments. The effect of sample size on the obtained designs are shown. We also obtain locally DM-optimal designs for some special cases of the model.