• Title/Summary/Keyword: 샘플 통계

Search Result 206, Processing Time 0.029 seconds

Multi-level skip-lot sampling plan (다단계 스?로트 샘플링검사 계획)

  • 최병철
    • The Korean Journal of Applied Statistics
    • /
    • v.6 no.2
    • /
    • pp.277-287
    • /
    • 1993
  • This paper is a generalization of single- and two-level skip-lot sampling plans to n-level, which can considerably reduce inspection cost when the level of submitted quality is high. In every skipping inspection of the generalized sampling plan, not only skipping parameters but also inspection fractions can be freely choosed. The general formula of the operating characteristic function for the n-level skip-lot sampling plan is derived. Also the operating characteristic curves of a reference plan, two-level and three-level skip-lot sampling plans are compared.

  • PDF

동시입력이 있는 병렬네트워크의 과부하 확률 추정

  • 권민희;이지연
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2000.11a
    • /
    • pp.247-252
    • /
    • 2000
  • 동시입력이 있는 병렬 네트워크에서 총 손님의 수가 특정한 값을 초과하여 과부하가 발생하는 확률을 추정하고자 한다. large deviation 이론을 적용하여 추정을 위한 최적의 확률 측도를 찾고 이를 이용하여 과부하 확률의 중요 샘플링 추정량을 구한다.

  • PDF

Three-level Skip-lot sampling plan split by two stages (2단으로 분할된 3단계 스킵-로트 샘플링 검사계획)

  • 최병철;이은주
    • The Korean Journal of Applied Statistics
    • /
    • v.8 no.2
    • /
    • pp.55-64
    • /
    • 1995
  • Three-level Skip-lot sampling plan split by two stages (Split2-SkSP0 is proposed by modifying multi-level skip-lot sampling plan proposed by Choi(1993), which has normal and terrace inspections on the first and the second stages, respectively. The plan is designed to work more higher level inspections when the quality of the submitted products are good, otherwise, return to the normal or the terrace inspection as fast as possible. Also, the formula of the operating characteristic function for the split skip-lot sampling plan is derived using the Markov chain approach. Also, operating characteristic properties of the proposed plans are studied and graphically compared with those of the multi-level skip-lot sampling plans.

  • PDF

On sampling algorithms for imbalanced binary data: performance comparison and some caveats (불균형적인 이항 자료 분석을 위한 샘플링 알고리즘들: 성능비교 및 주의점)

  • Kim, HanYong;Lee, Woojoo
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.5
    • /
    • pp.681-690
    • /
    • 2017
  • Various imbalanced binary classification problems exist such as fraud detection in banking operations, detecting spam mail and predicting defective products. Several sampling methods such as over sampling, under sampling, SMOTE have been developed to overcome the poor prediction performance of binary classifiers when the proportion of one group is dominant. In order to overcome this problem, several sampling methods such as over-sampling, under-sampling, SMOTE have been developed. In this study, we investigate prediction performance of logistic regression, Lasso, random forest, boosting and support vector machine in combination with the sampling methods for binary imbalanced data. Four real data sets are analyzed to see if there is a substantial improvement in prediction performance. We also emphasize some precautions when the sampling methods are implemented.

Geostatistical Integration of Ground Survey Data and Secondary Data for Geological Thematic Mapping (지질 주제도 작성을 위한 지표 조사 자료와 부가 자료의 지구통계학적 통합)

  • Park, No-Wook;Jang, Dong-Ho;Chi, Kwang-Hoon
    • Korean Journal of Remote Sensing
    • /
    • v.22 no.6
    • /
    • pp.581-593
    • /
    • 2006
  • Various geological thematic maps have been generated by interpolating sparsely sampled ground survey data and geostatistical kriging that can consider spatial correlation between neighboring data has widely been used. This paper applies multi-variate geostatistical algorithms to integrate secondary information with sparsely sampled ground survey data for geological thematic mapping. Simple kriging with local means and kriging with an external drift are applied among several multi-variate geostatistical algorithms. Two case studies for spatial mapping of groundwater level and grain size have been carried out to illustrate the effectiveness of multi-variate geostatistical algorithms. A digital elevation model and IKONOS remote sensing imagery were used as secondary information in two case studies. Two multi-variate geostatistical algorithms, which can account for both spatial correlation of neighboring data and secondary data, showed smaller prediction errors and more local variations than those of ordinary kriging and linear regression. The benefit of applying the multi-variate geostatistical algorithms, however, depends on sampling density, magnitudes of correlation between primary and secondary data, and spatial correlation of primary data. As a result, the experiment for spatial mapping of grain size in which the effects of those factors were dominant showed that the effect of using the secondary data was relatively small than the experiment for spatial mapping of groundwater level.

네트? 샘플링에서 응답오차를 고려한 중복수 추정량

  • 김규성;이기재;박진우;김영원
    • Communications for Statistical Applications and Methods
    • /
    • v.3 no.1
    • /
    • pp.101-109
    • /
    • 1996
  • 네트웍 샘플링은 회귀한 속성을 갖는 모집단에서 유용한 표본조사방법이다. 기존의 중복수 추정량(multiplicity estimator)은 네트웍 샘플링의 특징을 반영하는 추정량으로 응답오차를 고려하지 않은 경우에 이용되었다. 본 논문에서는 응답오차를 고려한 경우와 이용할 수 있는 수정된 중복수 추정량을 제안하였다. 그리고 제안된 추정량의 기대값과 근사기대분산(approximate expexted variance)을 유도하였으며, 제안된 추정량이 기존의 모총수 추정량보다 화과적임을 가상모집단을 통하여 보였다.

  • PDF

Permutation-Based Test with Small Samples for Detecting Differentially Expressed Genes (극소수 샘플에서 유의발현 유전자 탐색에 사용되는 순열에 근거한 검정법)

  • Lee, Ju-Hyoung;Song, Hae-Hiang
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.5
    • /
    • pp.1059-1072
    • /
    • 2009
  • In the analysis of microarray data with a small number of arrays, the most important task is the detection of differentially expressed genes by a significance test. For this purpose, one needs to construct a null distribution based on a large number of genes and one of the best way for constructing the null distribution for a small number of arrays is by means of permutation methods. In this paper we propose simple test statistics and permutation methods that are appropriate in constructing the null distribution. In a simulation study, we compare the null distributions generated by the proposed test statistics and permutation methods with the previous ones. With an example microarray data, differentially expressed genes are determined by applying these methods.

MCMC를 이용한 비동질적 포아송과정에서 일반화 순서통계량 모형의 연구

  • 최기헌;김희철
    • Communications for Statistical Applications and Methods
    • /
    • v.4 no.3
    • /
    • pp.753-763
    • /
    • 1997
  • 컴퓨터의 발전에 따른 MCMC를 비동질적 포아송 과정에 이용하였다. 베이지안 추론에서 조건부 분포를 가지고 사후분포를 결정하는데 있어서의 계산 문제를 고려하였다. 특히 분포가 이중지수, 곰페르츠, 랄리, 감마, 그리고 검벨인 일반 순서통계량 모형에 대하여 깁스 샘플링과 메트로폴리스 알고리즘을 활용한 베이지안 계산과 모형선택을 제시하였다.

  • PDF

Exploratory Analysis of Gene Expression Data Using Biplot (행렬도를 이용한 유전자발현자료의 탐색적 분석)

  • Park, Mi-Ra
    • The Korean Journal of Applied Statistics
    • /
    • v.18 no.2
    • /
    • pp.355-369
    • /
    • 2005
  • Genome sequencing and microarray technology produce ever-increasing amounts of complex data that needs statistical analysis. Visualization is an effective analytic technique that exploits the ability of the human brain to process large amounts of data. In this study, biplot approach applied to microarray data to see the relationship between genes and samples. The supplementary data method to classify new sample to known category is suggested. The methods are validated by applying it to well known microarray data such as Golub et al.(1999), Alizadeh et al.(2000), Ross et al.(2000). The results are compared to the results of several clustering methods. Modified graph which combine partitioning method and biplot is also suggested.

Region Analysis of Takbon Images (탁본영상의 영역분석)

  • Hwang, Jae-Ho
    • Proceedings of the KIEE Conference
    • /
    • 2006.04a
    • /
    • pp.141-143
    • /
    • 2006
  • 한국을 비롯한 동양 금석학 정보 인식의 중요한 매체인 탁본을 디지털 영상데이터로 변환하여 영상 특성을 분석하고 수학적 모델을 구현한다. 이를 위해 역사적으로 유명한 대표적 탁본을 포함한 50여개의 탁본영상 샘플을 작위로 선택하였고, 샘플영상 속에 내재되어 있는 영역특성을 중심으로 통계분석을 시도하였다. 탁본 원영상은 흑백의 두 영역으로 분할되는 완벽한 이진영상인데 반하여, 관측영상은 탁본뜨기 수작업과정을 거치면서 영역간 색도의 혼재와 얼룩무늬와 문양이 전체 영상에 분포한다. 본래의 두 영역은 정보영역과 바탕영역으로 구분되나 이들 얼룩무늬들은 또 다른 영역들로 치부되어 주로 바탕영역에 산발적으로 분포되어 영상인식을 저해하는 요인으로 작용한다. 관측영상 속에 내재되어 있는 영역 본래의 특성과 본뜨기 수작업 과정에서 새로 생성되는 영역들 사이의 기하학적 차이를 통계적으로 분류 처리함으로 관측 탁본영상의 영역 특성의 추이를 추론할 수 있다. 분석 결과, 탁본영상은 영역간 극단적인 확률적 차이를 보였으며, 이 양극성은 곧 탁본 원영상의 속성이 수작업과 관측이라는 훼손 과정을 거치면서도 보존됨을 의미한다. 이를 근거로 영역 특성과 훼손 과정을 수학적으로 모델링하였고 정보영역 추출의 일차적 개연성을 제시하였다.

  • PDF