• Title/Summary/Keyword: Sampling technique

Search Result 1,309, Processing Time 0.03 seconds

A Comparison of Systematic Sampling Designs for Forest Inventory

  • Yim, Jong Su;Kleinn, Christoph;Kim, Sung Ho;Jeong, Jin-Hyun;Shin, Man Yong
    • Journal of Korean Society of Forest Science
    • /
    • v.98 no.2
    • /
    • pp.133-141
    • /
    • 2009
  • This study was conducted to support for determining an efficient sampling design for forest resources assessments in South Korea with respect to statistical efficiency. For this objective, different systematic sampling designs were simulated and compared based on an artificial forest population that had been built from field sample data and satellite data in Yang-Pyeong County, Korea. Using the k-NN technique, two thematic maps (growing stock and forest cover type per pixel unit) across the test area were generated; field data (n=191) and Landsat ETM+ were used as source data. Four sampling designs (systematic sampling, systematic sampling for post-stratification, systematic cluster sampling, and stratified systematic sampling) were employed as optimum sampling design candidates. In order to compute error variance, the Monte Carlo simulation was used (k=1,000). Then, sampling error and relative efficiency were compared. When the objective of an inventory was to obtain estimations for the entire population, systematic cluster sampling was superior to the other sampling designs. If its objective is to obtain estimations for each sub-population, post-stratification gave a better estimation. In order to successfully perform this procedure, it requires clear definitions of strata of interest per field observation unit for efficient stratification.

Study on the Effect of Training Data Sampling Strategy on the Accuracy of the Landslide Susceptibility Analysis Using Random Forest Method (Random Forest 기법을 이용한 산사태 취약성 평가 시 훈련 데이터 선택이 결과 정확도에 미치는 영향)

  • Kang, Kyoung-Hee;Park, Hyuck-Jin
    • Economic and Environmental Geology
    • /
    • v.52 no.2
    • /
    • pp.199-212
    • /
    • 2019
  • In the machine learning techniques, the sampling strategy of the training data affects a performance of the prediction model such as generalizing ability as well as prediction accuracy. Especially, in landslide susceptibility analysis, the data sampling procedure is the essential step for setting the training data because the number of non-landslide points is much bigger than the number of landslide points. However, the previous researches did not consider the various sampling methods for the training data. That is, the previous studies selected the training data randomly. Therefore, in this study the authors proposed several different sampling methods and assessed the effect of the sampling strategies of the training data in landslide susceptibility analysis. For that, total six different scenarios were set up based on the sampling strategies of landslide points and non-landslide points. Then Random Forest technique was trained on the basis of six different scenarios and the attribute importance for each input variable was evaluated. Subsequently, the landslide susceptibility maps were produced using the input variables and their attribute importances. In the analysis results, the AUC values of the landslide susceptibility maps, obtained from six different sampling strategies, showed high prediction rates, ranges from 70 % to 80 %. It means that the Random Forest technique shows appropriate predictive performance and the attribute importance for the input variables obtained from Random Forest can be used as the weight of landslide conditioning factors in the susceptibility analysis. In addition, the analysis results obtained using specific sampling strategies for training data show higher prediction accuracy than the analysis results using the previous random sampling method.

An 8b 200 MHz 0.18 um CMOS ADC with 500 MHz Input Bandwidth (500 MHz의 입력 대역폭을 갖는 8b 200 MHz 0.18 um CMOS A/D 변환기)

  • 조영재;배우진;박희원;김세원;이승훈
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.40 no.5
    • /
    • pp.312-320
    • /
    • 2003
  • This work describes an 8b 200 MHz 0.18 urn CMOS analog-to-digital converter (ADC) based on a pipelined architecture for flat panel display applications. The proposed ABC employs an improved bootstrapping technique to obtain wider input bandwidth than the sampling tate of 200 MHz. The bootstrapuing technique improves the accuracy of the input sample-and-hold amplifier (SHA) and the fast fourier transform (FFT) analysis of the SHA outputs shows the 7.2 effective number of bits with an input sinusoidal wave frequency of 500 MHz and the sampling clock of 200 MHz at a 1.7 V supply voltage. Merged-capacitor switching (MCS) technique increases the sampling rate of the ADC by reducing the number of capacitors required in conventional ADC's by 50 % and minimizes chip area simultaneously. The simulated ADC in a 0.18 um n-well single-poly quad-metal CMOS technology shows an 8b resolution and a 73 mW power dissipation at a 200 MHz sampling clock and a 1.7 V supply voltage.

A study on the improvement ransomware detection performance using combine sampling methods (혼합샘플링 기법을 사용한 랜섬웨어탐지 성능향상에 관한 연구)

  • Kim Soo Chul;Lee Hyung Dong;Byun Kyung Keun;Shin Yong Tae
    • Convergence Security Journal
    • /
    • v.23 no.1
    • /
    • pp.69-77
    • /
    • 2023
  • Recently, ransomware damage has been increasing rapidly around the world, including Irish health authorities and U.S. oil pipelines, and is causing damage to all sectors of society. In particular, research using machine learning as well as existing detection methods is increasing for ransomware detection and response. However, traditional machine learning has a problem in that it is difficult to extract accurate predictions because the model tends to predict in the direction where there is a lot of data. Accordingly, in an imbalance class consisting of a large number of non-Ransomware (normal code or malware) and a small number of Ransomware, a technique for resolving the imbalance and improving ransomware detection performance is proposed. In this experiment, we use two scenarios (Binary, Multi Classification) to confirm that the sampling technique improves the detection performance of a small number of classes while maintaining the detection performance of a large number of classes. In particular, the proposed mixed sampling technique (SMOTE+ENN) resulted in a performance(G-mean, F1-score) improvement of more than 10%.

Reliability-Based Design Optimization Using Kriging Metamodel with Sequential Sampling Technique (순차적 샘플링과 크리깅 메타모델을 이용한 신뢰도 기반 최적설계)

  • Choi, Kyu-Seon;Lee, Gab-Seong;Choi, Dong-Hoon
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.33 no.12
    • /
    • pp.1464-1470
    • /
    • 2009
  • RBDO approach based on a sampling method with the Kriging metamodel and Constraint Boundary Sampling (CBS), which is sequential sampling method to generate metamodels is proposed. The major advantage of the proposed RBDO approach is that it does not require Most Probable failure Point (MPP) which is essential for First-Order Reliability Method (FORM)-based RBDO approach. The Monte Carlo Sampling (MCS), most well-known method of the sampling methods for the reliability analysis is used to assess the reliability of constraints. In addition, a Cumulative Distribution Function (CDF) of the constraints is approximated using Moving Least Square (MLS) method from empirical distribution function. It is possible to acquire a probability of failure and its analytic sensitivities by using an approximate function of the CDF for the constraints. Moreover, a concept of inactive design is adapted to improve a numerical efficiency of the proposed approach. Computational accuracy and efficiency of the proposed RBDO approach are demonstrated by numerical and engineering problems.

Effect of Sampling for Multi-set Cardinality Estimation (멀티셋의 크기 추정 기법에서 샘플링의 효과)

  • Dao, DinhNguyen;Nyang, DaeHun;Lee, KyungHee
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.4 no.1
    • /
    • pp.15-22
    • /
    • 2015
  • Estimating the number of distinct values is really well-known problems in network data measurement and many effective algorithms are suggested. Recent works have built upon technique called Linear Counting to solve the estimation problem for massive sets or spreaders in small memory. Sampling is used to reduce the measurement data, and it is assumed that sampling gives bad effect on the accuracy. In this paper, however, we show that the sampling on multi-set estimation sometimes gives better results for CSE with sampling than for MCSE that examines all the packets without sampling in terms of accuracy and estimation range. To prove this, we presented mathematical analysis, conducted experiment with real data, and compared the results of CSE, MCSE, and CSES.

Other approaches to bivariate ranked set sampling

  • Al-Saleh, Mohammad Fraiwan;Alshboul, Hadeel Mohammad
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.3
    • /
    • pp.283-296
    • /
    • 2018
  • Ranked set sampling, as introduced by McIntyre (Australian Journal of Agriculture Research, 3, 385-390, 1952), dealt with the estimation of the mean of one population. To deal with two or more variables, different forms of bivariate and multivariate ranked set sampling were suggested. For a technique to be useful, it should be easy to implement in practice. Bivariate ranked set sampling, as introduced by Al-Saleh and Zheng (Australian & New Zealand Journal of Statistics, 44, 221-232, 2002), is not easy to implement in practice, because it requires the judgment ranking of each of the combination of the order statistics of the two characteristics. This paper investigates two modifications that make the method easier to use. The first modification is based on ranking one variable and noting the rank of the other variable for one cycle, and do the reverse for another cycle. The second approach is based on ranking of one variable and giving the second variable the same rank (Concomitant Order Statistic) for one cycle and do the reverse for the other cycle. The two procedures are investigated for an estimation of the means of some well-known distributions. It is show that the suggested approaches can be used in practice and can be more efficient than using SRS. A real data set is used to illustrate the procedure.

A new structural reliability analysis method based on PC-Kriging and adaptive sampling region

  • Yu, Zhenliang;Sun, Zhili;Guo, Fanyi;Cao, Runan;Wang, Jian
    • Structural Engineering and Mechanics
    • /
    • v.82 no.3
    • /
    • pp.271-282
    • /
    • 2022
  • The active learning surrogate model based on adaptive sampling strategy is increasingly popular in reliability analysis. However, most of the existing sampling strategies adopt the trial and error method to determine the size of the Monte Carlo (MC) candidate sample pool which satisfies the requirement of variation coefficient of failure probability. It will lead to a reduction in the calculation efficiency of reliability analysis. To avoid this defect, a new method for determining the optimal size of the MC candidate sample pool is proposed, and a new structural reliability analysis method combining polynomial chaos-based Kriging model (PC-Kriging) with adaptive sampling region is also proposed (PCK-ASR). Firstly, based on the lower limit of the confidence interval, a new method for estimating the optimal size of the MC candidate sample pool is proposed. Secondly, based on the upper limit of the confidence interval, an adaptive sampling region strategy similar to the radial centralized sampling method is developed. Then, the k-means++ clustering technique and the learning function LIF are used to complete the adaptive design of experiments (DoE). Finally, the effectiveness and accuracy of the PCK-ASR method are verified by three numerical examples and one practical engineering example.