• Title/Summary/Keyword: Sampling methods

Search Result 3,064, Processing Time 0.03 seconds

A Comparison of Ensemble Methods Combining Resampling Techniques for Class Imbalanced Data (데이터 전처리와 앙상블 기법을 통한 불균형 데이터의 분류모형 비교 연구)

  • Leea, Hee-Jae;Lee, Sungim
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.3
    • /
    • pp.357-371
    • /
    • 2014
  • There are many studies related to imbalanced data in which the class distribution is highly skewed. To address the problem of imbalanced data, previous studies deal with resampling techniques which correct the skewness of the class distribution in each sampled subset by using under-sampling, over-sampling or hybrid-sampling such as SMOTE. Ensemble methods have also alleviated the problem of class imbalanced data. In this paper, we compare around a dozen algorithms that combine the ensemble methods and resampling techniques based on simulated data sets generated by the Backbone model, which can handle the imbalance rate. The results on various real imbalanced data sets are also presented to compare the effectiveness of algorithms. As a result, we highly recommend the resampling technique combining ensemble methods for imbalanced data in which the proportion of the minority class is less than 10%. We also find that each ensemble method has a well-matched sampling technique. The algorithms which combine bagging or random forest ensembles with random undersampling tend to perform well; however, the boosting ensemble appears to perform better with over-sampling. All ensemble methods combined with SMOTE outperform in most situations.

Comparison of Toluene Diisocyanate Concentrations Collected with Different Sampling Methods by Work Process (시료채취 방법에 따른 작업 공정별 Toluene diisocyanates 포집농도 비교)

  • Kim, Sung Ho;Won, Jong Uk;Kim, Chi Nyon;Jung, Woo Jin;Roh, Jaehoon
    • Journal of Korean Society of Occupational and Environmental Hygiene
    • /
    • v.23 no.2
    • /
    • pp.95-102
    • /
    • 2013
  • Objectives: The aim of this study is to present an appropriate sampling method for individual exposure assessment based on a comparison of toluene diisocyanate (TDI) concentrations collected through different sampling methods by work process type. Methods: Two plants handling TDIs in the Incheon area were selected. The samplings were taken during respective processes of spray painting, drying, grinding, and foaming, in which the production of TDIs took on different forms. For the sampling methods for airborne TDIs, open-face cassette holder, modified 2-piece cassette holder, and impinger were used, and the sampling was performed simultaneously at the same locations. Results: The comparison of TDI collection concentrations by each process depending on the sampling method showed high concentrations in the order of the impinger, modified 2-piece cassette holder, and open-face cassette holder for spray painting and foaming. In all processes with the exception of drying, TDI collection concentrations were higher for sampling done with the modified 2-piece cassette holder than with the open-face cassette holder. Conclusions: Based on these results, the modified 2-piece cassette holder was found to be a more appropriate sampling method than the open-face cassette holder when taking individual samples of TDIs from spray painting, grinding, and foaming processes. In particular, for individual exposure assessment of the spray painting process, which features comparatively high collection concentrations compared to the other processes, the use of a modified 2-piece cassette holder is considered appropriate.

Comparison of Sampling and Wall-to-Wall Methodologies for Reporting the GHG Inventory of the LULUCF Sector in Korea (LULUCF 부문 산림 온실가스 인벤토리 구축을 위한 Sampling과 Wall-to-Wall 방법론 비교)

  • Park, Eunbeen;Song, Cholho;Ham, Boyoung;Kim, Jiwon;Lee, Jongyeol;Choi, Sol-E;Lee, Woo-Kyun
    • Journal of Climate Change Research
    • /
    • v.9 no.4
    • /
    • pp.385-398
    • /
    • 2018
  • Although the importance of developing reliable and systematic GHG inventory has increased, the GIS/RS-based national scale LULUCF (Land Use, Land-Use Change and Forestry) sector analysis is insufficient in the context of the Paris Agreement. In this study, the change in $CO_2$ storage of forest land due to land use change is estimated using two GIS/RS methodologies, Sampling and Wall-to-Wall methods, from 2000 to 2010. Particularly, various imagery with sampling data and land cover maps are used for Sampling and Wall-to-Wall methods, respectively. This land use matrix of these methodologies and the national cadastral statistics are classified by six land-use categories (Forest land, Cropland, Grassland, Wetlands, Settlements, and Other land). The difference of area between the result of Sampling methods and the cadastral statistics decreases as the sample plot distance decreases. However, the difference is not significant under a 2 km sample plot. In the 2000s, the Wall-to-Wall method showed similar results to sampling under a 2 km distance except for the Settlement category. With the Wall-to-Wall method, $CO_2$ storage is higher than that of the Sampling method. Accordingly, the Wall-to-Wall method would be more advantageous than the Sampling method in the presence of sufficient spatial data for GHG inventory assessment. These results can contribute to establish an annual report system of national greenhouse gas inventory in the LULUCF sector.

Multistage Point and Confidence Interval Estimation of the Shape Parameter of Pareto Distribution

  • Hamdy, H.I.;Son, M.S.;Gharraph, M.K.;Rashad, A.M.
    • Communications for Statistical Applications and Methods
    • /
    • v.10 no.3
    • /
    • pp.1069-1086
    • /
    • 2003
  • This article presents the asymptotic theory of triple sampling procedure as pertain to estimating the shape parameter of Pareto distribution. Both point and confidence interval estimation are considered within the same inference unified framework. We show that this group sampling technique possesses the efficiency of Anscome (1953), Chow and Robbins (1965) purely sequential procedure as well as reduce the number of sampling operations by utilizing Stein (1945) two stages procedure. The analysis reveals that the technique performs excellent as far as the accuracy is concerned. The present problem differs from those considered by many authors, in multistage sampling, in that the final stage sample size and the parameter's estimate become highly correlated and therefore we adopted different approach.

On the Sampling and Transport of Radioactive Aerosols from Waste Thermal Process

  • Yang, Hee-Chul;Kim, Joon-Hyung;Yong Kang
    • Nuclear Engineering and Technology
    • /
    • v.29 no.4
    • /
    • pp.269-279
    • /
    • 1997
  • The errors associated with incorrect sampling and transport of radioactive aerosol from radwaste thermal process off-gas are analyzed and the conditions of representative sampling and correct transport of radioactive aerosol for off-gas system evaluation are discussed. An estimation method of sampling errors for individual radionuclides is proposed and applied to simulated vitrification melter aerosols. Prediction methods for particle deposition in sample transport tube under laminar as well as turbulent flow conditions are also described by example calculations with simulated incinerator off-gas From the results of example calculations and plots, instrumental and operational conditions of radioactive aerosol sampling system with minimized errors and correction methods for nonideal sampling and transport are recommended.

  • PDF

Chorionic villus sampling

  • Shim, Soon-Sup
    • Journal of Genetic Medicine
    • /
    • v.11 no.2
    • /
    • pp.43-48
    • /
    • 2014
  • Chorionic villus sampling has gained importance as a tool for early cytogenetic diagnosis with a shift toward first trimester screening. First trimester screening using nuchal translucency and biomarkers is effective for screening. Chorionic villus sampling generally is performed at 10-12 weeks by either the transcervical or transabdominal approach. There are two methods of analysis; the direct method and the culture method. While the direct method may prevent maternal cell contamination, the culture method may be more representative of the true fetal karyotype. There is a concern for mosaicism which occurs in approximately 1% of cases, and mosaic results require genetic counseling and follow-up amniocentesis or fetal blood sampling. In terms of complications, procedure-related pregnancy loss rates may be the same as those for amniocentesis when undertaken in experienced centers. When the procedure is performed after 9 weeks gestation, the risk of limb reduction is not greater than the risk in the general population. At present, chorionic villus sampling is the gold standard method for early fetal karyotyping; however, we anticipate that improvements in noninvasive prenatal testing methods, such as cell free fetal DNA testing, will reduce the need for invasive procedures in the near future.

Fast Volume Visualization Techniques for Ultrasound Data

  • Kwon Koo-Joo;Shin Byeong-Seok
    • Journal of Biomedical Engineering Research
    • /
    • v.27 no.1
    • /
    • pp.6-13
    • /
    • 2006
  • Ultrasound visualization is a typical diagnosis method to examine organs, soft tissues and fetus data. It is difficult to visualize ultrasound data because the quality of the data might be degraded by artifact and speckle noise, and gathered with non-linear sampling. Rendering speed is too slow since we can not use additional data structures or procedures in rendering stage. In this paper, we use several visualization methods for fast rendering of ultrasound data. First method, denoted as adaptive ray sampling, is to reduce the number of samples by adjusting sampling interval in empty space. Secondly, we use early ray termination scheme with sufficiently wide sampling interval and low threshold value of opacity during color compositing. Lastly, we use bilinear interpolation instead of trilinear interpolation for sampling in transparent region. We conclude that our method reduces the rendering time without loss of image quality in comparison to the conventional methods.

Experimental Considration of Multi-order Sampling for Digital Beamforming (디지털 빔포밍을 위한 다차 샘플링 방법의 실험적 고찰)

  • 나병윤;정목근
    • Journal of Biomedical Engineering Research
    • /
    • v.19 no.2
    • /
    • pp.105-112
    • /
    • 1998
  • In this paper, several bandwidth sampling methods were compared using experimental result in which contains "multi-order sampling", which was proposed for envelope detections in RF ultrasonic signals. A "Quadrature sampling method" and "Second-order sampling method" were compared with it. The resultant image of second-order sampling method introduces too much error as compared with the result of quadrature sampling. But Multi-order sampling method, specialy 5-th sampling method showed quite good envelope detection property. This means that more economical and quite good performance digital beamforming system can be built by adopting this multi-order sampling method.s multi-order sampling method.

  • PDF

Implementation of Quality and Reliability Sampling Inspection (품질 및 신뢰성 샘플링 검사의 활용)

  • Choi, Sung-Woon
    • Journal of the Korea Safety Management & Science
    • /
    • v.8 no.5
    • /
    • pp.243-251
    • /
    • 2006
  • This paper is to propose various quality and reliability sampling inspection methods to perform the competitive global outsourcing strategy while purchasing and subcontracting. The study also represents implementation strategy which can be efficiently and effectively used in the enterprise. Quality sampling inspection schemes extend reliability inspection techniques with a little change.

Modified Adaptive Cluster Sampling Designs

  • Park, Jeong-Soo;Kim, Youn-Woo;Son, Chang-Kyoon
    • Communications for Statistical Applications and Methods
    • /
    • v.14 no.1
    • /
    • pp.57-69
    • /
    • 2007
  • Adaptive cluster sampling design is known as a sampling method for rare clustered population. Three modified adaptive cluster sampling designs are proposed. The adjusted Hansen-Hurwitz estimator and the Horvitz-Thompson estimator are considered. Efficiency issue of the proposed sampling designs is discussed in a Monte-Carlo simulation study.