• Title/Summary/Keyword: sampling set

Search Result 623, Processing Time 0.032 seconds

Heterogeneous Ensemble of Classifiers from Under-Sampled and Over-Sampled Data for Imbalanced Data

  • Kang, Dae-Ki;Han, Min-gyu
    • International journal of advanced smart convergence
    • /
    • v.8 no.1
    • /
    • pp.75-81
    • /
    • 2019
  • Data imbalance problem is common and causes serious problem in machine learning process. Sampling is one of the effective methods for solving data imbalance problem. Over-sampling increases the number of instances, so when over-sampling is applied in imbalanced data, it is applied to minority instances. Under-sampling reduces instances, which usually is performed on majority data. We apply under-sampling and over-sampling to imbalanced data and generate sampled data sets. From the generated data sets from sampling and original data set, we construct a heterogeneous ensemble of classifiers. We apply five different algorithms to the heterogeneous ensemble. Experimental results on an intrusion detection dataset as an imbalanced datasets show that our approach shows effective results.

A Case study of an optimal design with structured sampling and simulation

  • Park, Hongjoon;Youngcook Jun
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2002.10a
    • /
    • pp.46.4-46
    • /
    • 2002
  • This study was motivated how it might be possible to validate structured sampling with orthogonal array for optimal design of a pin. The Taguchi method by orthogonal array, one of the structured sampling methods, has much advantage that is row cost and time saving for experiments. But this method has been applied in limited areas especially for mechanical problems. In this study, we experimented whether the structured sampling is useful for applying optimal design of mechanical elements. For the experiment, we first set up a mechanical problem which was related to determining optimal parameters associated a pin's crack occurred inside a hole. We, then, calculated combination of...

  • PDF

Comparison of Latin Hypercube Sampling and Simple Random Sampling Applied to Neural Network Modeling of HfO2 Thin Film Fabrication

  • Lee, Jung-Hwan;Ko, Young-Don;Yun, Il-Gu;Han, Kyong-Hee
    • Transactions on Electrical and Electronic Materials
    • /
    • v.7 no.4
    • /
    • pp.210-214
    • /
    • 2006
  • In this paper, two sampling methods which are Latin hypercube sampling (LHS) and simple random sampling were. compared to improve the modeling speed of neural network model. Sampling method was used to generate initial weights and bias set. Electrical characteristic data for $HfO_2$ thin film was used as modeling data. 10 initial parameter sets which are initial weights and bias sets were generated using LHS and simple random sampling, respectively. Modeling was performed with generated initial parameters and measured epoch number. The other network parameters were fixed. The iterative 20 minimum epoch numbers for LHS and simple random sampling were analyzed by nonparametric method because of their nonnormality.

시뮬레이션과 네트워크 축소기법을 이용한 네트워크 신뢰도 추정

  • Seo, Jae-Jun;Jeon, Chi-Hyeok
    • ETRI Journal
    • /
    • v.14 no.4
    • /
    • pp.19-27
    • /
    • 1992
  • Since. as is well known, direct computation of the reliability for a large-scaled and complex net work generally requires exponential time, a variety of alternative methods to estimate the network reliability using simulation have been proposed. Monte Carlo sampling is the major approach to estimate the network reliability using simulation. In the paper, a dynamic Monte Carlo sampling method, called conditional minimal cut set (CMCS) algorithm, is suggested. The CMCS algorithm simulates a minimal cut set composed of arcs originated from the (conditional) source node until s-t connectedness is confirmed, then reduces the network on the basis of the states of simulated arcs. We develop the importance sampling estimator and the total hazard estimator and compare the performance of these simulation estimators. It is found that the CMCS algorithm is useful in reducing variance of network reliability estimator.

  • PDF

SIMPLE RANKED SAMPLING SCHEME: MODIFICATION AND APPLICATION IN THE THEORY OF ESTIMATION OF ERLANG DISTRIBUTION

  • RAFIA GULZAR;IRSA SAJJAD;M. YOUNUS BHAT;SHAKEEL UL REHMAN
    • Journal of applied mathematics & informatics
    • /
    • v.41 no.2
    • /
    • pp.449-468
    • /
    • 2023
  • This paper deals in the study of the estimation of the parameters of Erlang distribution based on rank set sampling and some of its modifications. Here we considered Maximum Likelihood (ML) and the Bayesian technique to estimate the shape and scale parameter of Erlang distribution based on RSS and its some modifications such as ERSS, MRSS, and MRSSu. The derivation for unknown parameters of Erlang distribution is well presented using normal approximation to the asymptotic distribution of ML estimators. But due to the complexity involves in the integral, the Bayes estimator of unknown parameters is obtained using MCMC method. Further, we compared the MSE of estimation in different sampling schemes with different set sizes and cycle size. A real-life data application is also given to illustrate the efficiency of the proposed scheme.

Conditional Sampling Measurement to Identify Flame Structures in Turbulent Combustion (난류 화염 구조 규명을 위한 조건 평균 측정법)

  • Huh Kang Y.
    • Journal of the Korean Society of Visualization
    • /
    • v.2 no.1
    • /
    • pp.8-11
    • /
    • 2004
  • Conditional sampling measurement is required for conditional averages as well as unconditional Favre averages to resolve different flame structures of turbulent combustion. A Favre average can be obtained as an integral of conditional average and Favre PDF in terms of the mixture fraction, which is a preferred choice as a sampling variable in diffusion controlled turbulent combustion. MILD combustion data are presented as an example for a conditionally averaged data set and comparison with CMC calculation results.

  • PDF

Sampling Plans Based on Truncated Life Test for a Generalized Inverted Exponential Distribution

  • Singh, Sukhdev;Tripathi, Yogesh Mani;Jun, Chi-Hyuck
    • Industrial Engineering and Management Systems
    • /
    • v.14 no.2
    • /
    • pp.183-195
    • /
    • 2015
  • In this paper, we propose a two-stage group acceptance sampling plan for generalized inverted exponential distribution under truncated life test. Median life is considered as a quality parameter. Design parameters are obtained to ensure that true median life is longer than a given specified life at certain level of consumer's risk and producer's risk. We also explore situations under which design parameters based on median lifetime can be used for other percentile points. Tables and specific examples are reported to explain the proposed plans. Finally a real data set is analyzed to implement the plans in practical situations and some suggestions are given.

A RENDERING ALGORITHM FOR HYBRID SCENE REPRESENTATION

  • Tien, Yen;Chou, Yun-Fung;Shih, Zen-Chung
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2009.01a
    • /
    • pp.17-22
    • /
    • 2009
  • In this paper, we discuss two fundamental issues of hybrid scene representation: constructing and rendering. A hybrid scene consists of triangular meshes and point-set models. Consider the maturity of modeling techniques of triangular meshes, we suggest that generate a point-set model from a triangular mesh might be an easier and more economical way. We improve stratified sampling by introducing the concept of priority. Our method has the flexibility that one may easily change the importance criteria by substituting priority functions. While many works were devoted to blend rendering results of point and triangle, our work tries to render point-set models and triangular meshes as individuals. We propose a novel way to eliminate depth occlusion artifacts and to texture a point-set model. Finally, we implement our rendering algorithm with the new features of the shader model 4.0 and turns out to be easily integrated with existing rendering techniques for triangular meshes.

  • PDF

Using ranked auxiliary covariate as a more efficient sampling design for ANCOVA model: analysis of a psychological intervention to buttress resilience

  • Jabrah, Rajai;Samawi, Hani M.;Vogel, Robert;Rochani, Haresh D.;Linder, Daniel F.;Klibert, Jeff
    • Communications for Statistical Applications and Methods
    • /
    • v.24 no.3
    • /
    • pp.241-254
    • /
    • 2017
  • Drawing a sample can be costly or time consuming in some studies. However, it may be possible to rank the sampling units according to some baseline auxiliary covariates, which are easily obtainable, and/or cost efficient. Ranked set sampling (RSS) is a method to achieve this goal. In this paper, we propose a modified approach of the RSS method to allocate units into an experimental study that compares L groups. Computer simulation estimates the empirical nominal values and the empirical power values for the test procedure of comparing L different groups using modified RSS based on the regression approach in analysis of covariance (ANCOVA) models. A comparison to simple random sampling (SRS) is made to demonstrate efficiency. The results indicate that the required sample sizes for a given precision are smaller under RSS than under SRS. The modified RSS protocol was applied to an experimental study. The experimental study was designed to obtain a better understanding of the pathways by which positive experiences (i.e., goal completion) contribute to higher levels of happiness, well-being, and life satisfaction. The use of the RSS method resulted in a cost reduction associated with smaller sample size without losing the precision of the analysis.

Sub-Nyquist Nonuniform Sampling and Perfect Reconstruction of Speech Signals (음성신호의 Sub-Nyquist 비균일 표준화 및 완전 복구에 관한 연구)

  • Lee, He-Young
    • Speech Sciences
    • /
    • v.12 no.2
    • /
    • pp.153-170
    • /
    • 2005
  • The sub-Nyquist nonuniform sampling (SNNS) and the perfect reconstruction (PR) formula are proposed for the development of a systematic method to obtain minimal representation of a speech signal. In the proposed method, the instantaneous sampling frequency (ISF) varies, depending on the least upper boundary of spectral support of a speech signal in time-frequency domain (TFD). The definition of the instantaneous bandwidth (IB), which determines the ISF and is used for generating the set of samples that represent continuous-time signals perfectly, is given. Also, the spectral characteristics of the sampled data generated by the sub-Nyquist nonuniform sampling method is analyzed. The proposed method doesn't generate the redundant samples due to the time-varying property of the instantaneous bandwidth of a speech signal.

  • PDF