• Title/Summary/Keyword: sampling methods

Search Result 3,048, Processing Time 0.031 seconds

Optimal Design of the Adaptive Searching Estimation in Spatial Sampling

  • Pyong Namkung;Byun, Jong-Seok
    • Communications for Statistical Applications and Methods
    • /
    • v.8 no.1
    • /
    • pp.73-85
    • /
    • 2001
  • The spatial population existing in a plane ares, such as an animal or aerial population, have certain relationships among regions which are located within a fixed distance from one selected region. We consider with the adaptive searching estimation in spatial sampling for a spatial population. The adaptive searching estimation depends on values of sample points during the survey and on the nature of the surfaces under investigation. In this paper we study the estimation by the adaptive searching in a spatial sampling for the purpose of estimating the area possessing a particular characteristic in a spatial population. From the viewpoint of adaptive searching, we empirically compare systematic sampling with stratified sampling in spatial sampling through the simulation data.

  • PDF

Heterogeneous Ensemble of Classifiers from Under-Sampled and Over-Sampled Data for Imbalanced Data

  • Kang, Dae-Ki;Han, Min-gyu
    • International journal of advanced smart convergence
    • /
    • v.8 no.1
    • /
    • pp.75-81
    • /
    • 2019
  • Data imbalance problem is common and causes serious problem in machine learning process. Sampling is one of the effective methods for solving data imbalance problem. Over-sampling increases the number of instances, so when over-sampling is applied in imbalanced data, it is applied to minority instances. Under-sampling reduces instances, which usually is performed on majority data. We apply under-sampling and over-sampling to imbalanced data and generate sampled data sets. From the generated data sets from sampling and original data set, we construct a heterogeneous ensemble of classifiers. We apply five different algorithms to the heterogeneous ensemble. Experimental results on an intrusion detection dataset as an imbalanced datasets show that our approach shows effective results.

Comparison of Occurrences of Coleoptera by Three Sampling Methods in Mt. Yeonyeop Area, Korea (채집법에 따른 연엽산 일대 딱정벌레목의 출현상 비교 분석)

  • Jeong Jong-Kook;Lee Seung-Il;Choi Jae-Seok;Kwon Oh-Kil
    • Korean Journal of Environmental Biology
    • /
    • v.23 no.3 s.59
    • /
    • pp.228-237
    • /
    • 2005
  • To compare the occurrence of Coleoptera by different sampling methods such as light trap, pitfall trap and sweeping, we collected samples every month from April to September,2004 in the Mt. Yeonyeop, Gangwon-do, Korea. According to the sampling methods, the species composition, abundance and dry weight were completely different. We collected 151 species in 35 families (690 individuals) by sweeping method, 148 species in 30 families (689 individuals) by light trap, and 112 species in 18 families (1,674 individuals) by pitfall trap, respectively. The dry weight in collected sample was about 181.46 g in pitfall trap,39.85 g in light trap, and 10.89 g in sweeping method, respectively. Relatively high flight and small-sized beetles such as Coccinellidae, Nitidulidae, Scarabaeidae were collected in light trap. The species diversity was high in July. Unlike the samples collected in light trap, the pitfall trap samples were big-sized saprophagous or carnivorous beetles such as Carabidae, Silphidae, Staphylinidae. The pitfall trap showed relatively the higher number of individual and lower species diversity compared to other methods. The major samples collected by sweeping method were small-sized carnivorous or herbivorous beetles such as Chrysomelidae, Curculionidae, Coccinellidae. The peak of species diversity occurred in May. The similarity was calculated with the Jaccard's index over the light trap-pitfall trap was 0.07, light trap-sweeping was 0.10, and pitfall trap-sweeping was 0.01. Consequently, similarity of sampling methods was relatively low. In conclusion, efficiency of the each sampling methods significantly differed in the species composition of Coleoptera. This study emphasize the necessity of using three sampling methods in the area of diversity research.

Comparison the Diagnostic Value of Dilatation and Curettage Versus Endometrial Biopsy by Pipelle - a Clinical Trial

  • Sanam, Moradan;Majid, Mir Mohammad Khani
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.16 no.12
    • /
    • pp.4971-4975
    • /
    • 2015
  • Background: Several methods have been presented for the evaluation of the endometrium in patients with abnormal uterine bleeding, which include minimal invasive and invasive approaches such as diagnostic curettage or endometrial biopsy by Pipelle. Many studies have been performed in order to compare two methods; diagnostic curettage and outpatient endometrial biopsy. This investigation compared sampling adequacy, endometrial histopathology, failure rates, duration and costs between diagnostic curettage in a hospital and endometrial biopsy. Materials and Methods: This single blind clinical trial was performed on 130 patients older than 35 years who was referred to Amir training hospital in 2013 for elective diagnostic curettage because of abnormal uterine bleeding. For all patients eligible for the study, an endometrial sample by Pipelle was taken without anesthesia or dilatation. Then under general anesthesia diagnostic curettage was performed by sharp curette. Sampling duration was calculated and both samples were sent to the same pathologist. The diagnostic values of two methods in the diagnosis of normal endometrium, endometrial hyperplasia and carcinoma were compared. The costs of these two methods were also compared. Data analysis was performed by SPSS (version 16.0) software. Chi-Square, Fisher, and Pearson tests were used and were considered statistically significant at P values less than 0.05. Results: Two methods were agreed upon 88% of sampling adequacy and 94% of pathological results. Specificity of 100% and sensitivity of 90% for detection of proliferative endometrium, secretory endometrium, simple hyperplasia without atypia and 100% for cancer were recorded. Pipelle diagnostic accuracy in comparison with curettage, have been reported over 97%, so the failure rate in this study was below 5%. Sensitivity of Pipelle for detection of atrophic endometrium was reported below 50%. Duration and cost was lower in Pipelle versus curettage. Conclusions: It is concluded that due to high agreement and cohesion coefficient between curettage and Pipelle on the issue of sampling adequacy, histopathology finding (except atrophic endometrium), low failure rate, duration of sampling and cost, Pipelle can be introduced as a suitable alternative of diagnostic curettage.

Sampling, Surveillance and Forecasting of Insect Population for Integrated Pest Management in Sericulture

  • Singh, R.N.;Maheshwari, M.;Saratchandra, B.
    • International Journal of Industrial Entomology and Biomaterials
    • /
    • v.8 no.1
    • /
    • pp.17-26
    • /
    • 2004
  • Pest monitoring through field surveys and surveillance helps in forecasting the population build up of pest. It reduces the load of pesticides application and forms the basis of Integrated Pest Management in sericulture. Common sampling techniques for quantifying pest populations and damage caused by them are reviewed emphasizing the need for quick and simple sampling methods. Various direct and indirect sampling methods for establishing pest populations are discussed and methods have been discussed to use indirect sampling method under IPM programme in sericulture. The use of pheromone lures and traps forms one of the important ingredients of integrated pest management, which calls for integration of all available methods in a cost effective and environmental friendly manner offering consistent efficacy. Silk-worms feed on the variety of silk host plants and spin cocoons. Each silk host plant is attacked in the field by number of insect pest species. Several pests are common to mulberry, tasar, oak tasar, muga and eri host plant but pest status and seasonal abundance differs from each crop. The key pests are serious perennially occurring persistent species which cause considerable yield loss every year on large areas and require control measure. Regular occurrence of minor pest is noticed but sudden increase in its population is not known. The occasional pests are sporadic but potential causing sufficient damage. Silk losses due to attack of all the pests have not been calculated. However, information on pest biology and ecology, and control practices being practiced is available but the period of outbreak of major pests and predators on silkworms and its host plant needs to be reinvestigated. Pest and predators forecasting based on surveillance information may provide an opportunity to minimize the losses, particularly to reduce expenditure involved in pest management.

Comparison of resampling methods for dealing with imbalanced data in binary classification problem (이분형 자료의 분류문제에서 불균형을 다루기 위한 표본재추출 방법 비교)

  • Park, Geun U;Jung, Inkyung
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.3
    • /
    • pp.349-374
    • /
    • 2019
  • A class imbalance problem arises when one class outnumbers the other class by a large proportion in binary data. Studies such as transforming the learning data have been conducted to solve this imbalance problem. In this study, we compared resampling methods among methods to deal with an imbalance in the classification problem. We sought to find a way to more effectively detect the minority class in the data. Through simulation, a total of 20 methods of over-sampling, under-sampling, and combined method of over- and under-sampling were compared. The logistic regression, support vector machine, and random forest models, which are commonly used in classification problems, were used as classifiers. The simulation results showed that the random under sampling (RUS) method had the highest sensitivity with an accuracy over 0.5. The next most sensitive method was an over-sampling adaptive synthetic sampling approach. This revealed that the RUS method was suitable for finding minority class values. The results of applying to some real data sets were similar to those of the simulation.

How to Select Polling Places in Exit Poll? (출구조사의 투표소 표집방안 비교)

  • Cho, Sung-Kyum;Kim, Ji-Yun
    • Survey Research
    • /
    • v.5 no.2
    • /
    • pp.3-30
    • /
    • 2004
  • In Korea, bellwether voting places were selected for exit poll based on the past voting results. Sometimes, voting place stratification were used to improve the exit poll performance. The sampled voting places are intended to mirror the general voters of the entire electoral district. But few studies have been done as to which sampling method works better. This study compared the four sampling methods-bellwether voting place sampling method, random sampling method, stratified bellwether sampling method and systematic sampling from ordered voting places method. When we applied the four methods to the 2004 general election data, the systematic sampling from ordered voting places method outperformed the other three sampling method. Also, we found that the additional sampling of voting places over nine contribute little to the accuracy of the estimation.

  • PDF

An Evaluation of Sampling Design for Estimating an Epidemiologic Volume of Diabetes and for Assessing Present Status of Its Control in Korea (우리나라 당뇨병의 역학적 규모와 당뇨병 관리현황 파악을 위한 표본설계의 평가)

  • Lee, Ji-Sung;Kim, Jai-Yong;Baik, Sei-Hyun;Park, Ie-Byung;Lee, June-Young
    • Journal of Preventive Medicine and Public Health
    • /
    • v.42 no.2
    • /
    • pp.135-142
    • /
    • 2009
  • Objectives : An appropriate sampling strategy for estimating an epidemiologic volume of diabetes has been evaluated through a simulation. Methods : We analyzed about 250 million medical insurance claims data submitted to the Health Insurance Review & Assessment Service with diabetes as principal or subsequent diagnoses, more than or equal to once per year, in 2003. The database was re-constructed to a 'patient-hospital profile' that had 3,676,164 cases, and then to a 'patient profile' that consisted of 2,412,082 observations. The patient profile data was then used to test the validity of a proposed sampling frame and methods of sampling to develop diabetic-related epidemiologic indices. Results : Simulation study showed that a use of a stratified two-stage cluster sampling design with a total sample size of 4,000 will provide an estimate of 57.04%(95% prediction range, 49.83 - 64.24%) for a treatment prescription rate of diabetes. The proposed sampling design consists, at first, stratifying the area of the nation into "metropolitan/city/county" and the types of hospital into "tertiary/secondary/primary/clinic" with a proportion of 5:10:10:75. Hospitals were then randomly selected within the strata as a primary sampling unit, followed by a random selection of patients within the hospitals as a secondly sampling unit. The difference between the estimate and the parameter value was projected to be less than 0.3%. Conclusions : The sampling scheme proposed will be applied to a subsequent nationwide field survey not only for estimating the epidemiologic volume of diabetes but also for assessing the present status of nationwide diabetes control.