Search | Korea Science

Ensemble variable selection using genetic algorithm

Seogyoung, Lee;Martin Seunghwan, Yang;Jongkyeong, Kang;Seung Jun, Shin
- Communications for Statistical Applications and Methods
- /
- v.29 no.6
- /
- pp.629-640
- /
- 2022
Variable selection is one of the most crucial tasks in supervised learning, such as regression and classification. The best subset selection is straightforward and optimal but not practically applicable unless the number of predictors is small. In this article, we propose directly solving the best subset selection via the genetic algorithm (GA), a popular stochastic optimization algorithm based on the principle of Darwinian evolution. To further improve the variable selection performance, we propose to run multiple GA to solve the best subset selection and then synthesize the results, which we call ensemble GA (EGA). The EGA significantly improves variable selection performance. In addition, the proposed method is essentially the best subset selection and hence applicable to a variety of models with different selection criteria. We compare the proposed EGA to existing variable selection methods under various models, including linear regression, Poisson regression, and Cox regression for survival data. Both simulation and real data analysis demonstrate the promising performance of the proposed method.
https://doi.org/10.29220/CSAM.2022.29.6.629 인용 PDF KSCI

Optimization of Swine Breeding Programs Using Genomic Selection with ZPLAN+

Lopez, B.M.;Kang, H.S.;Kim, T.H.;Viterbo, V.S.;Kim, H.S.;Na, C.S.;Seo, K.S.
- Asian-Australasian Journal of Animal Sciences
- /
- v.29 no.5
- /
- pp.640-645
- /
- 2016
The objective of this study was to evaluate the present conventional selection program of a swine nucleus farm and compare it with a new selection strategy employing genomic enhanced breeding value (GEBV) as the selection criteria. The ZPLAN+ software was employed to calculate and compare the genetic gain, total cost, return and profit of each selection strategy. The first strategy reflected the current conventional breeding program, which was a progeny test system (CS). The second strategy was a selection scheme based strictly on genomic information (GS1). The third scenario was the same as GS1, but the selection by GEBV was further supplemented by the performance test (GS2). The last scenario was a mixture of genomic information and progeny tests (GS3). The results showed that the accuracy of the selection index of young boars of GS1 was 26% higher than that of CS. On the other hand, both GS2 and GS3 gave 31% higher accuracy than CS for young boars. The annual monetary genetic gain of GS1, GS2 and GS3 was 10%, 12%, and 11% higher, respectively, than that of CS. As expected, the discounted costs of genomic selection strategies were higher than those of CS. The costs of GS1, GS2 and GS3 were 35%, 73%, and 89% higher than those of CS, respectively, assuming a genotyping cost of $120. As a result, the discounted profit per animal of GS1 and GS2 was 8% and 2% higher, respectively, than that of CS while GS3 was 6% lower. Comparison among genomic breeding scenarios revealed that GS1 was more profitable than GS2 and GS3. The genomic selection schemes, especially GS1 and GS2, were clearly superior to the conventional scheme in terms of monetary genetic gain and profit.
https://doi.org/10.5713/ajas.15.0842 인용 PDF KSCI

Evaluating the Performance of Four Selections in Genetic Algorithms-Based Multispectral Pixel Clustering

Kutubi, Abdullah Al Rahat;Hong, Min-Gee;Kim, Choen
- Korean Journal of Remote Sensing
- /
- v.34 no.1
- /
- pp.151-166
- /
- 2018
This paper compares the four selections of performance used in the application of genetic algorithms (GAs) to automatically optimize multispectral pixel cluster for unsupervised classification from KOMPSAT-3 data, since the selection among three main types of operators including crossover and mutation is the driving force to determine the overall operations in the clustering GAs. Experimental results demonstrate that the tournament selection obtains a better performance than the other selections, especially for both the number of generation and the convergence rate. However, it is computationally more expensive than the elitism selection with the slowest convergence rate in the comparison, which has less probability of getting optimum cluster centers than the other selections. Both the ranked-based selection and the proportional roulette wheel selection show similar performance in the average Euclidean distance using the pixel clustering, even the ranked-based is computationally much more expensive than the proportional roulette. With respect to finding global optimum, the tournament selection has higher potential to reach the global optimum prior to the ranked-based selection which spends a lot of computational time in fitness smoothing. The tournament selection-based clustering GA is used to successfully classify the KOMPSAT-3 multispectral data achieving the sufficient the matic accuracy assessment (namely, the achieved Kappa coefficient value of 0.923).
https://doi.org/10.7780/kjrs.2018.34.1.11 인용 PDF KSCI HTML

The Controlled Selection: Do Algorithms for Optimal Sampling Plan Exist?

Kim, Sun-Woong;Ryu, Jae-Bok;Yum, Joon-Keun
- Proceedings of the Korean Statistical Society Conference
- /
- 2002.11a
- /
- pp.175-178
- /
- 2002
A number of controlled selection methods, which have some advantages for practical surveys in considering controls beyond stratification, have developed throughout the last half-century. With respect to the optimization of sampling plan, it is obvious that we may use optimal controlled selection in preference to satisfactory controlled selection. However, there are currently certain restrictions on the employment of optimal controlled selection. We present further research to improve an algorithm for optimal controlled selection and to develop standard software.
PDF

Selection Responses for Milk, Fat and Protein Yields in Zimbabwean Holstein Cattle

Mandizha, S.;Makuza, S.M.;Mhlanga, F.N.
- Asian-Australasian Journal of Animal Sciences
- /
- v.13 no.7
- /
- pp.883-887
- /
- 2000
One way of evaluating the effectiveness of a dairy breeding program is to measure response to selection. This may be direct or indirect. The objectives of this study were to estimate expected progress for direct selection on milk, fat and protein yields; to estimate the expected correlated responses on indirect selection for milk, fat and protein yields in Zimbabwean Holstein cattle and to establish the effect of selection intensity on responses. The Animal Model contained fixed effects of herd, year of calving, calving month, dry period, milking frequency and additive effects pertaining to cows, sires and dams. AIREML software package was used to analyse the data. The genetic and phenotypic parameters obtained in this study were used to compute direct and correlated responses to selection. Because of the higher heritabilities in first parity, genetic progress was found to be greater when selection was practised on first parity cows as compared to later lactations. It is therefore recommended that older cows in the herd be replaced with improved heifers so as to enhance genetic progress.
https://doi.org/10.5713/ajas.2000.883 인용 PDF

Bayesian estimation for finite population proportion under selection bias via surrogate samples

Choi, Seong Mi;Kim, Dal Ho
- Journal of the Korean Data and Information Science Society
- /
- v.24 no.6
- /
- pp.1543-1550
- /
- 2013
In this paper, we study Bayesian estimation for the finite population proportion in binary data under selection bias. We use a Bayesian nonignorable selection model to accommodate the selection mechanism. We compare four possible estimators of the finite population proportions based on data analysis as well as Monte Carlo simulation. It turns out that nonignorable selection model might be useful for weekly biased samples.
https://doi.org/10.7465/jkdi.2013.24.6.1543 인용 PDF KSCI

DEVELOPMENT OF KNOWLEDGE BASED SELECTION PROCESS FOR FINISHING MATERIALS AT BUILDING DESIGN PHASE

Su-Ho Yun;Hyun-Soo Park;Gyu-Tae Noh;Hye-Rin Lee;Kyo-Jin Koo
- International conference on construction engineering and project management
- /
- 2011.02a
- /
- pp.209-212
- /
- 2011
Selection of finishing materials in the design stage is an important management factor in terms of use safety and satisfaction, and work cost and process. However, selection of materials in the design stage is usually conducted without related guidelines or a set process, but depends on the experience of the architect or advice of materials company employees. Therefore, the aim of this study was to develop a finishing materials selection process that can be used by a architect. Materials selection related rules collected through interview with experts and five office building cases were used as knowledge. In addition, another aim of the study was to propose a prototype system interface for use in the field.
PDF

Distributed Relay Selection Algorithm for Cooperative Communication

Oo, Thant Zin;Hong, Choong-Seon
- Proceedings of the Korean Information Science Society Conference
- /
- 2011.06d
- /
- pp.213-214
- /
- 2011
This paper presents a distributed relay selection algorithm for cooperative communication. The algorithm separates the decision making into two simple steps, decision making for employing cooperative communication and decision making for relay selection.

Simultaneous outlier detection and variable selection via difference-based regression model and stochastic search variable selection

Park, Jong Suk;Park, Chun Gun;Lee, Kyeong Eun
- Communications for Statistical Applications and Methods
- /
- v.26 no.2
- /
- pp.149-161
- /
- 2019
In this article, we suggest the following approaches to simultaneous variable selection and outlier detection. First, we determine possible candidates for outliers using properties of an intercept estimator in a difference-based regression model, and the information of outliers is reflected in the multiple regression model adding mean shift parameters. Second, we select the best model from the model including the outlier candidates as predictors using stochastic search variable selection. Finally, we evaluate our method using simulations and real data analysis to yield promising results. In addition, we need to develop our method to make robust estimates. We will also to the nonparametric regression model for simultaneous outlier detection and variable selection.
https://doi.org/10.29220/CSAM.2019.26.2.149 인용 PDF KSCI

Nonlinear Feature Transformation and Genetic Feature Selection: Improving System Security and Decreasing Computational Cost

Taghanaki, Saeid Asgari;Ansari, Mohammad Reza;Dehkordi, Behzad Zamani;Mousavi, Sayed Ali
- ETRI Journal
- /
- v.34 no.6
- /
- pp.847-857
- /
- 2012
Intrusion detection systems (IDSs) have an important effect on system defense and security. Recently, most IDS methods have used transformed features, selected features, or original features. Both feature transformation and feature selection have their advantages. Neighborhood component analysis feature transformation and genetic feature selection (NCAGAFS) is proposed in this research. NCAGAFS is based on soft computing and data mining and uses the advantages of both transformation and selection. This method transforms features via neighborhood component analysis and chooses the best features with a classifier based on a genetic feature selection method. This novel approach is verified using the KDD Cup99 dataset, demonstrating higher performances than other well-known methods under various classifiers have demonstrated.
https://doi.org/10.4218/etrij.12.1812.0032 인용 PDF KSCI

Search Result 6,631, Processing Time 0.036 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)