• 제목/요약/키워드: Ridge regression

Search Result 118, Processing Time 0.03 seconds

Genome-association analysis of Korean Holstein milk traits using genomic estimated breeding value

  • Shin, Donghyun;Lee, Chul;Park, Kyoung-Do;Kim, Heebal;Cho, Kwang-hyeon
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.30 no.3
    • /
    • pp.309-319
    • /
    • 2017
  • Objective: Holsteins are known as the world's highest-milk producing dairy cattle. The purpose of this study was to identify genetic regions strongly associated with milk traits (milk production, fat, and protein) using Korean Holstein data. Methods: This study was performed using single nucleotide polymorphism (SNP) chip data (Illumina BovineSNP50 Beadchip) of 911 Korean Holstein individuals. We inferred each genomic estimated breeding values based on best linear unbiased prediction (BLUP) and ridge regression using BLUPF90 and R. We then performed a genome-wide association study and identified genetic regions related to milk traits. Results: We identified 9, 6, and 17 significant genetic regions related to milk production, fat and protein, respectively. These genes are newly reported in the genetic association with milk traits of Holstein. Conclusion: This study complements a recent Holstein genome-wide association studies that identified other SNPs and genes as the most significant variants. These results will help to expand the knowledge of the polygenic nature of milk production in Holsteins.

Memory retrieval with a DNA computing (DNA 연산을 이용한 기억 인출 시뮬레이션)

  • Kim Joon-Shik;Lee Eun-Seok;Noh Yung-Kyun;Zhang Byoung-Tak
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.06a
    • /
    • pp.34-36
    • /
    • 2006
  • 본 연구는 특정 사물을 계속 접하면서 그 사물에 대한 기억 강도가 의식적 노력 없이도 점점 강화되는 암묵적 기억 인출과정 associative memory retrieval의 DNA 연산 가능성을 논한다. 예를 들어 한 표적 단어에 대한 노출이 이를 관찰하는 시스템에게 그 단어의 기억 강도를 강화시키는 반면, 그와 유사한 다른 단어는 천천히 감소되고 나머지 가장 다른 단어는 일찍 잊혀지는 현상을 생각할 수 있다. 이들 단어들과 알파벳 철자들을 DNA 염기서열로 표현하고 simulated annealing을 통하여 결합 결과를 얻는다. Ridge regression 형태의 supervised 학습을 통하여 한 가지 표적 단어가 많이 생성되도록 DNA 조각들의 개수 분포를 변화시켜 진행한다. 실험 예로 'tic' 'tac' 'toe' 세 가지 단어를 그 아이템으로 정하여 계속 자극받는 표적 단어의 갯수가 증가함을 DNA annealing 시뮬레이션을 통하여 확인할 수 있다. 또한 'tac' 과 't' 와 'c'를 공유하는 'tic' 의 감소 점도가 't'만을 공유하는 'toe' 보다 느림을 확인할 수 있다. 위의 실험들을 통해 연합기억associative memory의 암묵적 인출과정을 분자 층위에서 표현할 수 있음을 확인 할 수 있다.

  • PDF

Optimization of Tri-enzyme Extraction Procedures for the Microbiological Assay of Folate in Red Kidney Bean and Roasted Peanut Using Response Surface Methodology

  • Choi, Young-Min;Eitenmiller, Ronald R.;Kim, Seon-Hee;Lee, Jun-Soo
    • Food Science and Biotechnology
    • /
    • v.18 no.1
    • /
    • pp.31-35
    • /
    • 2009
  • Total folate content was determined by microbiological assay using Lactobacillus casei spp. rhamnosis (ATCC 7469) with a 96-well microplate technique. Using roasted peanut and red kidney beans as representative legume samples, response surface methodology (RSM) was supplied to optimize the trienzyme procedures for the determination of folate in legumes. After response surface regression (RSREG), the second-order polynomial equation was fitted to the experimental data. Ridge analysis showed that the optimal digestion times were <2 hr for $Pronase^{(R)}$ and $\alpha$-amylase, and <5 hr for conjugase to obtain maximal folate values for legume samples. This study confirms that established digestion times for cereal products (AOAC Method 2004.05) of 3 for protease and 2 hr for $\alpha$-amylase are applicable to legumes. Conjugase treatment can be reduced to 5 from 16 hr and the conjugase level to 5 from 20 mg per sample, providing significant cost saving.

Wild Boar (Sus scrofa corranus Heude ) Habitat Modeling Using GIS and Logistic Regression (GIS와 로지스틱 회귀분석을 이용한 멧돼지 서식지 모형 개발)

  • 서창완;박종화
    • Spatial Information Research
    • /
    • v.8 no.1
    • /
    • pp.85-99
    • /
    • 2000
  • Accurate information on habitat distribution of protected fauna is essential for the habitat management of Korea, a country with very high development pressure. The objectives of this study were to develop a habitat suitability model of wild boar based on GIS and logistic regression, and to create habitat distribution map, and to prepare the basis for habitat management of our country s endangered and protected species. The modeling process of this restudyarch had following three steps. First, GIS database of environmental factors related to use and availability of wild boar habitat were built. Wild boar locations were collected by Radio-Telemetry and GPS. Second, environmental factors affecting the habitat use and availability of wild boars were identified through chi-square test. Third, habitat suitability model based on logistic regression were developed, and the validity of the model was tested. Finally , habitat assessment map was created by utilizing a rule-based approach. The results of the study were as folos. First , distinct difference in wild boar habitat use by season and habitat types were found, however, no difference in wild boar habiat use by season and habitat types were found , however, ho difference by sex and activity types were found. Second, it was found, through habitat availability analysis, that elevation , aspect , forest type, and forest age were significant natural environmental factors affecting wild boar hatibate selection, but the effects of slope, ridge/valley, water, and solar radiation could not be identified, Finally, the habitat at cutoff value of 0.5. The model validation showed that inside validation site had the classification accuracy of 73.07% for total habitat and 80.00% for cover habitat , and outside validation site had the classification accuracy of 75.00% for total habitat.

  • PDF

Late Quaternary Sequence Stratigraphy in Kyeonggi Bay, Mid-eastern Yellow Sea (황해 중동부 경기만의 후기 제4기 순차층서 연구)

  • Kwon, Yi-Kyun
    • Journal of the Korean earth science society
    • /
    • v.33 no.3
    • /
    • pp.242-258
    • /
    • 2012
  • The Yellow Sea has sensitively responded to high-amplitude sea-level fluctuations during the late Quaternary. The repeated inundation and exposure have produced distinct transgression-regression successions with extensive exposure surfaces in Kyeonggi Bay. The late Quaternary strata consist of four seismic stratigraphic units, considered as depositional sequences (DS-1, DS-2, DS-3, and DS-4). DS-1 was interpreted as ridge-forming sediments of tidal-flat and estuarine channel-fill facies, formed during the Holocene highstand. DS-2 consists of shallow-marine facies in offshore area, which was formed during the regression of Marine Isotope Stage (MIS)-3 period. DS-3 comprises the lower transgressive facies and the upper highstand tidal-flat facies in proximal ridges and forced regression facies in distal ridges and offshore area. The lowermost DS-4 rests on acoustic basement rocks, considered as the shallow-marine and shelf deposits formed before the MIS-6 lowstand. This study suggests six depositional stages. During the first stage-A, MIS-6 lowstand, the Yellow Sea shelf was subaerially exposed with intensive fluvial incision and weathering. The subsequent rapid and high amplitude rise of sea level in stage-B until the MIS-5e highstand produced transgressive deposits in the lowermost part of the MIS-5 sequence, and the successive regression during the MIS-5d to -5a and the MIS-4 lowstand formed the upperpart of the MIS-5 sequence in stage-C. During the stage-D, from the MIS-4 lowstand to MIS-3c highstand period, the transgressive MIS-3 sequence formed in a subtidal environment characterized by repetitive fluvial incision and channel-fill deposition in exposed area. The subsequent sea-level fall culminating the last glacial maximum (Stage-E) made shallow-marine regressive deposits of MIS-3 sequence in offshore distal area, whereas it formed fluvial channel-fills and floodplain deposits in the proximal area. After the last glacial maximum, the overall Yellow Sea shelf was inundated by the Holocene transgression and highstand (Stage-F), forming the Holocene transgressive shelf sands and tidal ridges.

Prediction of Postoperative Lung Function in Lung Cancer Patients Using Machine Learning Models

  • Oh Beom Kwon;Solji Han;Hwa Young Lee;Hye Seon Kang;Sung Kyoung Kim;Ju Sang Kim;Chan Kwon Park;Sang Haak Lee;Seung Joon Kim;Jin Woo Kim;Chang Dong Yeo
    • Tuberculosis and Respiratory Diseases
    • /
    • v.86 no.3
    • /
    • pp.203-215
    • /
    • 2023
  • Background: Surgical resection is the standard treatment for early-stage lung cancer. Since postoperative lung function is related to mortality, predicted postoperative lung function is used to determine the treatment modality. The aim of this study was to evaluate the predictive performance of linear regression and machine learning models. Methods: We extracted data from the Clinical Data Warehouse and developed three sets: set I, the linear regression model; set II, machine learning models omitting the missing data: and set III, machine learning models imputing the missing data. Six machine learning models, the least absolute shrinkage and selection operator (LASSO), Ridge regression, ElasticNet, Random Forest, eXtreme gradient boosting (XGBoost), and the light gradient boosting machine (LightGBM) were implemented. The forced expiratory volume in 1 second measured 6 months after surgery was defined as the outcome. Five-fold cross-validation was performed for hyperparameter tuning of the machine learning models. The dataset was split into training and test datasets at a 70:30 ratio. Implementation was done after dataset splitting in set III. Predictive performance was evaluated by R2 and mean squared error (MSE) in the three sets. Results: A total of 1,487 patients were included in sets I and III and 896 patients were included in set II. In set I, the R2 value was 0.27 and in set II, LightGBM was the best model with the highest R2 value of 0.5 and the lowest MSE of 154.95. In set III, LightGBM was the best model with the highest R2 value of 0.56 and the lowest MSE of 174.07. Conclusion: The LightGBM model showed the best performance in predicting postoperative lung function.

Optimization for the Post-Harvest Induction of trans-Resveratrol by Soaking Treatment in Raw Peanuts (침지조작에 의한 레스베라트롤 증가조건의 최적화)

  • Lee, Seon-Sook;Seo, Sun-Jung;Lee, Boo-Yong;Lee, Hee-Bong;Lee, Junsoo
    • Journal of the Korean Society of Food Science and Nutrition
    • /
    • v.34 no.4
    • /
    • pp.567-571
    • /
    • 2005
  • In this study, the effects of varying the amount of water, soaking time at $25^{\circ}C$ and drying time after soaking at $45^{\circ}C$ on the induction of resveratrol were evaluated to optimize the soaking treatment by response surface methodology (RSM). After response surface regression (RSREG), the second-order polynomial equation was fitted to the experimental data. The analysis of variance showed that the model appeared to be adequate $(R^2=0.9547)$ with no significant lack of fit (p>0.1). From statistical analysis, amount of water and soaking time were found to be significant factors. On the other hand, drying time was not significant. Ridge analysis showed that the optimized parameters were $67.15\%$ for amount of water, 19.58 hr for soaking time, 65.56 hr for drying time. It was confirmed that resveratrol content was increased from $0.1\;{\mu}g/g$ to $4.55\;{\mu}g/g$ under the optimized conditions. In addition, the experimental values at the optimized condition agreed with values predicted by ridge analysis. The analytical method validation parameters such as accuracy, precision, and specificity were calculated to ensure the method's validity.

Application of the Response Surface Methodology and Process Optimization to the Electrochemical Degradation of Rhodamine B and N, N-Dimethyl-4-nitrosoanilin Using a Boron-doped Diamond Electrode (Boron-doped Diamond 전극을 이용한 Rhodamine B와 N, N-Dimethyl-4-nitrosoanilin의 전기화학적 분해에 반응표면분석법의 적용과 공정 최적화)

  • Kim, Dong-Seog;Park, Young-Seek
    • Journal of Environmental Health Sciences
    • /
    • v.36 no.4
    • /
    • pp.313-322
    • /
    • 2010
  • The aim of this research was to apply experimental design methodology to optimization of conditions of electrochemical oxidation of Rhodamine B (RhB) and N, N-Dimethyl-4-nitrosoaniline (RNO, indicative of the OH radical). The reactions of electrochemical oxidation of RhB degradation were mathematically described as a function of the parameters of current ($X_1$), NaCl dosage ($X_2$) and pH ($X_3$) and modeled by the use of the central composite design. The application of response surface methodology (RSM) yielded the following regression equation, which is an empirical relationship between the removal efficiency of RhB and RNO and test variables in a coded unit: RhB removal efficiency (%) = $94.21+7.02X_1+10.94X_2-16.06X_3+3.70X_1X_3+9.05X_2X_3-{3.46X_1}^2-{4.67X_2}^2-{7.09X_3}^2$; RNO removal efficiency (%) = $54.78+13.33X_1+14.93X_2- 16.90X_3$. The model predictions agreed well with the experimentally observed result. Graphical response surface and contour plots were used to locate the optimum point. The estimated ridge of maximum response and optimal conditions for the RhB degradation using canonical analysis was 100.0%(current, 0.80 A; NaCl dosage, 2.97% and pH 6.37).

Application of the Central Composite Design and Response Surface Methodology to the Treatment of Dye using Electrocoagulation/flotation Process (전기응집/부상 공정을 이용한 염료 처리에 중심합성설계와 반응표면분석법의 적용)

  • Kim, Dong-Seog;Park, Young-Seek
    • Journal of Korean Society on Water Environment
    • /
    • v.26 no.1
    • /
    • pp.35-43
    • /
    • 2010
  • This experimental design and response surface methodology (RSM) have been applied to the investigation of the electrocoagulation/flotation of dye wastewater. The electrocoagulation/flotation reactions were mathematically described as a function of parameters current (A), NaCl concentration (B), initial RhB concentration (C) and time (D) being modeled by use of the central composite design (CCD). The application of RSM using the CCD yielded the following regression equation, which is an empirical relationship between the RhB removal (%) and test variables in RhB removal (%) = $-300.42+129.21{\cdot}Current+46.99{\cdot}NaCl-0.11{\cdot}RhB-+43.71{\cdot}Time-5.67{\cdot}Current{\cdot}NaCl-3.18{\cdot}Current{\cdot}Time-2.41{\cdot}NaCl{\cdot}Time-19.79{\cdot}Current^2-2.27{\cdot}NaCl^2-1.59{\cdot}Time^2$. the model predictions agreed well with the experimentally observed result ($R^{2}=0.9728$). The estimated ridge of maximum response and optimal conditions for RhB removal (%) using canonical analysis was 99.4% (A: 1,77 A, NaCl concentration: 2.23 g/L, RhB concentration: 56.12 mg/L, Time: 9.98 min). To confirm this optimum condition, three additional experiments were performed and RhB removal (%) were within range of 86.87% (95% PI low)~111.93% (95% PI high) obtained.

Disinfection of E. coli Using Electro-UV Complex Process: Disinfection Characteristics and Optimization by the Design of Experiment Based on the Box-Behnken Technique (전기-UV 복합 공정을 이용한 E. coli 소독 : 실험계획법중 박스-벤켄법을 이용한 소독 특성 및 최적화)

  • Kim, Dong-Seog;Park, Young-Seek
    • Journal of Environmental Science International
    • /
    • v.19 no.7
    • /
    • pp.889-900
    • /
    • 2010
  • The experimental design and response surface methodology (RSM) have been applied to the investigation of the electro-UV complex process for the disinfection of E. coli in the water. The disinfection reactions of electro-UV process were mathematically described as a function of parameters power ($X_1$), NaCl dosage ($X_2$), initial pH ($X_3$) and disinfection time ($X_4$) being modeled by use of the Box-Behnken technique. The application of RSM using the Box-Behnken technique yielded the following regression equation, which is an empirical relationship between the residual E. coli number and test variables in actual variables: Ln (CFU) = 23.57 - 0.87 power - 1.87 NaCl dosage - 2.13 pH - 2.84 time - 0.09 power time - 0.07 NaCl dosage pH + 0.14 pH time + 0.03 $power^2$ + 0.47 NaCl $dosage^2$ + 0.20 $pH^2$+ 0.33 $time^2$. The model predictions agreed well with the experimentally observed result ($R^2$ = 0.9987). Graphical response surface and contour plots were used to locate the optimum point. The estimated ridge of maximum response and optimal conditions for the E. coli disinfection using canonical analysis was Ln 1.06 CFU (power, 15.40 W; NaCl dosage, 1.95 g/L, pH, 5.94 and time, 4.67 min). To confirm this optimum condition, the obtained number of the residual E. coli after three additional experiments were Ln 1.05, 1.10 and Ln 1.12. These values were within range of 0.62 (95% PI low)~1.50 (95% PI high), which indicated that conforming the reproducibility of the model.