• Title/Summary/Keyword: stepwise addition

Search Result 248, Processing Time 0.093 seconds

Apartment Price Prediction Using Deep Learning and Machine Learning (딥러닝과 머신러닝을 이용한 아파트 실거래가 예측)

  • Hakhyun Kim;Hwankyu Yoo;Hayoung Oh
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.2
    • /
    • pp.59-76
    • /
    • 2023
  • Since the COVID-19 era, the rise in apartment prices has been unconventional. In this uncertain real estate market, price prediction research is very important. In this paper, a model is created to predict the actual transaction price of future apartments after building a vast data set of 870,000 from 2015 to 2020 through data collection and crawling on various real estate sites and collecting as many variables as possible. This study first solved the multicollinearity problem by removing and combining variables. After that, a total of five variable selection algorithms were used to extract meaningful independent variables, such as Forward Selection, Backward Elimination, Stepwise Selection, L1 Regulation, and Principal Component Analysis(PCA). In addition, a total of four machine learning and deep learning algorithms were used for deep neural network(DNN), XGBoost, CatBoost, and Linear Regression to learn the model after hyperparameter optimization and compare predictive power between models. In the additional experiment, the experiment was conducted while changing the number of nodes and layers of the DNN to find the most appropriate number of nodes and layers. In conclusion, as a model with the best performance, the actual transaction price of apartments in 2021 was predicted and compared with the actual data in 2021. Through this, I am confident that machine learning and deep learning will help investors make the right decisions when purchasing homes in various economic situations.

Temporal distritution analysis of design rainfall by significance test of regression coefficients (회귀계수의 유의성 검정방법에 따른 설계강우량 시간분포 분석)

  • Park, Jin Heea;Lee, Jae Joon
    • Journal of Korea Water Resources Association
    • /
    • v.55 no.4
    • /
    • pp.257-266
    • /
    • 2022
  • Inundation damage is increasing every year due to localized heavy rain and an increase of rainfall exceeding the design frequency. Accordingly, the importance of hydraulic structures for flood control and defense is also increasing. The hydraulic structures are designed according to its purpose and performance, and the amount of flood is an important calculation factor. However, in Korea, design rainfall is used as input data for hydrological analysis for the design of hydraulic structures due to the lack of sufficient data and the lack of reliability of observation data. Accurate probability rainfall and its temporal distribution are important factors to estimate the design rainfall. In practice, the regression equation of temporal distribution for the design rainfall is calculated using the cumulative rainfall percentage of Huff's quartile method. In addition, the 6th order polynomial regression equation which shows high overall accuracy, is uniformly used. In this study, the optimized regression equation of temporal distribution is derived using the variable selection method according to the principle of parsimony in statistical modeling. The derived regression equation of temporal distribution is verified through the significance test. As a result of this study, it is most appropriate to derive the regression equation of temporal distribution using the stepwise selection method, which has the advantages of both forward selection and backward elimination.

The Influence of Professional Identity, Role Conflict, and Job Stress on Job Satisfaction of Nurses in the General Hospital Wards

  • Su-Kyung Kim;Sun-Yeun Hong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.10
    • /
    • pp.187-195
    • /
    • 2023
  • In this paper aims to identify the effects of professional identity, role conflict and job stress on job satisfaction of nurses in general hospital wards. The subjects of this study were 193 nurses who worked in general hospitals in K district. Data were analyzed using frequency, percentage, mean, standard deviation, t-test, one-way ANOVA, Scheffe test, Pearson's correlation coefficient, and multiple stepwise regression. The results of the study are as follows. First, among the general characteristics, working years in general hospital wards showed a statistically significant difference in professional identity the less the working years in general hospital, the higher the professional identity. Second, professional identity showed a positive correlation with job satisfaction and role conflict, and a negative correlation with job stress. The higher the professional identity, the higher the job satisfaction and role conflict, but the lower the job stress. Role conflict showed a positive correlation with job stress the higher the role conflict, the higher the job stress. Job stress and job satisfaction showed a negative correlation the higher the job stress, the lower the job satisfaction. Third, the factors that affected the job satisfaction of nurses working in general hospital wards were job stress and professional identity, which had an influence of 38%. Thus, to improve the job satisfaction of nurses, it is necessary to develop programs that develop a positive professional identity. In addition, it is necessary to recognize the necessity of relieving job stress caused at work and to secure diverse human and material support resources.

Transient Behaviors of a Two-Stage Biofilter Packed with Immobilized Microorganisms when Treating a Mixture of Odorous Compounds (미생물 포괄고정화 담체를 이용한 이단 바이오필터에서의 오염부하량 동적 부하변동시 복합악취 제거효율 변화특성)

  • NamGung, Hyeong-Kyu;Shin, Seung-Kyu;Hwang, Sun-Jin;Song, Ji-Hyeon
    • Journal of Korean Society of Environmental Engineers
    • /
    • v.32 no.12
    • /
    • pp.1126-1133
    • /
    • 2010
  • A two-stage biofilter was constructed and utilized to determine the removal efficiency when treating dynamic loading of a mixture of odorous compounds including benzene, toluene, p-xylene, ammonia and hydrogen sulfide. A yeast strain, Candida tropicalis, and a sulfur oxidizing bacterial (SOB) strain, Acidithiobacillus caldus sp., were immobilized in polyurethane media and packed in the two-stage biofilter. The experiment of dynamic loading variation was composed of (1) stepwise loading variation of all the odorous compounds (total EC test), (2) stepwise loading variation of each odorous compound, and (3) intermittent loading variation with 2-day-off and 3-day-on. The total EC test showed that the maximum elimination capacity was $61\;g/m^3/hr$ for total VOCs, and 5.2 and $9.1\;g/m^3/hr$ for ammonia and hydrogen, respectively. In addition, the inhibition between VOCs was observed when the loading of each individual VOC was varied. Especially the stepwise increase in toluene loading resulted in decreases of benzene and p-xylene removal efficiencies about 30% and 25%, respectively. However, the inhibition between organic and inorganic compounds was not observed. The intermittent loading variation with 2-day-off and 3-day-on showed that greater than 95% of the overall removal efficiency was restored in two days after the loading resumed. Consequently, the two-stage biofilter packed with immobilized microorganisms showed advantages over conventional biofilters for the simultaneous treatment of the mixture of organic and inorganic odorous compounds.

Assessment of Dietitian's Nutritional Quality Management for School Food Service (학교급식 영양사의 영양적 품질관리 수행도 평가)

  • Ryu, Kyung;Woo, Chang-Nam;Kim, Woon-Ju
    • Journal of the Korean Society of Food Science and Nutrition
    • /
    • v.35 no.2
    • /
    • pp.238-247
    • /
    • 2006
  • The purpose of this study was to evaluate school dieticians' performance of nutritional quality control, and further to establish effective and objective standards of nutritional quality control. Data for this study came from 200 school dieticians' responses in the Chungbuk area. The total quality management (TQM)-based questionnaire was structured. The questionnaire consisted of the following four fields (1) performance of nutritional quality control, (2) performance of stepwise food production to maximize nutrient preservation rate, (3) management of documents and records related to nutritional quality control, and (4) other relating matters. The items of the questionnaire were measured on a five-point Likert scale which ranged from 'strongly agree' to 'strongly disagree'. First, the analysis indicated that school dieticians performed 'least' on human resource management', 'mediocre' on nutritional quality control, and best on 'leadership'. Second, the analysis on performance of stepwise food production to maximize nutrient preservation rate showed that dieticians considerably endeavored to maximize nutrients of cooked food, but it was found out that the most of nutrient destruction can be caused by heating during cooking. Third, the result showed that the systematic use of documents and records for nutritional quality control was not sufficiently accomplished, especially in the production phase of food. In addition, the measure by the Pearson correlation coefficient indicated that there was a significant relationship between performance of nutritional quality control and performance of stepwise food production to maximize nutrient preservation rate, and between performance of nutritional quality control and management of documents and records related to nutritional quality control. Finally, the findings of this study suggest that more effort should be exerted to carefully establish TQM-based standards for the improvement of nutritionary quality.

Mini-Array of Multiple Tumor-associated Antigens (TAAs) in the Immunodiagnosis of Esophageal Cancer

  • Qin, Jie-Jie;Wang, Xiao-Rui;Wang, Peng;Ren, Peng-Fei;Shi, Jian-Xiang;Zhang, Hong-Fei;Xia, Jun-Fen;Wang, Kai-Juan;Song, Chun-Hua;Dai, Li-Ping;Zhang, Jian-Ying
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.6
    • /
    • pp.2635-2640
    • /
    • 2014
  • Sera of cancer patients may contain antibodies that react with a unique group of autologous cellular antigens called tumor-associated antigens (TAAs). The present study aimed to determine whether a mini-array of multiple TAAs would enhance antibody detection and be a useful approach in esophageal cancer detection and diagnosis. Our mini-array of multiple TAAs consisted of eleven antigens, p53, pl6, Impl, CyclinB1, C-myc, RalA, p62, Survivin, Koc, CyclinD1 and CyclinE full-length recombinant proteins. Enzyme-linked immunosorbent assays (ELISA) were used to detect autoantibodies against eleven selected TAAs in 174 sera from patients with esophageal cancer, as well as 242 sera from normal individuals. In addition, positive results of ELISA were confirmed by Western blotting. In a parallel screening trial, with the successive addition of antigen to a final total of eleven TAAs, there was a stepwise increase in positive antibody reactions. The eleven TAAs were the best parallel combination, and the sensitivity and specificity in diagnosing esophageal cancer was 75.3% and 81.0%, respectively. The positive and negative predictive values were 74.0% and 82.0%, respectively, indicating that the parallel assay of eleven TAAs raised the diagnostic precision significantly. In addition, the levels of antibodies to seven antigens, comprising p53, Impl, C-myc, RalA, p62, Survivin, and CyclinD1, were significantly different in various stages of esophageal cancer, which showed that autoantibodies may be involved in the pathogenesis and progression of esophageal cancer. All in all, this study further supports our previous hypothesis that a combination of antibodies might acquire higher sensitivity for the diagnosis of certain types of cancer. A customized mini-array of multiple carefully-selected TAAs is able to enhance autoantibody detection in the immunodiagnosis of esophageal cancer and autoantibodies to TAAs might be reference indicators of clinical stage.

Dissolution and Duodenal Permeation Characteristics of Lovastatin from Bile Salt Solid Dispersions (담즙산염과의 고체분산체로부터 로바스타틴의 용출 및 십이지장 점막 투과 특성)

  • Chun, In-Koo
    • Journal of Pharmaceutical Investigation
    • /
    • v.39 no.2
    • /
    • pp.97-106
    • /
    • 2009
  • Although lovastatin (LS) is widely used in the treatment of hypercholesterolemia, its bioavailability is known to be around 5%. This study was aimed to increase the solubility and dissolution-permeation rates of LS using solid dispersions (SDs) with bile salts. The solubilities of LS in water, aqueous bile salt solutions and non-aqueous vehicles were determined, and effects of bile salts on the cellulose or duodenal permeation of LS from SDs were evaluated using a horizontal permeation system. SDs were prepared at various ratios of LS to carriers, such as sodium deoxycholate (SDC), sodium glycocholate (SGC) and/or 2-hydroxypropyl-$\beta$-cyclodextrin (HPCD). The addition of bile salts (25 mM) in water increased markedly the solubility of LS by the micellar solubilization. Some non-aqueous vehicles were effective in solubilizing LS. From differential scanning calorimetric studies, it was found that the crystallinity of LS in SDs disappeared, indicating a formation of amorphous state. The SDs showed markedly enhanced dissolution compared with those of their physical mixtures (PMs) and drug alone. In the dissolution-permeation studies using a cellulose membrane, the donor and receptor solutions were maintained as a sink condition using pH 7.0 phosphate buffer containing 0.05% sodium lauryl sulfate (SLS). The flux of LS alone was nearly same as that of LS-SDC-HPCD (1:3:6) PM. However, the flux of LS-SDC-HPCD (1:3:6) SD slightly increased compared with drug alone and PM, suggesting that entrapment of LS in micelles does not significantly hinder the permeation across cellulose membrane. In the dissolution-duodenal permeation studies using a LS-HPCD-SDC (1:3:6) SD, the addition of various bile salts in donor solutions (25 mM) enhanced the permeation of LS markedly, and the fluxes were found to be $0.69{\pm}0.41$, $0.87{\pm}0.51$, $0.84{\pm}0.46$, $0.47{\pm}0.17$ and $0.68{\pm}0.32{\mu}g/cm^2/hr$ for sodium cholate (SC), SDC, SGC, sodium taurodeoxycholate (STDC) and sodium taurocholate (STC), respectively. The stepwise increase of donor SGC concentration increased the flux dose-dependently. From the relationship of donor SGC concentration and flux, the concentration of SGC initiating the permeation across the duodenal mucosa was calculated to be 11.1 mM, which is nearly same as the critical micelle concentration (CMC, 11.6 mM) of SGC. However, with no addition of bile salts and below CMC, the permeation was very limited and irratic, indicating that LS itself is very poor permeable. Higher protions of bile salt in SD such as LS-SDC or LS-SGC (1 : 49 and 1 : 69) showed highly promoted fluxes. In conclusion, SD systems with bile salts, which may form their micelles in intestinal fluids, might be a promising means for providing enhanced dissolution and intestinal permeation of practically insoluble and non-absorbable LS.

Development and application of prediction model of hyperlipidemia using SVM and meta-learning algorithm (SVM과 meta-learning algorithm을 이용한 고지혈증 유병 예측모형 개발과 활용)

  • Lee, Seulki;Shin, Taeksoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.111-124
    • /
    • 2018
  • This study aims to develop a classification model for predicting the occurrence of hyperlipidemia, one of the chronic diseases. Prior studies applying data mining techniques for predicting disease can be classified into a model design study for predicting cardiovascular disease and a study comparing disease prediction research results. In the case of foreign literatures, studies predicting cardiovascular disease were predominant in predicting disease using data mining techniques. Although domestic studies were not much different from those of foreign countries, studies focusing on hypertension and diabetes were mainly conducted. Since hypertension and diabetes as well as chronic diseases, hyperlipidemia, are also of high importance, this study selected hyperlipidemia as the disease to be analyzed. We also developed a model for predicting hyperlipidemia using SVM and meta learning algorithms, which are already known to have excellent predictive power. In order to achieve the purpose of this study, we used data set from Korea Health Panel 2012. The Korean Health Panel produces basic data on the level of health expenditure, health level and health behavior, and has conducted an annual survey since 2008. In this study, 1,088 patients with hyperlipidemia were randomly selected from the hospitalized, outpatient, emergency, and chronic disease data of the Korean Health Panel in 2012, and 1,088 nonpatients were also randomly extracted. A total of 2,176 people were selected for the study. Three methods were used to select input variables for predicting hyperlipidemia. First, stepwise method was performed using logistic regression. Among the 17 variables, the categorical variables(except for length of smoking) are expressed as dummy variables, which are assumed to be separate variables on the basis of the reference group, and these variables were analyzed. Six variables (age, BMI, education level, marital status, smoking status, gender) excluding income level and smoking period were selected based on significance level 0.1. Second, C4.5 as a decision tree algorithm is used. The significant input variables were age, smoking status, and education level. Finally, C4.5 as a decision tree algorithm is used. In SVM, the input variables selected by genetic algorithms consisted of 6 variables such as age, marital status, education level, economic activity, smoking period, and physical activity status, and the input variables selected by genetic algorithms in artificial neural network consist of 3 variables such as age, marital status, and education level. Based on the selected parameters, we compared SVM, meta learning algorithm and other prediction models for hyperlipidemia patients, and compared the classification performances using TP rate and precision. The main results of the analysis are as follows. First, the accuracy of the SVM was 88.4% and the accuracy of the artificial neural network was 86.7%. Second, the accuracy of classification models using the selected input variables through stepwise method was slightly higher than that of classification models using the whole variables. Third, the precision of artificial neural network was higher than that of SVM when only three variables as input variables were selected by decision trees. As a result of classification models based on the input variables selected through the genetic algorithm, classification accuracy of SVM was 88.5% and that of artificial neural network was 87.9%. Finally, this study indicated that stacking as the meta learning algorithm proposed in this study, has the best performance when it uses the predicted outputs of SVM and MLP as input variables of SVM, which is a meta classifier. The purpose of this study was to predict hyperlipidemia, one of the representative chronic diseases. To do this, we used SVM and meta-learning algorithms, which is known to have high accuracy. As a result, the accuracy of classification of hyperlipidemia in the stacking as a meta learner was higher than other meta-learning algorithms. However, the predictive performance of the meta-learning algorithm proposed in this study is the same as that of SVM with the best performance (88.6%) among the single models. The limitations of this study are as follows. First, various variable selection methods were tried, but most variables used in the study were categorical dummy variables. In the case with a large number of categorical variables, the results may be different if continuous variables are used because the model can be better suited to categorical variables such as decision trees than general models such as neural networks. Despite these limitations, this study has significance in predicting hyperlipidemia with hybrid models such as met learning algorithms which have not been studied previously. It can be said that the result of improving the model accuracy by applying various variable selection techniques is meaningful. In addition, it is expected that our proposed model will be effective for the prevention and management of hyperlipidemia.

A Study on Utilization Strategy of Big Data for Local Administration by Analyzing Cases (사례분석을 통한 지방행정의 빅데이터 활용 전략)

  • Noh, Kyoo-Sung
    • Journal of Digital Convergence
    • /
    • v.12 no.1
    • /
    • pp.89-97
    • /
    • 2014
  • As Big Data's value is perceived and Government 3.0 is announced, there is a growing interest in Big Data. However, it won't be easy for each public institute or local government to apply Big Data systematically and make a successful achievement despite lacking of specific alternative plan or strategy. So, this study tried to suggest strategies to use Big Data after arranging the area which local government utilize it in. As a result, utilization areas of local administration's Big Data are divided into four areas; recognizing and corresponding the abnormal phenomenon, predicting and corresponding the close future, corresponding analyzed situation and developing new policy(administration service), and citizen customized service. In addition, strategies about how to use Big Data are suggested; stepwise approach, user's requirements analysis, critical success factors based implementation, pilot project, result evaluation, performance based incentive, building common infrastructure.

Risk Factors Associated with Clinical Insomnia in Chronic Low Back Pain: A Retrospective Analysis in a University Hospital in Korea

  • Kim, Shin Hyung;Sun, Jong Min;Yoon, Kyung Bong;Moon, Joo Hwa;An, Jong Rin;Yoon, Duck Mi
    • The Korean Journal of Pain
    • /
    • v.28 no.2
    • /
    • pp.137-143
    • /
    • 2015
  • Background: Insomnia is becoming increasingly recognized as a clinically important symptom in patients with chronic low back pain (CLBP). In this retrospective study, we have determined risk factors associated with clinical insomnia in CLBP patients in a university hospital in Korea. Methods: Data from four-hundred and eighty one CLBP patients was analyzed in this study. The Insomnia Severity Index (ISI) was used to determine the presence of clinical insomnia (ISI score ${\geq}15$). Patients' demographics and pain-related factors were evaluated by logistic regression analysis to identify risk factors of clinical insomnia in CLBP. Results: It was found that 43% of patients reported mild to severe insomnia after the development of back pain. In addition, 20% of patients met the criteria for clinically significant insomnia (ISI score ${\geq}15$). In a stepwise multivariate analysis, high pain intensity, the presence of comorbid musculoskeletal pain and neuropathic pain components, and high level of depression were strongly associated with clinical insomnia in CLBP. Among these factors, the presence of comorbid musculoskeletal pain other than back pain was the strongest determinant, with the highest odds ratio of 8.074 (95% CI 4.250 to 15.339) for predicting clinical insomnia. Conclusions: Insomnia should be addressed as an integral part of pain management in CLBP patients with these risk factors, especially in patients suffering from CLBP with comorbid musculoskeletal pain.