• Title/Summary/Keyword: Prediction of variables

Search Result 1,844, Processing Time 0.032 seconds

Development and Testing of a RIVPACS-type Model to Assess the Ecosystem Health in Korean Streams: A Preliminary Study (저서성 대형무척추동물을 이용한 RIVPACS 유형의 하천생태계 건강성 평가법 국내 하천 적용성)

  • Da-Yeong Lee;Dae-Seong Lee;Joong-Hyuk Min;Young-Seuk Park
    • Korean Journal of Ecology and Environment
    • /
    • v.56 no.1
    • /
    • pp.45-56
    • /
    • 2023
  • In stream ecosystem assessment, RIVPACS, which makes a simple but clear evaluation based on macroinvertebrate community, is widely used. In this study, a preliminary study was conducted to develop a RIVPACS-type model suitable for Korean streams nationwide. Reference streams were classified into two types(upstream and downstream), and a prediction model for macroinvertebrates was developed based on each family. A model for upstream was divided into 7 (train): 3 (test), and that for downstream was made using a leave-one-out method. Variables for the models were selected by non-metric multidimensional scaling, and seven variables were chosen, including elevation, slope, annual average temperature, stream width, forest ratio in land use, riffle ratio in hydrological characteristics, and boulder ratio in substrate composition. Stream order classified 3,224 sites as upstream and downstream, and community compositions of sites were predicted. The prediction was conducted for 30 macroinvertebrate families. Expected (E) and observed fauna (O) were compared using an ASPT biotic index, which is computed by dividing the BMWPK score into the number of families in a community. EQR values (i.e. O/E) for ASPT were used to assess stream condition. Lastly, we compared EQR to BMI, an index that is commonly used in the assessment. In the results, the average observed ASPT was 4.82 (±2.04 SD) and the expected one was 6.30 (±0.79 SD), and the expected ASPT was higher than the observed one. In the comparison between EQR and BMI index, EQR generally showed a higher value than the BMI index.

Prediction of Maximal Oxygen Uptake Ages 18~34 Years (18~34 남성의 최대산소 섭취량 추정)

  • Jeon, Yoo-Joung;Im, Jae-Hyeng;Lee, Byung-Kun;Kim, Chang-Hwan;Kim, Byeong-Wan
    • 한국체육학회지인문사회과학편
    • /
    • v.51 no.3
    • /
    • pp.373-382
    • /
    • 2012
  • The purpose of this study is to predict VO2max with body index and submaximal metabolic responses. The subjects are consisted of 250 male aging from 18 to 34 and we separated them into two groups randomly; 179 for a sample, 71 for a cross-validation group. They went through maximal exercise testing with Bruce protocol, and we measured the metabolic responses in the end of the first(3 minute) and second stage(6 minute). To predict VO2max, we applied multiple regression analysis to the sample with stepwise method. Model 1's variables are weight, 6 minute HR and 6 minute VO2(R=0.64, SEE=4.74, CV=11.7%, p<.01), and the equation is VO2max(ml/kg/min)= 72.256-0.340(Weight)-0.220(6minHR)+0.013(6minVO2). Model 2's variables are weight, 6 minute HR, 6 minute VO2, and 6 minute VCO2(R=0.67, SEE=4.59, CV=11.3%, p<.01), and the equation is VO2max(ml/kg/min)= 68.699-0.277(Weight) -0.206(6minHR)+0.020(6minVO2)-0.009(6minVCO2). And the result did not show multicolinearity for both models. Model 2 demonstrated more correlation compared to Model 1. However, when we conducted cross-validation of those models with 71 men, measured VO2max and estimated VO2 Max had statistical significance with correlation (R=0.53, 0.56, P<.01). Although both models are functional with validity considering their simplicity and utility, Model 2 has more accuracy.

A Study on Data-driven Modeling Employing Stratification-related Physical Variables for Reservoir Water Quality Prediction (취수원 수질예측을 위한 성층 물리변수 활용 데이터 기반 모델링 연구)

  • Hyeon June Jang;Ji Young Jung;Kyung Won Joo;Choong Sung Yi;Sung Hoon Kim
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2023.05a
    • /
    • pp.143-143
    • /
    • 2023
  • 최근 대청댐('17), 평림댐('19) 등 광역 취수원에서 망간의 먹는 물 수질기준(0.05mg/L 이하) 초과 사례가 발생되어, 다수의 민원이 제기되는 등 취수원의 망간 관리 중요성이 부각되고 있다. 특히, 동절기 전도(Turn-over)시기에 고농도 망간이 발생되는 경우가 많은데, 현재 정수장에서는 망간을 처리하기 위해 유입구간에 필터를 설치하고 주기적으로 교체하는 방식으로 처리하고 있다. 그러나 단기간에 고농도 망간 다량 유입 시 처리용량의 한계 등 정수장에서의 공정관리가 어려워지므로 사전 예측에 의한 대응 체계 고도화가 필요한 실정이다. 본 연구는 광역취수원인 주암댐을 대상으로 망간 예측의 정확도 향상 및 예측기간 확대를 위해 다양한 머신러닝 기법들을 적용하여 비교 분석하였으며, 독립변수 및 초매개변수 최적화를 진행하여 모형의 정확도를 개선하였다. 머신러닝 모형은 수심별 탁도, 저수위, pH, 수온, 전기전도도, DO, 클로로필-a, 기상, 수문 자료 등의 독립변수와 화순정수장에 유입된 망간 농도를 종속변수로 각 변수에 해당하는 실측치를 학습데이터로 사용하였다. 그리고 데이터기반 모형의 정확도를 개선하기 위해서 성층의 수준을 판별하는 지표로서 PEA(Potential Energy Anomaly)를 도입하여 데이터 분석에 활용하고자 하였다. 분석 결과, 망간 유입률은 계절 주기에 따라 농도가 달라지는 것을 확인하였고 동절기 전도시점과 하절기 장마기간 난류생성 시기에 저층의 고농도 망간이 유입이 되는 것을 분석하였다. 또한, 두 시기의 망간 농도의 변화 패턴이 상이하므로 예측 모델은 각 계절별로 구축해 학습을 진행함으로써 예측의 정확도를 향상할 수 있었다. 다양한 머신러닝 모델을 구축하여 성능 비교를 진행한 결과, 동절기에는 Gradient Boosting Machine, 하절기에는 eXtreme Gradient Boosting의 기법이 우수하여 추론 모델로 활용하고자 하였다. 선정 모델을 통한 단기 수질예측 결과, 전도현상 발생 시기에 대한 추종 및 예측력이 기존의 데이터 모형만 적용했을 경우대비 약 15% 이상 예측 효율이 향상된 것으로 나타났다. 본 연구는 머신러닝 모델을 활용한 망간 농도 예측으로 정수장의 신속한 대응 체계 마련을 지원하고, 수처리 공정의 효율성을 높이는 데 기여할 것으로 기대되며, 후속 연구로 과거 시계열 자료 활용 및 물리모형과의 연결 등을 통해 모델의 신뢰성을 제고 할 계획이다.

  • PDF

Predicting Oxygen Uptake for Men with Moderate to Severe Chronic Obstructive Pulmonary Disease (COPD환자에서 6분 보행검사를 이용한 최대산소섭취량 예측)

  • Kim, Changhwan;Park, Yong Bum;Mo, Eun Kyung;Choi, Eun Hee;Nam, Hee Seung;Lee, Sung-Soon;Yoo, Young Won;Yang, Yun Jun;Moon, Joung Wha;Kim, Dong Soon;Lee, Hyang Yi;Jin, Young-Soo;Lee, Hye Young;Chun, Eun Mi
    • Tuberculosis and Respiratory Diseases
    • /
    • v.64 no.6
    • /
    • pp.433-438
    • /
    • 2008
  • Background: Measurement of the maximum oxygen uptake in patients with chronic obstructive pulmonary disease (COPD) has been used to determine the intensity of exercise and to estimate the patient's response to treatment during pulmonary rehabilitation. However, cardiopulmonary exercise testing is not widely available in Korea. The 6-minute walk test (6MWT) is a simple method of measuring the exercise capacity of a patient. It also provides high reliability data and it reflects the fluctuation in one' s exercise capacity relatively well with using the standardized protocol. The prime objective of the present study is to develop a regression equation for estimating the peak oxygen uptake ($VO_2$) for men with moderate to very severe COPD from the results of a 6MWT. Methods: A total of 33 male patients with moderate to very severe COPD agreed to participate in this study. Pulmonary function testing, cardiopulmonary exercise testing and a 6MWT were performed on their first visits. The index of work ($6M_{work}$, 6-minute walk distance [6MWD]${\times}$body weight) was calculated for each patient. Those variables that were closely related to the peak $VO_2$ were identified through correlation analysis. With including such variables, the equation to predict the peak $VO_2$ was generated by the multiple linear regression method. Results: The peak $VO_2$ averaged $1,015{\pm}392ml/min$, and the mean 6MWD was $516{\pm}195$ meters. The $6M_{work}$ (r=.597) was better correlated to the peak $VO_2$ than the 6MWD (r=.415). The other variables highly correlated with the peak $VO_2$ were the $FEV_1$ (r=.742), DLco (r=.734) and FVC (r=.679). The derived prediction equation was $VO_2$ (ml/min)=($274.306{\times}FEV_1$)+($36.242{\times}DLco$)+($0.007{\times}6M_{work}$)-84.867. Conclusion: Under the circumstances when measurement of the peak $VO_2$ is not possible, we consider the 6MWT to be a simple alternative to measuring the peak $VO_2$. Of course, it is necessary to perform a trial on much larger scale to validate our prediction equation.

Predicting the Performance of Recommender Systems through Social Network Analysis and Artificial Neural Network (사회연결망분석과 인공신경망을 이용한 추천시스템 성능 예측)

  • Cho, Yoon-Ho;Kim, In-Hwan
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.4
    • /
    • pp.159-172
    • /
    • 2010
  • The recommender system is one of the possible solutions to assist customers in finding the items they would like to purchase. To date, a variety of recommendation techniques have been developed. One of the most successful recommendation techniques is Collaborative Filtering (CF) that has been used in a number of different applications such as recommending Web pages, movies, music, articles and products. CF identifies customers whose tastes are similar to those of a given customer, and recommends items those customers have liked in the past. Numerous CF algorithms have been developed to increase the performance of recommender systems. Broadly, there are memory-based CF algorithms, model-based CF algorithms, and hybrid CF algorithms which combine CF with content-based techniques or other recommender systems. While many researchers have focused their efforts in improving CF performance, the theoretical justification of CF algorithms is lacking. That is, we do not know many things about how CF is done. Furthermore, the relative performances of CF algorithms are known to be domain and data dependent. It is very time-consuming and expensive to implement and launce a CF recommender system, and also the system unsuited for the given domain provides customers with poor quality recommendations that make them easily annoyed. Therefore, predicting the performances of CF algorithms in advance is practically important and needed. In this study, we propose an efficient approach to predict the performance of CF. Social Network Analysis (SNA) and Artificial Neural Network (ANN) are applied to develop our prediction model. CF can be modeled as a social network in which customers are nodes and purchase relationships between customers are links. SNA facilitates an exploration of the topological properties of the network structure that are implicit in data for CF recommendations. An ANN model is developed through an analysis of network topology, such as network density, inclusiveness, clustering coefficient, network centralization, and Krackhardt's efficiency. While network density, expressed as a proportion of the maximum possible number of links, captures the density of the whole network, the clustering coefficient captures the degree to which the overall network contains localized pockets of dense connectivity. Inclusiveness refers to the number of nodes which are included within the various connected parts of the social network. Centralization reflects the extent to which connections are concentrated in a small number of nodes rather than distributed equally among all nodes. Krackhardt's efficiency characterizes how dense the social network is beyond that barely needed to keep the social group even indirectly connected to one another. We use these social network measures as input variables of the ANN model. As an output variable, we use the recommendation accuracy measured by F1-measure. In order to evaluate the effectiveness of the ANN model, sales transaction data from H department store, one of the well-known department stores in Korea, was used. Total 396 experimental samples were gathered, and we used 40%, 40%, and 20% of them, for training, test, and validation, respectively. The 5-fold cross validation was also conducted to enhance the reliability of our experiments. The input variable measuring process consists of following three steps; analysis of customer similarities, construction of a social network, and analysis of social network patterns. We used Net Miner 3 and UCINET 6.0 for SNA, and Clementine 11.1 for ANN modeling. The experiments reported that the ANN model has 92.61% estimated accuracy and 0.0049 RMSE. Thus, we can know that our prediction model helps decide whether CF is useful for a given application with certain data characteristics.

Predicting the Pre-Harvest Sprouting Rate in Rice Using Machine Learning (기계학습을 이용한 벼 수발아율 예측)

  • Ban, Ho-Young;Jeong, Jae-Hyeok;Hwang, Woon-Ha;Lee, Hyeon-Seok;Yang, Seo-Yeong;Choi, Myong-Goo;Lee, Chung-Keun;Lee, Ji-U;Lee, Chae Young;Yun, Yeo-Tae;Han, Chae Min;Shin, Seo Ho;Lee, Seong-Tae
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.22 no.4
    • /
    • pp.239-249
    • /
    • 2020
  • Rice flour varieties have been developed to replace wheat, and consumption of rice flour has been encouraged. damage related to pre-harvest sprouting was occurring due to a weather disaster during the ripening period. Thus, it is necessary to develop pre-harvest sprouting rate prediction system to minimize damage for pre-harvest sprouting. Rice cultivation experiments from 20 17 to 20 19 were conducted with three rice flour varieties at six regions in Gangwon-do, Chungcheongbuk-do, and Gyeongsangbuk-do. Survey components were the heading date and pre-harvest sprouting at the harvest date. The weather data were collected daily mean temperature, relative humidity, and rainfall using Automated Synoptic Observing System (ASOS) with the same region name. Gradient Boosting Machine (GBM) which is a machine learning model, was used to predict the pre-harvest sprouting rate, and the training input variables were mean temperature, relative humidity, and total rainfall. Also, the experiment for the period from days after the heading date (DAH) to the subsequent period (DA2H) was conducted to establish the period related to pre-harvest sprouting. The data were divided into training-set and vali-set for calibration of period related to pre-harvest sprouting, and test-set for validation. The result for training-set and vali-set showed the highest score for a period of 22 DAH and 24 DA2H. The result for test-set tended to overpredict pre-harvest sprouting rate on a section smaller than 3.0 %. However, the result showed a high prediction performance (R2=0.76). Therefore, it is expected that the pre-harvest sprouting rate could be able to easily predict with weather components for a specific period using machine learning.

Detecting the Climate Factors related to Dry Matter Yield of Whole Crop Maize (사일리지용 옥수수의 건물수량에 영향을 미치는 기후요인 탐색)

  • Peng, Jing-lun;Kim, Moon-ju;Kim, Young-ju;Jo, Mu-hwan;Nejad, Jalil Ghassemi;Lee, Bae-hun;Ji, Do-hyeon;Kim, Ji-yung;Oh, Seung-min;Kim, Byong-wan;Kim, Kyung-dae;So, Min-jeong;Park, Hyung-soo;Sung, Kyung-il
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.17 no.3
    • /
    • pp.261-269
    • /
    • 2015
  • The purpose of this research is to identify the significance of climate factors related to the significance of change of dry matter yield (DMY) of whole crop maize (WCM) by year through the exploratory data analysis. The data (124 varieties; n=993 in 7 provinces) was prepared after deletion and modification of the insufficient and repetitive data from the results (124 varieties; n=1027 in 7 provinces) of import adaptation experiment done by National Agricultural Cooperation Federation. WCM was classified into early-maturity (25 varieties, n=200), mid-maturity (40 varieties, n=409), late-maturity (27 varieties, n=234) and others (32 varieties, n=150) based on relative maturity and days to silking. For determining climate factors, 6 weather variables were generated using weather data. For detecting DMY and climate factors, SPSS21.0 was used for operating descriptive statistics and Shapiro-Wilk test. Mean DMY by year was classified into upper and lower groups, and a statistically significant difference in DMY was found between two groups (p<0.05). To find the reasons of significant difference between two groups, after statistics analysis of the climate variables, it was found that Seeding-Harvesting Accumulated Growing Degree Days (SHAGDD), Seeding-Harvesting Precipitation (SHP) and Seeding-Harvesting Hour of sunshine (SHH) were significantly different between two groups (p<0.05), whereas Seeding-Harvesting number of Days with Precipitation (SHDP) had no significant effects on DMY (p>0.05). These results indicate that the SHAGDD, SHP and SHH are related to DMY of WCM, but the comparison of R2 among three variables (SHAGDD, SHP and SHH) couldn't be obtained which is needed to be done by regression analysis as well as the prediction model of DMY in the future study.

Impact of Communication Competence and Empathy Abilities on Interpersonal Relationship Abilities among Dental Hygiene Students (일부 치위생 전공 대학생의 의사소통능력과 공감능력이 대인관계능력에 미치는 영향)

  • Kim, Sun-Ju;Kim, Han-Hong
    • Journal of dental hygiene science
    • /
    • v.13 no.3
    • /
    • pp.304-313
    • /
    • 2013
  • The purpose of this study was to examine the influence of the communication competence and empathy abilities of dental hygiene students on their interpersonal relationship abilities. The subjects in this study were 578 students who majored in dental hygiene at five randomly selected colleges. Out of the colleges, three were located in North Chungcheong province, and one was located in the city of Daejeon. The other one was located in South Gyeongsang province. Data were gathered using structured questionnaires from April 1 to May 7, 2013. The major findings of the study were as follows: 1. The respondents got a mean of $3.23{\pm}0.49$, $85.80{\pm}10.12$ and $83.27{\pm}8.37$ in interpersonal relationship abilities, communication competence and empathy abilities respectively. 2. As for communication competence, empathy abilities and interpersonal relationship abilities by general characteristics, there were statistically significant differences according to age, academic year, clinical practice experience and satisfaction with major. 3. The relationship of communication competence and empathy abilities to interpersonal relationship abilities was analyzed, and interpersonal relationship abilities were found to have a strong significant positive correlation to communication competence, empathy abilities and the subfactors of the two. 4. As a result of analyzing which variables affected interpersonal relationship abilities, it's found that interpersonal relationship abilities were under the influence of age, clinical practice experience, communication competence, empathy abilities. These variables made a 57.2% prediction of interpersonal relationship abilities. The above-mentioned findings suggest that communication competence and empathy abilities exerted an influence on interpersonal relationship abilities. Therefore curriculums and educational programs should be developed in consideration of these variables to ensure the stable college lives and successful relationship building of dental hygiene students who are on the way to adulthood and will serve as health care personnels in the future.

Prediction of Nitrate Contamination of Groundwater in the Northern Nonsan area Using Multiple Regression Analysis (다중 회귀 분석을 이용한 논산 북부 지역 지하수의 질산성 질소 오염 예측)

  • Kim, Eun-Young;Koh, Dong-Chan;Ko, Kyung-Seok;Yeo, In-Wook
    • Journal of Soil and Groundwater Environment
    • /
    • v.13 no.5
    • /
    • pp.57-73
    • /
    • 2008
  • Nitrate concentrations were measured up to 49 mg/L (as $NO_3$-N) and 22% of the samples exceeded drinking water standard in shallow and bedrock groundwater of the northern Nonsan area. Nitrate concentrations showed a significant difference among land use groups. To predict nitrate concentration in groundwater, multiple regression analysis was carried out using hydrogeologic parameters of soil media, topography and land use which were categorized as several groups, well depth and altitude, and field parameters of temperature, pH, DO and EC. Hydrogeologic parameters were quantified as area proportions of each category within circular buffers centering at wells. Regression was performed to all the combination of variables and the most relevant model was selected based on adjusted coefficient of determination (Adj. $R^2$). Regression using hydrogelogic parameters with varying buffer radii show highest Adj. $R^2$ at 50m and 300m for shallow and bedrock groundwater, respectively. Shallow groundwater has higher Adj. $R^2$ than bedrock groundwater indicating higher susceptibility to hydrogeologic properties of surface environment near the well. Land use and soil media was major explanatory variables for shallow and bedrock groundwater, respectively and residential area was a major variable in both shallow and bedrock groundwater. Regression involving hydrogeologic parameters and field parameters showed that EC, paddy and pH were major variables in shallow groundwater whereas DO, EC and natural area were in bedrock groundwater. Field parameters have much higher explanatory power over the hydrogeologic parameters suggesting field parameters which are routinely measured can provide important information on each well in assessment of nitrate contamination. The most relevant buffer radii can be applied to estimation of travel time of contaminants in surface environment to wells.

Shipping Industry Support Plan based on Research of Factors Affecting on the Freight Rate of Bulk Carriers by Sizes (부정기선 운임변동성 영향 요인 분석에 따른 우리나라 해운정책 지원 방안)

  • Cheon, Min-Soo;Mun, Ae-ri;Kim, Seog-Soo
    • Journal of Korea Port Economic Association
    • /
    • v.36 no.4
    • /
    • pp.17-30
    • /
    • 2020
  • In the shipping industry, it is essential to engage in the preemptive prediction of freight rate volatility through market monitoring. Considering that freight rates have already started to fall, the loss of shipping companies will soon be uncontrollable. Therefore, in this study, factors affecting the freight rates of bulk carriers, which have relatively large freight rate volatility as compared to container freight rates, were quantified and analyzed. In doing so, we intended to contribute to future shipping market monitoring. We performed an analysis using a vector error correction model and estimated the influence of six independent variables on the charter rates of bulk carriers by Handy Size, Supramax, Panamax, and Cape Size. The six independent variables included the bulk carrier fleet volume, iron ore traffic volume, ribo interest rate, bunker oil price, and Euro-Dollar exchange rate. The dependent variables were handy size (32,000 DWT) spot charter rates, Supramax 6 T/C average charter rates, Pana Max (75,000 DWT) spot charter, and Cape Size (170,000 DWT) spot charter. The study examined charter rates by size of bulk carriers, which was different from studies on existing specific types of ships or fares in oil tankers and chemical carriers other than bulk carriers. Findings revealed that influencing factors differed for each ship size. The Libo interest rate had a significant effect on all four ship types, and the iron ore traffic volume had a significant effect on three ship types. The Ribo rate showed a negative (-) relationship with Handy Size, Supramax, Panamax, and Cape Size. Iron ore traffic influenced three types of linearity, except for Panamax. The size of shipping companies differed depending on their characteristics. These findings are expected to contribute to the establishment of a management strategy for shipping companies by analyzing the factors influencing changes in the freight rates of charterers, which have a profound effect on the management performance of shipping companies.