• Title/Summary/Keyword: Random Error

Search Result 999, Processing Time 0.038 seconds

Comparative Assessment of Linear Regression and Machine Learning for Analyzing the Spatial Distribution of Ground-level NO2 Concentrations: A Case Study for Seoul, Korea (서울 지역 지상 NO2 농도 공간 분포 분석을 위한 회귀 모델 및 기계학습 기법 비교)

  • Kang, Eunjin;Yoo, Cheolhee;Shin, Yeji;Cho, Dongjin;Im, Jungho
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.6_1
    • /
    • pp.1739-1756
    • /
    • 2021
  • Atmospheric nitrogen dioxide (NO2) is mainly caused by anthropogenic emissions. It contributes to the formation of secondary pollutants and ozone through chemical reactions, and adversely affects human health. Although ground stations to monitor NO2 concentrations in real time are operated in Korea, they have a limitation that it is difficult to analyze the spatial distribution of NO2 concentrations, especially over the areas with no stations. Therefore, this study conducted a comparative experiment of spatial interpolation of NO2 concentrations based on two linear-regression methods(i.e., multi linear regression (MLR), and regression kriging (RK)), and two machine learning approaches (i.e., random forest (RF), and support vector regression (SVR)) for the year of 2020. Four approaches were compared using leave-one-out-cross validation (LOOCV). The daily LOOCV results showed that MLR, RK, and SVR produced the average daily index of agreement (IOA) of 0.57, which was higher than that of RF (0.50). The average daily normalized root mean square error of RK was 0.9483%, which was slightly lower than those of the other models. MLR, RK and SVR showed similar seasonal distribution patterns, and the dynamic range of the resultant NO2 concentrations from these three models was similar while that from RF was relatively small. The multivariate linear regression approaches are expected to be a promising method for spatial interpolation of ground-level NO2 concentrations and other parameters in urban areas.

Estimation of TROPOMI-derived Ground-level SO2 Concentrations Using Machine Learning Over East Asia (기계학습을 활용한 동아시아 지역의 TROPOMI 기반 SO2 지상농도 추정)

  • Choi, Hyunyoung;Kang, Yoojin;Im, Jungho
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.2
    • /
    • pp.275-290
    • /
    • 2021
  • Sulfur dioxide (SO2) in the atmosphere is mainly generated from anthropogenic emission sources. It forms ultra-fine particulate matter through chemical reaction and has harmful effect on both the environment and human health. In particular, ground-level SO2 concentrations are closely related to human activities. Satellite observations such as TROPOMI (TROPOspheric Monitoring Instrument)-derived column density data can provide spatially continuous monitoring of ground-level SO2 concentrations. This study aims to propose a 2-step residual corrected model to estimate ground-level SO2 concentrations through the synergistic use of satellite data and numerical model output. Random forest machine learning was adopted in the 2-step residual corrected model. The proposed model was evaluated through three cross-validations (i.e., random, spatial and temporal). The results showed that the model produced slopes of 1.14-1.25, R values of 0.55-0.65, and relative root-mean-square-error of 58-63%, which were improved by 10% for slopes and 3% for R and rRMSE when compared to the model without residual correction. The model performance by country was slightly reduced in Japan, often resulting in overestimation, where the sample size was small, and the concentration level was relatively low. The spatial and temporal distributions of SO2 produced by the model agreed with those of the in-situ measurements, especially over Yangtze River Delta in China and Seoul Metropolitan Area in South Korea, which are highly dependent on the characteristics of anthropogenic emission sources. The model proposed in this study can be used for long-term monitoring of ground-level SO2 concentrations on both the spatial and temporal domains.

Estimation of Surface fCO2 in the Southwest East Sea using Machine Learning Techniques (기계학습법을 이용한 동해 남서부해역의 표층 이산화탄소분압(fCO2) 추정)

  • HAHM, DOSHIK;PARK, SOYEONA;CHOI, SANG-HWA;KANG, DONG-JIN;RHO, TAEKEUN;LEE, TONGSUP
    • The Sea:JOURNAL OF THE KOREAN SOCIETY OF OCEANOGRAPHY
    • /
    • v.24 no.3
    • /
    • pp.375-388
    • /
    • 2019
  • Accurate evaluation of sea-to-air $CO_2$ flux and its variability is crucial information to the understanding of global carbon cycle and the prediction of atmospheric $CO_2$ concentration. $fCO_2$ observations are sparse in space and time in the East Sea. In this study, we derived high resolution time series of surface $fCO_2$ values in the southwest East Sea, by feeding sea surface temperature (SST), salinity (SSS), chlorophyll-a (CHL), and mixed layer depth (MLD) values, from either satellite-observations or numerical model outputs, to three machine learning models. The root mean square error of the best performing model, a Random Forest (RF) model, was $7.1{\mu}atm$. Important parameters in predicting $fCO_2$ in the RF model were SST and SSS along with time information; CHL and MLD were much less important than the other parameters. The net $CO_2$ flux in the southwest East Sea, calculated from the $fCO_2$ predicted by the RF model, was $-0.76{\pm}1.15mol\;m^{-2}yr^{-1}$, close to the lower bound of the previous estimates in the range of $-0.66{\sim}-2.47mol\;m^{-2}yr^{-1}$. The time series of $fCO_2$ predicted by the RF model showed a significant variation even in a short time interval of a week. For accurate evaluation of the $CO_2$ flux in the Ulleung Basin, it is necessary to conduct high resolution in situ observations in spring when $fCO_2$ changes rapidly.

Generation of Daily High-resolution Sea Surface Temperature for the Seas around the Korean Peninsula Using Multi-satellite Data and Artificial Intelligence (다종 위성자료와 인공지능 기법을 이용한 한반도 주변 해역의 고해상도 해수면온도 자료 생산)

  • Jung, Sihun;Choo, Minki;Im, Jungho;Cho, Dongjin
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.5_2
    • /
    • pp.707-723
    • /
    • 2022
  • Although satellite-based sea surface temperature (SST) is advantageous for monitoring large areas, spatiotemporal data gaps frequently occur due to various environmental or mechanical causes. Thus, it is crucial to fill in the gaps to maximize its usability. In this study, daily SST composite fields with a resolution of 4 km were produced through a two-step machine learning approach using polar-orbiting and geostationary satellite SST data. The first step was SST reconstruction based on Data Interpolate Convolutional AutoEncoder (DINCAE) using multi-satellite-derived SST data. The second step improved the reconstructed SST targeting in situ measurements based on light gradient boosting machine (LGBM) to finally produce daily SST composite fields. The DINCAE model was validated using random masks for 50 days, whereas the LGBM model was evaluated using leave-one-year-out cross-validation (LOYOCV). The SST reconstruction accuracy was high, resulting in R2 of 0.98, and a root-mean-square-error (RMSE) of 0.97℃. The accuracy increase by the second step was also high when compared to in situ measurements, resulting in an RMSE decrease of 0.21-0.29℃ and an MAE decrease of 0.17-0.24℃. The SST composite fields generated using all in situ data in this study were comparable with the existing data assimilated SST composite fields. In addition, the LGBM model in the second step greatly reduced the overfitting, which was reported as a limitation in the previous study that used random forest. The spatial distribution of the corrected SST was similar to those of existing high resolution SST composite fields, revealing that spatial details of oceanic phenomena such as fronts, eddies and SST gradients were well simulated. This research demonstrated the potential to produce high resolution seamless SST composite fields using multi-satellite data and artificial intelligence.

Study on water quality prediction in water treatment plants using AI techniques (AI 기법을 활용한 정수장 수질예측에 관한 연구)

  • Lee, Seungmin;Kang, Yujin;Song, Jinwoo;Kim, Juhwan;Kim, Hung Soo;Kim, Soojun
    • Journal of Korea Water Resources Association
    • /
    • v.57 no.3
    • /
    • pp.151-164
    • /
    • 2024
  • In water treatment plants supplying potable water, the management of chlorine concentration in water treatment processes involving pre-chlorination or intermediate chlorination requires process control. To address this, research has been conducted on water quality prediction techniques utilizing AI technology. This study developed an AI-based predictive model for automating the process control of chlorine disinfection, targeting the prediction of residual chlorine concentration downstream of sedimentation basins in water treatment processes. The AI-based model, which learns from past water quality observation data to predict future water quality, offers a simpler and more efficient approach compared to complex physicochemical and biological water quality models. The model was tested by predicting the residual chlorine concentration downstream of the sedimentation basins at Plant, using multiple regression models and AI-based models like Random Forest and LSTM, and the results were compared. For optimal prediction of residual chlorine concentration, the input-output structure of the AI model included the residual chlorine concentration upstream of the sedimentation basin, turbidity, pH, water temperature, electrical conductivity, inflow of raw water, alkalinity, NH3, etc. as independent variables, and the desired residual chlorine concentration of the effluent from the sedimentation basin as the dependent variable. The independent variables were selected from observable data at the water treatment plant, which are influential on the residual chlorine concentration downstream of the sedimentation basin. The analysis showed that, for Plant, the model based on Random Forest had the lowest error compared to multiple regression models, neural network models, model trees, and other Random Forest models. The optimal predicted residual chlorine concentration downstream of the sedimentation basin presented in this study is expected to enable real-time control of chlorine dosing in previous treatment stages, thereby enhancing water treatment efficiency and reducing chemical costs.

Assessment of Visual satisfaction & Visual Function with Prescription Swimming goggles In-air and Underwater (도수 수경 착용시 실내와 수중에서의 시각적 만족도 및 시력 평가)

  • Chu, Byoung-Sun
    • Journal of Korean Ophthalmic Optics Society
    • /
    • v.18 no.4
    • /
    • pp.357-363
    • /
    • 2013
  • Purpose: To investigate the visual function with prescription swimming goggles. Methods: 15 university students (mean age: $22{\pm}1.54$ years) participated, with a mean distance refractive error of RE: S-1.67 D/C-0.40 D, LE: S-1.70D/C-0.37 D. Inclusion criteria were no ocular pathology, able to wear soft contact lenses to correct their refractive error to emmetropia and able to swim. Participants were fitted with contact lenses to correct all ametropia. Subjective evaluation for satisfaction of visual acuity, asthenopia and balance were also measured using a questionnaire while wearing swimming goggles with cylinder (C+1.50 D, Ax $90^{\circ}$) compared with plano sphere outside the swimming pool area. Visual acuity was assessed using the same ETDRS chart. The prescription swimming goggles powers were assessed in random order and ranged in power from S+3.00 D to S-3.00 D in 0.50 D steps. Results: Subjective evaluation was significantly worse for the swimming goggles with cylinder than for the plano powered goggles for all 3 questions, visual acuity, asthenopia and balance. Visual acuity were significantly affected by the different power of the swimming goggles (p<0.05), but there was no significant difference between the in-air in-clinic and underwater in-swimming pool measures (p=0.173). However, visual acuity measured in the clinic was significantly better than underwater for some swimming goggle powers (+3.00, +1.00, +0.50, 0, -1.00 and -2.00 D). Conclusions: Wearing swimming goggles underwater may degrade the visual acuity compared to within air but as the difference is less than 1 line of Snellen acuity, and it is unlikely to result in significant real-life effects. Having an incorrect cylinder correction was found to be detrimental resulting in lower score of satisfaction. Considering slippery floor of swimming pool area, it can be a potential risk factor. Therefore, it is important to correct any refractive error in addition to astigmatism for swimming goggle.

Examination of Dose Change at the Junction at the Time of Treatment Using Multi-Isocenter Volumetric Modulated Arc Therapy (용적조절호형방사선치료(VMAT)의 다중치료중심(Multi- Isocenter)을 이용한 치료 시, 접합부(Junction)의 선량 변화에 대한 고찰)

  • Jung, Dong Min;Park, Kwang Soon;Ahn, Hyuk Jin;Choi, Yoon Won;Park, Byul Nim;Kwon, Yong Jae;Moon, Sung Gong;Lee, Jong Oon;Jeong, Tae Sik;Park, Ryeong Hwang;Kim, Se young;Kim, Mi Jung;Baek, Jong Geol;Cho, Jeong Hee
    • The Journal of Korean Society for Radiation Therapy
    • /
    • v.33
    • /
    • pp.9-14
    • /
    • 2021
  • This study examined dose change depending on the reposition error of the junction at the time of treatment with multi-isocenter volumetric modulated arc therapy. This study selected a random treatment region in the Arccheck Phantom and established the treatment plan for multi-isocenter volumetric modulated arc therapy. Then, after setting the error of the junction at 0 ~ 4 mm in the X (left), Y (upper), and Z (inner and outer) directions, the area was irradiated using a linear accelerator; the point doses and gamma indexes obtained through the Phantom were subsequently analyzed. It was found that when errors of 2 and 4 mm took place in the X and Y directions, the gamma pass rates (point doses) were 99.3% (2.085) and 98% (2.079 Gy) in the former direction and 98.5% (2.088) and 95.5% (2.093 Gy) in the latter direction, respectively. In addition, when errors of 1, 2, and 4 mm occurred in the inner and outer parts of the Z direction, the gamma pass rates (point doses) were found to be 94.8% (2.131), 82.6% (2.164), and 72.8% (2.22 Gy) in the former part and 93.4% (2.069), 90.6% (2.047), and 79.7% (1.962 Gy) in the latter part, respectively. In the X and Y directions, errors up to 4 mm were tolerable; however, in the Z direction, error values exceeding 1 mm were beyond the tolerance level. This suggests that for high and low dose areas, errors in the direction same as the progress direction in the treatment region have a more sensitive dose distribution. If the guidelines for set-up errors are established at the institutional level through continuous research in the future, it will be possible to provide good quality treatment using junctions.

Estimation of Genetic Parameters for Litter Size and Sex Ratio in Yorkshire and Landrace Pigs (요크셔종과 랜드레이스종의 산자수 및 성비에 대한 유전모수 추정)

  • Lee, Kyung-Soo;Kim, Jong-Bok;Lee, Jeong-Koo
    • Journal of Animal Science and Technology
    • /
    • v.52 no.5
    • /
    • pp.349-356
    • /
    • 2010
  • This study was conducted to estimate heritabilities, repeatabilities and rank correlation coefficients among breeding values for litter size and sex ratio of Yorkshire and Landrace pigs using various single trait animal models. The analyses were carried out the data comprising 26,390 litters of Yorkshire and 26,173 litters of Landrace collected from the year 1998 to 2008 at a private swine breeding farm located in central part of Korea. Five different analytical models were used for genetic parameter estimation. Model 1 was most simple basic model fitted with year-month contemporary group fixed effect, random additive genetic effect and random residual effect. Model 2 was similar to the model 1 but permanent maternal environmental effect added as random effect, and model 3 was similar with the model 2 but linear and quadratic effects of sow age were added as fixed covariate effect. Model 4 was similar as model 2 except that the parity was added as fixed effect and model 5 was similar to model 3 or model 4 but covariate of sow age was nested within parity effect. The results obtained in this study are summarized as follows: The means and standard error of total number of pigs born per litter (TNB) and number of pigs born alive per litter (NBA) were $11.35{\pm}0.02$ and $10.04{\pm}0.02$ for Yorkshire, $10.97{\pm}0.02$ and $9.98{\pm}0.02$ for Landrace, respectively. The sex ratio (percentage of female per litter) was $45.75{\pm}0.11%$ and $45.75{\pm}0.11%$ for Yorkshire and Landrace, respectively. The heritability estimates of TNB (0.243) and NBA (0.192) from model 1 tended to be higher than those from any other models in both breeds. Differences in heritability and repeatability for TNB were not large among models 3, 4 and 5 and same tendency of negligible differences among estimates by models 3, 4 and 5 were observed for NBA, where heritability and repeatability ranged from 0.096 to 0.099 and from 0.188 to 0.193, respectively, in Yorkshire; and ranged from 0.092 to 0.098 and from 0.193 and 0.196, respectively, in Landrace. The heritability estimates for sex ratio were close to zero which was ranged from 0.002 to 0.003 for TNB and from 0.001 to 0.003 for NBA over the models applied. The rank correlation coefficients of breeding values by model 1 with those from other models (model 2, 3, 4 and 5), and breeding values by model 2 with those from other models (model 1, 3, 4 and 5) were highly positive but lower than the coefficients among breeding values by model 3, model 4 and model 5 which were high of 0.99, approximately, for TNB and NBA of both breeds.

Study of Motion Effects in Cartesian and Spiral Parallel MRI Using Computer Simulation (컴퓨터 시뮬레이션을 이용한 직각좌표 및 나선주사 방식의 병렬 자기공명 영상에서 움직임 효과 연구)

  • Park, Sue-Kyeong;Ahn, Chang-Beom;Sim, Dong-Gyu;Park, Ho-Chong
    • Investigative Magnetic Resonance Imaging
    • /
    • v.12 no.2
    • /
    • pp.123-130
    • /
    • 2008
  • Purpose : Motion effects in parallel magnetic resonance imaging (MRI) are investigated. Parallel MRI is known to be robust to motion due to its reduced acquisition time. However, if there are some involuntary motions such as heart or respiratory motions involved during the acquisition of the parallel MRI, motion artifacts would be even worse than those in conventional (non-parallel) MRI. In this paper, we defined several types of motions, and their effects in parallel MRI are investigated in comparisons with conventional MRI. Materials and Methods : In order to investigate motion effects in parallel MRI, 5 types of motions are considered. Type-1 and 2 are periodic motions with different amplitudes and periods. Type-3 and 4 are segment-based linear motions, where they are stationary during the segment. Type-5 is a uniform random motion. For the simulation, Cartesian and spiral grid based parallel and non-parallel (conventional) MRI are used. Results : Based on the motions defined, moving artifacts in the parallel and non-parallel MRI are investigated. From the simulation, non-parallel MRI shows smaller root mean square error (RMSE) values than the parallel MRI for the periodic (type-1 and 2) motions. Parallel MRI shows less motion artifacts for linear(type-3 and 4) motions where motions are reduced with shorter acquisition time. Similar motion artifacts are observed for the random motion (type-5). Conclusion : In this paper, we simulate the motion effects in parallel MRI. Parallel MRI is effective in the reduction of motion artifacts when motion is reduced by the shorter acquisition time. However, conventional MRI shows better image quality than the parallel MRI when fast periodic motions are involved.

  • PDF

Predicting Crime Risky Area Using Machine Learning (머신러닝기반 범죄발생 위험지역 예측)

  • HEO, Sun-Young;KIM, Ju-Young;MOON, Tae-Heon
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.21 no.4
    • /
    • pp.64-80
    • /
    • 2018
  • In Korea, citizens can only know general information about crime. Thus it is difficult to know how much they are exposed to crime. If the police can predict the crime risky area, it will be possible to cope with the crime efficiently even though insufficient police and enforcement resources. However, there is no prediction system in Korea and the related researches are very much poor. From these backgrounds, the final goal of this study is to develop an automated crime prediction system. However, for the first step, we build a big data set which consists of local real crime information and urban physical or non-physical data. Then, we developed a crime prediction model through machine learning method. Finally, we assumed several possible scenarios and calculated the probability of crime and visualized the results in a map so as to increase the people's understanding. Among the factors affecting the crime occurrence revealed in previous and case studies, data was processed in the form of a big data for machine learning: real crime information, weather information (temperature, rainfall, wind speed, humidity, sunshine, insolation, snowfall, cloud cover) and local information (average building coverage, average floor area ratio, average building height, number of buildings, average appraised land value, average area of residential building, average number of ground floor). Among the supervised machine learning algorithms, the decision tree model, the random forest model, and the SVM model, which are known to be powerful and accurate in various fields were utilized to construct crime prevention model. As a result, decision tree model with the lowest RMSE was selected as an optimal prediction model. Based on this model, several scenarios were set for theft and violence cases which are the most frequent in the case city J, and the probability of crime was estimated by $250{\times}250m$ grid. As a result, we could find that the high crime risky area is occurring in three patterns in case city J. The probability of crime was divided into three classes and visualized in map by $250{\times}250m$ grid. Finally, we could develop a crime prediction model using machine learning algorithm and visualized the crime risky areas in a map which can recalculate the model and visualize the result simultaneously as time and urban conditions change.