• Title/Summary/Keyword: Linear regression models

Search Result 944, Processing Time 0.032 seconds

Application of Predictive Microbiology for Microbiological Shelf Life Estimation of Fresh-cut Salad with Short-term Temperature Abuse (PMP 모델을 활용한 시판 Salad의 Short-term Temperature Abuse 시 미생물학적 유통기한 예측에의 적용성 검토)

  • Lim, Jeong-Ho;Park, Kee-Jea;Jeong, Jin-Woong;Kim, Hyun-Soo;Hwang, Tae-Young
    • Food Science and Preservation
    • /
    • v.19 no.5
    • /
    • pp.633-638
    • /
    • 2012
  • The aim of this study was to investigate the growth of aerobic bacteria in fresh-cut salad during short-term temperature abuse ($4{\sim}30^{\circ}C$temperature for 1, 2, and 3 h) for 72 h and to develop predictive models for the growth of total viable cells (TVC) based on Predictive food microbiology (PFM). The tool that was used, Pathogen Modeling program (PMP 7.0), predicts the growth of Aeromonas hydrophila (broth Culture, aerobic) at pH 5.6, NaCl 2.5%, and sodium nitrite 150 ppm for 72 h. Linear models through linear regression analysis; DMFit program were created based on the results obtained at 5, 10, 20, and $30^{\circ}C$ for 72 h ($r^2$ >0.9). Secondary models for the growth rate and lag time, as a function of storage temperature, were developed using the polynomial model. The initial contamination level of fresh-cut salad was 5.6 log CFU/mL of TVC during 72 h storage, and the growth rate of TVC was shown to be 0.020~1.083 CFU/mL/h ($r^2$ >0.9). Also, the growth tendency of TVC was similar to that of PMP (grow rate: 0.017~0.235 CFU/mL/h; $r^2=0.994{\sim}1.000$). The predicted shelf life with PMP was 24.1~626.5 h, and the estimated shelf life of the fresh-cut salads with short-term temperature abuse was 15.6~31.1 h. The predicted shelf life was more than two times the observed one. This result indicates a 'fail safe' model. It can be taken to a ludicrous extreme by adopting a model that always predicts that a pathogenic microorganism will grow even under conditions so strict as to be actually impossible.

Empirical Estimation and Diurnal Patterns of Surface PM2.5 Concentration in Seoul Using GOCI AOD (GOCI AOD를 이용한 서울 지역 지상 PM2.5 농도의 경험적 추정 및 일 변동성 분석)

  • Kim, Sang-Min;Yoon, Jongmin;Moon, Kyung-Jung;Kim, Deok-Rae;Koo, Ja-Ho;Choi, Myungje;Kim, Kwang Nyun;Lee, Yun Gon
    • Korean Journal of Remote Sensing
    • /
    • v.34 no.3
    • /
    • pp.451-463
    • /
    • 2018
  • The empirical/statistical models to estimate the ground Particulate Matter ($PM_{2.5}$) concentration from Geostationary Ocean Color Imager (GOCI) Aerosol Optical Depth (AOD) product were developed and analyzed for the period of 2015 in Seoul, South Korea. In the model construction of AOD-$PM_{2.5}$, two vertical correction methods using the planetary boundary layer height and the vertical ratio of aerosol, and humidity correction method using the hygroscopic growth factor were applied to respective models. The vertical correction for AOD and humidity correction for $PM_{2.5}$ concentration played an important role in improving accuracy of overall estimation. The multiple linear regression (MLR) models with additional meteorological factors (wind speed, visibility, and air temperature) affecting AOD and $PM_{2.5}$ relationships were constructed for the whole year and each season. As a result, determination coefficients of MLR models were significantly increased, compared to those of empirical models. In this study, we analyzed the seasonal, monthly and diurnal characteristics of AOD-$PM_{2.5}$model. when the MLR model is seasonally constructed, underestimation tendency in high $PM_{2.5}$ cases for the whole year were improved. The monthly and diurnal patterns of observed $PM_{2.5}$ and estimated $PM_{2.5}$ were similar. The results of this study, which estimates surface $PM_{2.5}$ concentration using geostationary satellite AOD, are expected to be applicable to the future GK-2A and GK-2B.

A Study on the Determinants of Land Price in a New Town (신도시 택지개발사업지역에서 토지가격 결정요인에 관한 연구)

  • Jeong, Tae Yun
    • Korea Real Estate Review
    • /
    • v.28 no.1
    • /
    • pp.79-90
    • /
    • 2018
  • The purpose of this study was to estimate the pricing factors of residential lands in new cities by estimating the pricing model of residential lands. For this purpose, hedonic equations for each quantile of the conditional distribution of land prices were estimated using quantile regression methods and the sale price date of Jangyu New Town in Gimhae. In this study, a quantile regression method that models the relation between a set of explanatory variables and each quantile of land price was adopted. As a result, the differences in the effects of the characteristics by price quantile were confirmed. The number of years that elapsed after the completion of land construction is the quadratic effect in the model because its impact may give rise to a non-linear price pattern. Age appears to decrease the price until certain years after the construction, and increases the price afterward. In the estimation of the quantile regression, land age appears to have a statistically significant impact on land price at the traditional level, and the turning point appears to be shorter for the low quantiles than for the higher quantiles. The positive effects of the use of land for commercial and residential purposes were found to be the biggest. Land demand is preferred if there are more than two roads on the ground. In this case, the amount of sunshine will improve. It appears that the shape of a square wave is preferred to a free-looking land. This is because the square land is favorable for development. The variables of the land used for commercial and residential purposes have a greater impact on low-priced residential lands. This is because such lands tend to be mostly used for rental housing and have different characteristics from residential houses. Residential land prices have different characteristics depending on the price level, and it is necessary to consider this in the evaluation of the collateral value and the drafting of real estate policy.

A Study on the Evaluative Models and Indicators for Diagnosis of Urban Visual Landscape - Focusing on Seoul City - (도시경관 진단을 위한 평가모델 및 지표개발 연구 - 서울시를 중심으로 -)

  • Kim, Seung-Ju;Im, Seung-Bin
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.37 no.1
    • /
    • pp.78-86
    • /
    • 2009
  • Recently, there seems to besome problems in the urban visual landscape as a result of continuous economic growth and industrial development. At the same time, the public has begun to be aware of the importance of visual resources, and the necessity for visual landscape conservation and improvement. Therefore, the development of evaluative indicators for systematic visual landscape planning and design is urgent. The purpose ofthis study is to discover evaluative models and indicators for the diagnosis of urban visual landscapes. This study included the selection of 18 physical indicators(statistical data) by literature reviews, adoption of field and questionnaire surveys at 12 autonomous districts in Seoul and surrounding major mountain valleys and river streams(i.e. Mt. Nam and Han-River). The content of the questionnaire is scenic beauty. Moreover, the linear regression analysis between the scenic beauty mean scores and the physical indicator scores figure out the scenic beauty prediction model. As this study suggests, the most important indicators in urban visual landscapes are 'Greens', 'Park' and 'the number of apartment buildings(higher than 20 stories).' Based on the results, greens and parks should be priority elements to considerin urban landscape planning and design. Moreover, since the number of apartment buildings that are higher than 20 stories has a negative correlation with the scenic beauty score, it can be used as basic data for landscape planning. For the scenic beauty prediction models and evaluative indicators suggest a direction of urban management, each indicator becomes basic data for visual landscape planning and design. In following studies, if physical indicators and case studies are added, the scenic beauty prediction models and evaluative indicators could be more synthetic and systematic. Moreover, the development of physical indicators in three dimensions(3D)(i.e. results from visual district analysis, view surface analysis) could be expected to obtain more general and varied results.

Physical Fitness, Leisure Time Physical Activity, and Serum Lipid Levels in Middle-Aged Male Workers (중년 남성 근로자에서 신체 적합도, 여가중 신체 활동과 혈중 지질 농도)

  • Kim, Jang-Rak;Nam, Bock-Dong;Kim, Ju-Ho;Lee, Song-Kwon;Moon, Joong-Kap;Lee, Jang-Ho;Hong, Dae-Yong
    • Journal of Preventive Medicine and Public Health
    • /
    • v.29 no.2 s.53
    • /
    • pp.173-186
    • /
    • 1996
  • This is a cross-sectional study to evaluate the relationships between physical fitness, leisure time physical activity, and serum lipid levels in middle-aged male workers. Physical fitness was measured by a step test score, and leisure time physical activity was self-reported on a questionnaire. Serum total cholesterol was negatively related to physical fitness(r=-0.27), and positively to obesity index(r=0.27). But leisure time physical activity was related to total cholesterol negatively(r=-0.20) only in subjects whose total cholesterol levels were above 170mg/dl. High density lipoprotein(HDL) cholesterol was positively related to physical fitness(r=0.15), negatively to obestiy index(r=-0.22), and positively to weekly alcohol consumption(r=0.14). Total cholesterol/HDL cholesterol ratio was related to physical fitness(r=-0.23), obesity index(r=0.32), total cigarette index (r=0.13), weekly alcohol consumption(r=-0.13), and vegetable preference(r=0.13). Physical fitness was also related to leisure time physical activity(r=0.19) and obesity index(r=-0.18). In multiple linear regression models, physical fitness(beta=-0.23) and obesity index(beta=0.18) were significantly associated with total cholesterol, obesity index(beta=-0.25) with HDLcholesterol, and obesity index(beta=0.30), physical fitness(beta=-0.16) and vegetable preference (beta=0.14) with total cholesterol/HDL cholesterol ratio. In conclusion, as physical fitness has a stronger relationship with serum lipid levels than leisure time physical activity, and the association between physical fitness and leisure time physical activity is modest, physical fitness should be added as an important variable in addition to activity in future epidemiologic studies.

  • PDF

Prediction of Correct Answer Rate and Identification of Significant Factors for CSAT English Test Based on Data Mining Techniques (데이터마이닝 기법을 활용한 대학수학능력시험 영어영역 정답률 예측 및 주요 요인 분석)

  • Park, Hee Jin;Jang, Kyoung Ye;Lee, Youn Ho;Kim, Woo Je;Kang, Pil Sung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.4 no.11
    • /
    • pp.509-520
    • /
    • 2015
  • College Scholastic Ability Test(CSAT) is a primary test to evaluate the study achievement of high-school students and used by most universities for admission decision in South Korea. Because its level of difficulty is a significant issue to both students and universities, the government makes a huge effort to have a consistent difficulty level every year. However, the actual levels of difficulty have significantly fluctuated, which causes many problems with university admission. In this paper, we build two types of data-driven prediction models to predict correct answer rate and to identify significant factors for CSAT English test through accumulated test data of CSAT, unlike traditional methods depending on experts' judgments. Initially, we derive candidate question-specific factors that can influence the correct answer rate, such as the position, EBS-relation, readability, from the annual CSAT practices and CSAT for 10 years. In addition, we drive context-specific factors by employing topic modeling which identify the underlying topics over the text. Then, the correct answer rate is predicted by multiple linear regression and level of difficulty is predicted by classification tree. The experimental results show that 90% of accuracy can be achieved by the level of difficulty (difficult/easy) classification model, whereas the error rate for correct answer rate is below 16%. Points and problem category are found to be critical to predict the correct answer rate. In addition, the correct answer rate is also influenced by some of the topics discovered by topic modeling. Based on our study, it will be possible to predict the range of expected correct answer rate for both question-level and entire test-level, which will help CSAT examiners to control the level of difficulties.

Spatio-temporal Water Quality Variations at Various Streams of Han-River Watershed and Empirical Models of Serial Impoundment Reservoirs (한강수계 하천에서의 시공간적 수질변화 특성 및 연속적 인공댐호의 경험적 모델)

  • Jeon, Hye-Won;Choi, Ji-Woong;An, Kwang-Guk
    • Korean Journal of Ecology and Environment
    • /
    • v.45 no.4
    • /
    • pp.378-391
    • /
    • 2012
  • The objective of this study was to determine temporal patterns and longitudinal gradients of water chemistry at eight artificial reservoirs and ten streams within the Han-River watershed along the main axis of the headwaters to the downstreams during 2009~2010. Also, we evaluated chemical relations and their variations among major trophic variables such as total nitrogen (TN), total phosphorus (TP), and chlorophyll-a (CHL-a) and determined intense summer monsoon and annual precipitation effects on algal growth using empirical regression model. Stream water quality of TN, TP, and other parameters degradated toward the downstreams, and especially was largely impacted by point-sources of wastewater disposal plants near Jungrang Stream. In contrast, summer river runoff and rainwater improved the stream water quality of TP, TN, and ionic contents, measured as conductivity (EC) in the downstream reach. Empirical linear regression models of log-transformed CHL-a against log-transformed TN, TP, and TN : TP mass ratios in five reservoirs indicated that the variation of TP accounted 33.8% ($R^2$=0.338, p<0.001, slope=0.710) in the variation of CHL and the variation of TN accounted only 21.4% ($R^2$=0.214, p<0.001) in the CHL-a. Overall, our study suggests that, primary productions, estimated as CHL-a, were more determined by ambient phosphorus loading rather than nitrogen in the lentic systems of artificial reservoirs, and the stream water quality as lotic ecosystems were more influenced by a point-source locations of tributary streams and intense seasonal rainfall rather than a presence of artificial dam reservoirs along the main axis of the watershed.

Tibial Torsion in Children of the Jeju Area (제주지역 소아의 경골 염전)

  • Song, Dong Ho;Eun, Baik-Lin;Park, Sang Hee;Lee, Joon Young;Tockgo, Young Chang
    • Clinical and Experimental Pediatrics
    • /
    • v.48 no.1
    • /
    • pp.75-80
    • /
    • 2005
  • Purpose : Internal tibial torsion is prevalent in East Asian countries such as Korea and Japan, where sitting on the floor is common behavior. Internal tibial torsion or excessive lateral tibial torsion may cause esthetical, functional, or psychological problems and also may induce degenerative arthritis in older age. The purpose of this study is to measure the tibial torsion in children of the Jeju area. Methods : Tibial torsion was measured in 1,042 lower extremities of 521 children from one to 12 years of age. The values of transmalleolar angles were analyzed for each age group divided by 6 months. Quadratic and linear regression models were used to fit patterns of changes in mean values of transmalleolar angles. The age at seven, which provides the highest coefficient of determination for quadratic regression analysis, was used as a cut-off point to fit different statistical models. Results : The mean transmalleolar angle was $0.10{\pm}5.79^{\circ}$ in all children,$ 0.90{\pm}5.49^{\circ}$ in males, and $-0.80{\pm}5.97^{\circ}$ in females. The value was $4.25{\pm}4.04$ in 1 year of age, gradually decreased to the lowest level of $-1.98^{\circ}$ in four years and seven months of age, increased again with age until it reached $0.67{\pm}1.10^{\circ}$ at seven years of age, and stayed at that level thereafter. Conclusion : Internal tibial torsion in infancy is known to correct spontaneously in the normal developing process. But in this study, the mean transmalleolar angle in children of Jeju area annually decreased after one year of age; to the lowest angle at four years and seven months of age; increased again gradually to the age of seven; and persisted in that level, about $10^{\circ}$ less than western children, not correcting further thereafter. These findings suggest tibial torsion might be caused by lifestyle, especially sitting on feet. To prevent abnormalities of joints and gaits, early diagnosis of tibial torsion in childhood and posture correction or early treatment when needed, seems to be necessary.

Development of Prediction Model for the Na Content of Leaves of Spring Potatoes Using Hyperspectral Imagery (초분광 영상을 이용한 봄감자의 잎 Na 함량 예측 모델 개발)

  • Park, Jun-Woo;Kang, Ye-Seong;Ryu, Chan-Seok;Jang, Si-Hyeong;Kang, Kyung-Suk;Kim, Tae-Yang;Park, Min-Jun;Baek, Hyeon-Chan;Song, Hye-Young;Jun, Sae-Rom;Lee, Su-Hwan
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.23 no.4
    • /
    • pp.316-328
    • /
    • 2021
  • In this study, the leaf Na content prediction model for spring potato was established using 400-1000 nm hyperspectral sensor to develop the multispectral sensor for the salinity monitoring in reclaimed land. The irrigation conditions were standard, drought, and salinity (2, 4, 8 dS/m), and the irrigation amount was calculated based on the amount of evaporation. The leaves' Na contents were measured 1st and 2nd weeks after starting irrigation in the vegetative, tuber formative, and tuber growing periods, respectively. The reflectance of the leaves was converted from 5 nm to 10 nm, 25 nm, and 50 nm of FWHM (full width at half maximum) based on the 10 nm wavelength intervals. Using the variance importance in projections of partial least square regression(PLSR-VIP), ten band ratios were selected as the variables to predict salinity damage levels with Na content of spring potato leaves. The MLR(Multiple linear regression) models were estimated by removing the band ratios one by one in the order of the lowest weight among the ten band ratios. The performance of models was compared by not only R2, MAPE but also the number of band ratios, optimal FWHM to develop the compact multispectral sensor. It was an advantage to use 25 nm of FWHM to predict the amount of Na in leaves for spring potatoes during the 1st and 2nd weeks vegetative and tuber formative periods and 2 weeks tuber growing periods. The selected bandpass filters were 15 bands and mainly in red and red-edge regions such as 430/440, 490/500, 500/510, 550/560, 570/580, 590/600, 640/650, 650/660, 670/680, 680/690, 690/700, 700/710, 710/720, 720/730, 730/740 nm.

Nonlinear Vector Alignment Methodology for Mapping Domain-Specific Terminology into General Space (전문어의 범용 공간 매핑을 위한 비선형 벡터 정렬 방법론)

  • Kim, Junwoo;Yoon, Byungho;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.127-146
    • /
    • 2022
  • Recently, as word embedding has shown excellent performance in various tasks of deep learning-based natural language processing, researches on the advancement and application of word, sentence, and document embedding are being actively conducted. Among them, cross-language transfer, which enables semantic exchange between different languages, is growing simultaneously with the development of embedding models. Academia's interests in vector alignment are growing with the expectation that it can be applied to various embedding-based analysis. In particular, vector alignment is expected to be applied to mapping between specialized domains and generalized domains. In other words, it is expected that it will be possible to map the vocabulary of specialized fields such as R&D, medicine, and law into the space of the pre-trained language model learned with huge volume of general-purpose documents, or provide a clue for mapping vocabulary between mutually different specialized fields. However, since linear-based vector alignment which has been mainly studied in academia basically assumes statistical linearity, it tends to simplify the vector space. This essentially assumes that different types of vector spaces are geometrically similar, which yields a limitation that it causes inevitable distortion in the alignment process. To overcome this limitation, we propose a deep learning-based vector alignment methodology that effectively learns the nonlinearity of data. The proposed methodology consists of sequential learning of a skip-connected autoencoder and a regression model to align the specialized word embedding expressed in each space to the general embedding space. Finally, through the inference of the two trained models, the specialized vocabulary can be aligned in the general space. To verify the performance of the proposed methodology, an experiment was performed on a total of 77,578 documents in the field of 'health care' among national R&D tasks performed from 2011 to 2020. As a result, it was confirmed that the proposed methodology showed superior performance in terms of cosine similarity compared to the existing linear vector alignment.