• 제목/요약/키워드: Linear predictive model

검색결과 288건 처리시간 0.024초

Optimization of Predictors of Ewing Sarcoma Cause-specific Survival: A Population Study

  • Cheung, Min Rex
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제15권10호
    • /
    • pp.4143-4145
    • /
    • 2014
  • Background: This study used receiver operating characteristic curve to analyze Surveillance, Epidemiology and End Results (SEER) Ewing sarcoma (ES) outcome data. The aim of this study was to identify and optimize ES-specific survival prediction models and sources of survival disparities. Materials and Methods: This study analyzed socio-economic, staging and treatment factors available in the SEER database for ES. 1844 patients diagnosed between 1973-2009 were used for this study. For the risk modeling, each factor was fitted by a Generalized Linear Model to predict the outcome (bone and joint specific death, yes/no). The area under the receiver operating characteristic curve (ROC) was computed. Similar strata were combined to construct the most parsimonious models. Results: The mean follow up time (S.D.) was 74.48 (89.66) months. 36% of the patients were female. The mean (S.D.) age was 18.7 (12) years. The SEER staging has the highest ROC (S.D.) area of 0.616 (0.032) among the factors tested. We simplified the 4-layered risk levels (local, regional, distant, un-staged) to a simpler non-metastatic (I and II) versus metastatic (III) versus un-staged model. The ROC area (S.D.) of the 3-tiered model was 0.612 (0.008). Several other biologic factors were also predictive of ES-specific survival, but not the socio-economic factors tested here. Conclusions: ROC analysis measured and optimized the performance of ES survival prediction models. Optimized models will provide a more efficient way to stratify patients for clinical trials.

FORECAST OF SOLAR PROTON EVENTS WITH NOAA SCALES BASED ON SOLAR X-RAY FLARE DATA USING NEURAL NETWORK

  • Jeong, Eui-Jun;Lee, Jin-Yi;Moon, Yong-Jae;Park, Jongyeop
    • 천문학회지
    • /
    • 제47권6호
    • /
    • pp.209-214
    • /
    • 2014
  • In this study we develop a set of solar proton event (SPE) forecast models with NOAA scales by Multi Layer Perceptron (MLP), one of neural network methods, using GOES solar X-ray flare data from 1976 to 2011. Our MLP models are the first attempt to forecast the SPE scales by the neural network method. The combinations of X-ray flare class, impulsive time, and location are used for input data. For this study we make a number of trials by changing the number of layers and nodes as well as combinations of the input data. To find the best model, we use the summation of F-scores weighted by SPE scales, where F-score is the harmonic mean of PODy (recall) and precision (positive predictive value), in order to minimize both misses and false alarms. We find that the MLP models are much better than the multiple linear regression model and one layer MLP model gives the best result.

Rapid seismic vulnerability assessment by new regression-based demand and collapse models for steel moment frames

  • Kia, M.;Banazadeh, M.;Bayat, M.
    • Earthquakes and Structures
    • /
    • 제14권3호
    • /
    • pp.203-214
    • /
    • 2018
  • Predictive demand and collapse fragility functions are two essential components of the probabilistic seismic demand analysis that are commonly developed based on statistics with enormous, costly and time consuming data gathering. Although this approach might be justified for research purposes, it is not appealing for practical applications because of its computational cost. Thus, in this paper, Bayesian regression-based demand and collapse models are proposed to eliminate the need of time-consuming analyses. The demand model developed in the form of linear equation predicts overall maximum inter-story drift of the lowto mid-rise regular steel moment resisting frames (SMRFs), while the collapse model mathematically expressed by lognormal cumulative distribution function provides collapse occurrence probability for a given spectral acceleration at the fundamental period of the structure. Next, as an application, the proposed demand and collapse functions are implemented in a seismic fragility analysis to develop fragility and consequently seismic demand curves of three example buildings. The accuracy provided by utilization of the proposed models, with considering computation reduction, are compared with those directly obtained from Incremental Dynamic analysis, which is a computer-intensive procedure.

Is Health Locus of Control a Modifying Factor in the Health Belief Model for Prediction of Breast Self-Examination?

  • Tahmasebi, Rahim;Noroozi, Azita
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제17권4호
    • /
    • pp.2229-2233
    • /
    • 2016
  • Background: Breast cancer is one of the most common cancers among women in the world. Early detection is necessary to improve outcomes and decrease related costs. The aim of this study was to assess the predictive power of health locus of control as a modifying factor in the Health Belief Model (HBM) for prediction of breast self-examination. Materials and Methods: In this cross- sectional study, 400 women selected through the convenience sampling from health centers. Data were collected using part of the Champion's HBM scale (CHBMS), the Health Locus of Control Scale and a self administered questionnaire. For data analysis by SPSS the independent T test, Chi square test, logistic and linear regression modes were appliedl. Results: The results showed that 10.9% of the participants reported performing BSE regularly. Health locus of control did not act as a predictor of BSE as a modifying factor. In this study, perceived self-efficacy was the strongest predictor of BSE performance (Exp (B) =1.863) with direct effect, while awareness had direct and indirect influence. Conclusions: For increasing BSE, improvement of self-efficacy especially in young women and increasing knowledge about cancer is necessary.

A Comparative Study of Estimation by Analogy using Data Mining Techniques

  • Nagpal, Geeta;Uddin, Moin;Kaur, Arvinder
    • Journal of Information Processing Systems
    • /
    • 제8권4호
    • /
    • pp.621-652
    • /
    • 2012
  • Software Estimations provide an inclusive set of directives for software project developers, project managers, and the management in order to produce more realistic estimates based on deficient, uncertain, and noisy data. A range of estimation models are being explored in the industry, as well as in academia, for research purposes but choosing the best model is quite intricate. Estimation by Analogy (EbA) is a form of case based reasoning, which uses fuzzy logic, grey system theory or machine-learning techniques, etc. for optimization. This research compares the estimation accuracy of some conventional data mining models with a hybrid model. Different data mining models are under consideration, including linear regression models like the ordinary least square and ridge regression, and nonlinear models like neural networks, support vector machines, and multivariate adaptive regression splines, etc. A precise and comprehensible predictive model based on the integration of GRA and regression has been introduced and compared. Empirical results have shown that regression when used with GRA gives outstanding results; indicating that the methodology has great potential and can be used as a candidate approach for software effort estimation.

Fatigue reliability analysis of steel bridge welding member by fracture mechanics method

  • Park, Yeon-Soo;Han, Suk-Yeol;Suh, Byoung-Chul
    • Structural Engineering and Mechanics
    • /
    • 제19권3호
    • /
    • pp.347-359
    • /
    • 2005
  • This paper attempts to develop the analytical model of estimating the fatigue damage using a linear elastic fracture mechanics method. The stress history on a welding member, when a truck passed over a bridge, was defined as a block loading and the crack closure theory was used. These theories explain the influence of a load on a structure. This study undertook an analysis of the stress range frequency considering both dead load stress and crack opening stress. A probability method applied to stress range frequency distribution and the probability distribution parameters of it was obtained by Maximum likelihood Method and Determinant. Monte Carlo Simulation which generates a probability variants (stress range) output failure block loadings. The probability distribution of failure block loadings was acquired by Maximum likelihood Method and Determinant. This can calculate the fatigue reliability preventing the fatigue failure of a welding member. The failure block loading divided by the average daily truck traffic is a predictive remaining life by a day. Fatigue reliability analysis was carried out for the welding member of the bottom flange of a cross beam and the vertical stiffener of a steel box bridge by the proposed model. Results showed that the primary factor effecting failure time was crack opening stress. It was important to decide the crack opening stress for using the proposed model. Also according to the 50% reliability and 90%, 99.9% failure times were indicated.

Prediction of Larix kaempferi Stand Growth in Gangwon, Korea, Using Machine Learning Algorithms

  • Hyo-Bin Ji;Jin-Woo Park;Jung-Kee Choi
    • Journal of Forest and Environmental Science
    • /
    • 제39권4호
    • /
    • pp.195-202
    • /
    • 2023
  • In this study, we sought to compare and evaluate the accuracy and predictive performance of machine learning algorithms for estimating the growth of individual Larix kaempferi trees in Gangwon Province, Korea. We employed linear regression, random forest, XGBoost, and LightGBM algorithms to predict tree growth using monitoring data organized based on different thinning intensities. Furthermore, we compared and evaluated the goodness-of-fit of these models using metrics such as the coefficient of determination (R2), mean absolute error (MAE), and root mean square error (RMSE). The results revealed that XGBoost provided the highest goodness-of-fit, with an R2 value of 0.62 across all thinning intensities, while also yielding the lowest values for MAE and RMSE, thereby indicating the best model fit. When predicting the growth volume of individual trees after 3 years using the XGBoost model, the agreement was exceptionally high, reaching approximately 97% for all stand sites in accordance with the different thinning intensities. Notably, in non-thinned plots, the predicted volumes were approximately 2.1 m3 lower than the actual volumes; however, the agreement remained highly accurate at approximately 99.5%. These findings will contribute to the development of growth prediction models for individual trees using machine learning algorithms.

Prediction of compressive strength of sustainable concrete using machine learning tools

  • Lokesh Choudhary;Vaishali Sahu;Archanaa Dongre;Aman Garg
    • Computers and Concrete
    • /
    • 제33권2호
    • /
    • pp.137-145
    • /
    • 2024
  • The technique of experimentally determining concrete's compressive strength for a given mix design is time-consuming and difficult. The goal of the current work is to propose a best working predictive model based on different machine learning algorithms such as Gradient Boosting Machine (GBM), Stacked Ensemble (SE), Distributed Random Forest (DRF), Extremely Randomized Trees (XRT), Generalized Linear Model (GLM), and Deep Learning (DL) that can forecast the compressive strength of ternary geopolymer concrete mix without carrying out any experimental procedure. A geopolymer mix uses supplementary cementitious materials obtained as industrial by-products instead of cement. The input variables used for assessing the best machine learning algorithm not only include individual ingredient quantities, but molarity of the alkali activator and age of testing as well. Myriad statistical parameters used to measure the effectiveness of the models in forecasting the compressive strength of ternary geopolymer concrete mix, it has been found that GBM performs better than all other algorithms. A sensitivity analysis carried out towards the end of the study suggests that GBM model predicts results close to the experimental conditions with an accuracy between 95.6 % to 98.2 % for testing and training datasets.

한국 30~40대 실업률 예측을 위한 구글 검색 정보의 활용 (Application of Google Search Queries for Predicting the Unemployment Rate for Koreans in Their 30s and 40s)

  • 정재운;황진호
    • 디지털융복합연구
    • /
    • 제17권9호
    • /
    • pp.135-145
    • /
    • 2019
  • 장기불황으로 인해 한국 청년실업률이 수년간 10% 안팎의 높은 수준을 유지하고 있는 가운데, 주요 경제활동 인구인 30~40대의 실업률이 최근 상승세를 보이고 있다. 정부의 기존 청년 중심의 고용촉진 및 실업복지 정책을 30~40대를 포함한 다양한 연령층으로 확대 강화하기 위해서는 각 연령층에 대한 실업예측 모형 연구가 필요하다. 이에 본 연구에서는 한국 통계청 실업률 자료와 구글 검색어를 활용하여 한국 30~40대 연령층에 특화된 실업률 예측모형을 개발하고자 하였다. 실업률 자료와 계절성 자기회귀누적이동평균 모형을 활용하여 기초모형(Model 1)을 다중선형회귀 모형으로 추정하였으며, 개선된 모형을 구하고자 구글 검색 질의어 정보를 Model 1에 추가 활용하였다(Model 2). 그 결과, 30대와 40대 연령층 모두 구글 검색 질의어를 추가 활용한 Model 2가 Model 1보다 우수한 예측력을 보였다. 이는 웹 검색 질의어가 여전히 한국의 실업률 예측모형을 개선하는 데 유의미함을 의미한다. 본 연구는 실질적인 활용을 위해 추가적인 연구가 필요하지만, 연령대별 실업률 예측 연구에 기여할 것으로 판단된다.

약물의 염전성 부정맥 유발 예측 지표로서 심장의 전기생리학적 특징 값들의 검증 (Verification of Cardiac Electrophysiological Features as a Predictive Indicator of Drug-Induced Torsades de pointes)

  • 유예담;정다운;;임기무
    • 대한의용생체공학회:의공학회지
    • /
    • 제43권1호
    • /
    • pp.19-26
    • /
    • 2022
  • The Comprehensive in vitro Proarrhythmic Assay(CiPA) project was launched for solving the hERG assay problem of being classified as high-risk groups even though they are low-risk drugs due to their high sensitivity. CiPA presented a protocol to predict drug toxicity using physiological data calculated based on the in-silico model. in this study, features calculated through the in-silico model are analyzed for correlation of changing action potential in the near future, and features are verified through predictive performance according to drug datasets. Using the O'Hara Rudy model modified by Dutta et al., Pearson correlation analysis was performed between 13 features(dVm/dtmax, APpeak, APresting, APD90, APD50, APDtri, Capeak, Caresting, CaD90, CaD50, CaDtri, qNet, qInward) calculated at 100 pacing, and between dVm/dtmax_repol calculated at 1,000 pacing, and linear regression analysis was performed on each of the 12 training drugs, 16 verification drugs, and 28 drugs. Indicators showing high coefficient of determination(R2) in the training drug dataset were qNet 0.93, AP resting 0.83, APDtri 0.78, Ca resting 0.76, dVm/dtmax 0.63, and APD90 0.61. The indicators showing high determinants in the validated drug dataset were APDtri 0.94, APD90 0.92, APD50 0.85, CaD50 0.84, qNet 0.76, and CaD90 0.64. Indicators with high coefficients of determination for all 28 drugs are qNet 0.78, APD90 0.74, and qInward 0.59. The indicators vary in predictive performance depending on the drug dataset, and qNet showed the same high performance of 0.7 or more on the training drug dataset, the verified drug dataset, and the entire drug dataset.