• Title/Summary/Keyword: Models, statistical

Search Result 3,012, Processing Time 0.027 seconds

An Alternative Simplified Approach in Solving for the Inelastic Buckling Strengths of Singly Symmetric Non-Compact Stepped I-Beams (일축대칭 비조밀 스텝 I형보의 비탄성 좌굴강도 산정을 위한 단순방법)

  • Alolod, Shane;Park, Jong Sup
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.39 no.1
    • /
    • pp.123-134
    • /
    • 2019
  • This paper proposed a new design equation for the inelastic lateral torsional buckling (LTB) of singly symmetric stepped I-beams with non-compact flange sections. The proposed equation was generated using a finite element program, ABAQUS, and a statistical program, MINITAB. The parameters used were the stepped beams parameters; ${\alpha}$, ${\beta}$, and ${\gamma}$ and the length-to-height ratio ($L_b/h$) of the beam. The proposed equation was further validated by means of experimental test, where beams were subjected to four-point bending and supported by roller and lateral braces near the end supports. In addition, finite element models were simulated using the same parameters used in the experimental test to verify the results of the test conducted. It was proved that LTB capacity calculated from the proposed equation is accurate and conservative in comparison with the yielded values from the FEM and actual test, making it a reliable and safe approach in calculating the buckling capacities of singly symmetric stepped beams with non-compact flange sections.

A new sample selection model for overdispersed count data (과대산포 가산자료의 새로운 표본선택모형)

  • Jo, Sung Eun;Zhao, Jun;Kim, Hyoung-Moon
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.6
    • /
    • pp.733-749
    • /
    • 2018
  • Sample selection arises as a result of the partial observability of the outcome of interest in a study. Heckman introduced a sample selection model to analyze such data and proposed a full maximum likelihood estimation method under the assumption of normality. Recently sample selection models for binomial and Poisson response variables have been proposed. Based on the theory of symmetry-modulated distribution, we extend these to a model for overdispersed count data. This type of data with no sample selection is often modeled using negative binomial distribution. Hence we propose a sample selection model for overdispersed count data using the negative binomial distribution. A real data application is employed. Simulation studies reveal that our estimation method based on profile log-likelihood is stable.

Distinct cell subtype composition using gene expression data in oral cancer (유전자 발현 데이터 기반 구강암에서의 세포 조성 차이 분석)

  • Rhee, Je-Keun
    • Journal of the Korea Convergence Society
    • /
    • v.10 no.8
    • /
    • pp.59-65
    • /
    • 2019
  • There are various subtypes of cells in cancer tissues, but it is hard to confirm their composition experimentally. Here, we estimated the cell composition of each sample from gene expression data by using statistical machine learning approaches, two different regression models and investigated whether the cell composition was different between cancer and normal tissue. As a result, we found that CD8 T cell and Neutrophil were increased in oral cancer tissues compared to normal tissues. In addition, we applied t-SNE, which is one of the unsupervised learning, to verify whether normal tissue and oral cancer tissue can be clustered by the derived cell composition. Moreover, we showed that it is possible to predict oral cancer and normal tissue by several supervised classification algorithms. The study would help to improve the understanding of the immune cell infiltration at oral cancer.

Consumer behavior prediction using Airbnb web log data (에어비앤비(Airbnb) 웹 로그 데이터를 이용한 고객 행동 예측)

  • An, Hyoin;Choi, Yuri;Oh, Raeeun;Song, Jongwoo
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.3
    • /
    • pp.391-404
    • /
    • 2019
  • Customers' fixed characteristics have often been used to predict customer behavior. It has recently become possible to track customer web logs as customer activities move from offline to online. It has become possible to collect large amounts of web log data; however, the researchers only focused on organizing the log data or describing the technical characteristics. In this study, we predict the decision-making time until each customer makes the first reservation, using Airbnb customer data provided by the Kaggle website. This data set includes basic customer information such as gender, age, and web logs. We use various methodologies to find the optimal model and compare prediction errors for cases with web log data and without it. We consider six models such as Lasso, SVM, Random Forest, and XGBoost to explore the effectiveness of the web log data. As a result, we choose Random Forest as our optimal model with a misclassification rate of about 20%. In addition, we confirm that using web log data in our study doubles the prediction accuracy in predicting customer behavior compared to not using it.

Multi-tissue observation of the long non-coding RNA effects on sexually biased gene expression in cattle

  • Yoon, Joon;Kim, Heebal
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.32 no.7
    • /
    • pp.1044-1051
    • /
    • 2019
  • Objective: Recent studies have implied that gene expression has high tissue-specificity, and therefore it is essential to investigate gene expression in a variety of tissues when performing the transcriptomic analysis. In addition, the gradual increase of long non-coding RNA (lncRNA) annotation database has increased the importance and proportion of mapped reads accordingly. Methods: We employed simple statistical models to detect the sexually biased/dimorphic genes and their conjugate lncRNAs in 40 RNA-seq samples across two factors: sex and tissue. We employed two quantification pipeline: mRNA annotation only and mRNA+lncRNA annotation. Results: As a result, the tissue-specific sexually dimorphic genes are affected by the addition of lncRNA annotation at a non-negligible level. In addition, many lncRNAs are expressed in a more tissue-specific fashion and with greater variation between tissues compared to protein-coding genes. Due to the genic region lncRNAs, the differentially expressed gene list changes, which results in certain sexually biased genes to become ambiguous across the tissues. Conclusion: In a past study, it has been reported that tissue-specific patterns can be seen throughout the differentially expressed genes between sexes in cattle. Using the same dataset, this study used a more recent reference, and the addition of conjugate lncRNA information, which revealed alterations of differentially expressed gene lists that result in an apparent distinction in the downstream analysis and interpretation. We firmly believe such misquantification of genic lncRNAs can be vital in both future and past studies.

Bigdata Prediction Support Service for Citizen Data Scientists (시민 데이터과학자를 위한 빅데이터 예측 지원 서비스)

  • Chang, Jae-Young
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.2
    • /
    • pp.151-159
    • /
    • 2019
  • As the era of big data, which is the foundation of the fourth industry, has come, most related industries are developing related solutions focusing on the technologies of data storage, statistical analysis and visualization. However, for the diffusion of bigdata technology, it is necessary to develop the prediction analysis technologies using artificial intelligence. But these advanced technologies are only possible by some experts now called data scientists. For big data-related industries to develop, a non-expert, called a citizen data scientist, should be able to easily access the big data analysis process at low cost because they have insight into their own data. In this paper, we propose a system for analyzing bigdata and building business models with the support of easy-to-use analysis system without knowledge of high-level data science. We also define the necessary components and environment for the prediction analysis system and present the overall service plan.

Principal selected response reduction in multivariate regression (다변량회귀에서 주선택 반응변수 차원축소)

  • Yoo, Jae Keun
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.4
    • /
    • pp.659-669
    • /
    • 2021
  • Multivariate regression often appears in longitudinal or functional data analysis. Since multivariate regression involves multi-dimensional response variables, it is more strongly affected by the so-called curse of dimension that univariate regression. To overcome this issue, Yoo (2018) and Yoo (2019a) proposed three model-based response dimension reduction methodologies. According to various numerical studies in Yoo (2019a), the default method suggested in Yoo (2019a) is least sensitive to the simulated models, but it is not the best one. To release this issue, the paper proposes an selection algorithm by comparing the other two methods with the default one. This approach is called principal selected response reduction. Various simulation studies show that the proposed method provides more accurate estimation results than the default one by Yoo (2019a), and it confirms practical and empirical usefulness of the propose method over the default one by Yoo (2019a).

Penalized variable selection in mean-variance accelerated failure time models (평균-분산 가속화 실패시간 모형에서 벌점화 변수선택)

  • Kwon, Ji Hoon;Ha, Il Do
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.3
    • /
    • pp.411-425
    • /
    • 2021
  • Accelerated failure time (AFT) model represents a linear relationship between the log-survival time and covariates. We are interested in the inference of covariate's effect affecting the variation of survival times in the AFT model. Thus, we need to model the variance as well as the mean of survival times. We call the resulting model mean and variance AFT (MV-AFT) model. In this paper, we propose a variable selection procedure of regression parameters of mean and variance in MV-AFT model using penalized likelihood function. For the variable selection, we study four penalty functions, i.e. least absolute shrinkage and selection operator (LASSO), adaptive lasso (ALASSO), smoothly clipped absolute deviation (SCAD) and hierarchical likelihood (HL). With this procedure we can select important covariates and estimate the regression parameters at the same time. The performance of the proposed method is evaluated using simulation studies. The proposed method is illustrated with a clinical example dataset.

Impact of viewing conditions on the performance assessment of different computer monitors used for dental diagnostics

  • Hastie, Thomas;Venske-Parker, Sascha;Aps, Johan K.M.
    • Imaging Science in Dentistry
    • /
    • v.51 no.2
    • /
    • pp.137-148
    • /
    • 2021
  • Purpose: This study aimed to assess the computer monitors used for analysis and interpretation of digital radiographs within the clinics of the Oral Health Centre of Western Australia. Materials and Methods: In total, 135 computer monitors(3 brands, 6 models) were assessed by analysing the same radiographic image of a combined 13-step aluminium step wedge and the Artinis CDDent 1.0® (Artinis Medical Systems B.V.®, Elst, the Netherlands) test object. The number of steps and cylindrical objects observed on each monitor was recorded along with the monitor's make, model, position relative to the researcher's eye level, and proximity to the nearest window. The number of window panels blocked by blinds, the outside weather conditions, and the number of ceiling lights over the surgical suite/cubicle were also recorded. MedCalc® version 19.2.1 (MedCalc Software Ltd®, Ostend, Belgium, https://www.medcalc.org; 2020) was used for statistical analyses(Kruskal-Wallis test and stepwise regression analysis). The level of significance was set at P<0.05. Results: Stepwise regression analysis showed that only the monitor brand and proximity of the monitor to a window had a significant impact on the monitor's performance (P<0.05). The Kruskal-Wallis test showed significant differences (P<0.05) in monitor performance for all variables investigated, except for the weather and the clinic in which the monitors were placed. Conclusion: The vast performance variation present between computer monitors implies the need for a review of monitor selection, calibration, and viewing conditions.

Evaluating Accuracy according to the Evaluator and Equipment Using Electronic Apex Locators

  • Yu, Beom-Young;Son, Keunbada;Lee, Kyu-Bok
    • Journal of Korean Dental Science
    • /
    • v.13 no.2
    • /
    • pp.52-58
    • /
    • 2020
  • Purpose: Using two types of electronic apex locators, this study aimed to investigate the differences in accuracy according to the evaluator and equipment. Materials and Methods: Artificial teeth of the lower first premolars and two mandibular acrylic models (A and B) were used in this study. In the artificial teeth, the pulp chamber was opened and the access cavity was prepared. Using calibrated digital Vernier calipers, the distance from the top of the cavity and the root apex was measured to assess the actual distance between two artificial teeth. The evaluation was conducted by 20 dentists, and each evaluator repeated measurements for each electronic apex locator five times. The difference between the actual distance from the top of the cavity to the root apex and the distance measured using electronic measuring equipment was compared. For statistical analysis, the Friedman test the Mann-Whitney U-test were conducted and the differences between groups were analyzed (α=0.05). Result: As for the accuracy of measurement according to the two types of electronic apex locators, the value of the measurement error was 0.4753 mm in Dentaport ZX and 0.3321 mm in E-Cube Plus. Moreover, electronic apex locators Dentaport ZX and E-Cube Plus showed statistically significant differences (P<0.05). As for the difference in the accuracy of the two types of electronic apex locators according to the evaluator, the resulting values differed depending on the evaluator and showed a statistically significant difference (P<0.001). Conclusion: Electronic apex locator E-Cube Plus showed higher accuracy than did Dentaport ZX. Nevertheless, both types of electronic apex locators showed 100% accuracy in finding the region within root apex ±0.5 mm zone. Furthermore, according to the evaluator, the two electronic apex locators showed different resulting values.