• 제목/요약/키워드: Missing Value

Search Result 315, Processing Time 0.024 seconds

Survival Analysis of Gastric Cancer Patients with Incomplete Data

  • Moghimbeigi, Abbas;Tapak, Lily;Roshanaei, Ghodaratolla;Mahjub, Hossein
    • Journal of Gastric Cancer
    • /
    • v.14 no.4
    • /
    • pp.259-265
    • /
    • 2014
  • Purpose: Survival analysis of gastric cancer patients requires knowledge about factors that affect survival time. This paper attempted to analyze the survival of patients with incomplete registered data by using imputation methods. Materials and Methods: Three missing data imputation methods, including regression, expectation maximization algorithm, and multiple imputation (MI) using Monte Carlo Markov Chain methods, were applied to the data of cancer patients referred to the cancer institute at Imam Khomeini Hospital in Tehran in 2003 to 2008. The data included demographic variables, survival times, and censored variable of 471 patients with gastric cancer. After using imputation methods to account for missing covariate data, the data were analyzed using a Cox regression model and the results were compared. Results: The mean patient survival time after diagnosis was $49.1{\pm}4.4$ months. In the complete case analysis, which used information from 100 of the 471 patients, very wide and uninformative confidence intervals were obtained for the chemotherapy and surgery hazard ratios (HRs). However, after imputation, the maximum confidence interval widths for the chemotherapy and surgery HRs were 8.470 and 0.806, respectively. The minimum width corresponded with MI. Furthermore, the minimum Bayesian and Akaike information criteria values correlated with MI (-821.236 and -827.866, respectively). Conclusions: Missing value imputation increased the estimate precision and accuracy. In addition, MI yielded better results when compared with the expectation maximization algorithm and regression simple imputation methods.

A longitudinal study for child aggression with Korea Welfare Panel Study data (한국복지패널 자료를 이용한 아동기 공격성에 대한 경시적 자료 분석)

  • Choi, Nayeon;Huh, Jib
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.6
    • /
    • pp.1439-1447
    • /
    • 2014
  • Most of literatures on Korean child aggression are based on using the cross-sectional data sets. Although there is a related study with a longitudinal data set, it is assumed that the data sets measured repeatedly in the longitudinal data are mutually independent. A longitudinal data analysis for Korean child aggression is then necessary. This study is to analyze the effect of child development outcomes including academic achievement, self-esteem, depression anxiety, delinquency, victimization by peers, abuse by parents and internet using time on child aggression with Korea Welfare Panel Study data observed three times between 2006 and 2012. Since Korea Welfare Panel Study data have missing values, the missing at random is assumed. The linear mixed effect model and the restricted maximum likelihood estimation are considered.

Preference Prediction System using Similarity Weight granted Bayesian estimated value and Associative User Clustering (베이지안 추정치가 부여된 유사도 가중치와 연관 사용자 군집을 이용한 선호도 예측 시스템)

  • 정경용;최성용;임기욱;이정현
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.3_4
    • /
    • pp.316-325
    • /
    • 2003
  • A user preference prediction method using an exiting collaborative filtering technique has used the nearest-neighborhood method based on the user preference about items and has sought the user's similarity from the Pearson correlation coefficient. Therefore, it does not reflect any contents about items and also solve the problem of the sparsity. This study suggests the preference prediction system using the similarity weight granted Bayesian estimated value and the associative user clustering to complement problems of an exiting collaborative preference prediction method. This method suggested in this paper groups the user according to the Genre by using Association Rule Hypergraph Partitioning Algorithm and the new user is classified into one of these Genres by Naive Bayes classifier to slove the problem of sparsity in the collaborative filtering system. Besides, for get the similarity between users belonged to the classified genre and new users, this study allows the different estimated value to item which user vote through Naive Bayes learning. If the preference with estimated value is applied to the exiting Pearson correlation coefficient, it is able to promote the precision of the prediction by reducing the error of the prediction because of missing value. To estimate the performance of suggested method, the suggested method is compared with existing collaborative filtering techniques. As a result, the proposed method is efficient for improving the accuracy of prediction through solving problems of existing collaborative filtering techniques.

Performance Improvement of Collaborative Filtering System Using Associative User′s Clustering Analysis for the Recalculation of Preference and Representative Attribute-Neighborhood (선호도 재계산을 위한 연관 사용자 군집 분석과 Representative Attribute -Neighborhood를 이용한 협력적 필터링 시스템의 성능향상)

  • Jung, Kyung-Yong;Kim, Jin-Su;Kim, Tae-Yong;Lee, Jung-Hyun
    • The KIPS Transactions:PartB
    • /
    • v.10B no.3
    • /
    • pp.287-296
    • /
    • 2003
  • There has been much research focused on collaborative filtering technique in Recommender System. However, these studies have shown the First-Rater Problem and the Sparsity Problem. The main purpose of this Paper is to solve these Problems. In this Paper, we suggest the user's predicting preference method using Bayesian estimated value and the associative user clustering for the recalculation of preference. In addition to this method, to complement a shortcoming, which doesn't regard the attribution of item, we use Representative Attribute-Neighborhood method that is used for the prediction when we find the similar neighborhood through extracting the representative attribution, which most affect the preference. We improved the efficiency by using the associative user's clustering analysis in order to calculate the preference of specific item within the cluster item vector to the collaborative filtering algorithm. Besides, for the problem of the Sparsity and First-Rater, through using Association Rule Hypergraph Partitioning algorithm associative users are clustered according to the genre. New users are classified into one of these genres by Naive Bayes classifier. In addition, in order to get the similarity value between users belonged to the classified genre and new users, and this paper allows the different estimated value to item which user evaluated through Naive Bayes learning. As applying the preference granted the estimated value to Pearson correlation coefficient, it can make the higher accuracy because the errors that cause the missing value come less. We evaluate our method on a large collaborative filtering database of user rating and it significantly outperforms previous proposed method.

Corporate Social Responsibility and Firm Performance: the Moderating Role of Top Management Team Characteristics and Heterogeneity

  • Meng, La-Mei;Byun, Hae-Young
    • Asia-Pacific Journal of Business
    • /
    • v.12 no.2
    • /
    • pp.39-60
    • /
    • 2021
  • Purpose - The purpose of this paper is exploring whether the characteristics and heterogeneity of the TMT play a moderating role in CSR and corporate value or not. Design/methodology/approach - The literature research method includes collecting, organizing, and analyzing the literature on the characteristics and heterogeneity of the TMT, the effect of corporate social responsibility (CSR), and corporate value. We analyze the contributions and limitations in existing research, grasp the current research status, and develop the research content of this article. The empirical analysis method is based on the data of Chinese A-share listed companies from 2001 to 2017. This allows us to study the moderating effect of the characteristics and heterogeneity of the TMT on CSR and corporate value. Findings - The TMT age, education degree, overseas background, and compensation have a positive moderating effect on CSR and corporate market value. The comprehensive heterogeneity of the TMT also has a positive effect on CSR and financial performance. Research implications or Originality - The research on the relationship between CSR and corporate value is still inconclusive. Some results have found a positive relationship, while others show a negative relationship. Studies exist that report mixed findings as well. This study has attempted to clarify this problem by adding potentially missing variables related on the TMT characteristics and heterogeneity, investigating causality effects.

A motion-adaptive de-interlacing method using an efficient spatial and temporal interpolation (효율적인 시공간 보간을 통한 움직임 기반의 디인터레이싱 기법)

  • Lee, Seong-Gyu;Lee, Dong-Ho
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.38 no.5
    • /
    • pp.556-566
    • /
    • 2001
  • This paper proposes a motion-adaptive de-interlacing algorithm based on EBMF(Edge Based Median Filter) and AMPDF(Adaptive Minimum Pixel Difference Fillet). To compensate 'motion missing'error, which is an important factor in motion-adaptive methods, we used AMPDF which estimates an accurate value using different thresholds after classifying the input image to 4 classes. To efficiently interpolate the moving diagonal edge, we also used EBMF which selects a candidate pixel according to the edge information. Finally, we, to increase the performance, adopted an adaptive interpolation after classifying the input image to moving region, stationary region, and boundary region. Simulation results showed that the proposed method provides better performance than the existing methods.

  • PDF

Occlusal rehabilitation of posterior fixed prostheses: A clinical report

  • Yeo In-Sung;Yang Jae-Ho
    • The Journal of Korean Academy of Prosthodontics
    • /
    • v.39 no.3
    • /
    • pp.313-318
    • /
    • 2001
  • Masticatory function is more important than esthetics in posterior fixed restoration. The usual technique-mounting the casts on a semi-adjustable articulator, etc. -cannot make all patients satisfied with their restorations. For example, functionally generated path technique can be an easier and more satisfactory method in the restoration of group function. These clinical reports describe various approaches for occlusal restoration of relatively simple posterior fixed protheses according to patients occlusal patterns. The 3-unit bridge restoration is one of the most popular treatment options in prosthodontics. Because dentists have much experiences of it, they restore a missing span of one tooth mechanically, that is, without special consideration. While esthetics is important in making an anterior 3-unit fixed prostheses, mastication is more focused on in posterior 3-unit bridge restoration. Many dentists are concerned about various aspects in esthetics, such as morphology of the tooth, value, chroma, hue, translucency, surface texture, etc. But they do not usually consider various methods to restore occlusion. They treat one-tooth-missing area in a similar way in spite of patients having variety of occlusal patterns. Three cases are presented here in 3 or 4-unit bridge restoration. They show some methods to restore patients' occlusal patterns.

  • PDF

Registry Metadata Quality Assessment by the Example of re3data.org Schema

  • Kim, Suntae;Choi, Myung-Seok
    • International Journal of Knowledge Content Development & Technology
    • /
    • v.7 no.2
    • /
    • pp.41-51
    • /
    • 2017
  • Nowadays, research data repositories (RDR) have become progressively widespread all over the world. To expand repository services and build up inbound linking strategy, organizations list their repositories with so called Global Registries. Accordingly, such registries should be carefully described by the related data. In this study, I explore the metadata schema of re3data.org. I collect and analyze descriptions from the listed repositories, and come up with some suggestions concerning possible improvements to the metadata schema. To accomplish this, I develop a crawler program, which collects necessary data from the re3data.org. Based on the analysis results, I have identified two issues that required elements is missing, one issue that required element value is missing when the corresponding property is applied, five inconsistency issues with re3data controlled vocabulary, six issues with undescribed optional elements, and two inconsistency issues between the elements and their attributes which do not pair with. I believe this discussion can facilitate improvements to the existing re3data.org schema and further help researchers who analyze data repository trends.

Household, personal, and financial determinants of surrender in Korean health insurance

  • Shim, Hyunoo;Min, Jung Yeun;Choi, Yang Ho
    • Communications for Statistical Applications and Methods
    • /
    • v.28 no.5
    • /
    • pp.447-462
    • /
    • 2021
  • In insurance, the surrender rate is an important variable that threatens the sustainability of insurers and determines the profitability of the contract. Unlike other actuarial assumptions that determine the cash flow of an insurance contract, however, it is characterized by endogenous variables such as people's economic, social, and subjective decisions. Therefore, a microscopic approach is required to identify and analyze the factors that determine the lapse rate. Specifically, micro-level characteristics including the individual, demographic, microeconomic, and household characteristics of policyholders are necessary for the analysis. In this study, we select panel survey data of Korean Retirement Income Study (KReIS) with many diverse dimensions to determine which variables have a decisive effect on the lapse and apply the lasso regularized regression model to analyze it empirically. As the data contain many missing values, they are imputed using the random forest method. Among the household variables, we find that the non-existence of old dependents, the existence of young dependents, and employed family members increase the surrender rate. Among the individual variables, divorce, non-urban residential areas, apartment type of housing, non-ownership of homes, and bad relationship with siblings increase the lapse rate. Finally, among the financial variables, low income, low expenditure, the existence of children that incur child care expenditure, not expecting to bequest from spouse, not holding public health insurance, and expecting to benefit from a retirement pension increase the lapse rate. Some of these findings are consistent with those in the literature.

Application Examples Applying Extended Data Expression Technique to Classification Problems (패턴 분류 문제에 확장된 데이터 표현 기법을 적용한 응용 사례)

  • Lee, Jong Chan
    • Journal of the Korea Convergence Society
    • /
    • v.9 no.12
    • /
    • pp.9-15
    • /
    • 2018
  • The main goal of extended data expression is to develop a data structure suitable for common problems in ubiquitous environments. The greatest feature of this method is that the attribute values can be represented with probability. The next feature is that each event in the training data has a weight value that represents its importance. After this data structure has been developed, an algorithm has been devised that can learn it. In the meantime, this algorithm has been applied to various problems in various fields to obtain good results. This paper first introduces the extended data expression technique, UChoo, and rule refinement method, which are the theoretical basis. Next, this paper introduces some examples of application areas such as rule refinement, missing data processing, BEWS problem, and ensemble system.