• Title/Summary/Keyword: Use Statistics

Search Result 2,827, Processing Time 0.029 seconds

A Study on the Survey Modes Used in the 2020 Korean Crime Victim Survey

  • Sunyeong Heo
    • Journal of Integrative Natural Science
    • /
    • v.17 no.3
    • /
    • pp.75-86
    • /
    • 2024
  • Surveys are an important data collection method for national statistics in Korea. As of July 22, 2024, based on data provided by the MDIS of Statistics Korea, approximately 92% of national statistics are compiled using the survey method, and about 85% of these rely primarily on face-to-face interviews for data collection. However, with the increase in single-person households, the nonresponse rate of surveys has been rising each year. This study examined the behaviors of respondents across different survey modes used in the 2020 Korean Crime Victim Survey, whose raw data about survey response modes is available for public use on the MDIS website in Statistics Korea. The results show that, overall, the younger the age group, the higher the proportion of self-administered responses, with this proportion being higher for female than male except over 60 years old. Notably, the 20s age group has a significantly higher percentage of self-administered (TAPI) responses for both male and female. Furthermore, the response using tablets generally has a higher proportion compared to the response using paper across all age groups except for females in their teens, regardless of whether the survey is interviewer-administered or self-administered.

Standardized polytomous discrimination index using concordance (부합성을 이용한 표준화된 다항판별지수)

  • Choi, Jin Soo;Hong, Chong Sun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.1
    • /
    • pp.33-44
    • /
    • 2016
  • There are many situations that the outcome for clinical decision and credit assessment should be predicted more than two categories. Five kinds of statistics which are used the concordance are proposed and used for these polytomous problems. However, these statistics are defined without exact distinction of categories, so that we have difficulty to use both the pair and set approaches and it is hard to understand the meanings of these statistics. Hence, it is not possible to compare and analyze them. In this paper, the polytomous confusion matrix is standardized and the concordance statistic can be represented based on the confusion matrix. The five kinds of statistics by using the concordance are defined. With the methods proposed in this paper, we could not only explain their meanings but also compare and analyze these statistics. Based on various data sets, properties of these five statistics are explored and explained.

Development and Application of a Learning System based on Web 2.0 for Promoting Statistics Literacy (통계 활용 능력 향상을 위한 웹 2.0 기반 학습 시스템 개발 및 적용)

  • Lee, Jun-Hee
    • Journal of The Korean Association of Information Education
    • /
    • v.13 no.1
    • /
    • pp.85-95
    • /
    • 2009
  • The web is shifting from being a medium in which information is transmitted and consumed, into being a platform in which contents are created, shared, repurposed. Web 2.0 applications are those that make the most of the intrinsic advantages of that platform. This thesis proposed a learning system based on web 2.0 for promoting statistics literacy. For this, I proposed students to create UCC(User Created Contents) data to learn principle of statistics and practical use of statistical data. To examine educational effectiveness of this study, an experimental study was conducted through the education content and method to the subject of one class in the first-grade of university located in OO city. As a result, the student understanded basic principle on statistics and perceived importance of statistics literacy and presented positive responses to statistics. In conclusion, it is confirmed that it is possible to provide effective education for students if the statistics literacy is proposed to proper education method.

  • PDF

A Method to Predict the Number of Clusters

  • Chae, Seong-San;Willian D. Warde
    • Journal of the Korean Statistical Society
    • /
    • v.20 no.2
    • /
    • pp.162-176
    • /
    • 1991
  • The problem of determining the number of clusters, K. is the main objective of this study. Attention is focused on the use of Rand(1971)'s $C_{k}$ statistic with some agglomerative clustering algorithms(ACA) defined in the ($\beta$, $\pi$) plane in predicting the number of clusters within the given set of data. The (k, $C_{k}$) plots for k=1, 2, …, N are explored by a Monte Carlo study. Based on its performance, the use of $C_{k}$ with the pair of ACA, (-.5, .75) and (-.25, .0), is recommended for predicting the number of clusters present within a set of data. data.

  • PDF

Note on Use of $R^2$ for No-intercept Model

  • Do, Jong-Doo;Kim, Tae-Yoon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.2
    • /
    • pp.661-668
    • /
    • 2006
  • There have been some controversies on the use of the coefficient of determination for linear no-intercept model. One definition of the coefficient of determination, $R^2={\sum}\;{\widehat{y^2}}\;/\;{\sum}\;y^2$, is being widely accepted only for linear no-intercept models though Kvalseth (1985) demonstrated some possible pitfalls in using such $R^2$. Main objective of this note is to report that $R^2$ is not a desirable measure of fit for the no-intercept linear model. In fact it is found that mean square error(MSE) could replace $R^2$ efficiently in most cases where selection of no-intercept model is at issue.

  • PDF

Industrial Waste Database Analysis Using Data Mining Techniques

  • Cho, Kwang-Hyun;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.2
    • /
    • pp.455-465
    • /
    • 2006
  • Data mining is the method to find useful information for large amounts of data in database. It is used to find hidden knowledge by massive data, unexpectedly pattern, and relation to new rule. The methods of data mining are decision tree, association rules, clustering, neural network and so on. We analyze industrial waste database using data mining technique. We use k-means algorithm for clustering and C5.0 algorithm for decision tree and Apriori algorithm for association rule. We can use these outputs for environmental preservation and environmental improvement.

  • PDF

Bayesian Variable Selection in the Proportional Hazard Model with Application to DNA Microarray Data

  • Lee, Kyeon-Eun;Mallick, Bani K.
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.357-360
    • /
    • 2005
  • In this paper we consider the well-known semiparametric proportional hazards (PH) models for survival analysis. These models are usually used with few covariates and many observations (subjects). But, for a typical setting of gene expression data from DNA microarray, we need to consider the case where the number of covariates p exceeds the number of samples n. For a given vector of response values which are times to event (death or censored times) and p gene expressions (covariates), we address the issue of how to reduce the dimension by selecting the significant genes. This approach enable us to estimate the survival curve when n < < p. In our approach, rather than fixing the number of selected genes, we will assign a prior distribution to this number. The approach creates additional flexibility by allowing the imposition of constraints, such as bounding the dimension via a prior, which in effect works as a penalty. To implement our methodology, we use a Markov Chain Monte Carlo (MCMC) method. We demonstrate the use of the methodology to diffuse large B-cell lymphoma (DLBCL) complementary DNA(cDNA) data.

  • PDF

Environmental Survey Data Analysis by Data Fusion Techniques

  • Cho, Kwang-Hyun;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.4
    • /
    • pp.1201-1208
    • /
    • 2006
  • Data fusion is generally defined as the use of techniques that combine data from multiple sources and gather that information in order to achieve inferences. Data fusion is also called data combination or data matching. Data fusion is divided in five branch types which are exact matching, judgemental matching, probability matching, statistical matching, and data linking. Currently, Gyeongnam province is executing the social survey every year with the provincials. But, they have the limit of the analysis as execute the different survey to 3 year cycles. In this paper, we study to data fusion of environmental survey data using sas macro. We can use data fusion outputs in environmental preservation and environmental improvement.

  • PDF

Modelling the capture of spray droplets by barley

  • Cox, S.J.;Salt, D.W.;Lee, B.E.;Ford, M.G.
    • Wind and Structures
    • /
    • v.5 no.2_3_4
    • /
    • pp.127-140
    • /
    • 2002
  • This paper presents some of the results of a project whose aim has been to produce a full simulation model which would determine the efficacy of pesticides for use by both farmers and the bio-chemical industry. The work presented here describes how crop architecture can be mathematically modelled and how the mechanics of pesticide droplet capture can be simulated so that if a wind assisted droplet-trajectory model is assumed then droplet deposition patterns on crop surfaces can be predicted. This achievement, when combined with biological response models, will then enable the efficacy of pesticide use to be predicted.

The Use of Generalized Gamma-Polynomial Approximation for Hazard Functions

  • Ha, Hyung-Tae
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.6
    • /
    • pp.1345-1353
    • /
    • 2009
  • We introduce a simple methodology, so-called generalized gamma-polynomial approximation, based on moment-matching technique to approximate survival and hazard functions in the context of parametric survival analysis. We use the generalized gamma-polynomial approximation to approximate the density and distribution functions of convolutions and finite mixtures of random variables, from which the approximated survival and hazard functions are obtained. This technique provides very accurate approximation to the target functions, in addition to their being computationally efficient and easy to implement. In addition, the generalized gamma-polynomial approximations are very stable in middle range of the target distributions, whereas saddlepoint approximations are often unstable in a neighborhood of the mean.