• Title/Summary/Keyword: Statistical Analysis Data

Search Result 9,183, Processing Time 0.04 seconds

A Comparative Study on Spatial Lattice Data Analysis - A Case Where Outlier Exists - (공간 격자데이터 분석에 대한 우위성 비교 연구 - 이상치가 존재하는 경우 -)

  • Kim, Su-Jung;Choi, Seung-Bae;Kang, Chang-Wan;Cho, Jang-Sik
    • Communications for Statistical Applications and Methods
    • /
    • v.17 no.2
    • /
    • pp.193-204
    • /
    • 2010
  • Recently, researchers of the various fields where the spatial analysis is needed have more interested in spatial statistics. In case of data with spatial correlation, methodologies accounting for the correlation are required and there have been developments in methods for spatial data analysis. Lattice data among spatial data is analyzed with following three procedures: (1) definition of the spatial neighborhood, (2) definition of spatial weight, and (3) the analysis using spatial models. The present paper shows a spatial statistical analysis method superior to a general statistical method in aspect estimation by using the trimmed mean squared error statistic, when we analysis the spatial lattice data that outliers are included. To show validation and usefulness of contents in this paper, we perform a small simulation study and show an empirical example with a criminal data in BusanJin-Gu, Korea.

Statistical Issues in Genomic Cohort Studies (유전체 코호트 연구의 주요 통계학적 과제)

  • Park, So-Hee
    • Journal of Preventive Medicine and Public Health
    • /
    • v.40 no.2
    • /
    • pp.108-113
    • /
    • 2007
  • When conducting large-scale cohort studies, numerous statistical issues arise from the range of study design, data collection, data analysis and interpretation. In genomic cohort studies, these statistical problems become more complicated, which need to be carefully dealt with. Rapid technical advances in genomic studies produce enormous amount of data to be analyzed and traditional statistical methods are no longer sufficient to handle these data. In this paper, we reviewed several important statistical issues that occur frequently in large-scale genomic cohort studies, including measurement error and its relevant correction methods, cost-efficient design strategy for main cohort and validation studies, inflated Type I error, gene-gene and gene-environment interaction and time-varying hazard ratios. It is very important to employ appropriate statistical methods in order to make the best use of valuable cohort data and produce valid and reliable study results.

A Brief Guide to Statistical Analysis and Presentation for the Plant Pathology Journal

  • Jeon, Junhyun
    • The Plant Pathology Journal
    • /
    • v.38 no.3
    • /
    • pp.175-181
    • /
    • 2022
  • Statistical analysis of data is an integral part of research projects in all scientific disciplines including the plant pathology. Appropriate design, application and interpretation of statistical analysis are also, therefore, at the center of publishing and properly evaluating studies in plant pathology. A survey of research works published in the Plant Pathology Journal, however, cast doubt on high standard of statistical analysis required for scientific rigor and reproducibility in the journal. Here I first describe, based on the survey of published works, what mistakes are commonly made and what components are often lacking during statistical analysis and interpretation of its results. Next, I provide possible remedies and suggestions to help guide researchers in preparing manuscript and reviewers in evaluating manuscripts submitted to the Plant Pathology Journal. This is not aiming at delineating technical and practical details of particular statistical methods or approaches.

Study on Improving Oriental Medicine Statistical System for Multidimensional Statistical Data

  • Yea, Sang-Jun;Kim, Chul;Kim, Jin-Hyun;Jang, Hyun-Chul;Kim, Sang-Kyun;Song, Mi-Young
    • International Journal of Contents
    • /
    • v.7 no.3
    • /
    • pp.13-18
    • /
    • 2011
  • Oriental medicine statistics are essential in research planning, research evaluation, and policy decision based on objective data. However, integrated administration of such statistics is not presently possible in the oriental medicine field, which has been slow in incorporating information communication technology. In an effort to address this problem, the Korea Institute of Oriental Medicine (KIOM) developed an oriental medicine statistical system in 2009, and the system has been offered in the traditional medicine information portal of OASIS. However, according to a 2010 survey targeting OASIS users, those surveys reported that needs for a system where various statistical data can be extracted via an interactive approach to multidimensional data. As a result of an analysis of the functions of the existing system, it was found that it is necessary to array and arithmetically analyze Stats Value, Drill Up & Drill Down, and Pivot. To this end, the existing DB schema should be redesigned. Based on our analysis result, we redesigned the database into a structure that is applicable to the reverse pivot algorithm. We used J2EE/JSP and a Flex framework to design and develop an oriental medicine statistical system that can provide multidimensional statistical data. Considering that the improved oriental medicine statistical system is planned to be offered by OASIS of KIOM, utilization and value of oriental medicine statistical data are expected to be enhanced.

A small review and further studies on the LASSO

  • Kwon, Sunghoon;Han, Sangmi;Lee, Sangin
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.5
    • /
    • pp.1077-1088
    • /
    • 2013
  • High-dimensional data analysis arises from almost all scientific areas, evolving with development of computing skills, and has encouraged penalized estimations that play important roles in statistical learning. For the past years, various penalized estimations have been developed, and the least absolute shrinkage and selection operator (LASSO) proposed by Tibshirani (1996) has shown outstanding ability, earning the first place on the development of penalized estimation. In this paper, we first introduce a number of recent advances in high-dimensional data analysis using the LASSO. The topics include various statistical problems such as variable selection and grouped or structured variable selection under sparse high-dimensional linear regression models. Several unsupervised learning methods including inverse covariance matrix estimation are presented. In addition, we address further studies on new applications which may establish a guideline on how to use the LASSO for statistical challenges of high-dimensional data analysis.

Statistical Analysis of Ion Components in Rainwater (濕性大氣成分에 對한 統計的解析)

  • 李敏熙;韓義正;元良洙;辛燦基
    • Journal of Korean Society for Atmospheric Environment
    • /
    • v.2 no.1
    • /
    • pp.41-54
    • /
    • 1986
  • Methods used for averaging PH's of rainwater and site representation have been studied, Statistical analysis was attempted regarding effects of ionic components on PH's utilizing 847 data altogether obtained in two years, 1984 and 1985. The outcome of the study may be assumarized as follows: 1. Methods for Averaging PH Volume weighted method is considered to be acceptable providing that precipitation is measured at the same time when the samples are taken. Without precipitation data a simple averaging method should be the next choice. 2. Site Representation A statistical method used for optimizing a monitoring newtork was applied using the data collected. Because of a limited number of data, no discernible conclusion can be reached suggesting that the method can serve as a good guide when the data base becomes more reliable. 3. A good correlation appears to exist betwen conductivities and ionic components in rainwater. It would, therefore, be possible to certain extend to estimate ionic concentrations from conductivity measurements by correlation equations. 4. The acidity of rainwater is effected by $SO_4^{2-}, NO_3^-, Cl^- and NH_4^+ with SO_4^{2-}$ being the most significant as demonstrated by standardized regression analysis.

  • PDF

Change Analysis with the Sample Fourier Coefficients

  • Jaehee Kim
    • Communications for Statistical Applications and Methods
    • /
    • v.3 no.1
    • /
    • pp.207-217
    • /
    • 1996
  • The problem of detecting change with independent data is considered. The asymptotic distribution of the sample change process with the sample Fourier coefficients is shown as a Brownian Bridge process. We suggest to use dynamic statistics such as a sample Brownian Bridge and graphs as statistical animation. Graphs including change PP plots are given by way of illustration with the simulated data.

  • PDF

Statistical bioinformatics for gene expression data

  • Lee, Jae-K.
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2001.08a
    • /
    • pp.103-127
    • /
    • 2001
  • Gene expression studies require statistical experimental designs and validation before laboratory confirmation. Various clustering approaches, such as hierarchical, Kmeans, SOM are commonly used for unsupervised learning in gene expression data. Several classification methods, such as gene voting, SVM, or discriminant analysis are used for supervised lerning, where well-defined response classification is possible. Estimating gene-condition interaction effects require advanced, computationally-intensive statistical approaches.

  • PDF

Efficiency of Aggregate Data in Non-linear Regression

  • Huh, Jib
    • Communications for Statistical Applications and Methods
    • /
    • v.8 no.2
    • /
    • pp.327-336
    • /
    • 2001
  • This work concerns estimating a regression function, which is not linear, using aggregate data. In much of the empirical research, data are aggregated for various reasons before statistical analysis. In a traditional parametric approach, a linear estimation of the non-linear function with aggregate data can result in unstable estimators of the parameters. More serious consequence is the bias in the estimation of the non-linear function. The approach we employ is the kernel regression smoothing. We describe the conditions when the aggregate data can be used to estimate the regression function efficiently. Numerical examples will illustrate our findings.

  • PDF

A Necessity of Measurement Customer Satisfaction to NSO Products for Enhancing Quality

  • Choi, Kyung-Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.16 no.4
    • /
    • pp.781-790
    • /
    • 2005
  • Nowaday, statistical data with coherence, accuracy and timeliness are necessary to government, company and research center for decision making or research. In other words, the importance of statistical data quality is steadily increasing. Thus, in this paper, we suggest necessity of measuring customer satisfaction with NSO products for enhancing quality. And we construct measurement scale for measuring customer satisfaction based on the statistical quality indicators. Also we advise use of structural equation model in relation analysis for statistic quality elevation.

  • PDF