• Title/Summary/Keyword: representative statistics

Search Result 252, Processing Time 0.019 seconds

Comparison of Neural Network Techniques for Text Data Analysis

  • Kim, Munhee;Kang, Kee-Hoon
    • International Journal of Advanced Culture Technology
    • /
    • v.8 no.2
    • /
    • pp.231-238
    • /
    • 2020
  • Generally, sequential data refers to data having continuity. Text data, which is a representative type of unstructured data, is also sequential data in that it is necessary to know the meaning of the preceding word in order to know the meaning of the following word or context. So far, many techniques for analyzing sequential data such as text data have been proposed. In this paper, four methods of 1d-CNN, LSTM, BiLSTM, and C-LSTM are introduced, focusing on neural network techniques. In addition, by using this, IMDb movie review data was classified into two classes to compare the performance of the techniques in terms of accuracy and analysis time.

Design-based and model-based Inferences in Survey Sampling (표본조사에서 설계기반추론과 모형기반추론)

  • Kim Kyu-Seong
    • The Korean Journal of Applied Statistics
    • /
    • v.18 no.3
    • /
    • pp.673-687
    • /
    • 2005
  • We investigate both the design-based and model-based inferences, which are usual inferential methods in survey sampling. While the design-based inference is on the basis of randomization principle, The motel-based inference is based on likelihood principle as well as conditionality principle. There have been some disputes between two inferences for a long time and those have not yet been determined. In this paper we reviewed some issues on two inferences and compared their advantages and disadvantages in some viewpoints.

Adjusting sampling bias in case-control genetic association studies

  • Seo, Geum Chu;Park, Taesung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.5
    • /
    • pp.1127-1135
    • /
    • 2014
  • Genome-wide association studies (GWAS) are designed to discover genetic variants such as single nucleotide polymorphisms (SNPs) that are associated with human complex traits. Although there is an increasing interest in the application of GWAS methodologies to population-based cohorts, many published GWAS have adopted a case-control design, which raise an issue related to a sampling bias of both case and control samples. Because of unequal selection probabilities between cases and controls, the samples are not representative of the population that they are purported to represent. Therefore, non-random sampling in case-control study can potentially lead to inconsistent and biased estimates of SNP-trait associations. In this paper, we proposed inverse-probability of sampling weights based on disease prevalence to eliminate a case-control sampling bias in estimation and testing for association between SNPs and quantitative traits. We apply the proposed method to a data from the Korea Association Resource project and show that the standard estimators applied to the weighted data yield unbiased estimates.

Joint latent class analysis for longitudinal data: an application on adolescent emotional well-being

  • Kim, Eun Ah;Chung, Hwan;Jeon, Saebom
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.2
    • /
    • pp.241-254
    • /
    • 2020
  • This study proposes generalized models of joint latent class analysis (JLCA) for longitudinal data in two approaches, a JLCA with latent profile (JLCPA) and a JLCA with latent transition (JLTA). Our models reflect cross-sectional as well as longitudinal dependence among multiple latent classes and track multiple class-sequences over time. For the identifiability and meaningful inference, EM algorithm produces maximum-likelihood estimates under local independence assumptions. As an empirical analysis, we apply our models to track the joint patterns of adolescent depression and anxiety among US adolescents and show that both JLCPA and JLTA identify three adolescent emotional well-being subgroups. In addition, JLCPA classifies two representative profiles for these emotional well-being subgroups across time, and these profiles have different tendencies according to the parent-adolescent-relationship subgroups.

Comparison of control charts for individual observations (개별 관측치에 대한 관리도 비교)

  • Lee, Sungim
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.2
    • /
    • pp.203-215
    • /
    • 2022
  • In this paper, we consider the control charts applicable to monitoring the change of the population mean for sequentially observed individual data. The most representative control charts are Shewhart's individual control chart, the exponential weighted moving average (EWMA) control chart, and their combined control chart. We compare their performance based on a simulation study, and also, through real data analysis, we present how to apply control charts in practical application and investigate the problems of each control chart.

A Multivariate GARCH Analysis on International Stock Market Integration: Korean Market Case

  • Kim, Namhyoung
    • Management Science and Financial Engineering
    • /
    • v.21 no.1
    • /
    • pp.31-39
    • /
    • 2015
  • Financial integration is a phenomenon in which global financial markets are closely connected with each other. This article investigates the integration of Korean stock market with other stock markets using a multivariate GARCH analysis. We chose total seven countries including Korea for this paper based on the amount of export and then we chose major stock indices which can be thought as representative stock markets of those countries. The empirical analysis has shown that countries' financial integration.

A Systematic Approach to Quality Measurement of Official Statistics) (국가통계 품질측정을 위한 체계적 접근 - 표본조사의 품질평가지표 개발을 중심으로)

  • 이동명;김설희
    • Proceedings of the Korean Association for Survey Research Conference
    • /
    • 2002.11a
    • /
    • pp.111-127
    • /
    • 2002
  • As the utilization of official statistics has been recently increased, the necessity of objective quality assessment also has been increased. Since mean square error(MSE) and response rate, which have been considered as representative quality indicators in the past, may have the limits of use, it has been demanded to develop new quality indicators which are able to reflect the various requirements of users. In this paper, regarding sample surveys conducted by governmental agencies, the flow of procedures in statistics production is analyzed using input-output of each procedure. As the result, how to identify and develop quality indicators for each procedure is discussed, with some instances of indicators. Finally, the quality index using the results of quality assessment would be calculated based on a weighting method by the size of deviation of statistical measures.

  • PDF

A Statistical Analysis on Temperature Change and Climate Variability in Korea (한국의 기온변화와 기온변동성에 대한 통계적 연구)

  • Kim, Hyun-Chul;Choi, Seung-Kyung;Yun, Bo-Ra
    • Communications for Statistical Applications and Methods
    • /
    • v.18 no.1
    • /
    • pp.1-12
    • /
    • 2011
  • We analyzed the observed temperature data for 50 years on 5 representative points in Korea to verify global warming and the increase in climate variability. We found that there was some level of global warming but we could not disregard the effects of urbanization. In addition, we could not find any information for the increase in climate variability.

Performance comparison for automatic forecasting functions in R (R에서 자동화 예측 함수에 대한 성능 비교)

  • Oh, Jiu;Seong, Byeongchan
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.5
    • /
    • pp.645-655
    • /
    • 2022
  • In this paper, we investigate automatic functions for time series forecasting in R system and compare their performances. For the exponential smoothing models and ARIMA (autoregressive integrated moving average) models, we focus on the representative time series forecasting functions in R: forecast::ets(), forecast::auto.arima(), smooth::es() and smooth::auto.ssarima(). In order to compare their forecast performances, we use M3-Competiti on data consisting of 3,003 time series and adopt 3 accuracy measures. It is confirmed that each of the four automatic forecasting functions has strengths and weaknesses in the flexibility and convenience for time series modeling, forecasting accuracy, and execution time.

Effective Detection Techniques for Gradual Scene Changes on MPEG Video (MPEG 영상에서의 점진적 장면전환에 대한 효과적인 검출 기법)

  • 윤석중;지은석;김영로;고성제
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.24 no.8B
    • /
    • pp.1577-1585
    • /
    • 1999
  • In this paper, we propose detection methods for gradual scene changes such as dissolve, pan, and zoom. The proposal method to detect a dissolve region uses scene features based on spatial statistics of the image. The spatial statistics to define shot boundaries are derived from squared means within each local area. We also propose a method of the camera motion detection using four representative motion vectors in the background. Representative motion vectors are derived from macroblock motion vectors which are directly extracted from MPEG streams. To reduce the implementation time, we use DC sequences rather than fully decoded MPEG video. In addition, to detect the gradual scene change region precisely, we use all types of the MPEG frames(I, P, B frame). Simulation results show that the proposed detection methods perform better than existing methods.

  • PDF