Search | Korea Science

Multiple imputation and synthetic data (다중대체와 재현자료 작성)

Kim, Joungyoun;Park, Min-Jeong
- The Korean Journal of Applied Statistics
- /
- v.32 no.1
- /
- pp.83-97
- /
- 2019
As society develops, the dissemination of microdata has increased to respond to diverse analytical needs of users. Analysis of microdata for policy making, academic purposes, etc. is highly desirable in terms of value creation. However, the provision of microdata, whose usefulness is guaranteed, has a risk of exposure of personal information. Several methods have been considered to ensure the protection of personal information while ensuring the usefulness of the data. One of these methods has been studied to generate and utilize synthetic data. This paper aims to understand the synthetic data by exploring methodologies and precautions related to synthetic data. To this end, we first explain muptiple imputation, Bayesian predictive model, and Bayesian bootstrap, which are basic foundations for synthetic data. And then, we link these concepts to the construction of fully/partially synthetic data. To understand the creation of synthetic data, we review a real longitudinal synthetic data example which is based on sequential regression multivariate imputation.
https://doi.org/10.5351/KJAS.2019.32.1.083 인용 PDF KSCI HTML

Imputation of Multiple Missing Values by Normal Mixture Model under Markov Random Field: Application to Imputation of Pixel Values of Color Image (마코프 랜덤 필드 하에서 정규혼합모형에 의한 다중 결측값 대체기법: 색조영상 결측 화소값 대체에 응용)

Kim, Seung-Gu
- Communications for Statistical Applications and Methods
- /
- v.16 no.6
- /
- pp.925-936
- /
- 2009
There very many approaches to impute missing values in the iid. case. However, it is hardly found the imputation techniques in the Markov random field(MRF) case. In this paper, we show that the imputation under MRF is just to impute by fitting the normal mixture model(NMM) under several practical assumptions. Our multivariate normal mixture model based approaches under MRF is applied to impute the missing pixel values of 3-variate (R, G, B) color image, providing a technique to smooth the imputed values.
https://doi.org/10.5351/CKSS.2009.16.6.925 인용 PDF KSCI

The Implementation Directions and an Analysis of Assistive Devices and Alternative Formats to Improve Accessibility for Disabled People (장애인 접근성 향상을 위한 보조기기 및 대체자료 분석과 구현 방향)

Rim, Myunghwan;Gil, Younhee;Jeon, Gwangil
- The Journal of the Korea Contents Association
- /
- v.15 no.7
- /
- pp.664-673
- /
- 2015
The assistive devices for disabled people are being highlighted even in industrial aspects through the policy and support for disabled people, enactment of regulation for the improvement of accessibility of disabled, technological innovation and product development. Recently, internet access with the sense of touch and hearing and utilizing electronic publishing contents and e-mailing are being convenient through the product of ICT development such as screen reader for visually impaired people, braille display, screen enlarger, text converter and others. Even so, in rapidly changing digital media smart era, the accessibility of visually impaired people is still poor and assistive devices and alternative formats are in need of improvement. Therefore, in aspect of the research and development innovation, this study proposes the implementation directions for improvement of accessibility by analyzing the current situation and structure of alternative formats and assistive devices for visually impaired people. As a result, in the future, various types of digital information are expected to be converted into a customized and realistic forms and distributed through a dedicated disability products or smart devices.
https://doi.org/10.5392/JKCA.2015.15.07.664 인용 PDF KSCI

A Study on Missing Data Imputation for Water Demand in 112 Block of Yoengjong Island, Korea (영종도 112블록 AMI 물 수요량 결측 자료 보정기법 연구)

Koo, Kang Min;Han, Kuk Heon;Yum, Kyung Taek;Jun, Kyung Soo
- Proceedings of the Korea Water Resources Association Conference
- /
- 2019.05a
- /
- pp.3-3
- /
- 2019
최근 기후변화로 인한 집중호우, 가뭄 등 예측하기 어려운 사태가 발생하면서 깨끗하고 안정적인 용수공급 기술의 필요성이 대두되고 있다. 이에 IoT와 기존 물관리시스템을 결합한 스마트워터그리드 출범은 실시간으로 수요와 공급량의 정보를 취득하여 물 관리 효율성을 제고 할 수 있게 되었다. 실시간 수요량 자료를 이용하여 물 수요량 예측을 통한 최적의 물 공급량을 결정할 수 있다. 이 때 스마트워터그리드의 핵심 기술은 실시간으로 취득한 자료의 품질관리라 할 수 있다. 본 연구 대상지역인 영종도 112 블록에는 528개 AMI 스마트 미터를 이용하여 1시간 단위의 물 수요량 자료를 원격 검침하고 있다. 각 수용가에 설치된 AMI 센서를 통해 수집된 자료에는 오류를 포함할 수 있는데 통신 장애, 미터기 고장 및 교체 등으로 발생된다. 결측된 수요량 자료는 상수관망 수리해석에 사용되는 기본자료로서 비표본오차를 증가시켜 검정력과 정확성을 결여시키는 문제가 있다. 이에 본 연구에서는 수집된 자료를 가용할 수 있는 자료로 정제하고 대체하기 위해 완전히 관찰된 자료(complete data)만을 이용하여 각 시간에 따른 관경별, 용도별 그리고 요일별 수요패턴을 추정한다. 결측된 자료는 기존에 사용되는 평균대체법과 핫덱 대체(hot deck imputation) 등과 비교 검증한다.
PDF

패널자료의 종단적 결측패턴에 관한 실증분석 연구

Son, Chang-Gyun
- Proceedings of the Korean Association for Survey Research Conference
- /
- 2011.10a
- /
- pp.273-285
- /
- 2011
본 논문에서는 패널조사와 같은 종단면 연구에서 시간의 흐름에 따라 패널의 노후화 등의 원인으로 각 조사주기별로 발생하는 무응답(결측)에 대해 특정한 패널집단을 대상으로 무응답 패턴을 통계모형을 이용하여 분석하였다. 이러한 무응답 패턴분석을 기반으로 결측자료가 존재하는 종단자료의 분석에서 적절한 방법을 선택하여 분석을 수행할수 있으며, 만일 무응답 대체가 필요한 경우 적절한 대체 방법을 결정할 수 있을 것이다. 횡단면 조사와는 달리 이용가능한 보조정보가 각 웨이브별로 다양하게 존재하며, 이와 같은 보조정보를 무응답 대체에 활용할수 있다면, 결측자료가 존재하는 패널 자료에 비해 전통적인 통계분석 방법을 적용하여 표준적인 결과를 산출할 수 있을 것으로 기대된다.
PDF

Comparison of binary data imputation methods in clinical trials (임상시험에서 이분형 결측치 처리방법의 비교연구)

An, Koosung;Kim, Dongjae
- The Korean Journal of Applied Statistics
- /
- v.29 no.3
- /
- pp.539-547
- /
- 2016
We discussed how to handle missing binary data clinical trials. Patterns of occurring missing data are discussed and introduce missing binary data imputation methods that include the modified method. A simulation is performed by modifying actual data for each method. The condition of this simulation is controlled by a response rate and a missing value rate. We list the simulation results for each method and discussed them at the end of this paper.
https://doi.org/10.5351/KJAS.2016.29.3.539 인용 PDF KSCI

Missing Imputation Methods Using the Spatial Variable in Sample Survey (표본조사에서 공간 변수(SPATIAL VARIABLE)를 이용한 결측 대체(MISSING IMPUTATION)의 효율성 비교)

Lee Jin-Hee;Kim Jin;Lee Kee-Jae
- The Korean Journal of Applied Statistics
- /
- v.19 no.1
- /
- pp.57-67
- /
- 2006
In sampling survey, nonresponse tend to occur inevitably. If we use information from respondents only, the estimates will be baised. To overcome this, various non-response imputation methods have been studied. If there are few auxiliary variables for replacing missing imputation or spatial autocorrelation exists between respondents and nonrespondents, spatial autocorrelation can be used for missing imputation. In this paper, we apply several nonresponse imputation methods including spatial imputation for the analysis of farm household economy data of the Gangwon-Do in 2002 as an example. We show that spatial imputation is more efficient than other methods through the numerical simulations.
https://doi.org/10.5351/KJAS.2006.19.1.057 인용 PDF KSCI

Imputation Method using the Space-Time Model in Sample Survey (공간-시계열 모형을 이용한 결측대체 방법에 대한 연구)

Lee, Jin-Hee;Shin, Key-Il
- The Korean Journal of Applied Statistics
- /
- v.20 no.3
- /
- pp.499-514
- /
- 2007
It is a common practice to use the auxiliary variables to impute missing values from item nonresponse in surveys. Sometimes there are few auxiliary variables for missing value imputation, but if spatial and time autocorrelations exist, we should use these correlations for better results. Recently, Lee et al. (2006) showed that spatial autocorrelation could be efficiently used for missing value imputation when spatial autocorrelation existed, using the data from the farm household economy data in Gangwon-do, 2002. In this paper, we present au evaluation of spatial and space-time nonresponse imputation methods when there exist spatial and time autocorrelations using the monthly data during 2000-2002 from the same data previously used by Lee et al. (2006). We show that space-time imputation method is more efficient than the other through the numerical simulations.
https://doi.org/10.5351/KJAS.2007.20.3.499 인용 PDF KSCI

Popularity-based Eviction Functions in Cache Managements (캐쉬 관리를 위한 인기도 기반의 대체 기준치에 관한 연구)

홍진선;이상호
- Proceedings of the Korean Information Science Society Conference
- /
- 2001.04b
- /
- pp.55-57
- /
- 2001
캐쉬 대체 알고리즘은 캐쉬 적재공간의 한계성을 극복하는 방법 중에 하나이다. 기존의 많은 대체 알고리즘의 문제점인 대체 기준치의 부정확성 및 불충분성을 해결하기 위해 인기도를 제안하였다. 인기도는 인기 검색어의 순위를 정규화 한 값으로, 대량의 자료를 바탕으로 얻어진 통계치이다. 인기도 산출의 기반이 되는 인기 검색어는 시간적 흐름에 민감하고, 사회 전반적인 경향을 반영하며, 많은 중복을 가지고 있다. 인기도는 각 검색 엔진별로 단일 인기도와 누적 인기도를 산출한 후에, 이를 모두 병합하여 산출된다. 이것을 병합 인기도라고 하며, 이는 임의의 검색어에 0에서 1사이의 소수값으로 부여된다. 인기도는 메타 검색 엔진에서 캐쉬 대체를 수행할 때 적용될 수 있으며, 다수의 자료 입력 경향에 관한 정보가 존재하는 문제 영역에 사용될 수 있다.
PDF

A two-sample test with interval censored competing risk data using multiple imputation (다중대체방법을 이용한 구간 중도 경쟁 위험 모형에서의 이표본 검정)

Kim, Yuwon;Kim, Yang-Jin
- The Korean Journal of Applied Statistics
- /
- v.30 no.2
- /
- pp.233-241
- /
- 2017
Interval censored data frequently occur in observation studies where the subject is followed periodically. In this paper, our interest is to suggest a test statistic to compare the CIF of two groups with interval censored failure time data in the presence of competing risks. Gray (1988) suggested a test statistic for right censored data that motivated a well-known Fine and Gray's subdistribution hazard model. A multiple imputation technique is adopted to adopt Gray's test statistic to interval censored data. The powers and sizes of the suggested method are investigated through diverse simulation schemes. The main merit of the suggested method is its simplicity to implement with existing software for right censored data. The method is illustrated by analyzing Bangkok's HIV cohort dataset.
https://doi.org/10.5351/KJAS.2017.30.2.233 인용 PDF KSCI

Search Result 2,106, Processing Time 0.031 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)