• Title/Summary/Keyword: representative statistics

Search Result 252, Processing Time 0.019 seconds

A study on collecting representative food samples for the 10th Korean standard foods composition table (국가표준식품성분 데이터베이스 대표시료 선정을 위한 표본설계)

  • Kim, Jinheum;Hwang, Hae-Won;Cho, Yu Jung;Park, Jinwoo
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.2
    • /
    • pp.215-228
    • /
    • 2020
  • Under Article 19, Paragraph 1 of the Food Industry Promotion Act, Rural Development Administration renews the Korean foods composition table every five years. Before the publication of the tenth revision of the Korean foods composition table in 2021, this paper suggests methods for collecting representative samples of 182 highly consumed foods in Korea. Food markets are categorized by their distribution channels, which are supermarkets and local markets. Eight samples are collected from each category by applying the National Food and Nutrient Analysis Program (NFNAP)'s stratified multi-stage sampling. The NFNAP was implemented in 1997 as a collaborative food composition research effort between the National Institute of Health (NIH) and the US Department of Agriculture (USDA) to secure reliable estimates for the nutrient content of food and beverages consumed by the US population. Selected supermarkets for selecting representative food samples are Emart Kayang, Homeplus Siheung, Lottemart Dongducheon, Emart Suwon, Lottemart Dunsan, Lottemart Yeosu, Emart Ulsan, and Hanaroclub Ulsan. Selected local markets also are Doksandongusijang in Geumcheon-gu and Pungnapsijang in Songpa-gu, Seoul, Ilsansijang in Ilsanseo-gu, Goyang, Unamsijang in Buk-gu, Gwangju, Beopdongsijang in Daedeok-gu, Daejeon, Bongnaesijang in Yeongdo-gu and Jwadongjaeraesijang in Haeundae-gu, Busan, and Jungangsijang in Jinhae-gu, Changwon.

A Strategy Through Segmentation Using Factor and Cluster Analysis: focusing on corporations having a special status (요인분석과 군집분석을 통한 세분화 및 전략방향 제시: 특수법인 사례를 중심으로)

  • Cho, Yong-Jun;Kim, Yeong-Hwa
    • The Korean Journal of Applied Statistics
    • /
    • v.20 no.1
    • /
    • pp.23-38
    • /
    • 2007
  • Corporations adopt a segmentation depends on the existence of target variables, in general. In this paper, for the case of no target variables, a strategy through segmentation is proposed for corporations having a special status based on the management index. In case of segmentation using cluster analysis, however, if one classify according to many variables then he will be in face of difficulties in characterizing. Therefore, after extracting representative factors by factor analysis, a segmentation method through 2 step cluster analysis is employed on the basis of these representative factors. As a result, six segmentation groups are found and the resulting strategy is proposed which strengthens prominent factors and makes up defective factors for each group.

A Study on Analyzing the Difference Factors Occurred in the Secondary School Mathematics Teachers on the Mathematical Knowledge of Teaching and on Exploring the Enhancement on the Statistical Literacy (수학 중등 교사들 간의 수학교수지식(MKT) 차이 발생 요인 분석 및 이를 통한 통계적 소양 신장 방안)

  • Kim, Seul Bi;Hwang, Hye Jeang
    • East Asian mathematical journal
    • /
    • v.39 no.2
    • /
    • pp.141-166
    • /
    • 2023
  • The purpose of this study is to confirm the MKT(Mathematical Knowledge for Teaching) of the in-service mathematics teachers on the statistics(Representative value, Degree of scattering) through the comparative analysis between the sub-elements of the MKT. In addition, it is to examine the factors that cause the difference of the subjects' MKT. To accomplish this, by the subject of 12 secondary in-service mathematics teachers, in this study the test items of the MKT on the statistics were developed and data were collected and analyzed. As a result of the analysis of the MKT test sheet, the CCK(Common Content Knowledge) and SCK(Specialized Content Knowledge) of the mathematics teacher was confirmed as a high score, whereas the and KCS(Knowledge of Content and Students) and KCT(Knowldge of Curriculum and teaching) were confirmed as low scores. In addition, through these results, it was shown that the difference in MKT's elements the middle school and high school teachers obtain occurred slightly.

A Robust Approach of Regression-Based Statistical Matching for Continuous Data

  • Sohn, Soon-Cheol;Jhun, Myoung-Shic
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.2
    • /
    • pp.331-339
    • /
    • 2012
  • Statistical matching is a methodology used to merge microdata from two (or more) files into a single matched file, the variants of which have been extensively studied. Among existing studies, we focused on Moriarity and Scheuren's (2001) method, which is a representative method of statistical matching for continuous data. We examined this method and proposed a revision to it by using a robust approach in the regression step of the procedure. We evaluated the efficiency of our revised method through simulation studies using both simulated and real data, which showed that the proposed method has distinct advantages over existing alternatives.

A Pattern Consistency Index for Detecting Heterogeneous Time Series in Clustering Time Course Gene Expression Data (시간경로 유전자 발현자료의 군집분석에서 이질적인 시계열의 탐지를 위한 패턴일치지수)

  • Son, Young-Sook;Baek, Jang-Sun
    • The Korean Journal of Applied Statistics
    • /
    • v.18 no.2
    • /
    • pp.371-379
    • /
    • 2005
  • In this paper, we propose a pattern consistency index for detecting heterogeneous time series that deviate from the representative pattern of each cluster in clustering time course gene expression data using the Pearson correlation coefficient. We examine its usefulness by applying this index to serum time course gene expression data from microarrays.

A Study on the Spread of Apartments Using the Correlation Analysis among Housing Statistics Indices (주택통계 지표간의 상관분석을 이용한 아파트 주거 확산에 관한 고찰)

  • Choi, Jung-Min
    • Journal of the Korean housing association
    • /
    • v.17 no.5
    • /
    • pp.77-86
    • /
    • 2006
  • This study is to find some clues about background or causes of the mass supply of apartments in Korea in terms of correlation analysis using 30 indices extracted from the representative housing statistics data. Some findings include that the supply ratio of apartments is deeply related to 'average floor area ratio' and 'the construction amount of Dagagu housing and Dasedae housing' from the perspective of housing flow. Instead, from the perspective of housing stock, the supply amount of apartments is strongly related to 'housing redevelopment construction' and 'housing construction by public sector'. These indices are involved deeply in the spread of apartments, however, because the indices that used in the analysis are mutually highly related and the indices related to housing policy or system are absent, a critical index for the spread of apartments was not found.

A Classification Method Using Data Reduction

  • Uhm, Daiho;Jun, Sung-Hae;Lee, Seung-Joo
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.12 no.1
    • /
    • pp.1-5
    • /
    • 2012
  • Data reduction has been used widely in data mining for convenient analysis. Principal component analysis (PCA) and factor analysis (FA) methods are popular techniques. The PCA and FA reduce the number of variables to avoid the curse of dimensionality. The curse of dimensionality is to increase the computing time exponentially in proportion to the number of variables. So, many methods have been published for dimension reduction. Also, data augmentation is another approach to analyze data efficiently. Support vector machine (SVM) algorithm is a representative technique for dimension augmentation. The SVM maps original data to a feature space with high dimension to get the optimal decision plane. Both data reduction and augmentation have been used to solve diverse problems in data analysis. In this paper, we compare the strengths and weaknesses of dimension reduction and augmentation for classification and propose a classification method using data reduction for classification. We will carry out experiments for comparative studies to verify the performance of this research.

A Sampling Design for Health Index Survey

  • Ryu, Jea-Bok;Lee, Kay-O;Kim, Young-Won
    • Communications for Statistical Applications and Methods
    • /
    • v.9 no.2
    • /
    • pp.565-576
    • /
    • 2002
  • We propose a new sampling design for the 2001 Health Index Survey at Seoul. In this stratified two-stage sampling design, the ED(enumeration district) of 2000 Population and Housing Census is used as primary sampling unit and the Gu is used as stratification variable in order to obtain the sub-domain estimate for 25 Gu's as well as population estimate for Seoul. The sample ED's are systematically selected after the Ed's are ordered by location and property to obtain a representative sample. And also, the imputation methods for item nonresponses are suggested.

A Regression based Unconstraining Demand Method in Revenue Management (수입관리에서 회귀모형 기반 수요 복원 방법)

  • Lee, JaeJune;Lee, Woojoo;Kim, Junghwan
    • The Korean Journal of Applied Statistics
    • /
    • v.28 no.3
    • /
    • pp.467-475
    • /
    • 2015
  • Accurate demand forecasting is a crucial component in revenue management(RM). The booking data of departed flights is used to forecast the demand for future departing flights; however, some booking requests that were denied were omitted in the departed flights data. Denied booking requests can be interpreted as censored in statistics. Thus, unconstraining demand is an important issue to forecast the true demands of future flights. Several unconstraining methods have been introduced and a method based on expectation maximization is considered superior. In this study, we propose a new unconstraining method based on a regression model that can entertain such censored data. Through a simulation study, the performance of the proposed method was evaluated with two representative unconstraining methods widely used in RM.

Investigations into Coarsening Continuous Variables

  • Jeong, Dong-Myeong;Kim, Jay-J.
    • The Korean Journal of Applied Statistics
    • /
    • v.23 no.2
    • /
    • pp.325-333
    • /
    • 2010
  • Protection against disclosure of survey respondents' identifiable and/or sensitive information is a prerequisite for statistical agencies that release microdata files from their sample surveys. Coarsening is one of popular methods for protecting the confidentiality of the data. Grouped data can be released in the form of microdata or tabular data. Instead of releasing the data in a tabular form only, having microdata available to the public with interval codes with their representative values greatly enhances the utility of the data. It allows the researchers to compute covariance between the variables and build statistical models or to run a variety of statistical tests on the data. It may be conjectured that the variance of the interval data is lower that of the ungrouped data in the sense that the coarsened data do not have the within interval variance. This conjecture will be investigated using the uniform and triangular distributions. Traditionally, midpoint is used to represent all the values in an interval. This approach implicitly assumes that the data is uniformly distributed within each interval. However, this assumption may not hold, especially in the last interval of the economic data. In this paper, we will use three distributional assumptions - uniform, Pareto and lognormal distribution - in the last interval and use either midpoint or median for other intervals for wage and food costs of the Statistics Korea's 2006 Household Income and Expenditure Survey(HIES) data and compare these approaches in terms of the first two moments.