• Title/Summary/Keyword: 범주의 순서화

Search Result 30, Processing Time 0.021 seconds

Ordering Variables and Categories on the Mosaic Plot (모자이크 플롯에서 변수와 범주의 순서화)

  • Lee, Moon-Joo;Huh, Myung-Hoe
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.5
    • /
    • pp.875-888
    • /
    • 2008
  • Mosaic plots, proposed by Hartigan and Kleiner (1981, 1984), are very useful in visualizing categorical data. In mosaic plot, multi-way classified cell frequencies are represented by rectangles with proportional area. The plot is easy to understand while preserving the information contained in the data. Plot's appearance, however, does change substantially depending on the order of variables and the orders of categories with variable put into the plot. In this study, we propose the algorithms for ordering variables and categories of the categorical data to be explored via mosaic plots. We demonstrate our methods to three well-known datasets: Titanic, Housing and PreSex.

Developing of Exact Tests for Order-Restrictions in Categorical Data (범주형 자료에서 순서화된 대립가설 검정을 위한 정확검정의 개발)

  • Nam, Jusun;Kang, Seung-Ho
    • The Korean Journal of Applied Statistics
    • /
    • v.26 no.4
    • /
    • pp.595-610
    • /
    • 2013
  • Testing of order-restricted alternative hypothesis in $2{\times}k$ contingency tables can be applied to various fields of medicine, sociology, and business administration. Most testing methods have been developed based on a large sample theory. In the case of a small sample size or unbalanced sample size, the Type I error rate of the testing method (based on a large sample theory) is very different from the target point of 5%. In this paper, the exact testing method is introduced in regards to the testing of an order-restricted alternative hypothesis in categorical data (particularly if a small sample size or extreme unbalanced data). Power and exact p-value are calculated, respectively.

Development of Core Components of Projected Clustering for High-Dimensional Categorical Data (고차원 범주형 데이터를 위한 투영 군집화 기법의 핵심 요소 개발)

  • Kim Min-Ho;Ramakrishna R.S.
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.06b
    • /
    • pp.181-183
    • /
    • 2006
  • 본 논문은 고차원의 범주형 데이터에 대한 군집화에 대해서 다룬다. 기존의 범주형 데이터 객체를 위한 유사성(상이성) 계측들의 기저에 깔려 있는 한계점은 수치형 데이터에서와 같은 순서화 (ordering)의 부재와 데이터의 고차원성과 희소성에 기인하는데, 이를 효과적으로 극복할 수 있는 기법이 투영 군집화이다. 본 논문에서는 고차원의 범주형 데이터를 효과적으로 처리할 수 있는 투영 군집화를 다루며 핵심 요소인 군집 차원의 정의와 군집 응집도를 제안한다.

  • PDF

Data Priority-Based Timestamp-Ordering Protocol for Transactions (트랜잭션을 위한 데이터 우선순위 기반형 시간소인 순서화 기법)

  • Yun, Seok-Hwan;Kim, Pyeong-Jung;Park, Ji-Eun;Lee, Jae-Yeong;Lee, Dong-Hyeon;Gung, Sang-Hwan
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.5
    • /
    • pp.1196-1210
    • /
    • 1997
  • Timestamp-Ordering Prltocol among trancaction scheduling alforithms can cause the priority teversion that a transaction with higher priority is processed after the teansaction the trancaction withe lower priority by assigning timestamp to transactions entering system and scheduling them based on the timestamp.To prevent this reversion,we suggest a data priority-based timestamp ordering prioity within the same timestamp group after grouping teansactions into constant time interval based on entering points.To evaluate the performance of this protocol,we compared the performance of this protocol with that of others after constructing the simulation environment with real time database system.We verified that the performance of proposed protocol is supweior to that of timestamp ordering protocol under the comdition of high load and high data conflicts.

  • PDF

Permutation p-values for specific-category kappa measure of agreement (특정 범주에 대한 평가자간 카파 일치도의 퍼뮤테이션 p값)

  • Um, Yonghwan
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.4
    • /
    • pp.899-910
    • /
    • 2016
  • Asymptotic tests are often not suitable for the analysis of sparse ordered contingency tables as asymptotic p-values may either overestimate or underestimate the true pvalues. In this pater, we describe permutation procedures in which we compute exact or resampling p-values for a weighted specific-category agreement in ordered $k{\times}k$ contingency tables. We use the weighted specific-category kappa proposed by $Kv{\dot{a}}lseth$ to measure the extent to which two independent raters agree on the specific categories. We carried out comparison studies between exact p-values, resampling p-values and asymptotic p-values using $3{\times}3$ contingency data (real and artificial data sets) and $4{\times}4$ artificial contingency data.

A Big Data Analysis by Between-Cluster Information using k-Modes Clustering Algorithm (k-Modes 분할 알고리즘에 의한 군집의 상관정보 기반 빅데이터 분석)

  • Park, In-Kyoo
    • Journal of Digital Convergence
    • /
    • v.13 no.11
    • /
    • pp.157-164
    • /
    • 2015
  • This paper describes subspace clustering of categorical data for convergence and integration. Because categorical data are not designed for dealing only with numerical data, The conventional evaluation measures are more likely to have the limitations due to the absence of ordering and high dimensional data and scarcity of frequency. Hence, conditional entropy measure is proposed to evaluate close approximation of cohesion among attributes within each cluster. We propose a new objective function that is used to reflect the optimistic clustering so that the within-cluster dispersion is minimized and the between-cluster separation is enhanced. We performed experiments on five real-world datasets, comparing the performance of our algorithms with four algorithms, using three evaluation metrics: accuracy, f-measure and adjusted Rand index. According to the experiments, the proposed algorithm outperforms the algorithms that were considered int the evaluation, regarding the considered metrics.

Bayesian ordinal probit semiparametric regression models: KNHANES 2016 data analysis of the relationship between smoking behavior and coffee intake (베이지안 순서형 프로빗 준모수 회귀 모형 : 국민건강영양조사 2016 자료를 통한 흡연양태와 커피섭취 간의 관계 분석)

  • Lee, Dasom;Lee, Eunji;Jo, Seogil;Choi, Taeryeon
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.1
    • /
    • pp.25-46
    • /
    • 2020
  • This paper presents ordinal probit semiparametric regression models using Bayesian Spectral Analysis Regression (BSAR) method. Ordinal probit regression is a way of modeling ordinal responses - usually more than two categories - by connecting the probability of falling into each category explained by a combination of available covariates using a probit (an inverse function of normal cumulative distribution function) link. The Bayesian probit model facilitates posterior sampling by bringing a latent variable following normal distribution, therefore, the responses are categorized by the cut-off points according to values of latent variables. In this paper, we extend the latent variable approach to a semiparametric model for the Bayesian ordinal probit regression with nonparametric functions using a spectral representation of Gaussian processes based BSAR method. The latent variable is decomposed into a parametric component and a nonparametric component with or without a shape constraint for modeling ordinal responses and predicting outcomes more flexibly. We illustrate the proposed methods with simulation studies in comparison with existing methods and real data analysis applied to a Korean National Health and Nutrition Examination Survey (KNHANES) 2016 for investigating nonparametric relationship between smoking behavior and coffee intake.

Effect of Training Sequence Control in On-line Learning for Multilayer Perceptron (다계층 퍼셉트론의 온라인 학습에서 학습 순서 제어의 효과)

  • Lee, Jae-Young;Kim, Hwang-Soo
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.7
    • /
    • pp.491-502
    • /
    • 2010
  • When human beings acquire and develop knowledge through education, their prior knowledge influences the next learning process. As this is a fact that should be considered in machine learning, we need to examine the effects of controlling the order of training sequence on machine learning. In this research, the role of the supervisor is extended to control the order of training samples, in addition to just instructing the target values for classification problems. The supervisor sequences the training examples categorized by SOM to the learning model which in this case is MLP. The proposed method is distinguished in that it selects the most instructive example from categories formed by SOM to assist the learning progress, while others use SOM only as a preprocessing method for training samples. The result shows that the method is effective in terms of the number of samples used and time taken in training.

A Study on Reinterpretation and Categorization of Normative Meaning of Tradition (전통의 규범적 의미에 대한 재해석과 범주화)

  • Yoon, Young-don;Sim, Seungwoo;Chi, Chun-Ho;Han, Sung Gu
    • The Journal of Korean Philosophical History
    • /
    • no.50
    • /
    • pp.333-361
    • /
    • 2016
  • The purpose of this study is to delve into reinterpretation and categorization of normative meaning of tradition. The normative meaning of tradition which plays a key role of the action-guiding power is the main source of morality. According to ecological cultural approach to diachronic transition of traditional value, traditional value leads its dynamic life: its origin, acculturation, transformation, distortion of traditional value depending upon periodic social change. It is necessary for traditional value to be reinterpreted and categorized, with a view to contributing to attribute & competency of democratic citizen in future society. The normative meaning of traditional value applicable for Korea's future society can be reinterpreted from its origin revealed in the classic. The order of discussion in this paper runs as follows. Firstly, we will investigate into dynamic change of the traditional value on the basis of the ecological cultural perspective and seek the possibility of modern reinterpretation of loyalty & filial piety as representative traditional value. Finally, we will treat the categorization and its significance of traditional value in the frame of Korean value including both western value and Korean traditional value.

Representaion Model of Spatio-Temporal Synchronization for Sketching Scenario (시나리오의 스케치를 위한 시간 공간 동기화 표현 모델)

  • 하수철;성해경
    • Proceedings of the Korean Society for Emotion and Sensibility Conference
    • /
    • 1998.04a
    • /
    • pp.38-43
    • /
    • 1998
  • 서로 다른 개발 과정에 있거나, 이미 개발된 게임 공간에 새로운 시나라요의 스케치를 포함시키는 경우 시간(temporal)과 공간(spatial)의 개념적인 동기화가 요구된다. 본 논문에서는 게임 시나리오 스케치의 표현 공간에 대한 범주를 나누며, 게임 장면 순서의 시간 관계성과 동적 장면의 공간 개념과의 동기화에 관한 표현법률 확장하는 표현 모델에 대한 논의를 한다.

  • PDF