• Title/Summary/Keyword: Skewed Data

Search Result 203, Processing Time 0.029 seconds

Bit-map-based Spatial Data Transmission Scheme

  • OH, Gi Oug
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.8
    • /
    • pp.137-142
    • /
    • 2019
  • This paper proposed bitmap based spatial data transmission scheme in need of rapid transmission through network in mobile environment that use and creation of data are frequently happen. Former researches that used clustering algorithms, focused on providing service using spatial data can cause delay since it doesn't consider the transmission speed. This paper guaranteed rapid service for user by convert spatial data to bit, leads to more transmission of bit of MTU, the maximum transmission unit. In the experiment, we compared arithmetically default data composed of 16 byte and spatial data converted to bitmap and for simulation, we created virtual data and compared its network transmission speed and conversion time. Virtual data created as standard normal distribution and skewed distribution to compare difference of reading time. The experiment showed that converted bitmap and network transmission are 2.5 and 8 times faster for each.

Statistical Studies on the Derivation of Design Low Flows (II) (설계갈수량의 유도를 위한 수문통계학적 연구(II))

  • 이순혁;박명근;박종국
    • Magazine of the Korean Society of Agricultural Engineers
    • /
    • v.34 no.4
    • /
    • pp.39-47
    • /
    • 1992
  • Derivation of reasonable design low flows was attempted by comparative analysis of design low flows was derived by Power and SMEMAX transformations for the normalizations of skewed distribution and by Type m extremal distribution presented in the first report of this study with annual low flows in the five watersheds of main river basins in Korea. The results were anslyzed and summarized as follows. 1.Basic statistics of annual low flows for the selected watersheds were calculated by using Power and SMEMAX transformations. 2.Power thansformation has found to be the best for the normalization of skewed distribution among others including log, square root and SMEMAX transformations. 3.Design low flows for the selected watersheds were derived by the Power and SMEMAX transformations. 4.Judging by the relative suitabilities of the Type III extremal distribution, Power and SMEMAX transformation, it was found that design low flows of all methods are closer to the observed data within 10 years of the return period and those of Power transformation can be acknowledzed as a reasonable one among others from the viewpoint of the median between values of Type m extremal distribution and SMEMAX transformation in addition to closing the observed than others over 10 years of the return period.

  • PDF

Dynamic Distributed Grid Scheme to Manage the Location-Information of Moving Objects in Spatial Networks (공간 네트워크에서 이동객체의 위치정보 관리를 위한 동적 분산 그리드 기법)

  • Kim, Young-Chang;Hong, Seung-Tae;Jo, Kyung-Jin;Chang, Jae-Woo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.12
    • /
    • pp.948-952
    • /
    • 2009
  • Recently, a new distributed grid scheme, called DS-GRID(distributed S-GRID), has been proposed to manage the location information of moving objects in a spatial network[1]. However, because DS-GRID uses uniform grid cells, it cannot handle skewed data which frequently occur in the real application. To solve this problem, we propose a dynamic distributed grid scheme which splits a grid cell dynamically based on the density of moving objects. In addition, we propose a k-nearest neighbor processing algorithm for the proposed scheme. Finally, it is shown from the performance analysis that our scheme achieves better retrieval and update performance than the DS-GRID when the moving objects are skewed.

A Study on the Choice of Dependent Variables of Momentum Equations in the General Curvilinear Coordinate (일반곡률좌표계 운동량방정식의 종속변수 선정에 관한 연구)

  • Kim, Tak-Su;Kim, Won-Gap;Kim, Cheol-Su;Choe, Yeong-Don
    • Transactions of the Korean Society of Mechanical Engineers B
    • /
    • v.25 no.11
    • /
    • pp.1500-1508
    • /
    • 2001
  • This paper represents the importance of dependent variables in non-orthogonal curvilinear coordinates just as the importance of those variables of convective scheme and turbulence model in computational fluid dynamics. Each of Cartesian, physical covariant and physical contravariant velocity components was tested as the dependent variables of momentum equations in the staggered grid system. In the flow past a circular cylinder, the results were computed to use each of three variables and compared to experimental data. In the skewed driven cavity flow, the results were computed to check the grid dependency of the variables. The results used in Cartesian and physical contravariant components of velocity in cylinder flow show the nearly same accuracy. In the case of Cartesian and contravariant component, the same number of vortex was predicted in the skewed driven cavity flow. Vortex strength of Cartesian component case has about 30% lower value than that of the other two cases.

RAH-tree : A Efficient Index Scheme for Spatial Data with Skewed Access Patterns (RAH-tree : 편향 접근 패턴을 갖는 공간 데이터에 대한 효율적인 색인 기법)

  • Choi Keun-Ha;Lee Seung-Joong;Jung Sungwon
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2005.07b
    • /
    • pp.31-33
    • /
    • 2005
  • GPS및 PDA의 발달로 인해서 위치 기반 서비스(LBS), 차량항법장치(CNS), 지리정보시스템(GIS)등 공간 데이터를 다루는 응용프로그램들이 급속하게 보급되었다. 이러한 응용프로그램은 높이 균등 색인 기법을 사용하여 원하는 데이터에 대한 색인을 제공하였다. 그러나 모든 공간 객체는 서로 상이한 접근 빈도를 가지고 있음에도 불구하고 기존의 공간색인 기법은 접근 빈도를 고려하지 못하는 단점을 가지고 있었다. 또한 기존의 빈도수만을 고려한 공간 객체의 색인 방법은 접근 빈도에 따른 편향성(skewed)은 제공하지만 공간 객체에 대한 지역성을 반영하지 못한다. 본 논문에서는 밀집되어 있는 공간 객체의 접근 빈도를 반영해서 편향된 색인 트리를 생성하는 기법을 제안한다. 이형 클러스터링으로 분포되어 있는 전체 영역에 대해서 Zahn의 클러스터링 알고리즘을 변형시켜서 다단계 세부영역을 구분한다. 이렇게 구간된 세부영역에 대해서 거리적 인접성과 접근 빈도수의 합을 이용해서 색인 트리를 생성한다. 다단계로 구성된 전체영역에 대해서 하향식 방식으로 편향된 색인 트리를 생성함으로써, 접근 빈도가 높은 공간 객체에 대해서 빠른 탐색이 가능하게 한다.

  • PDF

A Study on Cost Rate Analysis Methodology of Credit Card Value Proposition (신용카드 부가서비스 요율 분석 방법론에 대한 연구)

  • Lee, Chan-Kyung;Roh, Hyung-Bong
    • Journal of Korean Society for Quality Management
    • /
    • v.46 no.4
    • /
    • pp.797-820
    • /
    • 2018
  • Purpose: It is to seek for an appropriate cost rate analysis methodology of credit card value propositions in Korea. For this issue, it is claimed that methodologies based on probability distribution is more suitable than methodologies based on data-mining. The analysis model constructed for the cost rate estimation is called VCPM model. Methods: The model includes two major variables denoted as S and P. S is monthly credit card usage amount. P stands for the proportion of usage amount at special merchants over the whole monthly usage amount. The distributions assumed for P are positively skewed distributions such as exponential, gamma and lognormal. The major inputs to the model are also derived from S and P, which are E(S) and the aggregate proportion of usage amount at special merchants over the total monthly usage amount. Results: When the credit card's value proposition is general discount, the VCPM model fits well and generates reasonable cost rate(denoted as R). However, it seems that the model does not work well for other types of credit cards. Conclusion: The VCPM model is reliable for calculating cost rate for credit cards with positively skewed distribution of P, which are general discount card. However, another model should be built for cards with other types of distributions of P.

A Novel Air Indexing Scheme for Window Query in Non-Flat Wireless Spatial Data Broadcast

  • Im, Seok-Jin;Youn, Hee-Yong;Choi, Jin-Tak;Ouyang, Jinsong
    • Journal of Communications and Networks
    • /
    • v.13 no.4
    • /
    • pp.400-407
    • /
    • 2011
  • Various air indexing and data scheduling schemes for wireless broadcast of spatial data have been developed for energy efficient query processing. The existing schemes are not effective when the clients' data access patterns are skewed to some items. It is because the schemes are based on flat broadcast that does not take the popularity of the data items into consideration. In this paper, thus, we propose a data scheduling scheme letting the popular items appear more frequently on the channel, and grid-based distributed index for non-flat broadcast (GDIN) for window query processing. The proposed GDIN allows quick and energy efficient processing of window query, matching the clients' linear channel access pattern and letting the clients access only the queried data items. The simulation results show that the proposed GDIN significantly outperforms the existing schemes in terms of access time, tuning time, and energy efficiency.

Spatial Partitioning using filbert Space Filling Curve for Spatial Query Optimization (공간 질의 최적화를 위한 힐버트 공간 순서화에 따른 공간 분할)

  • Whang, Whan-Kyu;Kim, Hyun-Guk
    • The KIPS Transactions:PartD
    • /
    • v.11D no.1
    • /
    • pp.23-30
    • /
    • 2004
  • In order to approximate the spatial query result size we partition the input rectangles into subsets and estimate the query result size based on the partitioned spatial area. In this paper we examine query result size estimation in skewed data. We examine the existing spatial partitioning techniques such as equi-area and equi-count partitioning, which are analogous to the equi-width and equi-height histograms used in relational databases, and examine the other partitioning techniques based on spatial indexing. In this paper we propose a new spatial partitioning technique based on the Hilbert space filling curve. We present a detailed experimental evaluation comparing the proposed technique and the existing techniques using synthetic as well as real-life datasets. The experiments showed that the proposed partitioning technique based on the Hilbert space filling curve achieves better query result size estimation than the existing techniques for space query size, bucket numbers, skewed data, and spatial data size.

Nonparametric two sample tests for scale parameters of multivariate distributions

  • Chavan, Atul R;Shirke, Digambar T
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.4
    • /
    • pp.397-412
    • /
    • 2020
  • In this paper, a notion of data depth is used to propose nonparametric multivariate two sample tests for difference between scale parameters. Data depth can be used to measure the centrality or outlying-ness of the multivariate data point relative to data cloud. A difference in the scale parameters indicates the difference in the depth values of a multivariate data point. By observing this fact on a depth vs depth plot (DD-plot), we propose nonparametric multivariate two sample tests for scale parameters of multivariate distributions. The p-values of these proposed tests are obtained by using Fisher's permutation approach. The power performance of these proposed tests has been reported for few symmetric and skewed multivariate distributions with the existing tests. Illustration with real-life data is also provided.

Comprehensive comparison of normality tests: Empirical study using many different types of data

  • Lee, Chanmi;Park, Suhwi;Jeong, Jaesik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.5
    • /
    • pp.1399-1412
    • /
    • 2016
  • We compare many normality tests consisting of different sources of information extracted from the given data: Anderson-Darling test, Kolmogorov-Smirnov test, Cramervon Mises test, Shapiro-Wilk test, Shaprio-Francia test, Lilliefors, Jarque-Bera test, D'Agostino' D, Doornik-Hansen test, Energy test and Martinzez-Iglewicz test. For the purpose of comparison, those tests are applied to the various types of data generated from skewed distribution, unsymmetric distribution, and distribution with different length of support. We then summarize comparison results in terms of two things: type I error control and power. The selection of the best test depends on the shape of the distribution of the data, implying that there is no test which is the most powerful for all distributions.