• Title/Summary/Keyword: Skewed Data

Search Result 206, Processing Time 0.026 seconds

Two-dimensional Tracer Tests in Natural Rivers Using Radioisotope (방사성 동위원소를 이용한 자연하천의 2차원 추적자 실험)

  • Seo, Il Won;Baek, Kyong Oh;Jeon, Tae Myong
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.26 no.2B
    • /
    • pp.161-170
    • /
    • 2006
  • A tracer test technique using a radioisotope was proposed to investigate pollutant mixing characteristics in rivers. The main advantages of radioisotope as a tracer in field tests are that it can be detected easily, and that its detection range is quite large. Also, using the radioisotope, the amount sorbed by the bed material and the biota may be a minimum. Field tracer tests were conducted at seven different sites in natural rivers with various meandering pattern. Based on the acquired data, the behavior of the tracer cloud in the intermediate-field was examined two-dimensionally, and dispersion coefficients were calculated using several evaluation methods. Results revealed that the tracer cloud was transported skewed to the outer bank and dispersion coefficients in bends were larger than those in straight reaches.

An Analysis Regarding Trends of Dualism in Korean Agriculture (농업생산 양극화 추이에 대한 연구)

  • Sung, Jae-Hoon;Woo, Sung-Hwi
    • The Journal of Industrial Distribution & Business
    • /
    • v.8 no.6
    • /
    • pp.87-95
    • /
    • 2017
  • Purpose - The structural changes of Korean agriculture are complex due to heterogeneous production processes and farms' features. This study analyzed trends of dualism in Korean agriculture over the period 2000-15 based on farm-level data to clarify the specific trends of dualism in terms of farm income, farm-size, and farm operators' age. From the results of this study, we would be able to understand the features of structural changes in Korean agriculture more profoundly. Research design, data, and methodology - We incorporated farm-level data in South Korea: Agricultural census and Farm household economy survey. As measures of inequality, we used size-weighted quantiles, and normalized Gini coefficients as well as mean and conventional quantiles. The size-weighted quantiles are more robust to changes in the number of small farms, but they are more sensitive to changes in the distribution of farm-size. Thus, they would be more useful to identify trends of dualism of Korean agriculture. Results - The results show that the farmland distribution of crop farms became more skewed and dispersed. However, the herd distribution of livestock farms became more concentrated. To be specific, their mean and 1st quantile increases more rapidly than their size-weighted 2nd quantile and size-weighted 3rd quantile. Gini coefficients of livestock farms regarding their herd distribution decreased by 0.1 on average. In the case of income distribution, the results indicate that the polarization regarding farm household/agricultural/non-agricultural income became more severe. However, we also found that the distribution of transfer income became concentrated continuously. The results imply that transfer income including subsidies would decrease farm income polarization. Lastly, during the study periods, Korean farms were aging over time, and age distribution of them more concentrated. Conclusions - The structure of Korean agriculture has been changing, even though the absolute size of it decreased over time. Land (herd) distribution became more dispersed (concentrated). Inequality regarding agricultural income became more severe, and it made farm household income more polarized even though transfer income would decrease income gaps among farms. Lastly, farms continue to age regardless of farm types and this might affect the structural changes in Korean agriculture in the future.

A Cell-based Indexing for Managing Current Location Information of Moving Objects (이동객체의 현재 위치정보 관리를 위한 셀 기반 색인 기법)

  • Lee, Eung-Jae;Lee, Yang-Koo;Ryu, Keun-Ho
    • The KIPS Transactions:PartD
    • /
    • v.11D no.6
    • /
    • pp.1221-1230
    • /
    • 2004
  • In mobile environments, the locations of moving objects such as vehicles, airplanes and users of wireless devices continuously change over time. For efficiently processing moving object information, the database system should be able to deal with large volume of data, and manage indexing efficiently. However, previous research on indexing method mainly focused on query performance, and did not pay attention to update operation for moving objects. In this paper, we propose a novel moving object indexing method, named ACAR-Tree. For processing efficiently frequently updating of moving object location information as well as query performance, the proposed method is based on fixed grid structure with auxiliary R-Tree. This hybrid structure is able to overcome the poor update performance of R-Tree which is caused by reorganizing of R-Tree. Also, the proposed method is able to efficiently deal with skewed-. or gaussian distribution of data using auxiliary R-Tree. The experimental results using various data size and distribution of data show that the proposed method has reduced the size of index and improve the update and query performance compared with R-Tree indexing method.

Selectivity Estimation for Spatio-Temporal a Overlap Join (시공간 겹침 조인 연산을 위한 선택도 추정 기법)

  • Lee, Myoung-Sul;Lee, Jong-Yun
    • Journal of KIISE:Databases
    • /
    • v.35 no.1
    • /
    • pp.54-66
    • /
    • 2008
  • A spatio-temporal join is an expensive operation that is commonly used in spatio-temporal database systems. In order to generate an efficient query plan for the queries involving spatio-temporal join operations, it is crucial to estimate accurate selectivity for the join operations. Given two dataset $S_1,\;S_2$ of discrete data and a timestamp $t_q$, a spatio-temporal join retrieves all pairs of objects that are intersected each other at $t_q$. The selectivity of the join operation equals the number of retrieved pairs divided by the cardinality of the Cartesian product $S_1{\times}S_2$. In this paper, we propose aspatio-temporal histogram to estimate selectivity of spatio-temporal join by extending existing geometric histogram. By using a wide spectrum of both uniform dataset and skewed dataset, it is shown that our proposed method, called Spatio-Temporal Histogram, can accurately estimate the selectivity of spatio-temporal join. Our contributions can be summarized as follows: First, the selectivity estimation of spatio-temporal join for discrete data has been first attempted. Second, we propose an efficient maintenance method that reconstructs histograms using compression of spatial statistical information during the lifespan of discrete data.

Analysis of Two-Dimensional Pollutant Transport in Meandering Streams (사행하천에서 오염물질의 2차원 거동특성 해석)

  • Oh, Jung-Sun;Seo, Il-Won;Kim, Young-Han
    • Journal of Korea Water Resources Association
    • /
    • v.37 no.12
    • /
    • pp.979-991
    • /
    • 2004
  • In this study, RMA2 and RMA4, the 2-D depth-averaged models, were employed to simulate the two-dimensional mixing characteristics of the pollutants in the natural streams. The velocity and depth were first calculated using RMA2, 2-D hydrodynamic model, and then the resulting flow field was inputted to RMA4, 2-D water quality model, to compute the concentration field. RMA models were verified using the velocity and concentration data measured in S-curved meandering channel. The results showed that the RMA2 model simulated well the phenomenon that the maximum velocity line is located at the Inner bank of meandering channel, and the RMA4 model was well adapted to reproduce the general mixing behavior and the separation of tracer clouds. Comparing model simulations with measured data in the field experiments, RMA2 model simulated well general flow field and tendency that the maximum velocity line skewed toward the outer bank which were found in field experiments. The simulations of RMA4 model showed that the center of the tracer cloud tends to follow the path in which the maximum velocity occurs. In this study, the dispersion coefficients are fine-tuned based on the measured coefficients calculated using field concentration data, and the results show reasonable agreement with predictive equations.

Studies on the Variation Pattern of Water Resources and their Generation Models by Simulation Technique (Simulation Technique에 의한 수자원의 변동양상 및 그 모의발생모델에 관한 연구)

  • Lee, Sun-Tak;An, Gyeong-Su;Lee, Ui-Rak
    • Water for future
    • /
    • v.9 no.2
    • /
    • pp.87-100
    • /
    • 1976
  • These studies are aimed at the analysis of systematic variation pattern of water resources in Korean river catchments and the development of their simulation models from the stochastic analysis of monthly and annual hydrologic data as main elements of water resources, i.e. rainfall and streamflow. In the analysis, monthly & annual rainfall records in Soul, Taegu, Pusan and Kwangju and streamflow records at the main gauging stations in Han, Nakdong and Geum river were used. Firstly, the systematic variation pattern of annual streamflow was found by the exponential function relationship between their standard deviations and mean values of log-annual runoff. Secondly, stochastic characteristics of annual rainfall & streamflow series were studied by the correlogram Monte Carlo method and a single season model of 1st-order Markov type were applied and compared in the simulation of annual hydrologic series. In the simulation, single season model of Markov type showed better results than LN-model and the simulated data were fit well with historical data. But it was noticed that LN-model gave quite better results in the simulation of annual rainfall. Thirdly, stochastic characteristics of monthly rainfall & streamflow series were also studied by the correlogram and spectrum analysis, and then the Model-C, which was developed and applied for the synthesis of monthly perennial streamflow by lst author and is a Markov type model with transformed skewed random number, was used in the simulation of monthly hydrologic series. In the simulation, it was proved that Model-C was fit well for extended area in Korea and also applicable for menthly rainfall as well as monthly streamflow.

  • PDF

Trend Analysis of Extreme Precipitation Using Quantile Regression (Quantile 회귀분석을 이용한 극대강수량 자료의 경향성 분석)

  • So, Byung-Jin;Kwon, Hyun-Han;An, Jung-Hee
    • Journal of Korea Water Resources Association
    • /
    • v.45 no.8
    • /
    • pp.815-826
    • /
    • 2012
  • The underestimating trend using existing ordinary regression (OR) based trend analysis has been a well-known problem. The existing OR method based on least squares approximate the conditional mean of the response variable given certain values of the time t, and the usual assumption of the OR method is normality, that is the distribution of data are not dissimilar form a normal distribution. In this regard, this study proposed a quantile regression that aims at estimating either the conditional median or other quantiles of the response variable. This study assess trend in annual daily maximum rainfall series over 64 weather stations through both in OR and QR approach. The QR method indicates that 47 stations out of 67 weather stations are a strong upward trend at 5% significance level while OR method identifies a significant trend only at 13 stations. This is mainly because the OR method is estimating the condition mean of the response variable. Unlike the OR method, the QR method allows us flexibly to detect the trends since the OR is designed to estimate conditional quantiles of the response variable. The proposed QR method can be effectively applied to estimate hydrologic trend for either non-normal data or skewed data.

Volatility of Export Volume and Export Value of Gwangyang Port (광양항의 수출물동량과 수출액의 변동성)

  • Mo, Soo-Won;Lee, Kwang-Bae
    • Journal of Korea Port Economic Association
    • /
    • v.31 no.1
    • /
    • pp.1-14
    • /
    • 2015
  • The standard GARCH model imposing symmetry on the conditional variance, tends to fail in capturing some important features of the data. This paper, hence, introduces the models capturing asymmetric effect. They are the EGARCH model and the GJR model. We provide the systematic comparison of volatility models focusing on the asymmetric effect of news on volatility. Specifically, three diagnostic tests are provided: the sign bias test, the negative size bias test, and the positive size bias test. This paper shows that there is significant evidence of GARCH-type process in the data, as shown by the test for the Ljung-Box Q statistic on the squared residual data. The estimated unconditional density function for squared residual is clearly skewed to the left and markedly leptokurtic when compared with the standard normal distribution. The observation of volatility clustering is also clearly reinforced by the plot of the squared value of residuals of export volume and values. The unconditional variance of both export volumes and export value indicates that large shocks of either sign tend to be followed by large shocks, and small shocks of either sign tend to follow small shocks. The estimated export volume news impact curve for the GARCH also suggests that $h_t$ is overestimated for large negative and positive shocks. The conditional variance equation of the GARCH model for export volumes contains two parameters ${\alpha}$ and ${\beta}$ that are insignificant, indicating that the GARCH model is a poor characterization of the conditional variance of export volumes. The conditional variance equation of the EGARCH model for export value, however, shows a positive sign of parameter ${\delta}$, which is contrary to our expectation, while the GJR model exhibits that parameters ${\alpha}$ and ${\beta}$ are insignificant, and ${\delta}$ is marginally significant. That indicates that the asymmetric volatility models are poor characterization of the conditional variance of export value. It is concluded that the asymmetric EGARCH and GJR model are appropriate in explaining the volatility of export volume, while the symmetric standard GARCH model is good for capturing the volatility.

The fundamental frequency (f0) distribution of Korean speakers in a dialogue corpus using Praat and R (Praat과 R로 분석한 한국인 대화 음성 말뭉치의 fundamental frequency(f0)값 분포)

  • Byunggon Yang
    • Phonetics and Speech Sciences
    • /
    • v.15 no.3
    • /
    • pp.17-25
    • /
    • 2023
  • This study examines the fundamental frequency(f0) distribution of 2,740 Korean speakers in a dialogue speech corpus. Praat and R were used for the collection and analysis of acoustical f0 data after removing extreme values considering the interquartile f0 range of the intonational phrases produced by each individual speaker. Results showed that the average f0 value of all speakers was 185 Hz and the median value was 187 Hz. The f0 data showed a positively skewed distribution of 0.11, and the kurtosis was -0.09, which is close to the normal distribution. The pitch values of daily conversations varied in the range of 238 Hz. Further examination of the male and female groups showed distinct median f0 values: 114 Hz for males and 199 Hz for females. A t-test between the two groups yielded a significant difference. The skewness representing the distribution shape was 1.24 for the male group and 0.58 for the female group. The kurtosis was 5.21 and 3.88 for the male and female groups, and the male group values appeared leptokurtic. A regression analysis between the median f0 and age yielded a slope of 0.15 for the male group and -0.586 for the female group, which indicated a divergent relationship. In conclusion, a normative f0 distribution of different Korean age and sex groups can be examined in the conversational speech corpus recorded by a massive number of participants. However, more rigorous data might be required to define a relation between age and f0 values.

Outlier Detection Techniques for Biased Opinion Discovery (편향된 의견 문서 검출을 위한 이상치 탐지 기법)

  • Yeon, Jongheum;Shim, Junho;Lee, Sanggoo
    • The Journal of Society for e-Business Studies
    • /
    • v.18 no.4
    • /
    • pp.315-326
    • /
    • 2013
  • Users in social media post various types of opinions such as product reviews and movie reviews. It is a common trend that customers get assistance from the opinions in making their decisions. However, as opinion usage grows, distorted feedbacks also have increased. For example, exaggerated positive opinions are posted for promoting target products. So are negative opinions which are far from common evaluations. Finding these biased opinions becomes important to keep social media reliable. Techniques of opinion mining (or sentiment analysis) have been developed to determine sentiment polarity of opinionated documents. These techniques can be utilized for finding the biased opinions. However, the previous techniques have some drawback. They categorize the text into only positive and negative, and they also need a large amount of training data to build the classifier. In this paper, we propose methods for discovering the biased opinions which are skewed from the overall common opinions. The methods are based on angle based outlier detection and personalized PageRank, which can be applied without training data. We analyze the performance of the proposed techniques by presenting experimental results on a movie review dataset.