• Title/Summary/Keyword: Multivariate Stream Data

Search Result 21, Processing Time 0.025 seconds

Evaluation of Multivariate Stream Data Reduction Techniques (다변량 스트림 데이터 축소 기법 평가)

  • Jung, Hung-Jo;Seo, Sung-Bo;Cheol, Kyung-Joo;Park, Jeong-Seok;Ryu, Keun-Ho
    • The KIPS Transactions:PartD
    • /
    • v.13D no.7 s.110
    • /
    • pp.889-900
    • /
    • 2006
  • Even though sensor networks are different in user requests and data characteristics depending on each application area, the existing researches on stream data transmission problem focus on the performance improvement of their methods rather than considering the original characteristic of stream data. In this paper, we introduce a hierarchical or distributed sensor network architecture and data model, and then evaluate the multivariate data reduction methods suitable for user requirements and data features so as to apply reduction methods alternatively. To assess the relative performance of the proposed multivariate data reduction methods, we used the conventional techniques, such as Wavelet, HCL(Hierarchical Clustering), Sampling and SVD (Singular Value Decomposition) as well as the experimental data sets, such as multivariate time series, synthetic data and robot execution failure data. The experimental results shows that SVD and Sampling method are superior to Wavelet and HCL ia respect to the relative error ratio and execution time. Especially, since relative error ratio of each data reduction method is different according to data characteristic, it shows a good performance using the selective data reduction method for the experimental data set. The findings reported in this paper can serve as a useful guideline for sensor network application design and construction including multivariate stream data.

A Sliding Window-based Multivariate Stream Data Classification (슬라이딩 윈도우 기반 다변량 스트림 데이타 분류 기법)

  • Seo, Sung-Bo;Kang, Jae-Woo;Nam, Kwang-Woo;Ryu, Keun-Ho
    • Journal of KIISE:Databases
    • /
    • v.33 no.2
    • /
    • pp.163-174
    • /
    • 2006
  • In distributed wireless sensor network, it is difficult to transmit and analyze the entire stream data depending on limited networks, power and processor. Therefore it is suitable to use alternative stream data processing after classifying the continuous stream data. We propose a classification framework for continuous multivariate stream data. The proposed approach works in two steps. In the preprocessing step, it takes input as a sliding window of multivariate stream data and discretizes the data in the window into a string of symbols that characterize the signal changes. In the classification step, it uses a standard text classification algorithm to classify the discretized data in the window. We evaluated both supervised and unsupervised classification algorithms. For supervised, we tested Bayesian classifier and SVM, and for unsupervised, we tested Jaccard, TFIDF Jaro and Jaro Winkler. In our experiments, SVM and TFIDF outperformed other classification methods. In particular, we observed that classification accuracy is improved when the correlation of attributes is also considered along with the n-gram tokens of symbols.

Cryptanalysis of LILI-128 with Overdefined Systems of Equations (과포화(Overdefined) 연립방정식을 이용한 LILI-128 스트림 암호에 대한 분석)

  • 문덕재;홍석희;이상진;임종인;은희천
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.13 no.1
    • /
    • pp.139-146
    • /
    • 2003
  • In this paper we demonstrate a cryptanalysis of the stream cipher LILI-128. Our approach to analysis on LILI-128 is to solve an overdefined system of multivariate equations. The LILI-128 keystream generato $r^{[8]}$ is a LFSR-based synchronous stream cipher with 128 bit key. This cipher consists of two parts, “CLOCK CONTROL”, pan and “DATA GENERATION”, part. We focus on the “DATA GENERATION”part. This part uses the function $f_d$. that satisfies the third order of correlation immunity, high nonlinearity and balancedness. But, this function does not have highly nonlinear order(i.e. high degree in its algebraic normal form). We use this property of the function $f_d$. We reduced the problem of recovering the secret key of LILI-128 to the problem of solving a largely overdefined system of multivariate equations of degree K=6. In our best version of the XL-based cryptanalysis we have the parameter D=7. Our fastest cryptanalysis of LILI-128 requires $2^{110.7}$ CPU clocks. This complexity can be achieved using only $2^{26.3}$ keystream bits.

A Study on Measuring the Similarity Among Sampling Sites in Lake Yongdam with Water Quality Data Using Multivariate Techniques (다변량기법을 활용한 용담호 수질측정지점 유사성 연구)

  • Lee, Yosang;Kwon, Sehyug
    • Journal of Environmental Impact Assessment
    • /
    • v.18 no.6
    • /
    • pp.401-409
    • /
    • 2009
  • Multivariate statistical approaches to classify sampling sites with measuring their similarity by water quality data and understand the characteristics of classified clusters have been discussed for the optimal water quality monitering network. For empirical study, data of two years (2005, 2006) at the 9 sampling sites with the combination of 2 depth levels and 7 important variables related to water quality is collected in Yongdam reservoir. The similarity among sampling sites is measured with Euclidean distances of water quality related variables and they are classified by hierarchical clustering method. The clustered sites are discussed with principal component variables in the view of the geographical characteristics of them and reducing the number of measuring sites. Nine sampling sites are clustered as follows; One cluster of 5, 6, and 7 sampling sites shows the characteristic of low water depth and main stream of water. The sites of 2 and 4 are clustered into the same group by characteristics of hydraulics which come from that of main stream. But their changing pattern of water quality looks like different since the site of 2 is near to dam. The sampling sites of 3, 8, and 9 are individually positioned due to the different tributary.

Assessment of Water Quality in the Miho Stream Using Multivariate Statistics (다변량 통계기법을 이용한 미호천 본류 수질특성 평가)

  • Yoon, Hyeyoung;Kim, Jeehyun;Chae, Minhee;Cho, Yoonhae;Cheon, Seuk
    • Journal of Environmental Impact Assessment
    • /
    • v.28 no.4
    • /
    • pp.373-386
    • /
    • 2019
  • In The study, is to investigate the spatial characteristics of the Miho stream, which is the main tributary of the Geum River system, and to identify the main factors influencing the water quality using water quality analysis and multivariate analysis. The survey subjects were selected as 7 main sites in the Miho stream water system, From 2012 to 2017, 16 items including weather temperature and weather data were used for multivariate analysis. As a result of the water quality analysis, the average concentration of BOD and COD for 6 years was 3grade (normal) compared with the water quality environmental standard (river) of conditions. The concentrations of nitrogen and phosphorus were highest at th upstream site, then decreased and then increased again by the hydrogeological and geomorphological effect. Cluster analysis of spatial and water quality characteristics, it was evaluated as three clusters and the pollution sources is the greatest impact. As a result of principal component analysis and factor analysis on each cluster and mainstream, three to four major components were extracted. Main stream and the Cluster 1, Cluster 3 first principal factor included nitrogen and seasonal factors,first factor of Cluster 2 included nitrogen and water temperature. Nitrogen is the principal factor which affects water quality in Miho stream.

Factor analysis of the trend of stream quality in Nakdong River

  • Kim, Kyong-Mu;Lee, In-Rak;Kim, Jong-Tae
    • Journal of the Korean Data and Information Science Society
    • /
    • v.19 no.4
    • /
    • pp.1201-1210
    • /
    • 2008
  • The goal of this paper is to investigate the trend of stream quality and the quality of water in Nakdong river by the method of factor analysis. It used the fourteen different monthly time series data such as pH, BOD, COD, SS, TN and etc. of the thirty four of Nakdong River measurement points from Jan. 1998 to Dec. 2006. The result of factor analysis is that the factor 1 results from organic water pollution is occupied 29.288% such as BOD, COD, TN and EC, and the factor 2 explained from sewage and a seasonal variation is occupied 16.467% such as SS.

  • PDF

A Comparative Study on the Multivariate Thomas-Fiering and Matalas Model (다변량 Thomas-Fiering 모형과 Matalas 모형의 비교연구)

  • 이주헌;이은태
    • Water for future
    • /
    • v.24 no.4
    • /
    • pp.59-66
    • /
    • 1991
  • Abstract The purpose of the synthetic of monthly river flows based on the short-term observed data by means of multivariate stochastic models is to provide abundunt input data to the water resources systems of which the system performance and operation policy are to be determined beforehand. In this study, multivariate Thomas-Fiering and Matalas models for synthetic generation based on stream flows in neihboring basin were employed to check if it can be applide in the modeling of monthly flows. Statistical parameters estimated by Method of Moment and Fourier Series Analysis respectively were reproduced for statistical features. For comparisons the statistical parameters of the generated monthly flow by each model were compared with those of the observed monthly flows. Results of this study suggest that the application of Matalas model for synthetic generation of monthly river flows can be adapted.

  • PDF

Evaluation of Water Quality for the Han River Tributaries Using Multivariate Analysis (다변량 통계 분석기법을 이용한 한강수계 지천의 수질 평가)

  • Kim, Yo-Yong;Lee, Si-Jin
    • Journal of Korean Society of Environmental Engineers
    • /
    • v.33 no.7
    • /
    • pp.501-510
    • /
    • 2011
  • In this study, water pollution sources of 14 major tributaries of Han river and characteristics of water quality for each target streams were evaluated based on water quality data in 2007.1-2009.12 (14 data sets) using a statistical package, SPSS-17.0. Cluster analysis over time and space for each stream resulted in 4 groups for the spatial variations in which type and density of pollution sources in the basins showed the greatest impact on grouping. Moreover, cluster analysis for the time variation in which rainfall, temperature and eutrophication were shown to contribute to the clustering, produced 2 groups, from summer to fall (July-Oct.) and from winter to early summer (Nov.-June). Four factors were found as responsible for the data structure explaining 71-90% of the total variance of the data set depending on the streams and they were organic matter, nutrients, bacterial contamination. Factor analysis showed main factors (water pollutants) changed according to the season with different pattern for each stream. This study demonstrated that water quality of each stream could produce useful outcomes when factor and pollution source of basin were evaluated together.

A Study on the Stochastic Modeling for Stream Flow Generation (하천유량의 모의발생을 위한 추계학적 모형의 적용에 관한 연구)

  • Lee, Joo-Heon
    • Journal of the Korean Society of Hazard Mitigation
    • /
    • v.1 no.2 s.2
    • /
    • pp.115-121
    • /
    • 2001
  • The purpose of the synthetic generation of monthly river flows based on the short term observed data by means of stochastic models is to provide abundant input data to the water resources systems of which the system performance and operation policy are to be determined beforehand. In this study, a multivariate autoregressive model has been applied to generate monthly flows of the multi sites considering the correlations between each site. The model performance was examined using statistical comparisons between the historical and generated monthly series such as mean, variance, skewness and correlation coefficients. The results of this study showed that the modeled generated flows were statistically similar to the historical flows.

  • PDF

A Study on Measuring the Similarity Among Sampling Sites in Lake (저수지 수질조사 지점간 유사성 분석)

  • Lee, Yo-Sang;Koh, Deuk-Koo;Lee, Hyun-Seok
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2010.05a
    • /
    • pp.957-961
    • /
    • 2010
  • Multivariate statistical approaches to classify sampling sites with measuring their similarity by water quality data. For empirical study, data of two years at the 9 sampling sites with the combination of 2 depth levels and 7 important variables related to water quality is collected in reservoir. The similarity among sampling sites is measured with Euclidean distances of water quality related variables and they are classified by hierarchical clustering method. The clustered sites are discussed with principal component variables in the view of the geographical characteristics of them and reducing the number of measuring sites. Nine sampling sites are clustered as follows; One cluster of 5, 6, and 7 sampling sites shows the characteristic of low water depth and main stream of water. The sites of 2 and 4 are clustered into the same group by characteristics of hydraulics which come from that of main stream. But their changing pattern of water quality looks like different since the site of 2 is near to dam. The sampling sites of 3, 8, and 9 are individually positioned due to the different tributary.

  • PDF