• Title/Summary/Keyword: Multiple Stream Data

Search Result 176, Processing Time 0.027 seconds

Implementing stream processing functionalities of Splash (Splash의 스트림 프로세싱 기능 구현)

  • Ahn, Jaeho;Noh, Soonhyun;Hong, Seongsoo
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2019.01a
    • /
    • pp.377-380
    • /
    • 2019
  • To accommodate for the difficult task of satisfying application's system timing constraints, we are developing Splash, a real time stream processing language for embedded AI applications. Splash is a graphical programming language that designs applications through data flow graph which, later automatically generates into codes. The codes are compiled and executed on top of the Splash runtime system. The Splash runtime system supports two aspects of the application. First, it supports the basic stream processing functions required for an application to operate on multiple streams of data. Second, it supports the checking and handling of the user configurated timing constraints. In this paper we explain the implementation of the first aspect of the Splash runtime system which is being developed using a real time communication middleware called DDS.

  • PDF

Iceberg Query Evaluation Technical Using a Cuboid Prefix Tree (큐보이드 전위트리를 이용한 빙산질의 처리)

  • Han, Sang-Gil;Yang, Woo-Sock;Lee, Won-Suk
    • Journal of KIISE:Databases
    • /
    • v.36 no.3
    • /
    • pp.226-234
    • /
    • 2009
  • A data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. Due to the characteristics of a data stream, it is impossible to save all the data elements of a data stream. Therefore it is necessary to define a new synopsis structure to store the summary information of a data stream. For this purpose, this paper proposes a cuboid prefix tree that can be effectively employed in evaluating an iceberg query over data streams. A cuboid prefix tree only stores those itemsets that consist of grouping attributes used in GROUP BY query. In addition, a cuboid prefix tree can compute multiple iceberg queries simultaneously by sharing their common sub-expressions. A cuboid prefix tree evaluates an iceberg query over an infinitely generated data stream while efficiently reducing memory usage and processing time, which is verified by a series of experiments.

Development of Relational Formula between Groundwater Pumping Rate and Streamflow Depletion (지하수 양수량과 하천수 감소량간 상관관계식 개발)

  • Kim, Nam Won;Lee, Jeongwoo;Lee, Jung Eun;Won, You Seung
    • Journal of Korea Water Resources Association
    • /
    • v.45 no.12
    • /
    • pp.1243-1258
    • /
    • 2012
  • The objective of this study is to develop the relational formula to estimate the streamflow depletion due to groundwater pumping near stream, which has been statistically derived by using the simulated data. The integrated surface water and groundwater model, SWAT-MODFLOW was applied to the Sinduncheon and Juksancheon watersheds to obtain the streamflow depletion data under various pumping conditions. Through the multiple regression analyses for the simulated streamflow depletion data, the relational formula between the streamflow depletion rate and various factors such as pumping rate, distance between well and stream, hydraulic properties in/near stream, amount of rainfall was obtained. The derived relational formula is easy to apply for assessing the effects of groundwater pumping on near stream, and is expected to be a tool for estimate the streamflow contribution to the pumped water.

Efficient Processing of Multidimensional Vessel USN Stream Data using Clustering Hash Table (클러스터링 해쉬 테이블을 이용한 다차원 선박 USN 스트림 데이터의 효율적인 처리)

  • Song, Byoung-Ho;Oh, Il-Whan;Lee, Seong-Ro
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.47 no.6
    • /
    • pp.137-145
    • /
    • 2010
  • Digital vessel have to accurate and efficient mange the digital data from various sensors in the digital vessel. But, In sensor network, it is difficult to transmit and analyze the entire stream data depending on limited networks, power and processor. Therefore it is suitable to use alternative stream data processing after classifying the continuous stream data. In this paper, We propose efficient processing method that arrange some sensors (temperature, humidity, lighting, voice) and process query based on sliding window for efficient input stream and pre-clustering using multiple Support Vector Machine(SVM) algorithm and manage hash table to summarized information. Processing performance improve as store and search and memory using hash table and usage reduced so maintain hash table in memory. We obtained to efficient result that accuracy rate and processing performance of proposal method using 35,912 data sets.

Feature Based Decision Tree Model for Fault Detection and Classification of Semiconductor Process (반도체 공정의 이상 탐지와 분류를 위한 특징 기반 의사결정 트리)

  • Son, Ji-Hun;Ko, Jong-Myoung;Kim, Chang-Ouk
    • IE interfaces
    • /
    • v.22 no.2
    • /
    • pp.126-134
    • /
    • 2009
  • As product quality and yield are essential factors in semiconductor manufacturing, monitoring the main manufacturing steps is a critical task. For the purpose, FDC(Fault detection and classification) is used for diagnosing fault states in the processes by monitoring data stream collected by equipment sensors. This paper proposes an FDC model based on decision tree which provides if-then classification rules for causal analysis of the processing results. Unlike previous decision tree approaches, we reflect the structural aspect of the data stream to FDC. For this, we segment the data stream into multiple subregions, define structural features for each subregion, and select the features which have high relevance to results of the process and low redundancy to other features. As the result, we can construct simple, but highly accurate FDC model. Experiments using the data stream collected from etching process show that the proposed method is able to classify normal/abnormal states with high accuracy.

A High-Speed Data Processing Algorithm for RFID Input Data Stream Using Multi-Buffer (RFID 입력 데이터 스트림에 대한 다중 버퍼 기반의 고속 데이터 처리 알고리즘)

  • Han, Soo;Park, Sang-Hyun;Shin, Seung-Ho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.13 no.2
    • /
    • pp.79-85
    • /
    • 2008
  • The middleware that provides RFID-based ubiquitous application service should process the data inputted constantly in real time, and acquire and deliver the answers of the questions in the application service. Studies for developing a Data Stream Management System(DSMS) has been performed in order to process a large amount of data stream inputted constantly in this way. Previous algorithms on data stream were mostly focused on reducing the average error between the answers of the successive questions and abandon the data according to the priority of them when a load occurs. This article is composed of presenting the necessity of the studies on the DSMS and speedy data processing, suggesting an algorithm to make Possible the speedy data processing using buffers and prompt questions and answers, and testing the performance of the data processing rate and whether a buffer is generated correspondingly to the algorithm suggested, in either a single or a multiple buffer, through simulations.

  • PDF

SIMULATION OF REGIONAL DAILY FLOW AT UNGAGED SITES USING INTEGRATED GIS-SPATIAL INTERPOLATION (GIS-SI) TECHNIQUE

  • Lee, Ju-Young;Krishinamursh, Ganeshi
    • Water Engineering Research
    • /
    • v.6 no.2
    • /
    • pp.39-48
    • /
    • 2005
  • The Brazos River is one of the longest rivers contained entirely in the state of Texas, flowing over 700 miles from northwest Texas to the Gulf of Mexico. Today, the Brazos River Authority and Texas Commission on Environmental Quality interest in drought protection plan, waterpower project, and allowing the appropriation of water system-wide and water right within the Brazos River Basin to meet water needs of customers like farmers and local civilians in the future. Especially, this purpose of this paper primarily intended to provide the data for the engineering guidelines and make easily geological mapping tool. In the Brazos River basin, many stream-flow gage station sites are not working, and they can not provide stream-flow data sets enough for development of the Probable Maximum Flood (PMF) for use in the evaluation of proposed and existing dams and other impounding structures. Integrated GIS-Spatial Interpolation (GIS-SI) tool are composed of two parts; (1) extended GIS technique (new making interface for hydrological regionalization parameters plus classical GIS mapping skills), (2) Spatial Interpolation technique using weighting factors from kriging method. They are obtained from the relationship among location and elevation of geological watershed and existing stream-flow datasets. GIS-SI technique is easily used to compute parameters which get drainage areas, mean daily/monthly/annual precipitation, and weighted values. Also, they are independent variables of multiple linear regressions for simulation at un gaged stream-flow sites. In this study, GIS-SI technique is applied to the Brazos river basin in Texas. By assuming the ungaged flow at the sites of Palo Pinto, Bryan and Needville, the simulated daily/monthly/annual time series are compared with observed time series. The simulated daily/monthly/annual time series are highly correlated with and well fitted to the observed times series.

  • PDF

Adaptive Upstream Backup Scheme based on Throughput Rate in Distributed Spatial Data Stream System (분산 공간 데이터 스트림 시스템에서 연산 처리율 기반의 적응적 업스트림 백업 기법)

  • Jeong, Weonil
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.14 no.10
    • /
    • pp.5156-5161
    • /
    • 2013
  • In distributed spatial data stream processing, processed tuples of downstream nodes are replicated to the upstream node in order to increase the utilization of distributed nodes and to recover the whole system for the case of system failure. However, while the data input rate increases and multiple downstream nodes share the operation result of the upstream node, the data which stores to output queues as a backup can be lost since the deletion operation delay may be occurred by the delay of the tuple processing of upstream node. In this paper, the adaptive upstream backup scheme based on operation throughput in distributed spatial data stream system is proposed. This method can cut down the average load rate of nodes by efficient spatial operation migration as it processes spatial temporal data stream, and it can minimize the data loss by fluid change of backup mode. The experiments show the proposed approach can prevent data loss and can decrease, on average, 20% of CPU utilization by node monitoring.

An Efficient M-way Stream Join Algorithm Exploiting a Bit-vector Hash Table (비트-벡터 해시 테이블을 이용한 효율적인 다중 스트림 조인 알고리즘)

  • Kwon, Tae-Hyung;Kim, Hyeon-Gyu;Lee, Yu-Won;Kim, Myoung-Ho
    • Journal of KIISE:Databases
    • /
    • v.35 no.4
    • /
    • pp.297-306
    • /
    • 2008
  • MJoin is proposed as an algorithm to join multiple data streams efficiently, whose characteristics are unpredictably changed. It extends a symmetric hash join to handle multiple data streams. Whenever a tuple arrives from a remote stream source, MJoin checks whether all of hash tables have matching tuples. However, when a join involves many data streams with low join selectivity, the performance of this checking process is significantly influenced by the checking order of hash tables. In this paper, we propose a BiHT-Join algorithm which extends MJoin to conduct this checking in a constant time regardless of a join order. BiHT-Join maintains a bit-vector which represents the existence of tuples in streams and decides a successful/unsuccessful join through comparing a bit-vector. Based on the bit-vector comparison, BiHT-Join can conduct a hash join only for successful joining tuples based on this decision. Our experimental results show that the proposed BiHT-Join provides better performance than MJoin in the processing of multiple streams.

Efficient Processing of Continuous Join Queries between a Data Stream and Multiple Relations for Real-Time Analysis of E-Commerce Data (전자상거래 데이터의 실시간 분석을 위한 데이터 스트림과 다수 릴레이션 간의 효율적인 연속 조인 처리 기법)

  • Kim, Haeri;Lee, Ki Yong
    • The Journal of Society for e-Business Studies
    • /
    • v.18 no.3
    • /
    • pp.159-175
    • /
    • 2013
  • Recently, as real-time availability of e-commerce data becomes possible, the requirement of real-time analysis of e-commerce increases significantly. In the real-time analysis of e-commerce data, it is very important to efficiently process continuous join queries between an e-commerce data stream and disk-based large relations. In this paper, we propose an efficient method for processing a continuous join query between an e-commerce data stream and multiple disk-based relations. The proposed method improves the service rate significantly, while reducing the amount of required memory substantially. Through analysis and various experiments, we show the efficiency of the proposed method compared with the previous one in terms of service rate and memory usage.