• Title/Summary/Keyword: Big Stream

Search Result 140, Processing Time 0.027 seconds

Performance Evaluation and Analysis of Multiple Scenarios of Big Data Stream Computing on Storm Platform

  • Sun, Dawei;Yan, Hongbin;Gao, Shang;Zhou, Zhangbing
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.7
    • /
    • pp.2977-2997
    • /
    • 2018
  • In big data era, fresh data grows rapidly every day. More than 30,000 gigabytes of data are created every second and the rate is accelerating. Many organizations rely heavily on real time streaming, while big data stream computing helps them spot opportunities and risks from real time big data. Storm, one of the most common online stream computing platforms, has been used for big data stream computing, with response time ranging from milliseconds to sub-seconds. The performance of Storm plays a crucial role in different application scenarios, however, few studies were conducted to evaluate the performance of Storm. In this paper, we investigate the performance of Storm under different application scenarios. Our experimental results show that throughput and latency of Storm are greatly affected by the number of instances of each vertex in task topology, and the number of available resources in data center. The fault-tolerant mechanism of Storm works well in most big data stream computing environments. As a result, it is suggested that a dynamic topology, an elastic scheduling framework, and a memory based fault-tolerant mechanism are necessary for providing high throughput and low latency services on Storm platform.

Hazelcast Vs. Ignite: Opportunities for Java Programmers

  • Maxim, Bartkov;Tetiana, Katkova;S., Kruglyk Vladyslav;G., Murtaziev Ernest;V., Kotova Olha
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.2
    • /
    • pp.406-412
    • /
    • 2022
  • Storing large amounts of data has always been a big problem from the beginning of computing history. Big Data has made huge advancements in improving business processes by finding the customers' needs using prediction models based on web and social media search. The main purpose of big data stream processing frameworks is to allow programmers to directly query the continuous stream without dealing with the lower-level mechanisms. In other words, programmers write the code to process streams using these runtime libraries (also called Stream Processing Engines). This is achieved by taking large volumes of data and analyzing them using Big Data frameworks. Streaming platforms are an emerging technology that deals with continuous streams of data. There are several streaming platforms of Big Data freely available on the Internet. However, selecting the most appropriate one is not easy for programmers. In this paper, we present a detailed description of two of the state-of-the-art and most popular streaming frameworks: Apache Ignite and Hazelcast. In addition, the performance of these frameworks is compared using selected attributes. Different types of databases are used in common to store the data. To process the data in real-time continuously, data streaming technologies are developed. With the development of today's large-scale distributed applications handling tons of data, these databases are not viable. Consequently, Big Data is introduced to store, process, and analyze data at a fast speed and also to deal with big users and data growth day by day.

Squall: A Real-time Big Data Processing Framework based on TMO Model for Real-time Events and Micro-batch Processing (Squall: 실시간 이벤트와 마이크로-배치의 동시 처리 지원을 위한 TMO 모델 기반의 실시간 빅데이터 처리 프레임워크)

  • Son, Jae Gi;Kim, Jung Guk
    • Journal of KIISE
    • /
    • v.44 no.1
    • /
    • pp.84-94
    • /
    • 2017
  • Recently, the importance of velocity, one of the characteristics of big data (5V: Volume, Variety, Velocity, Veracity, and Value), has been emphasized in the data processing, which has led to several studies on the real-time stream processing, a technology for quick and accurate processing and analyses of big data. In this paper, we propose a Squall framework using Time-triggered Message-triggered Object (TMO) technology, a model that is widely used for processing real-time big data. Moreover, we provide a description of Squall framework and its operations under a single node. TMO is an object model that supports the non-regular real-time processing method for certain conditions as well as regular periodic processing for certain amount of time. A Squall framework can support the real-time event stream of big data and micro-batch processing with outstanding performances, as compared to Apache storm and Spark Streaming. However, additional development for processing real-time stream under multiple nodes that is common under most frameworks is needed. In conclusion, the advantages of a TMO model can overcome the drawbacks of Apache storm or Spark Streaming in the processing of real-time big data. The TMO model has potential as a useful model in real-time big data processing.

Context Inference and Sensor Data Classification of Big Data Stream Environment (빅데이터 스트림 환경에서의 센서 데이터 분류와 상황추론)

  • Ryu, Chang-Kun
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.9 no.10
    • /
    • pp.1079-1085
    • /
    • 2014
  • The analysis of the variable continuous big data stram should reach the destination context awareness. This study presented a novel way of context inference of the variable data stream from sensor motes. For assessment of the sensor data, we calculated the difference of each measured value at the time window and determined the belief value of each focal element. It was beneficial that calculate and assessment of factor of situation for context inference with the Dempster-Shfer evidence theory.

Real-Time IoT Big-data Processing for Stream Reasoning (스트림-리즈닝을 위한 실시간 사물인터넷 빅-데이터 처리)

  • Yun, Chang Ho;Park, Jong Won;Jung, Hae Sun;Lee, Yong Woo
    • Journal of Internet Computing and Services
    • /
    • v.18 no.3
    • /
    • pp.1-9
    • /
    • 2017
  • Smart Cities intelligently manage numerous infrastructures, including Smart-City IoT devices, and provide a variety of smart-city applications to citizen. In order to provide various information needed for smart-city applications, Smart Cities require a function to intelligently process large-scale streamed big data that are constantly generated from a large number of IoT devices. To provide smart services in Smart-City, the Smart-City Consortium uses stream reasoning. Our stream reasoning requires real-time processing of big data. However, there are limitations associated with real-time processing of large-scale streamed big data in Smart Cities. In this paper, we introduce one of our researches on cloud computing based real-time distributed-parallel-processing to be used in stream-reasoning of IoT big data in Smart Cities. The Smart-City Consortium introduced its previously developed smart-city middleware. In the research for this paper, we made cloud computing based real-time distributed-parallel-processing available in the cloud computing platform of the smart-city middleware developed in the previous research, so that we can perform real-time distributed-parallel-processing with them. This paper introduces a real-time distributed-parallel-processing method and system for stream reasoning with IoT big data transmitted from various sensors of Smart Cities and evaluate the performance of real-time distributed-parallel-processing of the system where the method is implemented.

A novel window strategy for concept drift detection in seasonal time series (계절성 시계열 자료의 concept drift 탐지를 위한 새로운 창 전략)

  • Do Woon Lee;Sumin Bae;Kangsub Kim;Soonhong An
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.05a
    • /
    • pp.377-379
    • /
    • 2023
  • Concept drift detection on data stream is the major issue to maintain the performance of the machine learning model. Since the online stream is to be a function of time, the classical statistic methods are hard to apply. In particular case of seasonal time series, a novel window strategy with Fourier analysis however, gives a chance to adapt the classical methods on the series. We explore the KS-test for an adaptation of the periodic time series and show that this strategy handles a complicate time series as an ordinary tabular dataset. We verify that the detection with the strategy takes the second place in time delay and shows the best performance in false alarm rate and detection accuracy comparing to that of arbitrary window sizes.

A Study on The Change Characteristic of Basin Topographical Parameters According to the Threshold Area of Stream Creation (하천생성 임계면적의 변화에 따른 유역의 지형관련 매개변수들의 특성분석)

  • Ahn, Seung-Seop;Lee, Jeung-Seok;Kim, Jong-Ho;Lim, Kee-Seok
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.8 no.2
    • /
    • pp.10-20
    • /
    • 2005
  • The change of stream creation has a very sensitive effect on runoff analysis model using the divergence characteristic of stream. Therefore, in this study, the threshold area of stream creation was examined the change characteristic of topographical parameters. The subject basin of the research was the upper basin of the Kumho water gage station which is located in the middle of the Kumho river. The 1:25,000 numerical geography which was constructed $10{\times}10m$ mesh was used. The range of investigation of topographical parameters are number of stream order, length, area, slope, basin relief, sinuosity ratio, drainage density and total stream length etc. It was found from the result of analysis that the threshold value of 1st order stream has a very big effect on topographical parameters of basin. It was found that the threshold area of stream creation was under $0.10km^2$, the parameters showed a big change but showed a very small change over $0.10km^2$.

  • PDF

Implementation of Real-time Data Stream Processing for Predictive Maintenance of Offshore Plants (해양플랜트의 예지보전을 위한 실시간 데이터 스트림 처리 구현)

  • Kim, Sung-Soo;Won, Jongho
    • Journal of KIISE
    • /
    • v.42 no.7
    • /
    • pp.840-845
    • /
    • 2015
  • In recent years, Big Data has been a topic of great interest for the production and operation work of offshore plants as well as for enterprise resource planning. The ability to predict future equipment performance based on historical results can be useful to shuttling assets to more productive areas. Specifically, a centrifugal compressor is one of the major piece of equipment in offshore plants. This machinery is very dangerous because it can explode due to failure, so it is necessary to monitor its performance in real time. In this paper, we present stream data processing architecture that can be used to compute the performance of the centrifugal compressor. Our system consists of two major components: a virtual tag stream generator and a real-time data stream manager. In order to provide scalability for our system, we exploit a parallel programming approach to use multi-core CPUs to process the massive amount of stream data. In addition, we provide experimental evidence that demonstrates improvements in the stream data processing for the centrifugal compressor.

An Eclogical Study on the Aquatic Animals in Jungrang Stream of Seoul (중랑천의 수서동물에 관한 생태학적 연구)

  • 배경석;박종태;조기찬;길혜경;신재영
    • Journal of Environmental Health Sciences
    • /
    • v.23 no.2
    • /
    • pp.89-97
    • /
    • 1997
  • Most of urban streams in Korea have been changed channel forms and suffered from direct inflow of domestic sewage, etc. Therefore, maintenance of structure and function of those ecosystem are hard. The present study was carried out to examine the life survival maintenance ability of the stream by community analysis of aquatic animals in typical urban stream (Jungrang stream) in Seoul. The aquatic animals were composed of 31 species, 18 families, 8 orders, 5 classes in 4 phyla. Seasonal species number showed big fluctuation between 8 species in Winter and 24 species in Autumn. Major dominant species in Jungrang stream were Tubificidae sp.1, Chironomidae sp.1, Chironomidae sp.2 and Physa acuta, and above endurance species for water pollution occupied very high dominance indices. But, Cercion hieroglyphicum, Ischnura asiatica, Rantra chinensis, Herochares striatus, Agabus japonicus in benthic macroinvertebrates of a few individuals are appeared. Also, fry of Carassius auratus and Silurus asotus in fish are occurred. Therefore, we can be inferred on posibility of growth and spawning of above species in the stream. Jungrang stream has a small quantity of natural riffle areas, ponds and watergrass areas by channel form of water course. Aquatic animals in Jungrang stream has been suffered by reduction of self-purification reaction ability and have mass production of attached algae on the stream bed. For analysis of fluctuation of life survival maintenance ability in Jungrang stream, long-term survey is needed.

  • PDF

Study on the Sensor Gateway for Receive the Real-Time Big Data in the IoT Environment (IoT 환경에서 실시간 빅 데이터 수신을 위한 센서 게이트웨이에 관한 연구)

  • Shin, Seung-Hyeok
    • Journal of Advanced Navigation Technology
    • /
    • v.19 no.5
    • /
    • pp.417-422
    • /
    • 2015
  • A service size of the IoT environment is determined by the number of sensors. The number of sensors increase means increases the amount of data generated by the IoT environment. There are studies to reliably operate a network for research and operational dynamic buffer for data when network congestion control congestion in the network environment. There are also studies of the stream data that has been processed in the connectionless network environment. In this study, we propose a sensor gateway for processing big data of the IoT environment. For this, review the RESTful for designing a sensor middleware, and apply the double-buffer algorithm to process the stream data efficiently. Finally, it generates a big data traffic using the MJpeg stream that is based on the HTTP protocol over TCP to evaluate the proposed system, with open source media player VLC using the image received and compare the throughput performance.