• Title/Summary/Keyword: Big data Processing

Search Result 1,063, Processing Time 0.025 seconds

Design and Implementation of Big Data Streaming Query Processing System for Realtime Power Plant Sensor data (실시간 발전소 시설 장비 센서 데이터에 대한 빅데이터 스트리밍 질의 처리 시스템 설계 및 구현)

  • Um, Jung-Ho;Yu, Chan Hee;Sarda, Komal;Park, Kyongseok
    • Annual Conference of KIPS
    • /
    • 2020.11a
    • /
    • pp.88-91
    • /
    • 2020
  • 발전 시설은 연간 무중단으로 운영되어야 하고, 고장이 발생하면 손해가 막대하기 때문에 발전 시설 장비에는 수십만 개의 센서 데이터가 설치되어 있다. 본 논문에서는 효율적인 센서 데이터의 수집과 시설 모니터링 및 고장 예측 등을 위한 빅데이터 스트리밍 질의 처리 시스템을 설계 및 구현하였다. 또한 실시간 데이터 수집의 효율적인 관리를 위해 인코딩 방식을 설계하였으며, 데이터 전송 성능을 측정하여 문자열로 데이터를 전송하는 것보다 평균 12%, 최대 32% 데이터 처리 성능이 향상됨을 보였다. 또한, 스트리밍 데이터에 대한 윈도우 질의 처리 성능을 측정하여 약 0.97초의 평균 집계 질의 처리 시간이 소요됨을 확인하였다. 향후에는 고장 감지를 위한 인공지능 추론 모델을 제안하는 빅데이터 스트리밍 질의 처리 시스템에 적용할 예정이다.

Big Data Management in Structured Storage Based on Fintech Models for IoMT using Machine Learning Techniques (기계학습법을 이용한 IoMT 핀테크 모델을 기반으로 한 구조화 스토리지에서의 빅데이터 관리 연구)

  • Kim, Kyung-Sil
    • Advanced Industrial SCIence
    • /
    • v.1 no.1
    • /
    • pp.7-15
    • /
    • 2022
  • To adopt the development in the medical scenario IoT developed towards the advancement with the processing of a large amount of medical data defined as an Internet of Medical Things (IoMT). The vast range of collected medical data is stored in the cloud in the structured manner to process the collected healthcare data. However, it is difficult to handle the huge volume of the healthcare data so it is necessary to develop an appropriate scheme for the healthcare structured data. In this paper, a machine learning mode for processing the structured heath care data collected from the IoMT is suggested. To process the vast range of healthcare data, this paper proposed an MTGPLSTM model for the processing of the medical data. The proposed model integrates the linear regression model for the processing of healthcare information. With the developed model outlier model is implemented based on the FinTech model for the evaluation and prediction of the COVID-19 healthcare dataset collected from the IoMT. The proposed MTGPLSTM model comprises of the regression model to predict and evaluate the planning scheme for the prevention of the infection spreading. The developed model performance is evaluated based on the consideration of the different classifiers such as LR, SVR, RFR, LSTM and the proposed MTGPLSTM model and the different size of data as 1GB, 2GB and 3GB is mainly concerned. The comparative analysis expressed that the proposed MTGPLSTM model achieves ~4% reduced MAPE and RMSE value for the worldwide data; in case of china minimal MAPE value of 0.97 is achieved which is ~ 6% minimal than the existing classifier leads.

Storm-Based Dynamic Tag Cloud for Real-Time SNS Data (실시간 SNS 데이터를 위한 Storm 기반 동적 태그 클라우드)

  • Son, Siwoon;Kim, Dasol;Lee, Sujeong;Gil, Myeong-Seon;Moon, Yang-Sae
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.6
    • /
    • pp.309-314
    • /
    • 2017
  • In general, there are many difficulties in collecting, storing, and analyzing SNS (social network service) data, since those data have big data characteristics, which occurs very fast with the mixture form of structured and unstructured data. In this paper, we propose a new data visualization framework that works on Apache Storm, and it can be useful for real-time and dynamic analysis of SNS data. Apache Storm is a representative big data software platform that processes and analyzes real-time streaming data in the distributed environment. Using Storm, in this paper we collect and aggregate the real-time Twitter data and dynamically visualize the aggregated results through the tag cloud. In addition to Storm-based collection and aggregation functionalities, we also design and implement a Web interface that a user gives his/her interesting keywords and confirms the visualization result of tag cloud related to the given keywords. We finally empirically show that this study makes users be able to intuitively figure out the change of the interested subject on SNS data and the visualized results be applied to many other services such as thematic trend analysis, product recommendation, and customer needs identification.

Development of Rainfall Information Production Technology Using Optical Sensors (Estimation of Real-Time Rainfall Information Using Optima Rainfall Intensity Technique) (광학센서를 이용한 강우정보 생산기법 개발 (최적 강우강도 기법을 이용한 실시간 강우정보 산정))

  • Lee, Byung-Hyun;Kim, Byung-Sik;Lee, Young-Mi;Oh, Cheong-Hyeon;Choi, Jung-Ryel;Jun, Weon-Hyouk
    • Journal of Environmental Science International
    • /
    • v.30 no.12
    • /
    • pp.1101-1111
    • /
    • 2021
  • In this study, among the W-S-R(Wiper-Signal-Rainfall) relationship methods used to produce sensor-based rain information in real time, we sought to produce actual rainfall information by applying machine learning techniques to account for the effects of wiper operation. To this end, we used the gradient descent and threshold map methods for pre-processing the cumulative value of the difference before and after wiper operation by utilizing four sensitive channels for optical sensors which collected rain sensor data produced by five rain conditions in indoor artificial rainfall experiments. These methods produced rainfall information by calculating the average value of the threshold according to the rainfall conditions and channels, creating a threshold map corresponding to the 4 (channel) × 5 (considering rainfall information) grid and applying Optima Rainfall Intensity among the big data processing techniques. To verify these proposed results, the application was evaluated by comparing rainfall observations.

A GPU-based Filter Algorithm for Noise Improvement in Realtime Ultrasound Images (실시간 초음파 영상에서 노이즈 개선을 위한 GPU 기반의 필터 알고리즘)

  • Cho, Young-Bok;Woo, Sung-Hee
    • Journal of Digital Contents Society
    • /
    • v.19 no.6
    • /
    • pp.1207-1212
    • /
    • 2018
  • The ultrasound image uses ultrasonic pulses to receive the reflected waves and construct an image necessary for diagnosis. At this time, when the signal becomes weak, noise is generated and a slight difference in brightness occurs. In addition, fluctuation of image due to breathing phenomenon, which is the characteristic of ultrasound image, and change of motion in real time occurs. Such a noise is difficult to recognize and diagnose visually in the analysis process. In this paper, morphological features are automatically extracted by using image processing technique on ultrasound acquired images. In this paper, we implemented a GPU - based fast filter using a cloud big data processing platform for image processing. In applying the GPU - based high - performance filter, the algorithm was run with performance 4.7 times faster than CPU - based and the PSNR was 37.2dB, which is very similar to the original.

Automatic Processing Techniques of Rotorcraft Flight Data Using Data Mining (회전익항공기 운동모델 개발을 위한 데이터마이닝을 이용한 비행데이터 자동 처리 기법)

  • Oh, Hyeju;Jo, Sungbeom;Choi, Keeyoung;Roh, Eun-Jung;Kang, Byung-Ryong
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.46 no.10
    • /
    • pp.823-832
    • /
    • 2018
  • In general, the fidelity of the aircraft dynamic model is verified by comparison with the flight test results of the target aircraft. Therefore, the reference flight data for performance comparisons must be extracted. This process requires a lot of time and manpower to extract useful data from the vast quantity of flight test data containing various noise for comparing fidelity. In particular, processing of flight data is complex because rotorcraft have high non-linearity characteristics such as coupling and wake interference effect and perform various maneuvers such as hover and backward flight. This study defines flight data processing criteria for rotorcraft and provides procedures and methods for automated processing of static and dynamic flight data using data mining techniques. Finally, the methods presented are validated using flight data.

Design of Trajectory Data Indexing and Query Processing for Real-Time LBS in MapReduce Environments (MapReduce 환경에서의 실시간 LBS를 위한 이동궤적 데이터 색인 및 검색 시스템 설계)

  • Chung, Jaehwa
    • Journal of Digital Contents Society
    • /
    • v.14 no.3
    • /
    • pp.313-321
    • /
    • 2013
  • In recent, proliferation of mobile smart devices have led to big-data era, the importance of location-based services is increasing due to the exponential growth of trajectory related data. In order to process trajectory data, parallel processing platforms such as cloud computing and MapReduce are necessary. Currently, the researches based on MapReduce are on progress, but due to the MapReduce's properties in using batch processing and simple key-value structure, applying MapReduce framework for real time LBS is difficult. Therefore, in this research we propose a suitable system design on efficient indexing and search techniques for real time service based on detailed analysis on the properties of MapReduce.

Implementation of a pet product recommendation system using big data (빅 데이터를 활용한 애완동물 상품 추천 시스템 구현)

  • Kim, Sam-Taek
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.11
    • /
    • pp.19-24
    • /
    • 2020
  • Recently, due to the rapid increase of pets, there is a need for an integrated pet-related personalized product recommendation service such as feed recommendation using a health status check of pets and various collected data. This paper implements a product recommendation system that can perform various personalized services such as collection, pre-processing, analysis, and management of pet-related data using big data. First, the sensor information worn by pets, customer purchase patterns, and SNS information are collected and stored in a database, and a platform capable of customized personalized recommendation services such as feed production and pet health management is implemented using statistical analysis. The platform can provide information to customers by outputting similarity product information about the product to be analyzed and information, and finally outputting the result of recommendation analysis.

A biometric information collecting system for biomedical big data analysis (생체 의학 빅 데이터 분석을 위한 생체 정보 수집 시스템)

  • Lim, Damsub;Hong, Sunhag;Ku, Mino;Min, Dugki
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2013.10a
    • /
    • pp.513-516
    • /
    • 2013
  • In this paper, we present an information collecting system in medical information management domain. Our proposed system performs a systemized process, consisting of collection, transmission, and management, to develop intelligent medical information system and medical big data processing system. Our information collecting system consists of low-power biomedical sensors, biomedical information collecting devices, and storage systems. Currently, almost biomedical information of patients is collected manually by employees like nurses and medical doctors. Therefore, collected biometric data can be error-pronoun data. Since there is a lack to make big data of medical information, it is difficult to enhance the quality of medical services and researches. Accordingly, through our proposed system, we can overcome the problems like error-pronoun biometric data. In addition, we can extremely extend the area of collectable biometric data. Furthermore, using this system, we are able to make a real-time biomedical analysis system, like a real-time patient diagnosis system, and establish a strategy to against future medical markets changing rapidly.

  • PDF

A MapReduce-Based Workflow BIG-Log Clustering Technique (맵리듀스기반 워크플로우 빅-로그 클러스터링 기법)

  • Jin, Min-Hyuck;Kim, Kwanghoon Pio
    • Journal of Internet Computing and Services
    • /
    • v.20 no.1
    • /
    • pp.87-96
    • /
    • 2019
  • In this paper, we propose a MapReduce-supported clustering technique for collecting and classifying distributed workflow enactment event logs as a preprocessing tool. Especially, we would call the distributed workflow enactment event logs as Workflow BIG-Logs, because they are satisfied with as well as well-fitted to the 5V properties of BIG-Data like Volume, Velocity, Variety, Veracity and Value. The clustering technique we develop in this paper is intentionally devised for the preprocessing phase of a specific workflow process mining and analysis algorithm based upon the workflow BIG-Logs. In other words, It uses the Map-Reduce framework as a Workflow BIG-Logs processing platform, it supports the IEEE XES standard data format, and it is eventually dedicated for the preprocessing phase of the ${\rho}$-Algorithm that is a typical workflow process mining algorithm based on the structured information control nets. More precisely, The Workflow BIG-Logs can be classified into two types: of activity-based clustering patterns and performer-based clustering patterns, and we try to implement an activity-based clustering pattern algorithm based upon the Map-Reduce framework. Finally, we try to verify the proposed clustering technique by carrying out an experimental study on the workflow enactment event log dataset released by the BPI Challenges.