• 제목/요약/키워드: Distributed data collection

검색결과 234건 처리시간 0.029초

그리드 정보검색 시스템을 위한 OGSA-DAI 기반 확장된 Collection Manager 서비스 설계 (Design of Advanced Collection Manager Service for Grid-IR System Based on OGSA-DAI component)

  • 김혁호;김양우
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2009년도 춘계학술발표대회
    • /
    • pp.846-848
    • /
    • 2009
  • The interest in the access and integration of distributed massive data resources has increased recently. This paper presents the Advanced Collection Manager(CM) service with OGSA-DAI component which can access and integrate the distributed data resources. The Advanced CM service supports the data resource of various types. And it can provide the query, updating, transforming and delivering data via cooperating with other services in Grid Information Retrieval(Grid-IR or GIR) System. As a result, it can access and manage the data resource more flexible and efficient.

분산 시스템의 성능 모니터링과 레포팅 툴의 아키텍처 모델링 (Distributed System Architecture Modeling of a Performance Monitoring and Reporting Tool)

  • 김기;최은미
    • 한국시뮬레이션학회논문지
    • /
    • 제12권3호
    • /
    • pp.69-81
    • /
    • 2003
  • To manage a cluster of distributed server systems, a number of management aspects should be considered in terms of configuration management, fault management, performance management, and user management. System performance monitoring and reporting take an important role for performance and fault management. In this paper, we present distributed system architecture modeling of a performance monitoring and reporting tool. Modeling architecture of four subsystems are introduced: node agent, data collection, performance management & report, and DB schema. The performance-related information collected from distributed servers are categorized into performance counters, event data for system status changes, service quality, and system configuration data. In order to analyze those performance information, we use a number of ways to evaluate data corelation. By using some results from a real site of a company and from simulation of artificial workload, we show the example of performance collection and analysis. Since our report tool detects system fault or node component failure and analyzes performances through resource usage and service quality, we are able to provide information for server load balancing, in short term view, and the cause of system faults and decision for system scale-out and scale-up, in long term view.

  • PDF

Improvement of IoT sensor data loss rate of wireless network-based smart factory management system

  • Tae-Hyung Kim;Young-Gon, Kim
    • International journal of advanced smart convergence
    • /
    • 제12권2호
    • /
    • pp.173-181
    • /
    • 2023
  • Data collection is an essential element in the construction and operation of a smart factory. The quality of data collection is greatly influenced by network conditions, and existing wireless network systems for IoT inevitably lose data due to wireless signal strength. This data loss has contributed to increased system instability due to misinformation based on incorrect data. In this study, I designed a distributed MQTT IoT smart sensor and gateway structure that supports wireless multicasting for smooth sensor data collection. Through this, it was possible to derive significant results in the service latency and data loss rate of packets even in a wireless environment, unlike the MQTT QoS-based system. Therefore, through this study, it will be possible to implement a data collection management system optimized for the domestic smart factory manufacturing environment that can prevent data loss and delay due to abnormal data generation and minimize the input of management personnel.

도커 기반의 실시간 데이터 연계 및 처리 환경을 고려한 빅데이터 관리 플랫폼 개발 (Development of Big-data Management Platform Considering Docker Based Real Time Data Connecting and Processing Environments)

  • 김동길;박용순;정태윤
    • 대한임베디드공학회논문지
    • /
    • 제16권4호
    • /
    • pp.153-161
    • /
    • 2021
  • Real-time access is required to handle continuous and unstructured data and should be flexible in management under dynamic state. Platform can be built to allow data collection, storage, and processing from local-server or multi-server. Although the former centralize method is easy to control, it creates an overload problem because it proceeds all the processing in one unit, and the latter distributed method performs parallel processing, so it is fast to respond and can easily scale system capacity, but the design is complex. This paper provides data collection and processing on one platform to derive significant insights from various data held by an enterprise or agency in the latter manner, which is intuitively available on dashboards and utilizes Spark to improve distributed processing performance. All service utilize dockers to distribute and management. The data used in this study was 100% collected from Kafka, showing that when the file size is 4.4 gigabytes, the data processing speed in spark cluster mode is 2 minute 15 seconds, about 3 minutes 19 seconds faster than the local mode.

Implementation of Efficient Distributed Crawler through Stepwise Crawling Node Allocation

  • Kim, Hyuntae;Byun, Junhyung;Na, Yoseph;Jung, Yuchul
    • 한국정보기술학회 영문논문지
    • /
    • 제10권2호
    • /
    • pp.15-31
    • /
    • 2020
  • Various websites have been created due to the increased use of the Internet, and the number of documents distributed through these websites has increased proportionally. However, it is not easy to collect newly updated documents rapidly. Web crawling methods have been used to continuously collect and manage new documents, whereas existing crawling systems applying a single node demonstrate limited performances. Furthermore, crawlers applying distribution methods exhibit a problem related to effective node management for crawling. This study proposes an efficient distributed crawler through stepwise crawling node allocation, which identifies websites' properties and establishes crawling policies based on the properties identified to collect a large number of documents from multiple websites. The proposed crawler can calculate the number of documents included in a website, compare data collection time and the amount of data collected based on the number of nodes allocated to a specific website by repeatedly visiting the website, and automatically allocate the optimal number of nodes to each website for crawling. An experiment is conducted where the proposed and single-node methods are applied to 12 different websites; the experimental result indicates that the proposed crawler's data collection time decreased significantly compared with that of a single node crawler. This result is obtained because the proposed crawler applied data collection policies according to websites. Besides, it is confirmed that the work rate of the proposed model increased.

Implementation of AIoT Edge Cluster System via Distributed Deep Learning Pipeline

  • Jeon, Sung-Ho;Lee, Cheol-Gyu;Lee, Jae-Deok;Kim, Bo-Seok;Kim, Joo-Man
    • International journal of advanced smart convergence
    • /
    • 제10권4호
    • /
    • pp.278-288
    • /
    • 2021
  • Recently, IoT systems are cloud-based, so that continuous and large amounts of data collected from sensor nodes are processed in the data server through the cloud. However, in the centralized configuration of large-scale cloud computing, computational processing must be performed at a physical location where data collection and processing take place, and the need for edge computers to reduce the network load of the cloud system is gradually expanding. In this paper, a cluster system consisting of 6 inexpensive Raspberry Pi boards was constructed to perform fast data processing. And we propose "Kubernetes cluster system(KCS)" for processing large data collection and analysis by model distribution and data pipeline method. To compare the performance of this study, an ensemble model of deep learning was built, and the accuracy, processing performance, and processing time through the proposed KCS system and model distribution were compared and analyzed. As a result, the ensemble model was excellent in accuracy, but the KCS implemented as a data pipeline proved to be superior in processing speed..

Segmentation and Classification of Lidar data

  • Tseng, Yi-Hsing;Wang, Miao
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2003년도 Proceedings of ACRS 2003 ISRS
    • /
    • pp.153-155
    • /
    • 2003
  • Laser scanning has become a viable technique for the collection of a large amount of accurate 3D point data densely distributed on the scanned object surface. The inherent 3D nature of the sub-randomly distributed point cloud provides abundant spatial information. To explore valuable spatial information from laser scanned data becomes an active research topic, for instance extracting digital elevation model, building models, and vegetation volumes. The sub-randomly distributed point cloud should be segmented and classified before the extraction of spatial information. This paper investigates some exist segmentation methods, and then proposes an octree-based split-and-merge segmentation method to divide lidar data into clusters belonging to 3D planes. Therefore, the classification of lidar data can be performed based on the derived attributes of extracted 3D planes. The test results of both ground and airborne lidar data show the potential of applying this method to extract spatial features from lidar data.

  • PDF

분산 멀티미디어 데이터베이스에 대한 수집 융합 알고리즘 (Collection Fusion Algorithm in Distributed Multimedia Databases)

  • 김덕환;이주흥;이석룡;정진완
    • 한국정보과학회논문지:데이타베이스
    • /
    • 제28권3호
    • /
    • pp.406-417
    • /
    • 2001
  • 웹에서의 멀티미디어 데이터베이스가 발달함에 따라 분산 멀티미디어 데이터에 대한 검색 기능의 필요성이 높아지고 있다. 그러나 지금까지는 주로 웹상에 분산된 텍스트 데이터베이스를 선택하고 선택된 텍스트 데이터베이스에 대해소 질의 결과를 결합하는 연구가 이루어졌을 뿐 멀티미디어 데이터베이스에 대해서는 연구가 미진하였다. 웹상의 멀티미디어 데이터베이스는 자율적이고 이질적인 특성을 가지고 있고 주로 내용 기반으로 검색된다. 멀티미디어 데이터베이스에서의 수집 융합 문제는 웹상의 이질적인 멀티미디어 데이터베이스에서 내용 기반 검색으로 검색된 경과를 병합하는 것을 다룬다. 이 문제는 분산 멀티미디어 데이터베이스의 검색에 매우 중요하지만 아직까지 연구된 바가 없다. 본 논문은 웹상에서 이질적인 멀티미디어 데이터베이스의 수집 융합을 처리하는 새로운 알고리즘을 제안한다. 본 논문은 데이터베이스에서 검색할 객체의 개수를 추정하는 휴리스틱 방법과 선형 회귀분석을 이용한 알고리즘을 사용한다. 그리고 실험에 의해서 이 알고리즘들의 효율성을 보였다. 이 알고리즘들은 향후 웹상의 멀티미디어 데이터베이스들에 대한 분산 내용 기반 검색 알고리즘들의 기본이 될 수 있다.

  • PDF

A Secure Healthcare System Using Holochain in a Distributed Environment

  • Jong-Sub Lee;Seok-Jae Moon
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제15권4호
    • /
    • pp.261-269
    • /
    • 2023
  • We propose to design a Holochain-based security and privacy protection system for resource-constrained IoT healthcare systems. Through analysis and performance evaluation, the proposed system confirmed that these characteristics operate effectively in the IoT healthcare environment. The system proposed in this paper consists of four main layers aimed at secure collection, transmission, storage, and processing of important medical data in IoT healthcare environments. The first PERCEPTION layer consists of various IoT devices, such as wearable devices, sensors, and other medical devices. These devices collect patient health data and pass it on to the network layer. The second network connectivity layer assigns an IP address to the collected data and ensures that the data is transmitted reliably over the network. Transmission takes place via standardized protocols, which ensures data reliability and availability. The third distributed cloud layer is a distributed data storage based on Holochain that stores important medical information collected from resource-limited IoT devices. This layer manages data integrity and access control, and allows users to share data securely. Finally, the fourth application layer provides useful information and services to end users, patients and healthcare professionals. The structuring and presentation of data and interaction between applications are managed at this layer. This structure aims to provide security, privacy, and resource efficiency suitable for IoT healthcare systems, in contrast to traditional centralized or blockchain-based systems. We design and propose a Holochain-based security and privacy protection system through a better IoT healthcare system.

도로 침수영역의 탐색을 위한 빅데이터 분석 시스템 연구 (A Study on the Big Data Analysis System for Searching of the Flooded Road Areas)

  • 송영미;김창수
    • 한국멀티미디어학회논문지
    • /
    • 제18권8호
    • /
    • pp.925-934
    • /
    • 2015
  • The frequency of natural disasters because of global warming is gradually increasing, risks of flooding due to typhoon and torrential rain have also increased. Among these causes, the roads are flooded by suddenly torrential rain, and then vehicle and personal injury are happening. In this respect, because of the possibility that immersion of a road may occur in a second, it is necessary to study the rapid data collection and quick response system. Our research proposes a big data analysis system based on the collected information and a variety of system information collection methods for searching flooded road areas by torrential rains. The data related flooded roads are utilized the SNS data, meteorological data and the road link data, etc. And the big data analysis system is implemented the distributed processing system based on the Hadoop platform.