• Title/Summary/Keyword: 샤딩 기술

Search Result 7, Processing Time 0.02 seconds

Distributed Processing System for Aggregate/Analytical Functions on CUBRID Shard Distributed Databases (큐브리드 샤드 분산 데이터베이스에서 집계/분석 함수의 분산 처리 시스템 개발)

  • Won, Jiseop;Kang, Suk;Jo, Sunhwa;Kim, Jinho
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.8
    • /
    • pp.537-542
    • /
    • 2015
  • Database Shard is a technique that can be queried and stored by dividing one logical table into multiple databases horizontally. In order to analyze the shard data with aggregate or analysis functions, a process is required that integrates partial results on each shard database. In this paper, we introduce the design and implementation of a distributed processing system for aggregation and analysis on the CUBRID Shard distributed database, which is an open source database management system. The implemented system can accelerate the analysis onto multiple shards of partitioned tables; it shows efficient aggregation on shard distributed databases compared to stand-alone databases.

Performance Evaluation: Parameter Sharding approaches for DNN Models with a Very Large Layer (불균형한 DNN 모델의 효율적인 분산 학습을 위한 파라미터 샤딩 기술 성능 평가)

  • Choi, Ki-Bong;Ko, Yun-Yong;Kim, Sang-Wook
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.11a
    • /
    • pp.881-882
    • /
    • 2020
  • 최근 딥 러닝 (deep learning) 기술의 큰 발전으로 기존 기계 학습 분야의 기술들이 성공적으로 해결하지 못하던 많은 문제들을 해결할 수 있게 되었다. 이러한 딥 러닝의 학습 과정은 매우 많은 연산을 요구하기에 다수의 노드들로 모델을 학습하는 분산 학습 (distributed training) 기술이 연구되었다. 대표적인 분산 학습 기법으로 파라미터 서버 기반의 분산 학습 기법들이 있으며, 이 기법들은 파라미터 서버 노드가 학습의 병목이 될 수 있다는 한계를 갖는다. 본 논문에서는 이러한 파라미터 서버 병목 문제를 해결하는 파라미터 샤딩 기법에 대해 소개하고, 각 기법 별 학습 성능을 비교하고 그 결과를 분석하였다.

Multi-blockchain System based on Directed Acyclic Graph for Increasing Data Throughput (데이터 처리량 향상을 위한 유향 비순환 그래프 기반의 멀티블록체인 시스템)

  • CHEN, Hao-Tian;Kim, Tae Woo;Park, Jong Hyuk
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.05a
    • /
    • pp.25-28
    • /
    • 2021
  • 블록체인은 탈집중화, 위변조 방지, 추적 가능, 노드 간 공동 유지 및 보수가 가능한 데이터베이스로서 서로 신뢰하지 않은 노드 간 통신 신뢰 문제를 해결할 수 있는 점 대 점 통신 네트워크를 실현할 수 있다. 최근 몇 년 동안, 블록체인 기술은 지속적으로 발전하여 데이터 보안 문제를 해결하기 위한 중요한 기술로 주목받고 있다. 블록체인의 응용은 최초의 디지털 화폐 영역에서 금융·정무·공업 제조 영역으로 확대되고 있다. 블록체인의 특성에 따라 블록체인의 성능은 분산형 데이터 통신에 비해 크게 떨어지고 처리량이 제한되는 문제점이 존재한다. 본 논문에서는 최근 연구되고 있는 블록체인의 보안 구조 및 성능 분석에 대해 조사하고, 기존에 연구되었던 기술과 비교하여 블록체인의 안전성을 유지하며 성능을 향상시키는 방법에 대해 고찰한다. 이후 유향 비순환 그래프 (DAG: Directed Acyclic Graph) 및 샤딩 (Sharding)을 이용하여 안전성과 성능을 강화시키는 방법에 대해 제안한다. 제안하는 시스템은 DAG를 사용하여 위변조 방지 및 처리 속도 향상의 이점을 가지고 있으며, 샤딩을 사용함으로써 데이터 처리량을 향상시킨다. 마지막으로 제안하는 시스템은 기존 블록체인과 비교하여 안정성과 데이터 처리량 측면에서 비교 분석을 진행한다.

Research Trends on Distributed Storage Technology for Blockchain Transaction Data (블록체인 트랜잭션 데이터 분산 저장 기술 동향)

  • Choi, B.J.;Kim, C.S.;Lee, M.C.
    • Electronics and Telecommunications Trends
    • /
    • v.37 no.3
    • /
    • pp.85-96
    • /
    • 2022
  • Recently, the blockchain technology, which can decentralize business ecosystems using secure transactions without trusted intermediaries, has been spotlighted. Full nodes play an important role in maintaining decentralization in that they independently verify transactions using their full historical transaction data. However, the storage requirement of a full node for storing historical data is continuously increasing, and thus, has become harder for users to run a full node due to the heavy price for storage costs. In this paper, we investigate research trends on reducing the costs of storing blockchain transaction data so that nodes with low storage requirements can be used in the blockchain network.

A Study on Blockchain Networking for Internet of Things (사물인터넷을 위한 블록체인 네트워킹에 대한 연구)

  • Lee, Il-Gu
    • Journal of Digital Convergence
    • /
    • v.16 no.8
    • /
    • pp.201-210
    • /
    • 2018
  • High expectations are posed on the blockchain-based internet of things (IoT), in which IoT and blockchain technology is combined to obtain trust in the Internet, where trust appears impossible to obtain. However, applications of current blockchain-based IoT technology to real-world scenarios appears to be significantly more difficult owing to limitations regarding scalability and security. In this paper, the difficulties to implement blockchain networking technologies for IoT and digital businesses are investigated and practical solutions such as sharding, off-chain, de-idetification and P2P crypto-currency exchange are explored. In further work, a blockchain platform for IoTs which provides scalability and security will be implemented according to this research results, and compared with conventional blockchain platforms.

Research on Sharding Model for Enabling Cross Heterogeneous Blockchain Transactions (이기종 블록체인간 거래를 위한 샤딩모델 연구)

  • Hong, Sunghyuck
    • Journal of Digital Convergence
    • /
    • v.19 no.5
    • /
    • pp.315-320
    • /
    • 2021
  • While blockchain platforms for various purposes have been developed and the blockchain ecosystem is being developed, interoperability problems are emerging in which each blockchain is isolated and operated. In this study, we introduce interchain and sidechain technologies, which are blockchain that connect blockchain, and explain examples of using heterogeneous blockchain transactions and functions by applying them. In addition, blockchain, artificial intelligence, and IoT technologies, which are drawing attention in the fourth industrial revolution, are going through a process of converging and developing beyond their own development. In this regard, we present processes for combining artificial intelligence or IoT in blockchain, and propose a model that can operate without intervention by applying the combination of blockchain and artificial intelligence IoT to processes for trading and exchange between heterogeneous blockchain.

Design and Implementation of MongoDB-based Unstructured Log Processing System over Cloud Computing Environment (클라우드 환경에서 MongoDB 기반의 비정형 로그 처리 시스템 설계 및 구현)

  • Kim, Myoungjin;Han, Seungho;Cui, Yun;Lee, Hanku
    • Journal of Internet Computing and Services
    • /
    • v.14 no.6
    • /
    • pp.71-84
    • /
    • 2013
  • Log data, which record the multitude of information created when operating computer systems, are utilized in many processes, from carrying out computer system inspection and process optimization to providing customized user optimization. In this paper, we propose a MongoDB-based unstructured log processing system in a cloud environment for processing the massive amount of log data of banks. Most of the log data generated during banking operations come from handling a client's business. Therefore, in order to gather, store, categorize, and analyze the log data generated while processing the client's business, a separate log data processing system needs to be established. However, the realization of flexible storage expansion functions for processing a massive amount of unstructured log data and executing a considerable number of functions to categorize and analyze the stored unstructured log data is difficult in existing computer environments. Thus, in this study, we use cloud computing technology to realize a cloud-based log data processing system for processing unstructured log data that are difficult to process using the existing computing infrastructure's analysis tools and management system. The proposed system uses the IaaS (Infrastructure as a Service) cloud environment to provide a flexible expansion of computing resources and includes the ability to flexibly expand resources such as storage space and memory under conditions such as extended storage or rapid increase in log data. Moreover, to overcome the processing limits of the existing analysis tool when a real-time analysis of the aggregated unstructured log data is required, the proposed system includes a Hadoop-based analysis module for quick and reliable parallel-distributed processing of the massive amount of log data. Furthermore, because the HDFS (Hadoop Distributed File System) stores data by generating copies of the block units of the aggregated log data, the proposed system offers automatic restore functions for the system to continually operate after it recovers from a malfunction. Finally, by establishing a distributed database using the NoSQL-based Mongo DB, the proposed system provides methods of effectively processing unstructured log data. Relational databases such as the MySQL databases have complex schemas that are inappropriate for processing unstructured log data. Further, strict schemas like those of relational databases cannot expand nodes in the case wherein the stored data are distributed to various nodes when the amount of data rapidly increases. NoSQL does not provide the complex computations that relational databases may provide but can easily expand the database through node dispersion when the amount of data increases rapidly; it is a non-relational database with an appropriate structure for processing unstructured data. The data models of the NoSQL are usually classified as Key-Value, column-oriented, and document-oriented types. Of these, the representative document-oriented data model, MongoDB, which has a free schema structure, is used in the proposed system. MongoDB is introduced to the proposed system because it makes it easy to process unstructured log data through a flexible schema structure, facilitates flexible node expansion when the amount of data is rapidly increasing, and provides an Auto-Sharding function that automatically expands storage. The proposed system is composed of a log collector module, a log graph generator module, a MongoDB module, a Hadoop-based analysis module, and a MySQL module. When the log data generated over the entire client business process of each bank are sent to the cloud server, the log collector module collects and classifies data according to the type of log data and distributes it to the MongoDB module and the MySQL module. The log graph generator module generates the results of the log analysis of the MongoDB module, Hadoop-based analysis module, and the MySQL module per analysis time and type of the aggregated log data, and provides them to the user through a web interface. Log data that require a real-time log data analysis are stored in the MySQL module and provided real-time by the log graph generator module. The aggregated log data per unit time are stored in the MongoDB module and plotted in a graph according to the user's various analysis conditions. The aggregated log data in the MongoDB module are parallel-distributed and processed by the Hadoop-based analysis module. A comparative evaluation is carried out against a log data processing system that uses only MySQL for inserting log data and estimating query performance; this evaluation proves the proposed system's superiority. Moreover, an optimal chunk size is confirmed through the log data insert performance evaluation of MongoDB for various chunk sizes.