• Title/Summary/Keyword: 컬럼-지향 데이터베이스

Search Result 9, Processing Time 0.02 seconds

Processing of Sensor Network Data using Column-Oriented Database (세로-지향 데이터베이스를 이용한 센서 네트워크 데이터 처리)

  • Oh, Byung-Jung;Kim, Kyung-Ho;Kim, Jae-Kyung;Kim, Kyung-Chang
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2012.06c
    • /
    • pp.66-68
    • /
    • 2012
  • 센서 네트워크에서는 센서들의 배터리의 교체나 충전이 어렵기 때문에 배터리의 수명을 최대한으로 연장하는 것이 중요하다. 본 논문에서는 센서 네트워크에서 세로-지향 데이터베이스 시스템을 적용하여 통신비용을 절감하기 위한 전략을 소개한다. 세로-지향 데이터베이스 시스템의 적용은 데이터를 컬럼으로 저장하기 때문에 질의에 해당하는 컬럼만을 불러와 메시지의 길이를 줄여 통신비용의 감소 효과를 가져와 센서들의 배터리 수명을 연장하는 효과를 가져 온다. 아울러 기존의 가로-지향 데이터베이스 시스템의 처리 방식과 비교하여 어떠한 차이점이 있는지 기술하였다.

Search Performance Improvement of Column-oriented Flash Storages using Segmented Compression Index (분할된 압축 인덱스를 이용한 컬럼-지향 플래시 스토리지의 검색 성능 개선)

  • Byun, Siwoo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.14 no.1
    • /
    • pp.393-401
    • /
    • 2013
  • Most traditional databases exploit record-oriented storage model where the attributes of a record are placed contiguously in hard disk to achieve high performance writes. However, for search-mostly datawarehouse systems, column-oriented storage has become a proper model because of its superior read performance. Today, flash memory is largely recognized as the preferred storage media for high-speed database systems. In this paper, we introduce fast column-oriented database model and then propose a new column-aware index management scheme for the high-speed column-oriented datawarehouse system. Our index management scheme which is based on enhanced $B^+$-Tree achieves high search performance by embedded flash index and unused space compression in internal and leaf nodes. Based on the results of the performance evaluation, we conclude that our index management scheme outperforms the traditional scheme in the respect of the search throughput and response time.

Column-aware Transaction Management Scheme for Column-Oriented Databases (컬럼-지향 데이터베이스를 위한 컬럼-인지 트랜잭션 관리 기법)

  • Byun, Si-Woo
    • Journal of Internet Computing and Services
    • /
    • v.15 no.4
    • /
    • pp.125-133
    • /
    • 2014
  • The column-oriented database storage is a very advanced model for large-volume data analysis systems because of its superior I/O performance. Traditional data storages exploit row-oriented storage where the attributes of a record are placed contiguously in hard disk for fast write operations. However, for search-mostly datawarehouse systems, column-oriented storage has become a more proper model because of its superior read performance. Recently, solid state drive using MLC flash memory is largely recognized as the preferred storage media for high-speed data analysis systems. The features of non-volatility, low power consumption, and fast access time for read operations are sufficient grounds to support flash memory as major storage components of modern database servers. However, we need to improve traditional transaction management scheme due to the relatively slow characteristics of column compression and flash operation as compared to RAM memory. In this research, we propose a new scheme called Column-aware Multi-Version Locking (CaMVL) scheme for efficient transaction processing. CaMVL improves transaction performance by using compression lock and multi version reads for efficiently handling slow flash write/erase operation in lock management process. We also propose a simulation model to show the performance of CaMVL. Based on the results of the performance evaluation, we conclude that CaMVL scheme outperforms the traditional scheme.

Column-aware Polarization Scheme for High-Speed Database Systems (고속 데이터베이스 시스템을 위한 컬럼-인지 양분화 기법)

  • Byun, Si-Woo
    • Journal of Internet Computing and Services
    • /
    • v.13 no.3
    • /
    • pp.83-91
    • /
    • 2012
  • Recently, column-oriented storage has become a progressive model for high-speed database systems because of its superior I/O performance. In this paper, we analysis traditional raw-oriented storage model and then propose a new column-aware storage management model using flash memory drive and assist drive to improve the effective performance of the high-speed column-oriented database system. Our storage management scheme called column-aware polarization improves the performance of update operation by dividing and compressing table columns into active-columns or inactive-columns, and balancing congested update operations using a assist drive in high workload periods. The results obtained from experimental tests show that our scheme improves the update throughput of column-oriented storage by 19 percent, and the response time by up to 49 percent.

Cross Compressed Replication Scheme for Large-Volume Column Storages (대용량 컬럼 저장소를 위한 교차 압축 이중화 기법)

  • Byun, Siwoo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.14 no.5
    • /
    • pp.2449-2456
    • /
    • 2013
  • The column-oriented database storage is a very advanced model for large-volume data analysis systems because of its superior I/O performance. Traditional data storages exploit row-oriented storage where the attributes of a record are placed contiguously in hard disk for fast write operations. However, for search-mostly datawarehouse systems, column-oriented storage has become a more proper model because of its superior read performance. Recently, solid state drive using MLC flash memory is largely recognized as the preferred storage media for high-speed data analysis systems. In this paper, we introduce fast column-oriented data storage model and then propose a new storage management scheme using a cross compressed replication for the high-speed column-oriented datawarehouse system. Our storage management scheme which is based on two MLC SSD achieves superior performance and reliability by the cross replication of the uncompressed segment and the compressed segment under high workloads of CPU and I/O. Based on the results of the performance evaluation, we conclude that our storage management scheme outperforms the traditional scheme in the respect of update throughput and response time of the column segments.

Shadow Recovery for Column-based Databases (컬럼-기반 데이터베이스를 위한 그림자 복구)

  • Byun, Si-Woo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.16 no.4
    • /
    • pp.2784-2790
    • /
    • 2015
  • The column-oriented database storage is a very advanced model for large-volume data transactions because of its superior I/O performance. Traditional data storages exploit row-oriented storage where the attributes of a record are placed contiguously in hard disk for fast write operations. However, for search-mostly data warehouse systems, column-oriented storage has become a more proper model because of its superior read performance. Recently, solid state drive using flash memory is largely recognized as the preferred storage media for high-speed data analysis systems. In this research, we propose a new transaction recovery scheme for a column-oriented database environment which is based on a flash media file system. We improved traditional shadow paging schemes by reusing old data pages which are supposed to be invalidated in the course of writing a new data page in the flash file system environment. In order to reuse these data pages, we exploit reused shadow list structure in our column-oriented shadow recovery(CoSR) scheme. CoSR scheme minimizes the additional storage overhead for keeping shadow pages and minimizes the I/O performance degradation caused by column data compression of traditional recovery schemes. Based on the results of the performance evaluation, we conclude that CoSR outperforms the traditional schemes by 17%.

Cloud-based Intelligent Management System for Photovoltaic Power Plants (클라우드 기반 태양광 발전단지 통합 관리 시스템)

  • Park, Kyoung-Wook;Ban, Kyeong-Jin;Song, Seung-Heon;Kim, Eung-Kon
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.7 no.3
    • /
    • pp.591-596
    • /
    • 2012
  • Recently, the efficient management system for photovoltaic power plants has been required due to the continuously increasing construction of photovoltaic power plants. In this paper, we propose a cloud-based intelligent management system for many photovoltaic power plants. The proposed system stores the measured data of power plants using Hadoop HBase which is a column-oriented database, and processes the calculations of performance, efficiency, and prediction the amount of power generation by parallel processing based on Map-Reduce model. And, Web-based data visualization module allows the administrator to provide information in various forms.

Livestock Disease Forecasting and Smart Livestock Farm Integrated Control System based on Cloud Computing (클라우드 컴퓨팅기반 가축 질병 예찰 및 스마트 축사 통합 관제 시스템)

  • Jung, Ji-sung;Lee, Meong-hun;Park, Jong-kweon
    • Smart Media Journal
    • /
    • v.8 no.3
    • /
    • pp.88-94
    • /
    • 2019
  • Livestock disease is a very important issue in the livestock industry because if livestock disease is not responded quickly enough, its damage can be devastating. To solve the issues involving the occurrence of livestock disease, it is necessary to diagnose in advance the status of livestock disease and develop systematic and scientific livestock feeding technologies. However, there is a lack of domestic studies on such technologies in Korea. This paper, therefore, proposes Livestock Disease Forecasting and Livestock Farm Integrated Control System using Cloud Computing to quickly manage livestock disease. The proposed system collects a variety of livestock data from wireless sensor networks and application. Moreover, it saves and manages the data with the use of the column-oriented database Hadoop HBase, a column-oriented database management system. This provides livestock disease forecasting and livestock farm integrated controlling service through MapReduce Model-based parallel data processing. Lastly, it also provides REST-based web service so that users can receive the service on various platforms, such as PCs or mobile devices.

Design and Implementation of MongoDB-based Unstructured Log Processing System over Cloud Computing Environment (클라우드 환경에서 MongoDB 기반의 비정형 로그 처리 시스템 설계 및 구현)

  • Kim, Myoungjin;Han, Seungho;Cui, Yun;Lee, Hanku
    • Journal of Internet Computing and Services
    • /
    • v.14 no.6
    • /
    • pp.71-84
    • /
    • 2013
  • Log data, which record the multitude of information created when operating computer systems, are utilized in many processes, from carrying out computer system inspection and process optimization to providing customized user optimization. In this paper, we propose a MongoDB-based unstructured log processing system in a cloud environment for processing the massive amount of log data of banks. Most of the log data generated during banking operations come from handling a client's business. Therefore, in order to gather, store, categorize, and analyze the log data generated while processing the client's business, a separate log data processing system needs to be established. However, the realization of flexible storage expansion functions for processing a massive amount of unstructured log data and executing a considerable number of functions to categorize and analyze the stored unstructured log data is difficult in existing computer environments. Thus, in this study, we use cloud computing technology to realize a cloud-based log data processing system for processing unstructured log data that are difficult to process using the existing computing infrastructure's analysis tools and management system. The proposed system uses the IaaS (Infrastructure as a Service) cloud environment to provide a flexible expansion of computing resources and includes the ability to flexibly expand resources such as storage space and memory under conditions such as extended storage or rapid increase in log data. Moreover, to overcome the processing limits of the existing analysis tool when a real-time analysis of the aggregated unstructured log data is required, the proposed system includes a Hadoop-based analysis module for quick and reliable parallel-distributed processing of the massive amount of log data. Furthermore, because the HDFS (Hadoop Distributed File System) stores data by generating copies of the block units of the aggregated log data, the proposed system offers automatic restore functions for the system to continually operate after it recovers from a malfunction. Finally, by establishing a distributed database using the NoSQL-based Mongo DB, the proposed system provides methods of effectively processing unstructured log data. Relational databases such as the MySQL databases have complex schemas that are inappropriate for processing unstructured log data. Further, strict schemas like those of relational databases cannot expand nodes in the case wherein the stored data are distributed to various nodes when the amount of data rapidly increases. NoSQL does not provide the complex computations that relational databases may provide but can easily expand the database through node dispersion when the amount of data increases rapidly; it is a non-relational database with an appropriate structure for processing unstructured data. The data models of the NoSQL are usually classified as Key-Value, column-oriented, and document-oriented types. Of these, the representative document-oriented data model, MongoDB, which has a free schema structure, is used in the proposed system. MongoDB is introduced to the proposed system because it makes it easy to process unstructured log data through a flexible schema structure, facilitates flexible node expansion when the amount of data is rapidly increasing, and provides an Auto-Sharding function that automatically expands storage. The proposed system is composed of a log collector module, a log graph generator module, a MongoDB module, a Hadoop-based analysis module, and a MySQL module. When the log data generated over the entire client business process of each bank are sent to the cloud server, the log collector module collects and classifies data according to the type of log data and distributes it to the MongoDB module and the MySQL module. The log graph generator module generates the results of the log analysis of the MongoDB module, Hadoop-based analysis module, and the MySQL module per analysis time and type of the aggregated log data, and provides them to the user through a web interface. Log data that require a real-time log data analysis are stored in the MySQL module and provided real-time by the log graph generator module. The aggregated log data per unit time are stored in the MongoDB module and plotted in a graph according to the user's various analysis conditions. The aggregated log data in the MongoDB module are parallel-distributed and processed by the Hadoop-based analysis module. A comparative evaluation is carried out against a log data processing system that uses only MySQL for inserting log data and estimating query performance; this evaluation proves the proposed system's superiority. Moreover, an optimal chunk size is confirmed through the log data insert performance evaluation of MongoDB for various chunk sizes.