• Title/Summary/Keyword: distributed data storage

Search Result 296, Processing Time 0.028 seconds

The Creation and Placement of VMs and Tasks in Virtualized Hadoop Cluster Environments

  • Kim, Tae-Won;Chung, Hae-jin;Kim, Joon-Mo
    • Journal of Korea Multimedia Society
    • /
    • v.15 no.12
    • /
    • pp.1499-1505
    • /
    • 2012
  • Recently, the distributed processing system for big data has been actively investigated owing to the development of high speed network and storage technologies. In addition, virtual system that can provide efficient use of system resources through the consolidation of servers has been increasingly recognized. But, when we configure distributed processing system for big data in virtual machine environments, many problems occur. In this paper, we did an experiment on the optimization of I/O bandwidth according to the creation and placement of VMs and tasks with composing Hadoop cluster in virtual environments and evaluated the results of an experiment. These results conducted by this paper will be used in the study on the development of Hadoop Scheduler supporting I/O bandwidth balancing in virtual environments.

Study of Optimization through Performance Analysis of Parallel Distributed Filesystem (병렬 분산파일시스템의 성능 분석을 통한 최적화 연구)

  • Yoon, JunWeon;Song, Ui-Sung
    • Journal of Digital Contents Society
    • /
    • v.17 no.5
    • /
    • pp.409-416
    • /
    • 2016
  • Recently, Big Data issue has become a buzzword and universities, industries and research institutes have been efforts to collect, analyze various data enabled. These things includes accumulated data from the past, even if it is not possible to analysis at this present immediately a which has the potential means. And we are obtained a valuable result from the collected a large amount of data via the semantic analysis. The demand for high-performance storage system that can handle large amounts of data required is increasing around the world. In addition, it must provide a distributed parallel file system that stability to multiple users too perform a variety of analyzes at the same time by connecting a large amount of the accumulated data In this study, we identify the I/O bandwidth of the storage system to be considered, and performance of the metadata in order to provide a file system in stability and propose a method for configuring the optimal environment.

Performance Optimization of Big Data Center Processing System - Big Data Analysis Algorithm Based on Location Awareness

  • Zhao, Wen-Xuan;Min, Byung-Won
    • International Journal of Contents
    • /
    • v.17 no.3
    • /
    • pp.74-83
    • /
    • 2021
  • A location-aware algorithm is proposed in this study to optimize the system performance of distributed systems for processing big data with low data reliability and application performance. Compared with previous algorithms, the location-aware data block placement algorithm uses data block placement and node data recovery strategies to improve data application performance and reliability. Simulation and actual cluster tests showed that the location-aware placement algorithm proposed in this study could greatly improve data reliability and shorten the application processing time of I/O interfaces in real-time.

Quality monitoring of distributed materials from Glycyrrhizae Radix, Atractylodis Rhizoma Alba according to storage period (감초, 백출 유통품의 보관기간별 품질 모니터링)

  • Chun, Jin-Mi;Jang, Seol;Shim, Ji-Hoon;Lee, A-Yeong;Jeon, Won-Kyung;Lee, Hye-Won;Choo, Byung-Kil;Kim, Ho-Kyoung
    • Korean Journal of Oriental Medicine
    • /
    • v.12 no.3 s.18
    • /
    • pp.79-90
    • /
    • 2006
  • This study was investigated to determine the quality monitoring of distributed materials from Glycyrrhizae Radix (26 samples), Atractylodis Rhizoma Alba (24 samples) according to storage period after $1{\sim}3$ year. We have estimated by identification, purity, loss on drying, ash, acid insoluble ash, extract content, essential oil content, assay and microbial contamination. As a result, Glycyrrhizae Radix (26 samples) were satisfied with the standard of K.P. (Korean Pharmacopoeia) and WHO's microbial contamination limit standard. In the Atractylodis Rhizoma Alba (24 samples), 2 samples were not satisfied with the standard of K.P.(Korean Pharmacopoeia) and WHO's microbial contamination limit standard. The results make practical application of the basic data for the quality control of herbal medicine in storage.

  • PDF

Development of the design methodology for large-scale database based on MongoDB

  • Lee, Jun-Ho;Joo, Kyung-Soo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.11
    • /
    • pp.57-63
    • /
    • 2017
  • The recent sudden increase of big data has characteristics such as continuous generation of data, large amount, and unstructured format. The existing relational database technologies are inadequate to handle such big data due to the limited processing speed and the significant storage expansion cost. Thus, big data processing technologies, which are normally based on distributed file systems, distributed database management, and parallel processing technologies, have arisen as a core technology to implement big data repositories. In this paper, we propose a design methodology for large-scale database based on MongoDB by extending the information engineering methodology based on E-R data model.

An Efficient Design and Implementation of an MdbULPS in a Cloud-Computing Environment

  • Kim, Myoungjin;Cui, Yun;Lee, Hanku
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.8
    • /
    • pp.3182-3202
    • /
    • 2015
  • Flexibly expanding the storage capacity required to process a large amount of rapidly increasing unstructured log data is difficult in a conventional computing environment. In addition, implementing a log processing system providing features that categorize and analyze unstructured log data is extremely difficult. To overcome such limitations, we propose and design a MongoDB-based unstructured log processing system (MdbULPS) for collecting, categorizing, and analyzing log data generated from banks. The proposed system includes a Hadoop-based analysis module for reliable parallel-distributed processing of massive log data. Furthermore, because the Hadoop distributed file system (HDFS) stores data by generating replicas of collected log data in block units, the proposed system offers automatic system recovery against system failures and data loss. Finally, by establishing a distributed database using the NoSQL-based MongoDB, the proposed system provides methods of effectively processing unstructured log data. To evaluate the proposed system, we conducted three different performance tests on a local test bed including twelve nodes: comparing our system with a MySQL-based approach, comparing it with an Hbase-based approach, and changing the chunk size option. From the experiments, we found that our system showed better performance in processing unstructured log data.

Issues in structural health monitoring employing smart sensors

  • Nagayama, T.;Sim, S.H.;Miyamori, Y.;Spencer, B.F. Jr.
    • Smart Structures and Systems
    • /
    • v.3 no.3
    • /
    • pp.299-320
    • /
    • 2007
  • Smart sensors densely distributed over structures can provide rich information for structural monitoring using their onboard wireless communication and computational capabilities. However, issues such as time synchronization error, data loss, and dealing with large amounts of harvested data have limited the implementation of full-fledged systems. Limited network resources (e.g. battery power, storage space, bandwidth, etc.) make these issues quite challenging. This paper first investigates the effects of time synchronization error and data loss, aiming to clarify requirements on synchronization accuracy and communication reliability in SHM applications. Coordinated computing is then examined as a way to manage large amounts of data.

Dynamic Cluster Management of Hadoop Distributed Filesystem (하둡 분산 파일시스템의 동적 클러스터 관리 기법)

  • Ryu, Wooseok
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2016.10a
    • /
    • pp.435-437
    • /
    • 2016
  • Hadoop Distributed File System(HDFS) is a file system for distributed processing of big data by replicating data to distributed data nodes. HDFS cluster shows a great scalability up to thousands of nodes, but it assumes a exclusive node cluster with numerous nodes for the big data processing. Various operational-purpose worker systems used by office are hardly considered as a part of cluster. This paper discusses this problem and proposes a dynamic cluster management technique to increase storage capability and analytic performance of hadoop cluster. The propsed technique can add legacy systems to the cluster and can remove them from the cluster dynamically depending on their availability.

  • PDF

Controller Backup and Replication for Reliable Multi-domain SDN

  • Mao, Junli;Chen, Lishui;Li, Jiacong;Ge, Yi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.12
    • /
    • pp.4725-4747
    • /
    • 2020
  • Software defined networking (SDN) is considered to be one of the most promising paradigms in the future. To solve the scalability and performance problem that a single and centralized controller suffers from, the distributed multi-controller architecture is adopted, thus forms multi-domain SDN. In a multi-domain SDN network, it is of great importance to ensure a reliable control plane. In this paper, we focus on the reliability problem of multi-domain SDN against controller failure from perspectives of backup controller deployment and controller replication. We firstly propose a placement algorithm for backup controllers, which considers both the reliability and the cost factors. Then a controller replication mechanism based on shared data storage is proposed to solve the inconsistency between the active and standby controllers. We also propose a shared data storage layout method that considers both reliability and performance. Besides, a fault recovery and repair process is designed based on the controller backup and shared data storage mechanism. Simulations show that our approach can recover and repair controller failure. Evaluation results also show that the proposed backup controller placement approach is more effective than other methods.

Robust and Auditable Secure Data Access Control in Clouds

  • KARPAGADEEPA.S;VIJAYAKUMAR.P
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.5
    • /
    • pp.95-102
    • /
    • 2024
  • In distributed computing, accessible encryption strategy over Auditable data is a hot research field. Be that as it may, most existing system on encoded look and auditable over outsourced cloud information and disregard customized seek goal. Distributed storage space get to manage is imperative for the security of given information, where information security is executed just for the encoded content. It is a smaller amount secure in light of the fact that the Intruder has been endeavored to separate the scrambled records or Information. To determine this issue we have actualize (CBC) figure piece fastening. It is tied in with adding XOR each plaintext piece to the figure content square that was already delivered. We propose a novel heterogeneous structure to evaluate the issue of single-point execution bottleneck and give a more proficient access control plot with a reviewing component. In the interim, in our plan, a CA (Central Authority) is acquainted with create mystery keys for authenticity confirmed clients. Not at all like other multi specialist get to control plots, each of the experts in our plan deals with the entire trait set independently. Keywords: Cloud storage, Access control, Auditing, CBC.