• Title/Summary/Keyword: Database Cluster System

Search Result 103, Processing Time 0.03 seconds

Design and Implementation of a Benchmarking System Based on ArangoDB (ArangoDB기반 벤치마킹 시스템 설계 및 구현)

  • Choi, Do-Jin;Baek, Yeon-Hee;Lee, So-Min;Kim, Yun-A;Kim, Nam-Young;Choi, Jae-Young;Lee, Hyeon-Byeong;Lim, Jong-Tae;Bok, Kyoung-Soo;Song, Seok-Il;Yoo, Jae-Soo
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.9
    • /
    • pp.198-208
    • /
    • 2021
  • ArangoDB is a NoSQL database system that has been popularly utilized in many applications for storing large amounts of data. In order to apply a new NoSQL database system such as ArangoDB, to real work environments we need a benchmarking system that can evaluate its performance. In this paper, we design and implement a ArangoDB based benchmarking system that measures a kernel level performance well as an application level performance. We partially modify YCSB to measure the performance of a NoSQL database system in the cluster environment. We also define three real-world workload types by analyzing the existing materials. We prove the feasibility of the proposed system through the benchmarking of three workload types. We derive available workloads in ArangoDB and show that performance at the kernel layer as well as the application layer can be visualized through benchmarking of three workload types. It is expected that applicability and risk reviews will be possible through benchmarking of this system in environments that need to transfer data from the existing database engine to ArangoDB.

Image Clustering using Improved Neural Network Algorithm (개선된 신경망 알고리즘을 이용한 영상 클러스터링)

  • 박상성;이만희;유헌우;문호석;장동식
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.10 no.7
    • /
    • pp.597-603
    • /
    • 2004
  • In retrieving large database of image data, the clustering is essential for fast retrieval. However, it is difficult to cluster a number of image data adequately. Moreover, current retrieval methods using similarities are uncertain of retrieval accuracy and take much retrieving time. In this paper, a suggested image retrieval system combines Fuzzy ART neural network algorithm to reinforce defects and to support them efficiently. This image retrieval system takes color and texture as specific feature required in retrieval system and normalizes each of them. We adapt Fuzzy ART algorithm as neural network which receive normalized input-vector and propose improved Fuzzy ART algorithm. The result of implementation with 200 image data shows approximately retrieval ratio of 83%.

Web 2.0 Cluster based Process and Performance Management System Modeling (Web 2.0 Cluster 기반의 공정 및 성과관리 시스템 모델 구축)

  • AHn, Jae-Gyu;Ong, Ho-Kyoung;Kim, Dae-Young
    • Proceedings of the Korean Institute Of Construction Engineering and Management
    • /
    • 2007.11a
    • /
    • pp.892-898
    • /
    • 2007
  • This study aims to implement an efficient process management system for small and medium sized(local) construction companies and a performance management system for the Korean construction industry. The process management system by Lean Construction is Web 2.0 platform-based and creates clusters with numerous general contractors and sub-contractors, which will enable mutually organic process management. Plus, this system will enable them to compare project performance management by analyzing it during or after a project by collecting and accumulating lots of data occurring in pursuit of a project. These performance management cases will be of help in process planning during similar upcoming projects. This study is expected to somewhat reduce the burden of implementing a complicated process management protocol and system that Korean small and medium sized (local) construction companies experience with their web-based process management, and is supposed to realize accurate performance management with highly reliable data which are significantly accumulated within the database.

  • PDF

Parallel Data Mining with Distributed Frequent Pattern Trees (분산형 FP트리를 활용한 병렬 데이터 마이닝)

  • 조두산;김동승
    • Proceedings of the IEEK Conference
    • /
    • 2003.07c
    • /
    • pp.2561-2564
    • /
    • 2003
  • Data mining is an effective method of the discovery of useful information such as rules and previously unknown patterns existing in large databases. The discovery of association rules is an important data mining problem. We have developed a new parallel mining called Distributed Frequent Pattern Tree (abbreviated by DFPT) algorithm on a distributed shared nothing parallel system to detect association rules. DFPT algorithm is devised for parallel execution of the FP-growth algorithm. It needs only two full disk data scanning of the database by eliminating the need for generating the candidate items. We have achieved good workload balancing throughout the mining process by distributing the work equally to all processors. We implemented the algorithm on a PC cluster system, and observed that the algorithm outperformed the Improved Count Distribution scheme.

  • PDF

Microbial Diversity Information Facility: Bacteriology Insight Orienting System (BIOS)

  • Shimura, Junko;Shimiz, Hideyukiu;Tsuruwaka, Keiji;Moritani, Yukimitsu;Miyazaki, Kenji;Tsugita, Akira;Watanabe, Makoto M.
    • Proceedings of the Korean Society for Applied Microbiology Conference
    • /
    • 2000.04a
    • /
    • pp.135-141
    • /
    • 2000
  • Global Biodiversity is common interest of humans for better health and sustainable development of the society. To provide access and analysis on microbial diversity information, Bacteriology Insight Orienting System (BIOS) has been developed. BIOS contains 6402 species and subspecies names of bacteria and archaea, 2606 names of cyanobacteria by March 2000. BIOS of which web based analytical tool provides windows to compare the results of phylogenetic analysis based on 16S rDNA sequence and the results of cluster analysis on proteome profiling. The sequence data and 2 dimensional gel electrophoresis analysis data were accumulated in BIOS database content for cyanobacteria reclassification and taxonomy. (BIOS URL: http.://www-sp2000ao.nies.go.jp/bios/index.html).

  • PDF

A Comparative Study of Social Network Tools for Analysing Chinese Elites

  • Lee, HeeJeong Jasmine;Kim, In
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.10
    • /
    • pp.3571-3587
    • /
    • 2021
  • For accurately analysing and forecasting the social networks of China's political, economic and social power elites, it is necessary to develop a database that collates their information. The development of such a database involves three stages: data definition, data collection and data quality maintenance. The present study recommends distinctive solutions in overcoming the challenges that occur in existing comparable databases. We used organizational and event factors to identify the Chinese power elites to be included in the database, and used their memberships, social relations and interactions in combination with flows data collection methodologies to determine the associations between them. The system can be used to determine the optimal relationship path (i.e., the shortest path) to reach a target elite and to identify of the most important power elite in a social network (e.g., degree, closeness and eigenvector centrality) or a community (e.g., a clique or a cluster). We have used three social network analysis tools (i.e., R, UCINET and NetMiner) in order to find the important nodes in the network. We compared the results of centrality rankings of each tool. We found that all three tools are providing slightly different results of centrality. This is because different tools use different algorithms and even within the same tool there are various libraries which provide the same functionality (i.e., ggraph, igraph and sna in R that provide the different function to calculate centrality). As there are chances that the results may not be the same (i.e. centrality rankings indicating the most important nodes can be varied), we recommend a comparison test using different tools to get accurate results.

An Intelligent Agent System using Multi-View Information Fusion (다각도 정보융합 방법을 이용한 지능형 에이전트 시스템)

  • Rhee, Hyun-Sook
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.12
    • /
    • pp.11-19
    • /
    • 2014
  • In this paper, we design an intelligent agent system with the data mining module and information fusion module as the core components of the system and investigate the possibility for the medical expert system. In the data mining module, fuzzy neural network, OFUN-NET analyzes multi-view data and produces fuzzy cluster knowledge base. In the information fusion module and application module, they serve the diagnosis result with possibility degree and useful information for diagnosis, such as uncertainty decision status or detection of asymmetry. We also present the experiment results on the BI-RADS-based feature data set selected form DDSM benchmark database. They show higher classification accuracy than conventional methods and the feasibility of the system as a computer aided diagnosis system.

Hierarchical Browsing Interface for Geo-Referenced Photo Database (위치 정보를 갖는 사진집합의 계층적 탐색 인터페이스)

  • Lee, Seung-Hoon;Lee, Kang-Hoon
    • Journal of the Korea Computer Graphics Society
    • /
    • v.16 no.4
    • /
    • pp.25-33
    • /
    • 2010
  • With the popularization of digital photography, people are now capturing and storing far more photos than ever before. However, the enormous number of photos often discourages the users to identify desired photos. In this paper, we present a novel method for fast and intuitive browsing through large collections of geo-referenced photographs. Given a set of photos, we construct a hierarchical structure of clusters such that each cluster includes a set of spatially adjacent photos and its sub-clusters divide the photo set disjointly. For each cluster, we pre-compute its convex hull and the corresponding polygon area. At run-time, this pre-computed data allows us to efficiently visualize only a fraction of the clusters that are inside the current view and have easily recognizable sizes with respect to the current zoom level. Each cluster is displayed as a single polygon representing its convex hull instead of every photo location included in the cluster. The users can quickly transfer from clusters to clusters by simply selecting any interesting clusters. Our system automatically pans and zooms the view until the currently selected cluster fits precisely into the view with a moderate size. Our user study demonstrates that these new visualization and interaction techniques can significantly improve the capability of navigating over large collections of geo-referenced photos.

Implementation of Data processing of the High Availability for Software Architecture of the Cloud Computing (클라우드 서비스를 위한 고가용성 대용량 데이터 처리 아키텍쳐)

  • Lee, Byoung-Yup;Park, Junho;Yoo, Jaesoo
    • The Journal of the Korea Contents Association
    • /
    • v.13 no.2
    • /
    • pp.32-43
    • /
    • 2013
  • These days, there are more and more IT research institutions which foresee cloud services as the predominant IT service in the near future and there, in fact, are actual cloud services provided by some IT leading vendors. Regardless of physical location of the service and environment of the system, cloud service can provide users with storage services, usage of data and software. On the other hand, cloud service has challenges as well. Even though cloud service has its edge in terms of the extent to which the IT resource can be freely utilized regardless of the confinement of hardware, the availability is another problem to be solved. Hence, this paper is dedicated to tackle the aforementioned issues; prerequisites of cloud computing for distributed file system, open source based Hadoop distributed file system, in-memory database technology and high availability database system. Also the author tries to body out the high availability mass distributed data management architecture in cloud service's perspective using currently used distributed file system in cloud computing market.

Distributed File Systems Architectures of the Large Data for Cloud Data Services (클라우드 데이터 서비스를 위한 대용량 데이터 처리 분산 파일 아키텍처 설계)

  • Lee, Byoung-Yup;Park, Jun-Ho;Yoo, Jae-Soo
    • The Journal of the Korea Contents Association
    • /
    • v.12 no.2
    • /
    • pp.30-39
    • /
    • 2012
  • In these day, some of IT venders already were going to cloud computing market, as well they are going to expand their territory for the cloud computing market through that based on their hardware and software technology, making collaboration between hardware and software vender. Distributed file system is very mainly technology for the cloud computing that must be protect performance and safety for high levels service requests as well data store. This paper introduced distributed file system for cloud computing and how to use this theory such as memory database, Hadoop file system, high availability database system. now In the market, this paper define a very large distributed processing architect as a reference by kind of distributed file systems through using technology in cloud computing market.