• Title/Summary/Keyword: Data-intensive processing

Search Result 131, Processing Time 0.026 seconds

LDBAS: Location-aware Data Block Allocation Strategy for HDFS-based Applications in the Cloud

  • Xu, Hua;Liu, Weiqing;Shu, Guansheng;Li, Jing
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.1
    • /
    • pp.204-226
    • /
    • 2018
  • Big data processing applications have been migrated into cloud gradually, due to the advantages of cloud computing. Hadoop Distributed File System (HDFS) is one of the fundamental support systems for big data processing on MapReduce-like frameworks, such as Hadoop and Spark. Since HDFS is not aware of the co-location of virtual machines in the cloud, the default scheme of block allocation in HDFS does not fit well in the cloud environments behaving in two aspects: data reliability loss and performance degradation. In this paper, we present a novel location-aware data block allocation strategy (LDBAS). LDBAS jointly optimizes data reliability and performance for upper-layer applications by allocating data blocks according to the locations and different processing capacities of virtual nodes in the cloud. We apply LDBAS to two stages of data allocation of HDFS in the cloud (the initial data allocation and data recovery), and design the corresponding algorithms. Finally, we implement LDBAS into an actual Hadoop cluster and evaluate the performance with the benchmark suite BigDataBench. The experimental results show that LDBAS can guarantee the designed data reliability while reducing the job execution time of the I/O-intensive applications in Hadoop by 8.9% on average and up to 11.2% compared with the original Hadoop in the cloud.

Mutable Encryption for Oblivious Data Access in Cloud Storage

  • Ahmad, Mahmood;Hussain, Shujjat;Pervez, Zeeshan;Lee, Sungyoung;Chung, Tae Choong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2013.05a
    • /
    • pp.157-158
    • /
    • 2013
  • Data privacy and access control policies in computer clouds are a prime concerns while talking about the sensitive data. Authorized access is ensured with the help of secret keys given to a range of valid users. Granting the role access is a trivial matter but revoking user access is tricky and compute intensive. To revoke a user and making his data access ineffective the data owner has to compute new set of keys for the rest of effective users. This situation is inappropriate where user revocation is a frequent phenomenon. Time based revocation is another way to deal this issue where key for data access expires automatically. This solution rests in a very strong assumption of time determination in advance. In this paper we have proposed a mutable encryption for oblivious data access in cloud storage where the access key becomes ineffective after defined number of threshold by the data owner. The proposed solution adds to its novelty by introducing mutable encryption while accessing the data obliviously.

A Column-Aware Index Management Using Flash Memory for Read-Intensive Databases

  • Byun, Si-Woo;Jang, Seok-Woo
    • Journal of Information Processing Systems
    • /
    • v.11 no.3
    • /
    • pp.389-405
    • /
    • 2015
  • Most traditional database systems exploit a record-oriented model where the attributes of a record are placed contiguously in a hard disk to achieve high performance writes. However, for read-mostly data warehouse systems, the column-oriented database has become a proper model because of its superior read performance. Today, flash memory is largely recognized as the preferred storage media for high-speed database systems. In this paper, we introduce a column-oriented database model based on flash memory and then propose a new column-aware flash indexing scheme for the high-speed column-oriented data warehouse systems. Our index management scheme, which uses an enhanced $B^+$-Tree, achieves superior search performance by indexing an embedded segment and packing an unused space in internal and leaf nodes. Based on the performance results of two test databases, we concluded that the column-aware flash index management outperforms the traditional scheme in the respect of the mixed operation throughput and its response time.

SLA-Aware Resource Management for Cloud based Multimedia Service

  • Hasan, Md. Sabbir;Islam, Md. Motaharul;Park, Jun Young;Huh, Eui-Nam
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2013.05a
    • /
    • pp.171-174
    • /
    • 2013
  • Virtualization technology opened a new era in the field of Data intensive, Grid and Cloud Computing. Today's Data centers are smarter than ever leveraging the Virtualization technology. In response to that, Dynamic consolidations of Virtual Machines (VMs) allow efficient resource management by live migration of VMs in the hosts. Moreover, each client typically has a service level agreement (SLA), leads to stipulation in dealing with energy-performance trade-off as aggressive consolidation may lead to performance degradation beyond the negotiation. In this paper we propose a Cloud Based CDN approach for allocation of VM that aims to maximize the client-level SLA. Our experiment result demonstrates significant enhancement of SLA at certain level.

Pig Face Recognition Using Deep Learning (딥러닝을 이용한 돼지 얼굴 인식)

  • MA, RUIHAN;Kim, Sang-Cheol
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.11a
    • /
    • pp.493-494
    • /
    • 2022
  • The development of livestock faces intensive farming results in a rising need for recognition of individual animals such as cows and pigs is related to high traceability. In this paper, we present a non-invasive biometrics systematic approach based on the deep-learning classification model to pig face identification. Firstly, in our systematic method, we build a ROS data collection system block to collect 10 pig face data images. Secondly, we proposed a preprocessing block in that we utilize the SSIM method to filter some images of collected images that have high similarity. Thirdly, we employ the improved image classification model of CNN (ViT), which uses the finetuning and pretraining technique to recognize the individual pig face. Finally, our proposed method achieves the accuracy about 98.66%.

Method of preventing Pressure Ulcer and EMR data preprocess

  • Kim, Dowon;Kim, Minkyu;Kim, Yoon;Han, Seon-Sook;Heo, Jungwon;Choi, Hyun-Soo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.12
    • /
    • pp.69-76
    • /
    • 2022
  • This paper proposes a method of refining and processing time-series data using Medical Information Mart for Intensive Care (MIMIC-IV) v2.0 data. In addition, the significance of the processing method was validated through a machine learning-based pressure ulcer early warning system using a dataset processed based on the proposed method. The implemented system alerts medical staff in advance 12 and 24 hours before a lesion occurs. In conjunction with the Electronic Medical Record (EMR) system, it informs the medical staff of the risk of a patient's pressure ulcer development in real-time to support a clinical decision, and further, it enables the efficient allocation of medical resources. Among several machine learning models, the GRU model showed the best performance with AUROC of 0.831 for 12 hours and 0.822 for 24 hours.

Parallel LDPC Decoding on a Heterogeneous Platform using OpenCL

  • Hong, Jung-Hyun;Park, Joo-Yul;Chung, Ki-Seok
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.6
    • /
    • pp.2648-2668
    • /
    • 2016
  • Modern mobile devices are equipped with various accelerated processing units to handle computationally intensive applications; therefore, Open Computing Language (OpenCL) has been proposed to fully take advantage of the computational power in heterogeneous systems. This article introduces a parallel software decoder of Low Density Parity Check (LDPC) codes on an embedded heterogeneous platform using an OpenCL framework. The LDPC code is one of the most popular and strongest error correcting codes for mobile communication systems. Each step of LDPC decoding has different parallelization characteristics. In the proposed LDPC decoder, steps suitable for task-level parallelization are executed on the multi-core central processing unit (CPU), and steps suitable for data-level parallelization are processed by the graphics processing unit (GPU). To improve the performance of OpenCL kernels for LDPC decoding operations, explicit thread scheduling, vectorization, and effective data transfer techniques are applied. The proposed LDPC decoder achieves high performance and high power efficiency by using heterogeneous multi-core processors on a unified computing framework.

Intensive Monitoring Survey of Nearby Galaxies

  • Choi, Changsu;Im, Myungshin;Sung, Hyun-Il
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.40 no.1
    • /
    • pp.79.1-79.1
    • /
    • 2015
  • We describe our ongoing project, Intensive Monitoring Survey of Nearby Galaxies. This survey is designed to study transients such as Supernovae (SNe) in nearby galaxies. Our targets are UV-bright (MUV < -18.4) and nearby (d < 50 Mpc) 50 galaxies selected from a GALEX catalog, whose star formation rates are larger than normal galaxies. High star formation in these galaxies ensures that core-collapse supernova explosions occur more frequently in them than normal galaxies. By monitoring them with a short cadence of a few hours, we expect to discover 5 SNe/yr events. Most importantly, we hope to construct very early light curves in rising phase for some of them, which enables us to understand better the physical properties of progenitor star and the explosion mechanism. To enable such a high cadence observation, we constructed a world wide telescope network covering northern, southern hemisphere distributed over a wide range of longitudes (Korea, US, Australia, Uzbekistan and Spain). Data reduction pipe line, detection and classification algorithms are being developed for an efficient processing of the data. Using the network of telescopes, we expect to reach observe not only SNe but also other transients like GRBs, Asteroid, variable AGNs and gravitaional wave optical counter part.

  • PDF

Low-Power Data Cache Architecture and Microarchitecture-level Management Policy for Multimedia Application (멀티미디어 응용을 위한 저전력 데이터 캐쉬 구조 및 마이크로 아키텍쳐 수준 관리기법)

  • Yang Hoon-Mo;Kim Cheong-Gil;Park Gi-Ho;Kim Shin-Dug
    • The KIPS Transactions:PartA
    • /
    • v.13A no.3 s.100
    • /
    • pp.191-198
    • /
    • 2006
  • Today's portable electric consumer devices, which are operated by battery, tend to integrate more multimedia processing capabilities. In the multimedia processing devices, multimedia system-on-chips can handle specific algorithms which need intensive processing capabilities and significant power consumption. As a result, the power-efficiency of multimedia processing devices becomes important increasingly. In this paper, we propose a reconfigurable data caching architecture, in which data allocation is constrained by software support, and evaluate its performance and power efficiency. Comparing with conventional cache architectures, power consumption can be reduced significantly, while miss rate of the proposed architecture is very similar to that of the conventional caches. The reduction of power consumption for the reconfigurable data cache architecture shows 33.2%, 53.3%, and 70.4%, when compared with direct-mapped, 2-way, and 4-way caches respectively.

Classification of Terrestrial LiDAR Data through a Technique of Combining Heterogeneous Data (이기종 측량자료의 융합기법을 통한 지상 라이다 자료의 분류)

  • Kim, Dong-Moon;Kim, Seong-Hoon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.12 no.9
    • /
    • pp.4192-4198
    • /
    • 2011
  • Terrestrial LiDAR is a high precision positioning technique to monitor the behavior and change of structures and natural slopes, but it has depended on subjective hand intensive tasks for the classification(surface and vegetation or structure and vegetation) of positioning data. Thus it has a couple of problems including lower reliability of data classification and longer operation hours due to the surface characteristics of various geographical and natural features. In order to solve those problems, the investigator developed a technique of using the NDVI, which is a major index to monitor the changes on the surface(including vegetation), to categorize land covers, combining the results with the terrestrial LiDAR data, and classifying the results according to items. The application results of the developed technique show that the accuracy of convergence was 94% even though there was a problem with partial misclassification of 0.003% along the boundaries between items. The technique took less time for data processing than the old hand intensive task and improved in accuracy, thus increasing its utilization across a range of fields.