• Title/Summary/Keyword: Big Data Processing Technology

Search Result 385, Processing Time 0.024 seconds

An Efficient Implementation of Mobile Raspberry Pi Hadoop Clusters for Robust and Augmented Computing Performance

  • Srinivasan, Kathiravan;Chang, Chuan-Yu;Huang, Chao-Hsi;Chang, Min-Hao;Sharma, Anant;Ankur, Avinash
    • Journal of Information Processing Systems
    • /
    • v.14 no.4
    • /
    • pp.989-1009
    • /
    • 2018
  • Rapid advances in science and technology with exponential development of smart mobile devices, workstations, supercomputers, smart gadgets and network servers has been witnessed over the past few years. The sudden increase in the Internet population and manifold growth in internet speeds has occasioned the generation of an enormous amount of data, now termed 'big data'. Given this scenario, storage of data on local servers or a personal computer is an issue, which can be resolved by utilizing cloud computing. At present, there are several cloud computing service providers available to resolve the big data issues. This paper establishes a framework that builds Hadoop clusters on the new single-board computer (SBC) Mobile Raspberry Pi. Moreover, these clusters offer facilities for storage as well as computing. Besides the fact that the regular data centers require large amounts of energy for operation, they also need cooling equipment and occupy prime real estate. However, this energy consumption scenario and the physical space constraints can be solved by employing a Mobile Raspberry Pi with Hadoop clusters that provides a cost-effective, low-power, high-speed solution along with micro-data center support for big data. Hadoop provides the required modules for the distributed processing of big data by deploying map-reduce programming approaches. In this work, the performance of SBC clusters and a single computer were compared. It can be observed from the experimental data that the SBC clusters exemplify superior performance to a single computer, by around 20%. Furthermore, the cluster processing speed for large volumes of data can be enhanced by escalating the number of SBC nodes. Data storage is accomplished by using a Hadoop Distributed File System (HDFS), which offers more flexibility and greater scalability than a single computer system.

A Keyword-Based Big Data Analysis for Individualized Health Activity: Focusing on Methodological Approach

  • Kim, Han-Byul;Bae, Geun-Pyo;Huh, Jun-Ho
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.04a
    • /
    • pp.540-543
    • /
    • 2017
  • It will be possible to solve some of the major issues in our society and economy with the emerging Big Data used across 21st century global digital economy. One of the main areas where big data can be quite useful is the medical and health area. IT technology is being used extensively in this area and expected to expand its application field further. However, there is still room for improvement in the usage of Big Data as it is difficult to search unstructured data contained in Big Data and collect statistics for them. This limits wider application of Big Data. Depending on data collection and analysis method, the results from a Big Data can be varied. Some of them could be positive or negative so that it is essential that Big Data should be handled adequately and appropriately adapting to a purpose. Therefore, a Big Data has been constructed in this study to applying Crawling technique for data mining and analyzed with R. Also, the data were visualized for easier recognition and this was effective in developing an individualized health plan from different angles.

Integration of Cloud and Big Data Analytics for Future Smart Cities

  • Kang, Jungho;Park, Jong Hyuk
    • Journal of Information Processing Systems
    • /
    • v.15 no.6
    • /
    • pp.1259-1264
    • /
    • 2019
  • Nowadays, cloud computing and big data analytics are at the center of many industries' concerns to take advantage of the potential benefits of building future smart cities. The integration of cloud computing and big data analytics is the main reason for massive adoption in many organizations, avoiding the potential complexities of on-premise big data systems. With these two technologies, the manufacturing industry, healthcare system, education, academe, etc. are developing rapidly, and they will offer various benefits to expand their domains. In this issue, we present a summary of 18 high-quality accepted articles following a rigorous review process in the field of cloud computing and big data analytics.

Adaptive Boundary Correction based Particle Swarm Optimization for Activity Recognition (사용자 행동인식을 위한 적응적 경계 보정기반 Particle Swarm Optimization 알고리즘)

  • Heo, Seonguk;Kwon, Yongjin;Kang, Kyuchang;Bae, Changseok
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2012.11a
    • /
    • pp.1166-1169
    • /
    • 2012
  • 본 논문은 사용자 행동인식을 위해 기존 PSO (Particle Swarm Optimization) 알고리즘의 경계선을 통한 데이터 분류에서 데이터의 수집환경에 의해 발생하는 문제를 벡터의 길이비교를 이용한 보정을 통해 보완한 알고리즘을 제안한다. 기존의 PSO 알고리즘은 데이터 분류를 위해서 데이터의 최소, 최대값을 이용하여 경계를 생성하고, 이를 이용하여 데이터를 분류하였다. 그러나 PSO를 이용하여 행동인식을 할 때 행동이 수집되는 환경에 따라서 경계에 포함되지 못해 행동이 분류되지 못하는 문제가 있다. 이러한 분류의 문제를 보완하기 위해 경계를 벗어난 데이터와 각 행동을 대표하는 데이터의 벡터 길이를 계산하고 최소길이를 비교하여 분류한다. 실험결과, 기존 PSO 방법에 비해 개선된 방법이 평균적으로 앉기 1%, 걷기 7%, 서기 7%의 개선된 결과를 얻었다.

Big Data Processing Scheme of Distribution Environment (분산환경에서 빅 데이터 처리 기법)

  • Jeong, Yoon-Su;Han, Kun-Hee
    • Journal of Digital Convergence
    • /
    • v.12 no.6
    • /
    • pp.311-316
    • /
    • 2014
  • Social network server due to the popularity of smart phones, and data stored in a big usable access data services are increasing. Big Data Big Data processing technology is one of the most important technologies in the service, but a solution to this minor security state. In this paper, the data services provided by the big -sized data is distributed using a double hash user to easily access to data of multiple distributed hash chain based data processing technique is proposed. The proposed method is a kind of big data data, a function, characteristics of the hash chain tied to a high-throughput data are supported. Further, the token and the data node to an eavesdropper that occurs when the security vulnerability to the data attribute information to the connection information by utilizing hash chain of big data access control in a distributed processing.

A Study on the Crime Prevention Smart System Based on Big Data Processing (빅데이터 처리 기반의 범죄 예방 스마트 시스템에 관한 연구)

  • Kim, Won
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.11
    • /
    • pp.75-80
    • /
    • 2020
  • Since the Fourth Industrial Revolution, important technologies such as big data analysis, robotics, Internet of Things, and the artificial intelligence have been used in various fields. Generally speaking it is understood that the big-data technology consists of gathering stage for enormous data, analyzing and processing stage and distributing stage. Until now crime records which is one of useful big-sized data are utilized to obtain investigation information after occurring crimes. If crime records are utilized to predict crimes it is believed that crime occurring frequency can be lowered by processing big-sized crime records in big-data framework. In this research the design is proposed that the smart system can provide the users of smart devices crime occurrence probability by processing crime records in big-data analysis. Specifically it is meant that the proposed system will guide safer routes by displaying crime occurrence probabilities on the digital map in a smart device. In the experiment result for a smart application dealing with small local area it is showed that its usefulness is quite good in crime prevention.

Small Sample Face Recognition Algorithm Based on Novel Siamese Network

  • Zhang, Jianming;Jin, Xiaokang;Liu, Yukai;Sangaiah, Arun Kumar;Wang, Jin
    • Journal of Information Processing Systems
    • /
    • v.14 no.6
    • /
    • pp.1464-1479
    • /
    • 2018
  • In face recognition, sometimes the number of available training samples for single category is insufficient. Therefore, the performances of models trained by convolutional neural network are not ideal. The small sample face recognition algorithm based on novel Siamese network is proposed in this paper, which doesn't need rich samples for training. The algorithm designs and realizes a new Siamese network model, SiameseFacel, which uses pairs of face images as inputs and maps them to target space so that the $L_2$ norm distance in target space can represent the semantic distance in input space. The mapping is represented by the neural network in supervised learning. Moreover, a more lightweight Siamese network model, SiameseFace2, is designed to reduce the network parameters without losing accuracy. We also present a new method to generate training data and expand the number of training samples for single category in AR and labeled faces in the wild (LFW) datasets, which improves the recognition accuracy of the models. Four loss functions are adopted to carry out experiments on AR and LFW datasets. The results show that the contrastive loss function combined with new Siamese network model in this paper can effectively improve the accuracy of face recognition.

A propose of Big data quality elements (빅 데이터의 품질 요소 제안)

  • Choi, Sang-Kyoon;Jeon, Soon-Cheon
    • Journal of Advanced Navigation Technology
    • /
    • v.17 no.1
    • /
    • pp.9-15
    • /
    • 2013
  • Big data has a key engine of the new value creation and troubleshooting are becoming more data-centric era begins in earnest. This paper takes advantage of the big data, big data in order to secure the quality of the quality elements for ensuring the quality of Justice and quality per-element strategy argue against. To achieve this, big data, case studies, resources of the big data plan and the elements of knowledge, analytical skills and big data processing technology, and more. This defines the quality of big data and quality, quality strategy. The quality of the data is secured by big companies from the large amounts of data through the data reinterpreted in big corporate competitiveness and to extract data for various strategies.

New Medical Image Fusion Approach with Coding Based on SCD in Wireless Sensor Network

  • Zhang, De-gan;Wang, Xiang;Song, Xiao-dong
    • Journal of Electrical Engineering and Technology
    • /
    • v.10 no.6
    • /
    • pp.2384-2392
    • /
    • 2015
  • The technical development and practical applications of big-data for health is one hot topic under the banner of big-data. Big-data medical image fusion is one of key problems. A new fusion approach with coding based on Spherical Coordinate Domain (SCD) in Wireless Sensor Network (WSN) for big-data medical image is proposed in this paper. In this approach, the three high-frequency coefficients in wavelet domain of medical image are pre-processed. This pre-processing strategy can reduce the redundant ratio of big-data medical image. Firstly, the high-frequency coefficients are transformed to the spherical coordinate domain to reduce the correlation in the same scale. Then, a multi-scale model product (MSMP) is used to control the shrinkage function so as to make the small wavelet coefficients and some noise removed. The high-frequency parts in spherical coordinate domain are coded by improved SPIHT algorithm. Finally, based on the multi-scale edge of medical image, it can be fused and reconstructed. Experimental results indicate the novel approach is effective and very useful for transmission of big-data medical image(especially, in the wireless environment).

Big Data Management Scheme using Property Information based on Cluster Group in adopt to Hadoop Environment (하둡 환경에 적합한 클러스터 그룹 기반 속성 정보를 이용한 빅 데이터 관리 기법)

  • Han, Kun-Hee;Jeong, Yoon-Su
    • Journal of Digital Convergence
    • /
    • v.13 no.9
    • /
    • pp.235-242
    • /
    • 2015
  • Social network technology has been increasing interest in the big data service and development. However, the data stored in the distributed server and not on the central server technology is easy enough to find and extract. In this paper, we propose a big data management techniques to minimize the processing time of information you want from the content server and the management server that provides big data services. The proposed method is to link the in-group data, classified data and groups according to the type, feature, characteristic of big data and the attribute information applied to a hash chain. Further, the data generated to extract the stored data in the distributed server to record time for improving the data index information processing speed of the data classification of the multi-attribute information imparted to the data. As experimental result, The average seek time of the data through the number of cluster groups was increased an average of 14.6% and the data processing time through the number of keywords was reduced an average of 13%.