• Title/Summary/Keyword: Big Data Processing Technology

Search Result 385, Processing Time 0.023 seconds

A Study of Big Data Information Systems Building and Cases (빅데이터 정보시스템의 구축 및 사례에 관한 연구)

  • Lee, Choong Kwon
    • Smart Media Journal
    • /
    • v.4 no.3
    • /
    • pp.56-61
    • /
    • 2015
  • Although many successful cases regarding big data have been reported, building information systems of big data is still difficult. From the perspective of technology the builders need to understand the whole process of systems development ranging from collecting, storing, processing, and analyzing data to presenting and using information. Whereas, from the perspective of business, the builders need to understand the values of the proposed big data project and explain to top managers who have to make a decision of the risky investment. This study proposes a framework of 5W 1H that can help the builder understand things related to the development of big data information systems. In addition, big data cases from the real world have been illustrated by applying to the framework. It is expected to help builders understand and manage big data projects and lead managers to make better decisions of the investment to the development of information systems.

A Study on Structural Holes of Privacy Protection for Life Logging Service as analyzing/processing of Big-Data (빅데이터 분석/처리에 따른 생활밀착형 서비스의 프라이버시 보호 측면에서의 구조혈 연구)

  • Kang, Jang-Mook;Song, You-Jin
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.14 no.1
    • /
    • pp.189-193
    • /
    • 2014
  • SNS (Social Network Service) has evolved to life-friendly service with the combination of local services. Unlike exsiting mobile services, life-friendly service is expected to be personalized with gathering of local information, location information and social network service information. In the process of gathering various kinds of information, Big-data technology and Cloud technology is needed. The effective algorithem has researched for this already, however the privacy protection model hasn't researched enough in life-friendly service or big-data using circumstance. In this paper, the privacy issue is dealt with in terms of 'Structure hole', and the privacy issue comes from big-data technology of life-friendly service.

PPNC: Privacy Preserving Scheme for Random Linear Network Coding in Smart Grid

  • He, Shiming;Zeng, Weini;Xie, Kun;Yang, Hongming;Lai, Mingyong;Su, Xin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.3
    • /
    • pp.1510-1532
    • /
    • 2017
  • In smart grid, privacy implications to individuals and their families are an important issue because of the fine-grained usage data collection. Wireless communications are utilized by many utility companies to obtain information. Network coding is exploited in smart grids, to enhance network performance in terms of throughput, delay, robustness, and energy consumption. However, random linear network coding introduces a new challenge for privacy preserving due to the encoding of data and updating of coefficients in forwarder nodes. We propose a distributed privacy preserving scheme for random linear network coding in smart grid that considers the converged flows character of the smart grid and exploits a homomorphic encryption function to decrease the complexities in the forwarder node. It offers a data confidentiality privacy preserving feature, which can efficiently thwart traffic analysis. The data of the packet is encrypted and the tag of the packet is encrypted by a homomorphic encryption function. The forwarder node random linearly codes the encrypted data and directly processes the cryptotext tags based on the homomorphism feature. Extensive security analysis and performance evaluations demonstrate the validity and efficiency of the proposed scheme.

Study of Efficient Algorithm for Deduplication of Complex Structure (복잡한 구조의 데이터 중복제거를 위한 효율적인 알고리즘 연구)

  • Lee, Hyeopgeon;Kim, Young-Woon;Kim, Ki-Young
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.14 no.1
    • /
    • pp.29-36
    • /
    • 2021
  • The amount of data generated has been growing exponentially, and the complexity of data has been increasing owing to the advancement of information technology (IT). Big data analysts and engineers have therefore been actively conducting research to minimize the analysis targets for faster processing and analysis of big data. Hadoop, which is widely used as a big data platform, provides various processing and analysis functions, including minimization of analysis targets through Hive, which is a subproject of Hadoop. However, Hive uses a vast amount of memory for data deduplication because it is implemented without considering the complexity of data. Therefore, an efficient algorithm has been proposed for data deduplication of complex structures. The performance evaluation results demonstrated that the proposed algorithm reduces the memory usage and data deduplication time by approximately 79% and 0.677%, respectively, compared to Hive. In the future, performance evaluation based on a large number of data nodes is required for a realistic verification of the proposed algorithm.

Advanced Resource Management with Access Control for Multitenant Hadoop

  • Won, Heesun;Nguyen, Minh Chau;Gil, Myeong-Seon;Moon, Yang-Sae
    • Journal of Communications and Networks
    • /
    • v.17 no.6
    • /
    • pp.592-601
    • /
    • 2015
  • Multitenancy has gained growing importance with the development and evolution of cloud computing technology. In a multitenant environment, multiple tenants with different demands can share a variety of computing resources (e.g., CPU, memory, storage, network, and data) within a single system, while each tenant remains logically isolated. This useful multitenancy concept offers highly efficient, and cost-effective systems without wasting computing resources to enterprises requiring similar environments for data processing and management. In this paper, we propose a novel approach supporting multitenancy features for Apache Hadoop, a large scale distributed system commonly used for processing big data. We first analyze the Hadoop framework focusing on "yet another resource negotiator (YARN)", which is responsible for managing resources, application runtime, and access control in the latest version of Hadoop. We then define the problems for supporting multitenancy and formally derive the requirements to solve these problems. Based on these requirements, we design the details of multitenant Hadoop. We also present experimental results to validate the data access control and to evaluate the performance enhancement of multitenant Hadoop.

Real-time Processing of Manufacturing Facility Data based on Big Data for Smart-Factory (스마트팩토리를 위한 빅데이터 기반 실시간 제조설비 데이터 처리)

  • Hwang, Seung-Yeon;Shin, Dong-Jin;Kwak, Kwang-Jin;Kim, Jeong-Joon;Park, Jeong-Min
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.5
    • /
    • pp.219-227
    • /
    • 2019
  • Manufacturing methods have been changed from labor-intensive methods to technological intensive methods centered on manufacturing facilities. As manufacturing facilities replace human labour, the importance of monitoring and managing manufacturing facilities is emphasized. In addition, Big Data technology has recently emerged as an important technology to discover new value from limited data. Therefore, changes in manufacturing industries have increased the need for smart factory that combines IoT, information and communication technologies, sensor data, and big data. In this paper, we present strategies for existing domestic manufacturing factory to becom big data based smart-factory through technologies for distributed storage and processing of manufacturing facility data in MongoDB in real time and visualization using R programming.

Spatial Big Data Query Processing System Supporting SQL-based Query Language in Hadoop (Hadoop에서 SQL 기반 질의언어를 지원하는 공간 빅데이터 질의처리 시스템)

  • Joo, In-Hak
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.10 no.1
    • /
    • pp.1-8
    • /
    • 2017
  • In this paper we present a spatial big data query processing system that can store spatial data in Hadoop and query the data with SQL-based query language. The system stores large-scale spatial data in HDFS-based storage system, and supports spatial queries expressed in SQL-based query language extended for spatial data processing. It supports standard spatial data types and functions defined in OGC simple feature model in the query language. This paper presents the development of core functions of the system including query language parsing, query validation, query planning, and connection with storage system. We compares the performance of the suggested system with an existing system, and our experiments show that the system shows about 58% performance improvement of query execution time over the existing system when executing region query for spatial data stored in Hadoop.

Research on the Development of Big Data Analysis Tools for Engineering Education (공학교육 빅 데이터 분석 도구 개발 연구)

  • Kim, Younyoung;Kim, Jaehee
    • Journal of Engineering Education Research
    • /
    • v.26 no.4
    • /
    • pp.22-35
    • /
    • 2023
  • As information and communication technology has developed remarkably, it has become possible to analyze various types of large-volume data generated at a speed close to real time, and based on this, reliable value creation has become possible. Such big data analysis is becoming an important means of supporting decision-making based on scientific figures. The purpose of this study is to develop a big data analysis tool that can analyze large amounts of data generated through engineering education. The tasks of this study are as follows. First, a database is designed to store the information of entries in the National Creative Capstone Design Contest. Second, the pre-processing process is checked for analysis with big data analysis tools. Finally, analyze the data using the developed big data analysis tool. In this study, 1,784 works submitted to the National Creative Comprehensive Design Contest from 2014 to 2019 were analyzed. As a result of selecting the top 10 words through topic analysis, 'robot' ranked first from 2014 to 2019, and energy, drones, ultrasound, solar energy, and IoT appeared with high frequency. This result seems to reflect the current core topics and technology trends of the 4th Industrial Revolution. In addition, it seems that due to the nature of the Capstone Design Contest, students majoring in electrical/electronic, computer/information and communication engineering, mechanical engineering, and chemical/new materials engineering who can submit complete products for problem solving were selected. The significance of this study is that the results of this study can be used in the field of engineering education as basic data for the development of educational contents and teaching methods that reflect industry and technology trends. Furthermore, it is expected that the results of big data analysis related to engineering education can be used as a means of preparing preemptive countermeasures in establishing education policies that reflect social changes.

A Meta Analysis of the Edible Insects (식용곤충 연구 메타 분석)

  • Yu, Ok-Kyeong;Jin, Chan-Yong;Nam, Soo-Tai;Lee, Hyun-Chang
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2018.10a
    • /
    • pp.182-183
    • /
    • 2018
  • Big data analysis is the process of discovering a meaningful correlation, pattern, and trends in large data set stored in existing data warehouse management tools and creating new values. In addition, by extracts new value from structured and unstructured data set in big volume means a technology to analyze the results. Most of the methods of Big data analysis technology are data mining, machine learning, natural language processing, pattern recognition, etc. used in existing statistical computer science. Global research institutes have identified Big data as the most notable new technology since 2011.

  • PDF

An Insight Study on Keyword of IoT Utilizing Big Data Analysis (빅데이터 분석을 활용한 사물인터넷 키워드에 관한 조망)

  • Nam, Soo-Tai;Kim, Do-Goan;Jin, Chan-Yong
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2017.10a
    • /
    • pp.146-147
    • /
    • 2017
  • Big data analysis is a technique for effectively analyzing unstructured data such as the Internet, social network services, web documents generated in the mobile environment, e-mail, and social data, as well as well formed structured data in a database. The most big data analysis techniques are data mining, machine learning, natural language processing, and pattern recognition, which were used in existing statistics and computer science. Global research institutes have identified analysis of big data as the most noteworthy new technology since 2011. Therefore, companies in most industries are making efforts to create new value through the application of big data. In this study, we analyzed using the Social Matrics which a big data analysis tool of Daum communications. We analyzed public perceptions of "Internet of things" keyword, one month as of october 8, 2017. The results of the big data analysis are as follows. First, the 1st related search keyword of the keyword of the "Internet of things" has been found to be technology (995). This study suggests theoretical implications based on the results.

  • PDF