• Title/Summary/Keyword: HADOOP

Search Result 395, Processing Time 0.023 seconds

ICT Utilization for Optimization of SME Decision Making (중소기업 의사결정 최적화를 위한 ICT 활용 방안)

  • Park, Ji-Young;Kim, Kyung-Ihl
    • Journal of Convergence for Information Technology
    • /
    • v.8 no.1
    • /
    • pp.275-280
    • /
    • 2018
  • Companies are now rapidly entering the realm of the realtime economy named 'Now Economy'. 'Now Economy' features the measurement and assessment, accelerating speed of decision making about business. According to this, companies intend to change their disposition to be able to make quick and accurate decision by gathering informations rapidly and correctly, and then by processing that. Applications of ICT can be possible to change the new decision system of companies. In this thesis, the new decision system through amalgamations of BPMS, Mobile, Cloud Service, Hadoop, BI and AI is presented. It will be able to make decision quickly and accurately by collecting all information between the most efficiently managed process and formal and informal data inside company through this, and then by combining changes with situations outside company.

Design of a Platform for Collecting and Analyzing Agricultural Big Data (농업 빅데이터 수집 및 분석을 위한 플랫폼 설계)

  • Nguyen, Van-Quyet;Nguyen, Sinh Ngoc;Kim, Kyungbaek
    • Journal of Digital Contents Society
    • /
    • v.18 no.1
    • /
    • pp.149-158
    • /
    • 2017
  • Big data have been presenting us with exciting opportunities and challenges in economic development. For instance, in the agriculture sector, mixing up of various agricultural data (e.g., weather data, soil data, etc.), and subsequently analyzing these data deliver valuable and helpful information to farmers and agribusinesses. However, massive data in agriculture are generated in every minute through multiple kinds of devices and services such as sensors and agricultural web markets. It leads to the challenges of big data problem including data collection, data storage, and data analysis. Although some systems have been proposed to address this problem, they are still restricted either in the type of data, the type of storage, or the size of data they can handle. In this paper, we propose a novel design of a platform for collecting and analyzing agricultural big data. The proposed platform supports (1) multiple methods of collecting data from various data sources using Flume and MapReduce; (2) multiple choices of data storage including HDFS, HBase, and Hive; and (3) big data analysis modules with Spark and Hadoop.

Analysis of Network Log based on Hadoop (하둡 기반 네트워크 로그 시스템)

  • Kim, Jeong-Joon;Park, Jeong-Min;Chung, Sung-Taek
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.17 no.5
    • /
    • pp.125-130
    • /
    • 2017
  • Since field control equipment such as PLC has no function to log key event information in the log, it is difficult to analyze the accident. Therefore, it is necessary to secure information that can analyze when a cyber accident occurs by logging the main event information of the field control equipment such as PLC and IED. The protocol analyzer is required to analyze the field control device (the embedded device) communication protocol for event logging. However, the conventional analyzer, such as Wireshark is difficult to process the data identification and extraction of the large variety of protocols for event logging is difficult analysis of the payload data based and classification. In this paper, we developed a system for Big Data based on field control device communication protocol payload data extraction for event logging of large studies.

Trend analysis of Open Source Technologies for Cloud Storage Infrastructure (클라우드 스토리지 인프라 구축을 위한 오픈 소스 기술 동향)

  • Bae, Yu-Mi;Jung, Sung-Jae;Bae, Jung-Min;Park, Jeong-Su;Sung, Kyung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2013.05a
    • /
    • pp.263-266
    • /
    • 2013
  • The universal cloud computing environment, the increase of mobile devices, and the emergence of various web-based services require large amounts of storage space. With the widespread use of Web-based storage services, such as Google Drive, Naver Ndrive, Daum Cloud, there is a need for more storage space. Therefore, storage areas can be provided according to the needs of users of virtualized storage resources through a network, and a large, easy to extend, and royalty in a specific geographical location, cloud storage may be the limelight. In this paper, find out about the features of open source software technology, Hadoop, Swift, GlusterFS for Cloud Storage infrastructure.

  • PDF

UX Analysis for Mobile Devices Using MapReduce on Distributed Data Processing Platform (MapReduce 분산 데이터처리 플랫폼에 기반한 모바일 디바이스 UX 분석)

  • Kim, Sungsook;Kim, Seonggyu
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.9
    • /
    • pp.589-594
    • /
    • 2013
  • As the concept of web characteristics represented by openness and mind sharing grows more and more popular, device log data generated by both users and developers have become increasingly complicated. For such reasons, a log data processing mechanism that automatically produces meaningful data set from large amount of log records have become necessary for mobile device UX(User eXperience) analysis. In this paper, we define the attributes of to-be-analyzed log data that reflect the characteristics of a mobile device and collect real log data from mobile device users. Along with the MapReduce programming paradigm in Hadoop platform, we have performed a mobile device User eXperience analysis in a distributed processing environment using the collected real log data. We have then demonstrated the effectiveness of the proposed analysis mechanism by applying the various combinations of Map and Reduce steps to produce a simple data schema from the large amount of complex log records.

Design of Building Biomertic Big Data System using the Mi Band and MongoDB (Mi Band와 MongoDB를 사용한 생체정보 빅데이터 시스템의 설계)

  • Lee, Younghun;Kim, Yongil
    • Smart Media Journal
    • /
    • v.5 no.4
    • /
    • pp.124-130
    • /
    • 2016
  • Big data technologies are increasing the need for big data in many areas of the world. Recently, the health care industry has become increasingly aware of the importance of disease and health care services, as it has become increasingly immune to prevention and health care. To do this, we need a Big data system to collect and analyze the personal biometric data. In this paper, we design the biometric big data system using low cost wearable device. We collect basic biometric data, such as heart rate, step count and physical activity from Mi Band, and store the collected biometric data into MongoDB. Based on the results of this study, it is possible to build a big data system that can be used in actual medical environment by using Hadoop etc. and to use it in real medical service in connection with various wearable devices for medical information.

Performance Evaluation of Medical Big Data Analysis based on RHadoop (RHadoop 기반 보건의료 빅데이터 분석의 성능 평가)

  • Ryu, Woo-Seok
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.13 no.1
    • /
    • pp.207-212
    • /
    • 2018
  • As a data analysis tool which is becoming popular in the Big Data era, R is rapidly expanding its user range by providing powerful statistical analysis and data visualization functions. Major advantage of R is its functional scalability based on open source, but its scale scalability is limited, resulting in performance degrades in large data processing. RHadoop, one of the extension packages to complement it, can improve data analysis performance as it supports Hadoop platform-based distributed processing of programs written in R. In this paper, we evaluate the validity of RHadoop by evaluating the performance improvement of RHadoop in real medical big data analysis. Performance evaluation of the analysis of the medical history information, which is provided by National Health Insurance Service, using R and RHadoop shows that RHadoop cluster composed of 8 data nodes can improve performance up to 8 times compared with R.

K Nearest Neighbor Joins for Big Data Processing based on Spark (Spark 기반 빅데이터 처리를 위한 K-최근접 이웃 연결)

  • JIAQI, JI;Chung, Yeongjee
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.9
    • /
    • pp.1731-1737
    • /
    • 2017
  • K Nearest Neighbor Join (KNN Join) is a simple yet effective method in machine learning. It is widely used in small dataset of the past time. As the number of data increases, it is infeasible to run this model on an actual application by a single machine due to memory and time restrictions. Nowadays a popular batch process model called MapReduce which can run on a cluster with a large number of computers is widely used for large-scale data processing. Hadoop is a framework to implement MapReduce, but its performance can be further improved by a new framework named Spark. In the present study, we will provide a KNN Join implement based on Spark. With the advantage of its in-memory calculation capability, it will be faster and more effective than Hadoop. In our experiments, we study the influence of different factors on running time and demonstrate robustness and efficiency of our approach.

Big Data Management Scheme using Property Information based on Cluster Group in adopt to Hadoop Environment (하둡 환경에 적합한 클러스터 그룹 기반 속성 정보를 이용한 빅 데이터 관리 기법)

  • Han, Kun-Hee;Jeong, Yoon-Su
    • Journal of Digital Convergence
    • /
    • v.13 no.9
    • /
    • pp.235-242
    • /
    • 2015
  • Social network technology has been increasing interest in the big data service and development. However, the data stored in the distributed server and not on the central server technology is easy enough to find and extract. In this paper, we propose a big data management techniques to minimize the processing time of information you want from the content server and the management server that provides big data services. The proposed method is to link the in-group data, classified data and groups according to the type, feature, characteristic of big data and the attribute information applied to a hash chain. Further, the data generated to extract the stored data in the distributed server to record time for improving the data index information processing speed of the data classification of the multi-attribute information imparted to the data. As experimental result, The average seek time of the data through the number of cluster groups was increased an average of 14.6% and the data processing time through the number of keywords was reduced an average of 13%.

Hierarchical Visualization of Cloud-Based Social Network Service Using Fuzzy (퍼지를 이용한 클라우드 기반의 소셜 네트워크 서비스 계층적 시각화)

  • Park, Sun;Kim, Yong-Il;Lee, Seong Ro
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.38B no.7
    • /
    • pp.501-511
    • /
    • 2013
  • Recently, the visualization method of social network service have been only focusing on presentation of visualizing network data, which the methods do not consider an efficient processing speed and computational complexity for increasing at the ratio of arithmetical of a big data regarding social networks. This paper proposes a cloud based on visualization method to visualize a user focused hierarchy relationship between user's nodes on social network. The proposed method can intuitionally understand the user's social relationship since the method uses fuzzy to represent a hierarchical relationship of user nodes of social network. It also can easily identify a key role relationship of users on social network. In addition, the method uses hadoop and hive based on cloud for distributed parallel processing of visualization algorithm, which it can expedite the big data of social network.