• Title/Summary/Keyword: HADOOP

Search Result 394, Processing Time 0.025 seconds

Comparing Energy Efficiency of MPI and MapReduce on ARM based Cluster (ARM 클러스터에서 에너지 효율 향상을 위한 MPI와 MapReduce 모델 비교)

  • Maqbool, Jahanzeb;Rizki, Permata Nur;Oh, Sangyoon
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2014.01a
    • /
    • pp.9-13
    • /
    • 2014
  • The performance of large scale software applications has been automatically increasing for last few decades under the influence of Moore's law - the number of transistors on a microprocessor roughly doubled every eighteen months. However, on-chip transistors limitations and heating issues led to the emergence of multicore processors. The energy efficient ARM based System-on-Chip (SoC) processors are being considered for future high performance computing systems. In this paper, we present a case study of two widely used parallel programming models i.e. MPI and MapReduce on distributed memory cluster of ARM SoC development boards. The case study application, Black-Scholes option pricing equation, was parallelized and evaluated in terms of power consumption and throughput. The results show that the Hadoop implementation has low instantaneous power consumption that of MPI, but MPI outperforms Hadoop implementation by a factor of 1.46 in terms of total power consumption to execution time ratio.

  • PDF

Visualization Method of Social Networks Service using Message correlations based on Distributed Parallel Processing (메시지의 상관관계를 이용한 분산병렬처리 기반의 소셜 네트워크 서비스 시각화 방법)

  • Kim, Yong-Il;Park, Sun;Ryu, Gab-Sang
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.17 no.5
    • /
    • pp.1168-1173
    • /
    • 2013
  • This paper proposes a new visualization method based on cloud technique which uses internal relationship of user correlation and external relation of social network to visualize user relationship hierarchy. The visualization method of this paper can well represent user-focused relationship hierarchy on social networks by a correlation matrix. The importance of a access node reflects into user relationship hierarchy by exploiting external relation of social network. Users of the method can well understand user relationships on account of representing user relationship hierarchy from social networks. In addition, the method use hadoop and hive for distribution storing and parallel processing which the result of calculation visualizes hierarchy graph using D3.

Marine Environment Monitoring System based Open Source (오픈소스 기반 해양환경 모니터링 시스템)

  • Park, Sun;Cha, ByungRae;Kim, Jongwon
    • Smart Media Journal
    • /
    • v.6 no.3
    • /
    • pp.75-82
    • /
    • 2017
  • Recently, the marine monitoring technology is actively being studied since the sea is a rich repository of natural resources that is taken notice in the world. In particular, the marine environment data should be collected continuously in order to understand and analyze the marine environment, however the study of automatic monitoring of marine environment in Korea is not enough. In this paper, we proposed the marine environment monitoring system based on open source. The proposed system can be designed as a scale out system using Hadoop based time series database which it can easily process the increasing collection data by a scale out computer resources. It can also be used to analyze marine data by visualizing collected data.

A Study on Open API of Securities and Investment Companies in Korea for Activating Big Data

  • Ryu, Gui Yeol
    • International journal of advanced smart convergence
    • /
    • v.8 no.2
    • /
    • pp.102-108
    • /
    • 2019
  • Big data was associated with three key concepts, volume, variety, and velocity. Securities and investment services produce and store a large data of text/numbers. They have also the most data per company on the average in the US. Gartner found that the demand for big data in finance was 25%, which was the highest. Therefore securities and investment companies produce the largest data such as text/numbers, and have the highest demand. And insurance companies and credit card companies are using big data more actively than banking companies in Korea. Researches on the use of big data in securities and investment companies have been found to be insignificant. We surveyed 22 major securities and investment companies in Korea for activating big data. We can see they actively use AI for investment recommend. As for big data of securities and investment companies, we studied open API. Of the major 22 securities and investment companies, only six securities and investment companies are offering open APIs. The user OS is 100% Windows, and the language used is mainly VB, C#, MFC, and Excel provided by Windows. There is a difficulty in real-time analysis and decision making since developers cannot receive data directly using Hadoop, the big data platform. Development manuals are mainly provided on the Web, and only three companies provide as files. The development documentation for the file format is more convenient than web type. In order to activate big data in the securities and investment fields, we found that they should support Linux, and Java, Python, easy-to-view development manuals, videos such as YouTube.

Distributed Support Vector Machines for Localization on a Sensor Newtork (센서 네트워크에서 위치 측정을 위한 분산 지지 벡터 머신)

  • Moon, Sangook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2014.10a
    • /
    • pp.944-946
    • /
    • 2014
  • Localization of a sensor network node using machine learning has been recently studied. It is easy for Support vector machines algorithm to implement in high level language enabling parallelism. In this paper, we realized Support vector machine using python language and built a sensor network cluster with 5 Pi's. We also established a Hadoop software framework to employ MapReduce mechanism. We modified the existing Support vector machine algorithm to fit into the distributed hadoop architecture system for localization of a sensor node. In our experiment, we implemented the test sensor network with a variety of parameters and examined based on proficiency, resource evaluation, and processing time.

  • PDF

Deep Learning-Based Smart Meter Wattage Prediction Analysis Platform

  • Jang, Seonghoon;Shin, Seung-Jung
    • International journal of advanced smart convergence
    • /
    • v.9 no.4
    • /
    • pp.173-178
    • /
    • 2020
  • As the fourth industrial revolution, in which people, objects, and information are connected as one, various fields such as smart energy, smart cities, artificial intelligence, the Internet of Things, unmanned cars, and robot industries are becoming the mainstream, drawing attention to big data. Among them, Smart Grid is a technology that maximizes energy efficiency by converging information and communication technologies into the power grid to establish a smart grid that can know electricity usage, supply volume, and power line conditions. Smart meters are equient that monitors and communicates power usage. We start with the goal of building a virtual smart grid and constructing a virtual environment in which real-time data is generated to accommodate large volumes of data that are small in capacity but regularly generated. A major role is given in creating a software/hardware architecture deployment environment suitable for the system for test operations. It is necessary to identify the advantages and disadvantages of the software according to the characteristics of the collected data and select sub-projects suitable for the purpose. The collected data was collected/loaded/processed/analyzed by the Hadoop ecosystem-based big data platform, and used to predict power demand through deep learning.

Development of Software Education Support System using Learning Analysis Technique (학습분석 기법을 적용한 소프트웨어교육 지원 시스템 개발)

  • Jeon, In-seong;Song, Ki-Sang
    • Journal of The Korean Association of Information Education
    • /
    • v.24 no.2
    • /
    • pp.157-165
    • /
    • 2020
  • As interest in software education has increased, discussions on teaching, learning, and evaluation method it have also been active. One of the problems of software education teaching method is that the instructor cannot grasp the content of coding in progress in the learner's computer in real time, and therefore, instructors are limited in providing feedback to learners in a timely manner. To overcome this problem, in this study, we developed a software education support system that grasps the real-time learner coding situation under block-based programming environment by applying a learning analysis technique and delivers it to the instructor, and visualizes the data collected during learning through the Hadoop system. The system includes a presentation layer to which teachers and learners access, a business layer to analyze and structure code, and a DB layer to store class information, account information, and learning information. The instructor can set the content to be learned in advance in the software education support system, and compare and analyze the learner's achievement through the computational thinking components rubric, based on the data comparing the stored code with the students' code.

A Study on Procurement Audit Integration Real Time Monitoring System Using Process Mining Under Big Data Environment (빅 데이터 환경하에서 프로세스 마이닝을 이용한 구매 감사 통합 실시간 모니터링 시스템에 대한 연구)

  • Yoo, Young-Seok;Park, Han-Gyu;Back, Seung-Hoon;Hong, Sung-Chan
    • Journal of Internet Computing and Services
    • /
    • v.18 no.3
    • /
    • pp.71-83
    • /
    • 2017
  • In recent years, by utilizing the greatest strengths of process mining, the various research activities have been actively progressed to use auditing work of business organization. On the other hand, there is insufficient research on systematic and efficient analysis of massive data generated under big data environment using process mining, and proactive monitoring of risk management from audit side, which is one of important management activities of corporate organization. In this study, we intend to realize Hadoop-based internal audit integrated real-time monitoring system in order to detect the abnormal symptoms in prevent accidents in advance. Through the integrated real-time monitoring system for purchasing audit, we intend to realize strengthen the delivery management of purchasing materials ordered, reduce cost of purchase, manage competitive companies, prevent fraud, comply with regulations, and adhere to internal control accounting system. As a result, we can provide information that can be immediately executed due to enhanced purchase audit integrated real-time monitoring by analyzing data efficiently using process mining via Hadoop-based systems. From an integrated viewpoint, it is possible to manage the business status, by processing a large amount of work at a high speed faster than the continuous monitoring, the effectiveness of the quality improvement of the purchase audit and the innovation of the purchase process appears.

Efficient Association Rule Mining based SON Algorithm for a Bigdata Platform (빅데이터 플랫폼을 위한 SON알고리즘 기반의 효과적인 연관 룰 마이닝)

  • Nguyen, Giang-Truong;Nguyen, Van-Quyet;Nguyen, Sinh-Ngoc;Kim, Kyungbaek
    • Journal of Digital Contents Society
    • /
    • v.18 no.8
    • /
    • pp.1593-1601
    • /
    • 2017
  • In a big data platform, association rule mining applications could bring some benefits. For instance, in a agricultural big data platform, the association rule mining application could recommend specific products for farmers to grow, which could increase income. The key process of the association rule mining is the frequent itemsets mining, which finds sets of products accompanying together frequently. Former researches about this issue, e.g. Apriori, are not satisfying enough because huge possible sets can cause memory to be overloaded. In order to deal with it, SON algorithm has been proposed, which divides the considered set into many smaller ones and handles them sequently. But in a single machine, SON algorithm cause heavy time consuming. In this paper, we present a method to find association rules in our Hadoop based big data platform, by parallelling SON algorithm. The entire process of association rule mining including pre-processing, SON algorithm based frequent itemset mining, and association rule finding is implemented on Hadoop based big data platform. Through the experiment with real dataset, it is conformed that the proposed method outperforms a brute force method.

Performance Analysis of Distributed Parallel Processing Schemes for Large Data in Cloud Computing (클라우드 컴퓨팅에서의 대규모 데이터를 위한 분산 병렬 처리 기법의 성능분석)

  • Hong, Seung-Tae;Chang, Jae-Woo
    • Proceedings of the Korean Association of Geographic Inforamtion Studies Conference
    • /
    • 2010.09a
    • /
    • pp.111-118
    • /
    • 2010
  • 최근 IT 분야에서 인터넷을 기반으로 IT 자원들을 서비스 형태로 제공하는 클라우드 컴퓨팅에 대한 연구가 활발히 진행되고 있다. 한편, 효율적인 클라우드 컴퓨팅을 제공하기 위해서는, 막대한 양의 데이터를 수많은 서버들에 분산 처장하고 관리하기 위한 분산 데이터 처장 기법 빛 분산 병렬 처리 기법에 대한 연구가 필수적이다. 이를 위해 본 논문에서는 대표적인 분산 병렬 처리 기법에 대해 살펴보고, 이를 비교 분석한다. 마지막으로 Hadoop 기반 클러스터를 구축하고 이를 통해서 대규모 데이터를 위한 분산 병렬 처리 기법에 대한 성능평가를 수행한다.

  • PDF