• Title/Summary/Keyword: 분산 인메모리 플랫폼

Search Result 3, Processing Time 0.017 seconds

A Comparative Analysis of Recursive Query Algorithm Implementations based on High Performance Distributed In-Memory Big Data Processing Platforms (대용량 데이터 처리를 위한 고속 분산 인메모리 플랫폼 기반 재귀적 질의 알고리즘들의 구현 및 비교분석)

  • Kang, Minseo;Kim, Jaesung;Lee, Jaegil
    • Journal of KIISE
    • /
    • v.43 no.6
    • /
    • pp.621-626
    • /
    • 2016
  • Recursive query algorithm is used in many social network services, e.g., reachability queries in social networks. Recently, the size of social network data has increased as social network services evolve. As a result, it is almost impossible to use the recursive query algorithm on a single machine. In this paper, we implement recursive query on two popular in-memory distributed platforms, Spark and Twister, to solve this problem. We evaluate the performance of two implementations using 50 machines on Amazon EC2, and real-world data sets: LiveJournal and ClueWeb. The result shows that recursive query algorithm shows better performance on Spark for the Livejournal input data set with relatively high average degree, but smaller vertices. However, recursive query on Twister is superior to Spark for the ClueWeb input data set with relatively low average degree, but many vertices.

Framework Implementation of Image-Based Indoor Localization System Using Parallel Distributed Computing (병렬 분산 처리를 이용한 영상 기반 실내 위치인식 시스템의 프레임워크 구현)

  • Kwon, Beom;Jeon, Donghyun;Kim, Jongyoo;Kim, Junghwan;Kim, Doyoung;Song, Hyewon;Lee, Sanghoon
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.41 no.11
    • /
    • pp.1490-1501
    • /
    • 2016
  • In this paper, we propose an image-based indoor localization system using parallel distributed computing. In order to reduce computation time for indoor localization, an scale invariant feature transform (SIFT) algorithm is performed in parallel by using Apache Spark. Toward this goal, we propose a novel image processing interface of Apache Spark. The experimental results show that the speed of the proposed system is about 3.6 times better than that of the conventional system.

Design and Implementation of a Search Engine based on Apache Spark (아파치 스파크 기반 검색엔진의 설계 및 구현)

  • Park, Ki-Sung;Choi, Jae-Hyun;Kim, Jong-Bae;Park, Jae-Won
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.1
    • /
    • pp.17-28
    • /
    • 2017
  • Recently, a study on data has been actively conducted because the value of the data has become more useful. Web crawler that is program of data collection recently spotlighted because it can take advantage of the various fields. Web crawler can be defined as a tool to analyze the web pages and collects the URL by traversing the web server in an automated manner. For the treatment of Big-data, distributed Web crawler is widely used which is based on the Hadoop MapReduce. But, it is difficult to use and has constraints on the performance. Apache spark that is the In-memory computing platform is an alternative to MapReduce. The search engine which is one of the main purposes of web crawler displays the information you search by keyword gathered by web crawler. If search engines implement a spark-based web crawler instead of traditional MapReduce-based web crawler, it would be a more rapid data collection.