• Title/Summary/Keyword: Map Reduce

Search Result 849, Processing Time 0.031 seconds

The Method of Analyzing Firewall Log Data using MapReduce based on NoSQL (NoSQL기반의 MapReduce를 이용한 방화벽 로그 분석 기법)

  • Choi, Bomin;Kong, Jong-Hwan;Hong, Sung-Sam;Han, Myung-Mook
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.23 no.4
    • /
    • pp.667-677
    • /
    • 2013
  • As the firewall is a typical network security equipment, it is usually installed at most of internal/external networks and makes many packet data in/out. So analyzing the its logs stored in it can provide important and fundamental data on the network security research. However, along with development of communications technology, the speed of internet network is improved and then the amount of log data is becoming 'Massive Data' or 'BigData'. In this trend, there are limits to analyze log data using the traditional database model RDBMS. In this paper, through our Method of Analyzing Firewall log data using MapReduce based on NoSQL, we have discovered that the introducing NoSQL data base model can more effectively analyze the massive log data than the traditional one. We have demonstrated execellent performance of the NoSQL by comparing the performance of data processing with existing RDBMS. Also the proposed method is evaluated by experiments that detect the three attack patterns and shown that it is highly effective.

Grid-based Index Generation and k-nearest-neighbor Join Query-processing Algorithm using MapReduce (맵리듀스를 이용한 그리드 기반 인덱스 생성 및 k-NN 조인 질의 처리 알고리즘)

  • Jang, Miyoung;Chang, Jae Woo
    • Journal of KIISE
    • /
    • v.42 no.11
    • /
    • pp.1303-1313
    • /
    • 2015
  • MapReduce provides high levels of system scalability and fault tolerance for large-size data processing. A MapReduce-based k-nearest-neighbor(k-NN) join algorithm seeks to produce the k nearest-neighbors of each point of a dataset from another dataset. The algorithm has been considered important in bigdata analysis. However, the existing k-NN join query-processing algorithm suffers from a high index-construction cost that makes it unsuitable for the processing of bigdata. To solve the corresponding problems, we propose a new grid-based, k-NN join query-processing algorithm. Our algorithm retrieves only the neighboring data from a query cell and sends them to each MapReduce task, making it possible to improve the overhead data transmission and computation. Our performance analysis shows that our algorithm outperforms the existing scheme by up to seven-fold in terms of the query-processing time, while also achieving high extent of query-result accuracy.

A Fast and Scalable Image Retrieval Algorithms by Leveraging Distributed Image Feature Extraction on MapReduce (MapReduce 기반 분산 이미지 특징점 추출을 활용한 빠르고 확장성 있는 이미지 검색 알고리즘)

  • Song, Hwan-Jun;Lee, Jin-Woo;Lee, Jae-Gil
    • Journal of KIISE
    • /
    • v.42 no.12
    • /
    • pp.1474-1479
    • /
    • 2015
  • With mobile devices showing marked improvement in performance in the age of the Internet of Things (IoT), there is demand for rapid processing of the extensive amount of multimedia big data. However, because research on image searching is focused mainly on increasing accuracy despite environmental changes, the development of fast processing of high-resolution multimedia data queries is slow and inefficient. Hence, we suggest a new distributed image search algorithm that ensures both high accuracy and rapid response by using feature extraction of distributed images based on MapReduce, and solves the problem of memory scalability based on BIRCH indexing. In addition, we conducted an experiment on the accuracy, processing time, and scalability of this algorithm to confirm its excellent performance.

Performance Analysis of Distributed Hadoop Systems (분산 하둡 시스템의 성능 비교 분석)

  • Bae, Byoung-Jin;Kim, Young-Joo;Kim, Young-Kuk
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2014.05a
    • /
    • pp.479-482
    • /
    • 2014
  • Nowadays open-source hadoop systems have been using widely to efficiently manage a fast-growing big data. Hadoop systems consist of distributed file processing system called HDFS (Hadoop Distributed File System) and distributed parallel processing system called MapReduce. The MapReduce reads and processes big data from HDFS and then processed results are written in HDFS again by the MapReduce. Such a processing method has different system structure respectively according to hadoop version. Therefore, this paper shows analysis results for performance of hadoop systems. For this, we devise a way which monitors hadoop systems and measure occurrence frequency of processes, threads, and variables generated in hadoop system itself using the devised way. So, by using the measured results as analysis indicator, we help the indicator predict inner performance of hadoop systems.

  • PDF

A Fast Handoff between MAPs in Hierarchical Mobile IPv6 (HMIPv6에서의 고속 매크로 핸드오프 지원 방안)

  • Shin, Tea-Il;Mun, Ygung-Song
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.43 no.2 s.344
    • /
    • pp.16-21
    • /
    • 2006
  • Internet Engineering Task Force(IETF) proposed the Hierarchical Mobile IPv6(HMIPv6) to support mobility efficiently. The HMIPv6 was developed to reduce the signaling overhead and delay concerned with Binding Update in Mobile IPv6. However, the HMIPv6 still need a further enhancement for supporting the real-time application because HMIPv6 only concerns with the latency resulted within a MAP For providing seemless handoff we propose a scheme that can reduce latency when Mobile Node changes a MAP. Also we compare the HMIPv6 with the proposed scheme through a analysis model.

Effects of the Mind Map for Emotional Labor and Burnout: A Survey of Nurses in Outpatient Departments of Cancer Hospitals (마인드맵이 감정노동과 소진에 미치는 효과: 암전문병원 외래간호사를 중심으로)

  • Lee, Jin A;Park, Seok Won;Kim, Kyeong Ji;Paik, Hyun Ok;Jeon, Eunyoung
    • Journal of Korean Academy of Nursing Administration
    • /
    • v.21 no.5
    • /
    • pp.511-518
    • /
    • 2015
  • Purpose: The purpose of this research was to develop and evaluate the effect of a mind map for relief of emotional labor and burnout among nurses in outpatient departments in cancer hospitals. Methods: We developed a mind map to reduce emotional labor and burnout. A quasi-experimental study was used with a nonequivalent control group pretest-posttest design. Data were collected from December 2012 to April 2013. Participants were 35 nurses working in the outpatient department of a cancer hospital. The experimental group participated in the mind map program biweekly for 10 weeks. Data were analyzed using $x^2$-test, Mann-Whitney U test, paired t-test, and Wilcoxon sign rank test with the SPSS 21.0 program. Results: The physical burnout and total burnout scores decreased significantly in the intervention group which took the mind map program. Conclusion: Findings indicate that the mind map is an effective intervention to reduce burnout in outpatient department nurses.

Real-time Network Attack Pattern Analysis System using Snort Log on MapReduce Environment (MapReduce 환경에서 Snort 로그를 이용한 실시간 네트워크 공격패턴 분석 시스템)

  • Kang, Moon-Hwan;Jang, Jin-Su;Shin, Young-Sung;Chang, Jae-Woo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.04a
    • /
    • pp.75-77
    • /
    • 2017
  • 최근 급격히 증가하고 있는 네트워크 로그 상에서 보안위협에 신속히 대응하기 위해 기업들은 방화벽, IDS 등의 네트워크 보안 로그를 분석하여 보안 위협을 파악한다. Snort는 이러한 보안 위협에 대응하기 위해 네트워크 로그를 수집하는 도구 중 하나이다. 그러나 보안 관제 담당자는 방대한 양의 보안 관련 로그를 분석하기 위해 많은 시간이 필요하기 때문에, 관제 결과를 보고하고 대응하기까지 시간이 지체되는 문제가 존재한다. 이러한 문제를 해결하기 위해, 본 논문에서는 Snort 로그를 이용한 실시간 네트워크 공격패턴 분석 시스템을 제안한다. 제안하는 시스템은 대용량 데이터 처리에 효과적인 MapReduce 분산 처리를 활용하여 방대한 네트워크 로그를 추출 및 분석하기 때문에 보안 위협 상황 발생 여부를 실시간으로 빠르게 인지할 수 있다.

Design of Testbed for Agile Computing of MapReduce Applications by using Docker

  • Kang, Yunhee
    • International Journal of Contents
    • /
    • v.12 no.3
    • /
    • pp.29-33
    • /
    • 2016
  • Cloud computing makes extensive use of virtual machines that permit for workloads, as well as resource usage, to be isolated from one another, and a hypervisor can be used by a virtual machine to construct cloud computing infrastructure. However, the hypervisor has high resource usage when constructing virtual machines, which results in a waste of allocated resources when not activated. Docker provides a more light-weight method to obtain agile computing resources based on a container technique that handles this problem. In this study, we have chosen this specific tool due to the increasing popularity of MapReduce and cloud container technologies such as Docker. This study aims to automatically configure Twister workloads for container-driven clouds. Basically, this is the first attempt towards automatic configuration of Twister jobs on a container-based cloud platform VM for many workloads.

A study of MapReduce Algorithm for Bigdata (빅데이터 처리를 위한 맵리듀스 연구)

  • Kim, Man-Yun;Youn, Hee-Yong
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2014.07a
    • /
    • pp.341-342
    • /
    • 2014
  • 지난 10년간 데이터의 폭발적인 증가로 우리는 빅데이터 시대를 맞이하게 되었다. 특히, 최근 몇 년 사이 소셜 네트워크의 발전으로 인해 발생하는 데이터의 양이 증가하면서, 이를 처리하기 위한 시스템으로 하둡이 등장하였다. 이전에는 저장 및 처리할 수 없었던 대용량 데이터를 오픈소스인 하둡의 등장으로 누구나가 대용량 데이터를 처리할 수 있는 시스템을 운영할 수 있게 된 것이다. 대규모 처리 분석을 위한 소프트웨어 프레임워크인 하둡은 클라우드 컴퓨팅의 대표적인 기술로 널리 사용되고 있다. 하둡은 크게 데이터의 저장을 담당하는 HDFS(Hadoop Distribute File System)와 데이터를 처리하는 맵리듀스로 나뉜다. 본 논문에서는 기존의 MapReduce와 차세대 맵리듀스로 불리는 YARN을 비교 분석하고 맵리듀스의 용도와 효율적인 활용방안을 제시한다.

  • PDF

GPS Data Partitioning Method for POI Extraction Based MapReduce (MapReduce 기반 POI를 추출하기 위한 GPS 데이터 분할 방법)

  • Oh, Joo-Seong;Jeon, Hye-Ji;Lee, Hye-Jin;Jeong, Min-A;Lee, Seong-Ro
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2015.10a
    • /
    • pp.1199-1201
    • /
    • 2015
  • 위치 기반 서비스는 여러 분야에서 활용되어지고 있다. 사용자들에게 정확한 정보를 제공하기 위해서는 대량의 위치 데이터를 분석하여 POI를 추출하고 분석해야 된다. 본 논문에서는 POI를 추출하는 방법으로 DBSCAN 클러스터링을 이용하고 이를 MapReduce 환경에서 구현한다. 또한 알고리즘의 수행속도를 향상시키기위해 데이터를 분할하는 방법을 제안한다.